top of page
  • Writer's pictureElite Cloud

ChatGPT 4 vs Claude 3! Worth switching?

Claude 3, one of the natural language processing AI developed by Anthropic has made a breakthrough defeating today’s standard ChatGPT 4.0. Previously we had an article explaining the benchmark and Claude 3 models


But that was the on-paper result. Here we’ll test the Claude 3 (Opus, Sonnet) model and compare it with ChatGPT 4.0. And we’ll see if it is as better as it claims. 


These tests will be divided into 3 sections, general, code, and image analysis. Let’s get into it.

If you haven’t already, make sure to check our article on Claude 3. It’s a high-level rundown that covers its benchmark performance and other dope info you’ll wanna know.


General

Here, we will perform a few tests to see the response style and overall knowledge of a model. First well ask it a few general questions.


Prompt: “Who won the Nobel Prize in Physics in 2020?”


Here both GPT 4 and Claude 3 – Sonnet were able to give the right answer.


testing chatgpt 4 for general questions

testing claude 3 for general questions

Next, we’ll let GPT 4 and sonnet translate the output into Chinese. This looks like both AIs were able to translate English to Chinese and vice versa while keeping the meaning intact.


testing chatgpt 4 for language translation

testing chatgpt 4 for language translation

Next, we tried out some basic riddles. And in most cases, both AI (GPT 4; Sonnet) responded the same.


testing chatgpt 4 for riddle

testing chatgpt 4 for riddle

We have performed a lot of other tests on reasoning, logic, basic math, suggestions in real life, and more. In most cases, the GPT 4 and Claude 3 sonnets were responding the same, whereas Claude 3, sonnet, and opus had a better output. The output from Claude 3 models felt more human-like compared to GPT 4. We decided not to include them here since it was almost the same output.


Code

In this test, we’ll ask ChatGPT and Claude to create scripts for automating tasks. They will be tasked with converting a program from one language to another while preserving its functionality. Additionally, they will be required to analyze code, identify bugs, and provide explanations. These tests won’t be intensive but will cover general tasks that a coder might need.


Test 1 – Creating Automation Scripts

First, we’ll ask ChatGPT 4.0 to create a Python script that takes URLs from a text file, captures full-page screenshots of those URLs, and saves them as PNG files.”


Prompt: “Make me a python script that will automatically browse the URLs from a txt file and take full screenshots of the webpage also top to bottom and save it in a png file.”


testing chatgpt 4 with coding challenge

Just copy-pasting the script into a file and running it gives us some errors as below.


testing chatgpt 4 with coding challenge

Passing this error to ChatGPT, it suggested installing the Chrome browser or using Firefox instead. In the next prompt, I asked it to do it in Firefox since I already had Google Chrome installed.


testing chatgpt 4 with coding challenge

This time it gave me a few instructions to install `geckodriver`. So, I installed it and ran the new script.


testing chatgpt 4 with coding challenge

Now the script worked and I got the images I wanted in the current directory.


testing chatgpt 4 with coding challenge

Claude 3 – Opus

Doing the same thing with Claude 3 opus gives the code and instructions on what needs to be done before executing this script. The code from Claude 3 opus seems much easier to understand compared to GPT 4.


testing claude 3 opus with coding challenge

Running this code without modifying or changing anything works on the first try. And it works using Chrome, not Firefox. The screenshots are also saved in a new directory this time.


testing claude 3 opus with coding challenge

While the images it captured are proper but the screen size was smaller than ChatGPT. But that was also fixed the next time mentioned.


testing claude 3 opus with coding challenge

Test 2 – Converting language

In this test, we will convert a Java program to a Go program that converts a string into an md5 hash. We started with ChatGPT with the prompt “can you convert the code behavior in go language “`code“`”.


testing chatgpt 4 with programming language conversion

It delivered a code that converts the string to MD5. And it worked on the first go.


testing chatgpt 4 with programming language conversion
testing chatgpt 4 with programming language conversion

Claude 3 – Opus

Using the same prompt in Claude 3 opus gives the code with a better explanation than GPT 4 has provided. Here claude 3 also gives the code output of the string provided.


testing claude 3 opus with programming language conversion

And it’s no daydreaming it’s the actual hash of the string provided.


testing claude 3 opus with programming language conversion

Test 3 – Check for bugs

In this test, we are going to give it a C code with a buffer overflow bug. This is a very common bug that often leads to RCE and total system compromise. 


Prompt: “is there any bug in my code “`#include <stdio.h>

int main() {

    int secret = 0xdeadbeef;

    char name[100] = {0};

    read(0, name, 0x100);

    if (secret == 0x1337) {

        puts(“Wow! “);

    } else {

        puts(“Hello”);

    }

}“`”


testing chatgpt 4 with vulnerability identification in codes

ChatGPT was able to get the bug and also where it occurs. 


Claude 3 – Opus

Just like ChatGPT Claude 3 was also able to detect the vulnerability but in addition to that it read the entire code and its behaviour. It also provided a bug-free version of the code.


testing cluade 3 opus with vulnerability identification in codes

Test 4 – Analyze the code

In this test, we’ll provide ChatGPT a PHP reverse shell and ask it for details on it. The prompt used here is “can you tell me whats going on here in details “`code“`”.


testing chat gpt 4 to analyse code

This is explained by groping the actions. It was also able to detect the reverse shell.


Claude 3 – Opus

Using the same prompt with Opus gives us a much better explanation than ChatGPT. Here the explanation is given in points instead of grouping. It was also able to detect that reverse shell.


testing claude 3 opus to analyze code

Image Analysis

In image analysis, we’ll check for facial recognition, detect multiple subjects from an image, and also detect geo-locations. 


Test 1 – Facial Recognition

Here we provided an image of the Mona Lisa and prompted “Do you know him?” the “him” part was intentional. Both ChatGPT 4 and Claude 3 Sonnet were able to recognize the image as Monalisa.


testing chatgpt for facial recognition

testing claude 3 sonnet for facial recognition

Next, we provided an image of Obama without asking any questions. Here GPT 4 didn’t recognize the image, but it recognized after asking about the person in the image.


testing chatgpt for facial recognition

testing chatgpt for facial recognition

But Claude 3 Sonnet recognized it just from the image.


testing claude 3 sonnet for facial recognition

Test 2 – Recognizing multiple subjects

In this test, we provided an image containing 7 animals together and asked both AI to count the total number and identify the animals.


testing chatgpt 4 for subject identification in image

testing claude 3 for subject identification in image

Here both AI missed the total number but GPT 4 provided a wrong name.


Next, we tested with an image that had multiple shapes in it. 


testing chatgpt 4 for object identification in image

Here ChatGPT 4 missed one shape while Claude 3 Sonnet was able to detect all 9 shapes in the image.


testing claude 3 for object identification in image

Test 3 – Checking GEO Locations

In this test, we provided a map image with a mark in Indonesia and asked both AIs to recognize the country. ChatGPT 4 failed the test unable to recognize the map.


testing chatgpt to identify global map

While Claude 3 – Sonnet was able to detect the country properly.


testing claude 3 sonnet to identify global map

This is done in Amazon Bedrock client but the model is the same as you will get on the official website. Amazon Bedrock allows you to use multiple AIs in one place.


We have made some other tests with images. In most cases, the Claude 3 Sonnet answer was better than GPT 4. Claude 3 could understand the meaning of a question better than ChatGPT 4.


Conclusions

After performing a lot of tests with both AI, including general tasks, coding, logic, analysis, reasoning, and more, it’s visible that Claude 3 is better than chatGPT 4. It can keep a tone that feels like talking to a human, can understand the true meaning behind questions, and provide appropriate answers.


But ChatGPT still provides more than Claude 3. You can browse the internet with GPT 4, generate images, and make your custom BOT in the ChatGPT console. But as soon as Claude 3 adopts these features, I see no reason to use ChatGPT. Because the subscription cost of both AI is the same.


In conclusion, both AI works great. You should use both and select what best works for you. I’ve been using GPT 4 since it was released and I’m still using it every day. But if OpenAI dont bring any update that beats Calude 3. I’ll most likely switch. Thank you for reading.

4 views0 comments

Comments


Unleash Your Business Potential with Elite Cloud | Boost Efficiency & Scalability.

Get in Touch

+886 2-26575580

bottom of page