News

Elon Musk reacts to GPT-4 scoring 93% on SAT exams

Human-level performance, which OpenAI claims lets GPT-4 do even better than a real human being on SAT exams, is one of the most intriguing elements

By Web Desk

March 15, 2023

Recently, in its most recent iteration, OpenAI released the GPT-4, its main large language model.— AFP/file

Twitter boss Elon Musk took to the microblogging platform to react to the new talk of the town: GPT-4. The news of the updated and more capable artificial intelligence passing SAT exams with flying colours has surprised many.

While reacting to GPT-4, the billionaire was quick to market his brain-chip startup Neuralink. "What will be left for us humans to do? We better get a move on with Neuralink!" he wrote.

Recently, in its most recent iteration, OpenAI released the GPT-4, its main large language model. The company claims that this new one is even smarter and can produce better outcomes because it has been educated on more data.

GTP-4 is already being used by Bing's AI chatbot, according to a report from Microsoft reported by Ubergizmo. Judging by the way things are going, consumer product chatbots will probably start using this most recent version in the coming weeks.

The new model can respond to images, for example, by producing tags and descriptions and suggesting recipes based on images of the ingredients. It can process 25,000 words in total, which is around eight times as much as ChatGPT.

Human-level performance

Many of the AI demonstrations that mesmerised us over the past six months have been powered by OpenAI's GPT language, but what was already pretty good has gotten even better; the startup claims that the new model will be able to reply with fewer factually incorrect answers.

The so-called human-level performance, which the business claims lets GPT-4 do even better than a real human being on SAT exams, is one of the more intriguing elements, though. The new language received scores of 93% on the SAT reading exam, 89% on the SAT math exam, and 90% on the bar exam in simulated testing.

Is it truly the best?

Even with all of its power, GPT-4 still doesn't perform particularly well when it comes to making things up. In a recent blog post, the company claimed that its new product has limitations when it comes to social biases, hallucinations, and adversarial prompts but that they are already working on improvements.

According to a blog post by OpenAI, the differences between GPT-4 and GPT 3.5 are not noticeable in casual conversations but become apparent when the task's complexity exceeds a certain threshold because the newer version is more reliable, creative, and equipped to handle instructions with a variety of nuances.