OpenAI O1 Released With Reasoning Abilities

WhatsApp Group Join Now
Telegram Group Join Now

OpenAI O1 Released With Reasoning Abilities: OpenAI released the long-awaited o1 model and a smaller, cheaper version called o1-mini to improve its GPT powers. The company that made these releases says they can “reason through complex tasks and solve harder problems than earlier models in science, coding, and math.” The models used to be called Strawberry.

OpenAI says this is the first of a line of products like this in Chat GPT and on its API, though it is still just a preview. More are on the way. People have been teaching these models to “think through problems more before they react, much like a person would,” according to the company. As part of their training, they learn how to think more clearly, try new things, and find their mistakes.

OpenAI says that this next model update works similarly to PhD students on difficult benchmark tasks in physics, chemistry, and biology. This new O1 model is impressive because of this. It does really well in math and code too.

OpenAI O1
OpenAI O1

OpenAI O1 Released With Reasoning Abilities

Feature Description
Model Name OpenAI O1
Training Method Reinforcement Learning
Reasoning Capability Complex reasoning with internal chain of thought
Competitive Programming Ranks in the 89th percentile on Codeforces
Math Olympiad Performance Top 500 in USA Math Olympiad qualifier (AIME)
GPQA Benchmark Surpasses human PhD-level accuracy in physics, biology, and chemistry
Training Efficiency Highly data-efficient training process
Test-Time Compute Performance improves with more test-time compute
Reasoning Benchmarks Outperforms GPT-4o on reasoning-heavy tasks
MMLU Subcategories Improves on 54/57 MMLU subcategories
Math Performance 74% average on AIME exams with a single sample per problem
Consensus Accuracy 83% with consensus among 64 samples
Re-ranking Accuracy 93% when re-ranking 1000 samples with a learned scoring function
Human Expert Comparison Rivals human experts on reasoning-heavy benchmarks
Model Availability Early version available as OpenAI O1-preview
API Access Available to trusted API users
Training Constraints Different from LLM pretraining
Performance Improvement Consistently improves with more reinforcement learning
Benchmark Performance Greatly improves over GPT-4o on challenging reasoning benchmarks
Use Cases Suitable for complex reasoning tasks in various domains

GPT-4o only got 13% of the questions right on a test to qualify for the International Math Olympiad (IMO), while the reasoning model got 83%. OpenAI says that because ChatGPT is still a very early model, it does not have many of the features that make it useful, such as the ability to browse the web and share files and pictures. The company still says that GPT-4o is best for these uses.

But OpenAI thinks it is a “significant advancement and a new level of AI capability. It will be good for jobs that require complex reasoning to solve problems. OpenAI says that users of ChatGPT Plus and Team can now use the o1 model. In the model picker, you can choose the new models by hand. When it first starts, 01-preview will be able to send 30 texts per week, and o1-mini will be able to send 50. From next week, ChatGPT Enterprise and Edu will be able to use both modes for solving complex jobs. The business wants all ChatGPT users to be able to use 01-mini in the future.

Scroll to Top