Link Digest, Vol 392, Issue 9

link-request Mon, 21 Jul 2025 19:05:36 -0700

Send Link mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://mailman.anu.edu.au/mailman/listinfo/link
or, via email, send a message with subject or body 'help' to
        [email protected]


You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Link digest..."


Today's Topics:

   1. OpenAI's latest model solved 5 out of 6 problems on the
      International Math Olympiad exam (Stephen Loosley)


----------------------------------------------------------------------

Message: 1
Date: Mon, 21 Jul 2025 20:22:07 +0930
From: Stephen Loosley <[email protected]>
To: "link" <[email protected]>
Subject: [LINK] OpenAI's latest model solved 5 out of 6 problems on
        the International Math Olympiad exam
Message-ID: <[email protected]>
Content-Type: text/plain; charset="UTF-8"

OpenAI just won gold at the world's most prestigious math competition. Here's 
why that's a big deal.
 
By Lakshmi Varanasi Jul 20, 2025, 8:25 AM GMT+10 
https://www.businessinsider.com/openai-gold-iom-math-competition-2025-7

[Photo Caption: An experimental LLM from OpenAI won gold at the International 
Math Olympiad this week. sasirin pamai/Getty Images]


OpenAI's latest model solved five out of six problems on the International Math 
Olympiad exam.

OpenAI CEO Sam Altman called it "a significant marker of how far AI has come 
over the past decade."

AI skeptic Gary Marcus said he was "impressed" but that the model's utility is 
yet to be seen.

OpenAI's latest experimental model is a math whiz, performing so well on an 
insanely difficult math exam that everyone's now talking about it.

"I'm excited to share that our latest @OpenAI experimental reasoning LLM has 
achieved a longstanding grand challenge in AI: gold medal-level performance on 
the world's most prestigious math competition ? the International Math Olympiad 
(IMO)," Alexander Wei, a member of OpenAI's technical staff, said on X.

The International Math Olympiad is a global competition that began in 1959 in 
Romania and is now considered one of the hardest in the world. It's divided 
into two days, during which participants are given a four-and-a-half-hour exam, 
each with three questions. Some famous winners include Grigori Perelman, who 
helped advance geometry, and Terence Tao, recipient of the Fields Medal, the 
highest honor in mathematics.

In June, Tao predicted on Lex Fridman's podcast that AI would not score high on 
the IMO. He suggested researchers shoot a bit lower. "There are smaller 
competitions. There are competitions where the answer is a number rather than a 
long-form proof," he said.

Yet OpenAI's latest model solved five out of six of the problems correctly, 
working under the same testing conditions as humans, Wei said.

Wei's colleague, Noam Brown, said the model displayed a new level of endurance 
during the exam.

"IMO problems demand a new level of sustained creative thinking compared to 
past benchmarks," he said. "This model thinks for a long time."

Wei said the model is an upgrade in general intelligence. The model's 
performance is "breaking new ground in general-purpose reinforcement learning," 
he said. DeepMind's AlphaGeometry, by contrast, is specifically designed just 
to do math.

What is Grok?

"This is an LLM doing math and not a specific formal math system; it is part of 
our main push towards general intelligence," Altman said on X.

"When we first started openai, this was a dream but not one that felt very 
realistic to us; it is a significant marker of how far AI has come over the 
past decade," Altman wrote, referring to the model's performance at IOM.

Altman added that a model with a "gold level of capability" will not be 
available to the public for "many months."

The achievement is an example of how fast the technology is developing. Just 
last year, "AI labs were using grade school math" to evaluate models, Brown 
said. And tech billionaire Peter Thiel said last year it would take at least 
another three years before AI could solve US Math Olympiad problems.


Still, there are always skeptics.

Gary Marcus, a well-known critic of AI hype, called the model's performance 
"genuinely impressive" on X. But he also posed several questions about how the 
model was trained, the scope of its "general intelligence," the utility for the 
general population, and the cost per problem. Marcus also said that the IMO has 
not independently verified these results.

--



------------------------------

Subject: Digest Footer

_______________________________________________
Link mailing list
[email protected]
https://mailman.anu.edu.au/mailman/listinfo/link


------------------------------

End of Link Digest, Vol 392, Issue 9
************************************

Link Digest, Vol 392, Issue 9

Reply via email to