On Sat, Dec 21, 2024 at 5:14 AM PGC <[email protected]> wrote:
> * > there is a statement that when the system is scaled up dramatically > (172 times more compute resources), it manages to score 87.5%. The > difference between the 75.7% result and 87.5% result is thus explained by a > large disparity in the computational budget used for training or inference.* > *Yes, and if O3 had been given even more time it would've scored even higher, to me that indicates that the fundamental problem of AGI has been solved, and now it's just a question of optimizing things to make them more efficient. And if history is any guide that won't take long, today much smaller more compute efficient models can equal the performance of huge compute hungry state of the art models of just a few months ago. * *It's bizarre to realize that just a month and a half ago the majority of people in the USA thought the major problems facing the country were the trivial issues of illegal immigration and transsexual bathrooms, and that's why Donald Trump will be the most powerful hominid on earth during the most critical period in the entire history of his Homo sapiens species. * > > *> the model was explicitly trained on the very same data (or a > substantial subset of it) against which it was later tested. The text > itself says: “trained on the ARC-AGI-1 Public Training set”* > *I don't see how the fact that O3 was trained on the ARC-AGI-1 Public Training set could be considered cheating when the ARC people are the ones who released the ARC-AGI-1 Public Training set for the precise purpose, as its name indicates, of training AIs.* *> Beyond the bare mention of “trained on the ARC-AGI-1 Public Training > set,” there is an implied process of repeated tuning or hyperparameter > searches.* > *Yes, because that's what "training an AI" means! * *> children’s ability to adapt to novel tasks and generalize without being > artificially “trained” on the same data is a key part of the skepticism:* > *Human children need to go to school, so do newly born childish AIs. * > > *> a quote from the blog:"Passing ARC-AGI does not equate to achieving > AGI, and, as a matter of fact, I don't think o3 is AGI yet." * *The average human taking the ARC test will receive a score of about 50%, some very exceptionally talented humans can get a score of around 80%. About one year ago, back in the stone age when the best AI's only scored about 2% on the ARC test, Francois Chollet, the author of the above quote and the originator of the ARC test, said that if a computer got a score above 75% he would consider it an AGI. But now that O3 can get a score of 87.5% if it thinks for a long time and 75.7% if it is only allowed a short time to think, Chollet has done what all AI skeptics have done since the 1960s, he has moved the goal post. * > *> Furthermore, early data points suggest that the upcoming ARC-AGI-2 > benchmark will still pose a significant challenge to o3,* > *Yes, I'm certain computers will find it more difficult to get a high score on ARC-AGI-2, but human beings will find this new test to be even more difficult than computers do. Today's benchmarks are becoming obsolete because computers are rapidlymaxing them out, that's why we need ARC-AGI-2, it will be very useful in comparing one AGI to another AGI.* *John K Clark See what's on my new list at Extropolis <https://groups.google.com/g/extropolis>* 4n1 > > > -- You received this message because you are subscribed to the Google Groups "Everything List" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion visit https://groups.google.com/d/msgid/everything-list/CAJPayv38NpQOu4uZwgxtAkxg%3DCu1n_K9oT%3Ds0WJpAEFt82BMmw%40mail.gmail.com.

