On Saturday, June 22, 2024 at 1:51:56 AM UTC+2 John Clark wrote:
Yesterday some people were saying that the improvement in large language models had reached a wall, but they can't say that today because Claude 3.5 Sonnet came out today and it beats GPT-4o on most benchmarks. But the really amazing thing is that it's MUCH smaller than GPT-4o and thus much faster and much cheaper to operate. The company that makes it Anthropic, says they will come out with their far larger version, Claude 3.5 Opus, sometime later this year. I think it's going to be amazing. Anthropic's SHOCKING New Model BREAKS the Software Industry! Claude 3.5 Sonnet Insane Coding Ability <https://www.youtube.com/watch?v=_mkyL0Ww_08> With all the headlines proclaiming AI achieved this or surpassed that milestone, the absence of the basic distinction between narrow and general AI does not appear conspicuous to us? LLMs are Narrow AI designed to perform specific tasks within the limits of their training data. They excel at tasks like text prediction, pattern recognition, and generating creative content but lack the broad understanding and cognitive abilities needed to handle diverse and unpredictable situations, which is the defining property of general AI. Marketing often blurs this distinction, leading the public and investors to overestimate the “intelligence” of narrow AI. For years, narrow AI has been making significant strides in natural language processing, image recognition, guessing your shopping preferences on Amazon, navigating us, game playing etc. Companies capitalize on these accomplishments and the ambiguous use of terms like "intelligence" and "learning" in public discourse to suggest that AI systems now are somehow far more advanced than their progress over the years suggests. This confusion allows companies to imply that their AI technologies are as competent as a student passing a test, more effective than doctors at finding patterns in datasets for cancer research or drug development, and capable of generating mathematical proofs. They effectively cash in on this ambiguity, leading people/investors to think their algorithms are getting generally smarter or that scientists have built such powerful AIs that general intelligence is within reach. While the advances in narrow AI are impressive and have been so for years, they come with limitations. For example, we were promised perfect autonomous driving by Elon Musk years ago, yet reality continually presents situations outside the training set, leading to safety concerns and... accidents. Real ones. This illustrates the difficulty in achieving the "general" part of AI, which can handle unexpected cases. The coupling of ever more effective narrow AI, marketing opportunities and profits, the mystique of general AI that said marketing relies on, and the public belief in superintelligence and technological progress all work together to hype the public into believing that their phones, apps, and browsers are smarter than they are. Benchmarks are mainly memory-reliant, unlike the ARC test or similar types of problems, and the public is bombarded with questionable "breakthroughs" in the headlines, stating that some model achieved a high score on this or that test. These headlines conveniently omit that expanding the training data towards some narrow domain-specific task will yield such abilities trivially. No number of benchmarks aced that rely on memory by narrow AI can prepare it for the real world and unexpected cases not in their training data. This is why it matters how results on benchmarks are achieved. Anticipating a domain-specific problem and providing AI with a cheat sheet through training data adaptation and fine-tuning on the fly is not the same as tech meeting an unexpected situation in reality without software engineers tweaking the thing live. The distinction and our understanding of it affects the technologies we develop that rely on these AIs. Our understanding of the terms we use, their biases, and honest appraisals of the genuine possibilities and limitations are crucial. For now, commercial interests profit from this ambiguity, but clarity and transparency will benefit technological progress and public trust in the long run. Do note, I am not making a claim that there is nothing to LLMs and their recent boost in applications, capacity, conveniences offered, and similar developments. But it’s narrow AI by definition and occasional advances and spurts of growth should be expected. It’s fascinating but its also… what do we call it… marketing and money. -- You received this message because you are subscribed to the Google Groups "Everything List" group. To unsubscribe from this group and stop receiving emails from it, send an email to everything-list+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/everything-list/76096d30-3443-4f3f-906e-e0f517fb88a1n%40googlegroups.com.