Yes, I know I wrote a paper in 2013 estimating that automating the economy with AGI would cost $1 quadrillion, mostly to collect 10^17 bits of human knowledge. This proved correct when it took companies with trillion dollar market caps to produce LLMs. Those are the ones that have access to your emails, texts, and social media posts that go far beyond the 10^13 bits you can suck off the public internet. Even so, we are less than 1% of the way there, which is why AI has not put a dent in the employment rate yet.
But that's not what I'm trying to build. I'm building a human level small language model (SLM). It's not a probabilistic logic knowledge base like the ones that Ben Goertzel, YKY, and Pei Wang were developing before they left the group when LLMs proved in 2023 that all you need to pass the Turing test is text prediction, like I predicted in a 1999 paper. That's basically the Hutter prize. I do appreciate that 2 of the 3 Hutter prize committee members (me and James Bowery) are still active here, and others (Immortal Discoveries, or submerge on encode.su) are pursuing this approach as well. My math mostly agrees with Turing's 1950 prediction that a computer with 10^9 bits of memory, but no faster than current technology (mechanical relays are as fast as neurons) would win the imitation game (now known as the Turing test) by 2000. His forecast of Moore's law was remarkably prescient, given that Gordon Moore didn't state it until 1965. Turing's paper was published just after Shannon invented information theory and estimated the entropy of English at about 1 bit per character, consistent with the top results on my large text benchmark. It also predates Landauer's 1973 estimate of 10^9 bits of human long term memory capacity, although Turing could have easily estimated how many words we process in a lifetime. My math says a SLM can be implemented on a single CPU at 10,000 x real time, compressing a lifetime of learning into a day. You have a vocabulary of about 50K tokens with a Zipf distribution, where the n'th most frequent word has a frequency of about 0.1/n. You have a short term memory of about 7 tokens, where low frequency tokens persist longer. You have a 50K by 50K matrix mapping short term memory to the predicted token, with the sparse parts of the matrix implemented as hidden layers in a neural network to cut the parameter space to 10^9. Updates should be fast because the learning rate is only about 4 bits per token, so only a small number of parameters need to be updated. Predictions should likewise be fast if we implement an attention mechanism in the hidden layer (like in transformers), where all but the few most active neurons are set to 0. But it is still hard. I suppose if it wasn't, we would have solved AI 23 years earlier. Two months ago I released a version that compressed enwik9 to 145 MB in 10 minutes using article sorting by topic, XML unwrapping, capitalization encoding, a tiny dictionary, and a pure linear context model. The plan is to mix these predictions with the language model, which I have yet to write. Instead I spent the last 2 months refining the context model. I had a bunch of ideas to dramatically improve speed or memory usage, but ended up spending days to implement and debug them, only to see it either didn't work or the improvement was so marginal it wasn't worth the effort. As the program grows, each update is like brain surgery, carefully changing 1 or 2 lines and testing in case I broke something and have to go back. In 2 months, all I have to show for it is 142 MB in 20 minutes, a tiny movement along the Pareto frontier that isn't even worth releasing. I need to get to 110 MB, but as I do, testing times will go from minutes to hours to days. There's something I'm not getting. Why does the brain need 10^15 synapses to store 10^9 bits? Maybe it's a speed optimization, like how a server farm has a million copies of Linux, or your body has 10^13 copies of your DNA. Or is it something else? Is it the reason we didn't solve AI in 2000? -- -- Matt Mahoney, [email protected] ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/Tc9fe35df94409188-M580dec3e9f299a9aad2af686 Delivery options: https://agi.topicbox.com/groups/agi/subscription
