On Sat, Jun 15, 2024 at 1:29 AM twenkid <twen...@gmail.com> wrote: > > ... > 2. Yes, the tokenization in current LLMs is usually "wrong", ... it should > be on concepts and world models: ... it should predict the *physical* future > of the virtual worlds
Thanks for comments. I can see you've done a lot of thinking, and see similarities in many places, not least Jeff Hawkins, HTM, and Friston's Active Inference. But I read what you are suggesting as a solution to the current "token" problem for LLMs, like that of a lot of people currently, LeCun prominently, to be that we need to ground representation more deeply in the real world. I find this immediate retreat to other sources of data kind of funny, actually. It's like... studying the language problem has worked really well, so the solution to move forward is to stop studying the language problem! We completely ignore why studying the language problem has caused such an advance. And blindly, immediately throw away our success and look elsewhere. I say look more closely at the language problem. Understand why it has caused such an advance before you look elsewhere. I think the reason language models have led us to such an advance is that the patterns language prompts us to learn are inherently better. "Embeddings", gap fillers, substitution groupings, are just closer to the way the brain works. And language has led us to them. So OK, if "embeddings" have been the advance, replacing both fixed labeled objects in supervised learning, and fixed objects based on internal similarities in "unsupervised" learning, instead leading us to open ended categories based on external relations, why do we still have problems? Why can't we structure better than "tokens"? Why does it seem like they've led us the other way, to no structure at all? My thesis is actually pretty simple. It is that these open ended categories of "embeddings" are good, but they contradict. These "open" categories can have a whole new level of "open". They can change all the time. That's why it seems like they've led us to no structure at all. Actually we can have structure. It is just we have to generate it in real time, not try to learn it all at once. That's really all I'm saying, and my solution to the "token" problem. It means you can start with "letter" tokens, and build "word" tokens, and also "phrases", whole hierarchies. But you have to do it in real time, because the "tokens", "words", "something", "anything", "any thing", two "words", one "word"... whatever, can contradict and have to be found always only in their relevant context. Do you have any comments on that idea, that patterns of meaning which can be learned contradict, and so have to be generated in real time? I still basically see nobody addressing it in the machine learning community. It's a little like Matt's "modeling both words and letters" comment. But it gets beneath both. It doesn't only use letters and words, it creates both "letters" and "words" as "fuzzy", or contradictory, constructs in themselves. And then goes on to create higher level structures, hierarchies, phrases, sentences, as higher "tokens", facilitating logic, symbolism, and all those other artifacts of higher structure which are currently eluding LLMs. All levels of structure become accessible if we just accept they may contradict, and so have to be generated in context, at run time. It's also not unrelated to James' definition of "a thing as anything that can be distinguished from something else." Though that is more at the level of equating definition with relationship, or "embedding", and doesn't get into the missing, contradictory, or "fuzzy" aspect. Though it allows that fuzzy aspect to exist, and leads to it, if once you imagine it might, because it decouples the definition of a thing, from any single internal structure of the thing itself. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M9013929b4653571c40328f7b Delivery options: https://agi.topicbox.com/groups/agi/subscription