Matt,

A very detailed analysis. But what are you analysing? A theoretical limit
for Human Knowledge of the world? Related to a theoretical limit for the
size of language models?

No system will predict tomorrow's lottery numbers... Except the universe
itself.

Perhaps that itself is an argument for an infinite expansion of knowledge.
There will always be more.

This is Wolfram's "computational irreducibility" argument. The smallest
thing which can predict such a system, is the system itself.

Language too, can only predict itself.

If we think of language as being a compression of the world, then it will
be a lossy compression. If it is this kind of system. A system which cannot
be (losslessly) compressed, then it must be lossy.

It's not whether language is a (lossy) compression of the world which is at
issue. It is whether language itself, and the cognitive system (of lossy
compression of the world) that it represents, is itself capable of lossless
compression.

If it is not, if it operates itself not as a compression but an expansion,
then that has consequences.

What you get from such a system is not perfect predictions about the world,
it is that they can themselves, constantly get bigger.

Language might constantly get bigger. Perhaps not all of those expansions
will be useful when set against the world. It will probably be necessary to
set them against the world from time to time to ensure flights of fancy are
not too detached. But there will always be more books to be written.

For video, no doubt the problem of mapping this system to the world is
exacerbated. Language I do believe to be special. Yes. Because it is the
brain's own invention to influence itself. The brain will have worked to
expose only structure which is pertinent to the core meaning creation part
of the system. For vision, by contrast, no-one has helpfully whittled down
the data to exactly match what our core cognitive system will use. The
world has its own agendas and doesn't package its information in forms
directly relevant to our core cognitive mechanisms. We'll need to burrow
down and look into the biology of the retina and cortex to know what
information the brain uses to interpret it. What it uses to create the
illusion of this desk in front of me, from the clouds of energy fields
which are our current best approximation of a ground truth reality. With
language, the brain has already done that for you.

But that doesn't mean that the system we use for language will not also
apply to vision. Just that we may need to burrow down some to see the
parallels.

For vision, my guess is that saccades are playing some role of translating
the vision problem into sequences, which can then be operated on by the
brain's native prediction grouping mechanisms. Language doesn't need
saccades. It's already serial. But vision may need saccades. And obviously
other biology is filtering raw reality too, squeezing it into three cones
for colour perception etc.

However, I think once the filtering is done, the core cognition will
operate on vision the same way it operates on language, grouping sequences
to maximize prediction. And if we don't apply that core cognitive part of
the mechanism, no matter how much work we do with models of cones and even
saccades, is going to capture a reasonable analogue of our cognitive visual
world.

So, yes, creating a comparable benchmark for vision would involve a lot of
science to whittle the visual system down to a level where it is comparable
with what the brain has already done for us with language.

As an argument for a maximal size of a cognitive model, a lot of work has
to be done before you even start getting comparable numbers to what you
have for language.

Not being repeatable is part of that. It's really just the chaos argument
that no measurement can ever be precise enough.

But if you believe, like I do, that the essence of the system is to get
bigger. Adapting your "theoretical Human Knowledge size limit" model beyond
language is going to be a fruitless task, anyway. Because the answer is
that there is no theoretical limit in the size of a model for Human
Knowledge. It's just going to get bigger all the time (not always matching
the world, it's true, Some weird art. Some people develop crazy ideas, but
even crazy ideas are an expansion in themselves at some level.)

-R

On Wed, Sep 3, 2025 at 1:27 AM Matt Mahoney <[email protected]> wrote:

> On Mon, Sep 1, 2025, 11:26 PM Rob Freeman <[email protected]>
> wrote:
>
>> On Mon, Sep 1, 2025 at 11:54 AM Matt Mahoney <[email protected]>
>> wrote:
>>
>>>
>>> The model representation in memory is several times larger than the input
>>>
>>
>> I just want to emphasize that line.
>>
>> What might be the theoretical limit in size, I wonder? Could there be no
>> limit?
>>
>
> A Hopfield net stores 0.15 bits per connection. A server farm keeps
> thousands of copies of the Linux kernel in RAM. The human brain stores 10^9
> bits using 10^15 synapses. Your body stores 10^13 copies of your DNA. The
> laws of physics probably have a few hundred bits of Kolmogorov complexity
> but describe a biosphere with 10^37 bits of DNA in a universe with a
> storage capacity of 10^90 bits and an entropy (Bekenstein bound of the
> Hubble radius) of 2.95 x 10^122 bits.
>
> So, yes there is a limit unless you include multiverse theories with an
> infinite number of finite universes and an overall Kolmogorov complexity of
> 0. But even in our observable universe, there is no computer big enough to
> simulate it to predict tomorrow's lottery numbers or to test grand unified
> theories.
>
> But we are just talking about testing LLMs using lossless compression, and
> I need to point out the limitations of this approach.
>
> 1. This test only works on deterministic computers, where you can reset to
> an earlier state and reproduce the same sequence of predictions to
> decompress a file. This is not possible with human brains.
>
> 2. This only works with language. It is good enough for passing the Turing
> test but it does not work with vision or robotics. The problem with pixel
> prediction in video is that most of the data is noise that is not
> perceptible to the eye, but would still have to be compressed. In theory
> you could compress raw video (10^9 bits per second) to a text description
> (10 bits per second) and uncompress by asking an AI to generate another
> video that looks about the same. That would rely on subjective evaluation
> rather than just comparing files.
>
> 3. LLM chatbots output the most likely continuation, which means they
> can't use the chain rule (p(xy) = p(x)p(y|x)) to predict a bit or token at
> a time and use it as context for the next prediction. Suppose you have:
>
> p(00) = .3
> p(01) = .3
> p(10) = 0
> p(11) = .4
>
> A language model would predict the next bit is 0 with probability .6 even
> though the correct response is 11. Solving this requires looking ahead and
> searching over the decision tree. Compression doesn't distinguish between
> chatbots that do this well vs poorly.
>
> 4. Current chatbots have separate training and test phases so that the
> parameters can be fixed and shared without leaking information between
> users. Doing this in a compressor would make compression worse. Compressors
> normally update the model after each prediction.
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/Ta9b77fda597cc07a-Mab6442134a46b301a1e05467>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta9b77fda597cc07a-Mab6992b321efce47350a0be7
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to