The top program, NNCP, uses a transformer running on a GPU. I think that I
can get the same effect using a short term memory as context for next token
prediction. Low frequency tokens would stay in memory longer. I'm not sure,
but I think the earlier versions of NNCP did this using LSTM. It is closed
source but there is a paper describing the algorithm.

I prefer simpler programs, but as Legg proved, good predictors are
necessarily complex. His proof goes something like this: suppose you have a
simple but powerful prediction algorithm. Then I can create a simple
sequence that your program can't predict. My program runs a copy of your
program and outputs the opposite of yours.

Tuning for a single benchmark like enwik9 doesn't have this problem but
loses generality that you want in an LLM. I'm still working on the low
hanging fruit like reversing the XML, HTML, and Wiki formatting. The more
interesting work will be a language model that learns to parse the input
into tokens and discovers the syntactic and semantic relations between them.

-- Matt Mahoney, [email protected]

On Wed, Oct 29, 2025, 5:45 AM <[email protected]> wrote:

> @Matt. Your best program scores 15.9MB or so for enwik8, right? And it
> doesn't use any related words mechanism?
>
> I see 2 of your programs, one is 4KB when compressed, but the one that
> scores 15.9 is about 404KBs compressed, is it possible to make that program
> 4KB? How much can be shrunken down and to what?
>
> I was thinking small programs are better for others to use, but then why
> is your top program coming with about 404KBs, unlike the 4KB one that
> scores about 17.8MB.
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-Mffc0119bc50299e24599c1ae>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-M8c1c905111a1dca85aa28cc5
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to