On Wed, Oct 15, 2025 at 11:40 AM Matt Mahoney <[email protected]> wrote: ...
4. Tiny dictionary encoding using byte pair encoding, replacing the least > frequent byres with codes for the most frequent byte pairs until there is > no more size reduction, which takes about 6 passes when the pairing is > restricted to groups of letters or groups of repeated punctuation symbols > that are used in the XML, HTML, and Wikipedia markup.... > I have some ideas for tokenization and for modeling a semantic network > with an attention mechanism like in a transformer, but that doesn't require > a GPU to run reasonably fast. But it will be awhile before I have any code > ready to release. > ... Yesterday, a dynamical MDL approach to the Re-Pair algorithm emerged from formalizing Tom Etter's Relation Arithmetic approach to foundations I've been working on in Lean4, so that I get the philosophy of causality right from the gitgo. A couple of years ago I played around with Re-Pair as an approach to lossless compression but couldn't figure out a principled approach to causality using it. Since my goal in life at this point is to demonstrate discovery of macrosocial dynamics latent in the data by approximating the Algorithmic Information Criterion for model selection and get a prize funded to nuke the social pseudosciences before they succeed in nuking humanity with their "Alignment" theocracy, I had to continue working on the philosophy of causality itself which is why I revisited Etter's largely unpublished corpus. I didn't even realize it was incorporating a variation of BPE until I began more carefully examining the structure of the causal graphs. Too bad the Desi's thought their H-1b attack on the West was more important than supporting people like Tom <https://en.wikipedia.org/wiki/Dartmouth_workshop>. Around June 18, 1956, the earliest participants (perhaps only Ray > Solomonoff, maybe with Tom Etter) arrived at the Dartmouth campus in > Hanover, N.H., to join John McCarthy who already had an apartment there. > Solomonoff and Minsky stayed at Professors' apartments, but most would stay > at the Hanover Inn. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-Md3fbf44283528ff585b03226 Delivery options: https://agi.topicbox.com/groups/agi/subscription
