Yes it will be open source. I will share a cleaner and better version soon.
Enwik8/9 dataset, but my plan is to use the first 100,000 bytes for faster testing, until it matches what Byron Knoll's scores for it. Which is about 20,000 or less basically. I use the pre-processed version for now, which feeds in actually about 65,000 bytes. With my plain algorithm of 15 order matches (training on letters it pops out from lossless decompression), without recency priming, I score ~24,950, but ~25,052 if use an self-amplifier instead of controlling which are mixed first (because that could hurt parallelism). Also, with only a tiny amount (3 possible arrangements and about 3 others helped only slightly, half help big therefore) of the entire holes and delay mechanism, I already score 24,252, and there was 3 different keys to this I realized this month that made it work better - and they are not very obvious if you were to go and try this without heavy/smart thinking and experimenting !, and if I add all of I might score 22,500. If I add back my priming, it's about 500 down basically also, so maybe 22,000. Related words should in theory bring it down to 20,000 or something. And 1 other mechanism might be Mirroring the abilities using a command matches with an example and then half the unseen example, because you can ask someone to say a very related word to any word, but that's impossible with related matches even with holes and delays. Of course now we know thinking too (generate data on demand) allows better prediction, and tool use. The last 2 days were good, I've learned a lot again fast, I think I could possibly have a plan that might actually run on GPU. No backprop. It turns out, it almost seems as if we don't need to use 10,000 gpus to train gpt3/4 for a month, I think the network can be "constructed" first on CPU and literally with plain yes plain strings hierarchically (not trie/tree) in 1 day, and then ran on GPU for inference using the special algorithm I have in mind so that these exact string also match non-exact inputs. Possibly maybe even all on GPU or all on CPU, but either way I believe it costs "5 dollars" - and not 5 million, to make gpt3/4. Hierarchy structure is what allows the half activated memory to be continued to be matched once you get to the end of a long sentence with a injected middle part like "the cat that was on the mat on the bed in the room is sleeping", so now the node "the cat is sleeping" is fully matched. Of course with delay. Keep in mind, delays in the many matches looks for pattern too, if in 1 match the items are all stretched or rotated, it's still "a cat", because the delays all cancel out mathematically (or mostly, with some remaining penalty). These numbers below are probably slightly off that I made this month if code is lagging in smaller data due to optimizations for bigger data or if simply possibly does better with more data, but it's still helpful to look at!: Byron's is the bottom 2 scores (he recently has a better score yes, outdated, but it's usable for now): if i score this i should be able to score this: 25,000 > 18,497,970 24,000 > 17,758,051 23,000 > 17,018,132 22,000 > 16,278,214 21,000 > 15,538,295 20,054 > 14,838,332 (0.2600811808118081 % down. Byron's 2 scores for 100,000 bytes and 100,000,000). Byrons scores as 10x more data is fed in, note i already inflated the upper by 0s at the end for comparison only 20,054,000 ## for 100,000 bytes, inflated only for show 17,638,800 (0.1204348259698813 % drop) 16,514,210 (0.0637566047576933 % drop) 14,838,332 (0.1014809669975131 % drop) ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T6cf3be509c7cd2f2-Maf8309f640fab7c45a61b4bb Delivery options: https://agi.topicbox.com/groups/agi/subscription
