Oh!! GANs refine/train the Generator by Using a Critic, until the Generations 
are more real/lossless like the Hutter Prize explains. It's just compressing 
data so it can extract/generate all the good stuff. But why compress? It's 
faster to let the context find matches. Again, brute Force can make/do what AGI 
can do, but a brain uses SO little data and can know what will occur after any 
unseen data using SO little data. So the more diverse data it learns the 
exponentially more it knows, but can compress it so it's a tad more usable. Ok 
but why compress? You have 100GB and know it all. Why compress!? The perfect 
small world network would have 1 of each unique quantinized, no double nodes. 
You recognize all the similar contexts using just 1 stored. And maybe a few 
other hill peaks in my viz ('the cat' activates 'this dog ate' and 'my mom 
ate') which act as Attention Heads like in Transformers? So, what is the 100GB 
is 40% wrong/not real? One bad answer leads to another bad answer, 
exponentially bad the more you know such data. You prune false data by 
collecting more real world data. But we still are left with 60GB of true data. 
It seems like it wants a small world network. But why??? It 'feels' like Earth 
looses data if 'you'r compressor gets too small and takes longer sometimes to 
regenerate/decompress. Well. Byte Pair Encoding lowers Cost yet retains ALL 
data. Hierarchy does too. Lossless. So if you delete some nodes like 'my cat 
ate' and keep 'my dog ate', it let's you still regenerate back because of 
similar words/positions. But it tells you something, it connects something. 
Let's think, a given phrase node activates many, and each it activates can be 
activated by many, while the Next Token can be gotten by ex. 100 nodes (hence 
10,000 inputs) and a given node has ex. 50 Next Tokens too. But x does sit in 
all 100 nodes, and if cat activates dogs in nodes then that was because of 
context handles. When 'dog' is Next Word predicted, 100 phrases need to predict 
it, so they can all activate 1 single node basically, but this is already 
allowed without touching hierarchy...If it's related, it will. However the 
combined nodes, 'wehilcllome' (welcome, hi, hello, nodes & delay trigger this 
node just enough) would allow for use of it, a combined phrase that works for 
many. 'my dog likes to eat food' and 'food is what his cat loves to still eat' 
could become a averaged 'my animal loves food'.

Internal self thinking is a RL and has actions to move around on the 
internet/desktop. It's goals as said above, uses exploit/explore to find data 
to expand, and to compress data and lower node/connection count Cost error.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T409fc28ec41e6e3a-Mea24d23de1b5d3476d1f719a
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to