Oh!! GANs refine/train the Generator by Using a Critic, until the Generations
are more real/lossless like the Hutter Prize explains. It's just compressing
data so it can extract/generate all the good stuff. But why compress? It's
faster to let the context find matches. Again, brute Force can make/do what AGI
can do, but a brain uses SO little data and can know what will occur after any
unseen data using SO little data. So the more diverse data it learns the
exponentially more it knows, but can compress it so it's a tad more usable. Ok
but why compress? You have 100GB and know it all. Why compress!? The perfect
small world network would have 1 of each unique quantinized, no double nodes.
You recognize all the similar contexts using just 1 stored. And maybe a few
other hill peaks in my viz ('the cat' activates 'this dog ate' and 'my mom
ate') which act as Attention Heads like in Transformers? So, what is the 100GB
is 40% wrong/not real? One bad answer leads to another bad answer,
exponentially bad the more you know such data. You prune false data by
collecting more real world data. But we still are left with 60GB of true data.
It seems like it wants a small world network. But why??? It 'feels' like Earth
looses data if 'you'r compressor gets too small and takes longer sometimes to
regenerate/decompress. Well. Byte Pair Encoding lowers Cost yet retains ALL
data. Hierarchy does too. Lossless. So if you delete some nodes like 'my cat
ate' and keep 'my dog ate', it let's you still regenerate back because of
similar words/positions. But it tells you something, it connects something.
Let's think, a given phrase node activates many, and each it activates can be
activated by many, while the Next Token can be gotten by ex. 100 nodes (hence
10,000 inputs) and a given node has ex. 50 Next Tokens too. But x does sit in
all 100 nodes, and if cat activates dogs in nodes then that was because of
context handles. When 'dog' is Next Word predicted, 100 phrases need to predict
it, so they can all activate 1 single node basically, but this is already
allowed without touching hierarchy...If it's related, it will. However the
combined nodes, 'wehilcllome' (welcome, hi, hello, nodes & delay trigger this
node just enough) would allow for use of it, a combined phrase that works for
many. 'my dog likes to eat food' and 'food is what his cat loves to still eat'
could become a averaged 'my animal loves food'.
Internal self thinking is a RL and has actions to move around on the
internet/desktop. It's goals as said above, uses exploit/explore to find data
to expand, and to compress data and lower node/connection count Cost error.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink:
https://agi.topicbox.com/groups/agi/T409fc28ec41e6e3a-Mea24d23de1b5d3476d1f719a
Delivery options: https://agi.topicbox.com/groups/agi/subscription