Oh!! GANs refine/train the Generator by Using a Critic, until the Generations are more real/lossless like the Hutter Prize explains. It's just compressing data so it can extract/generate all the good stuff. But why compress? It's faster to let the context find matches. Again, brute Force can make/do what AGI can do, but a brain uses SO little data and can know what will occur after any unseen data using SO little data. So the more diverse data it learns the exponentially more it knows, but can compress it so it's a tad more usable. Ok but why compress? You have 100GB and know it all. Why compress!? The perfect small world network would have 1 of each unique quantinized, no double nodes. You recognize all the similar contexts using just 1 stored. And maybe a few other hill peaks in my viz ('the cat' activates 'this dog ate' and 'my mom ate') which act as Attention Heads like in Transformers? So, what is the 100GB is 40% wrong/not real? One bad answer leads to another bad answer, exponentially bad the more you know such data. You prune false data by collecting more real world data. But we still are left with 60GB of true data. It seems like it wants a small world network. But why??? It 'feels' like Earth looses data if 'you'r compressor gets too small and takes longer sometimes to regenerate/decompress. Well. Byte Pair Encoding lowers Cost yet retains ALL data. Hierarchy does too. Lossless. So if you delete some nodes like 'my cat ate' and keep 'my dog ate', it let's you still regenerate back because of similar words/positions. But it tells you something, it connects something. Let's think, a given phrase node activates many, and each it activates can be activated by many, while the Next Token can be gotten by ex. 100 nodes (hence 10,000 inputs) and a given node has ex. 50 Next Tokens too. But x does sit in all 100 nodes, and if cat activates dogs in nodes then that was because of context handles. When 'dog' is Next Word predicted, 100 phrases need to predict it, so they can all activate 1 single node basically, but this is already allowed without touching hierarchy...If it's related, it will. However the combined nodes, 'wehilcllome' (welcome, hi, hello, nodes & delay trigger this node just enough) would allow for use of it, a combined phrase that works for many. 'my dog likes to eat food' and 'food is what his cat loves to still eat' could become a averaged 'my animal loves food'.
Internal self thinking is a RL and has actions to move around on the internet/desktop. It's goals as said above, uses exploit/explore to find data to expand, and to compress data and lower node/connection count Cost error. ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T409fc28ec41e6e3a-Mea24d23de1b5d3476d1f719a Delivery options: https://agi.topicbox.com/groups/agi/subscription