Someone else recognized the truth long ago:
https://www.reddit.com/r/MachineLearning/comments/47j8j6/is_deep_learning_a_markov_chain_in_disguise/

If you take a 500 dimensional dataset, as Juan showed in that refreshing 
article, they say, given a new data point unseen, how can you learn general 
patterns in  the high D space without overfitting either. In a ex. 2D space we 
may have a few blue dots surrounded by red dots, but the thing here is....after 
you do your pattern checking on the unseen point, you can then determine where 
in that 2D space it is...is it inside the blue ball zone, or not? Or maybe is 
both blue and red? It may look like both a rat and a lion after all. But the 
thing here is is,  it is not all just tweak the weights until it gives you 
those patterns, else Transformers wouldn't need positional coding, embedding 
like Glove/Word2Vec/Seq2Vec, or self-attention! Or BPE, Normalization, all the 
things mine has is same too just no curve manipulation going on. I still need 
pooling activation function BTW and weights, but it's as esy to understand as, 
a markov chain. At best, backpropagation is just an optimization to make HHMMs 
faster, it can't be a new way to find patterns, all patterns start at exact 
matches, and clearly it is doing all the things mine does. I'm not one bit 
interested therefore in the curve manipulation in backprop/ the net, i'm only 
interested in the pattern keys/ organs of it all - BPE, self-attention, embed 
relations, normalization, pooling energy, all these things that actually look 
for patterns in data. Backprop is not part of AI, it's a shirt on a body, not 
the muscles.
------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Ta86fa089ebd8ca28-Mbc756438f56627aebb9a8c2a
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to