1. @Can symblic approach ...
2. @Rob Freeman LLMs, What's wrong with NLP (2009-2024),  Whisper

*1*. *IMO the sharpness of the division "neat" and "scruffy",  NN and symbolic 
is confused: Neural Networks are also symbolic:* 
http://artificial-mind.blogspot.com/2019/04/neural-networks-are-also-symbolic.html

NNs are a subset of symbolic, as of implementation and the output, and symbolic 
is also a bad term, a better one is *conceptual*, in a developing system it is 
about creation of systems of *concepts* and operating with them; generalization 
from specifics; not about  "symbols" (dereferencing or the characters) or 
mindless algebra, "symbol manipulation", as every calculation can be seen as 
something like that.

Either NN or whatever computational is a program in a computer, the training of 
NN is a kind of programming, in the big-data trained one the data is just the 
biggest part of the code. If the "data part" of the code is represented in a 
more succinct way, or the current NN part is more complex so it needs less 
"brute force"*, they will converge in another intermediate representation. The 
brute force is also relative.

The whole NNs are concepts or "symbols" within "symbolic" systems, it could 
incorporate it and use whatever is already available, existing models and 
training new ones for particular tasks.

An AGI, Mind and Universe, cognitively, are hierarchical simulators of virtual 
universes, with the lowest level being "the real" or "physical" one for the 
current evaluator-observer virtual universe (causality-control unit). Whatever 
works is allowed. The terms are from the Theory of Universe and Mind, classical 
version 2001-2004, taught during the world-first university course in AGI 
(Plovdiv 2010,2011) and the core reasoning gradually got incorporated in the 
mainstream AI (some were hiding there earlier). 

That kind of architecture or working, providing an explicit  imagination is 
something that LLMs currently lack, they "have" only implicit and 
"sweeping-on-the-go" one, encoded within their whole system, like the diffusion 
models and GANs have implicit models for 3D models and global illumination.,

*2.* Yes, the tokenization in current LLMs is usually "wrong", it's workable 
for shaping and generating plausible matter for the modality, given "knowing 
all already" and covering all cases, but it  should be on concepts and world 
models: simulators of virtual universes, and mapped to imagination, it should 
predict the *physical* future of the virtual worlds, not these tokens which are 
often not morphemes, not cases - sometimes they match specific "meaningful" 
ones, and as they now use huge amount, many words and MWE get separate tokens.

The models can indirectly create these *world *models in order to generate the 
correct words, but then the data should include wider information and 
intentions, as in the multimodal models. 

The following early 2009 articles are still valid, while there is progress 
according to some of the suggestions, and now there is a longer "chain of 
intelligent operations" (even "Chain of thought reasoning" as a term):

*What's wrong with Natural Language Processing? Part I: *
https://artificial-mind.blogspot.com/2009/02/whats-wrong-with-natural-language.html

*What's wrong with Natural Language Processing? Part II : Static, Specific, 
High-level, Not-evolving...
*
https://artificial-mind.blogspot.com/2009/03/whats-wrong-with-natural-language.html

It includes criticism about the NLP tests at the time, there were a few back 
then, POS-tagging etc., now they are plenty, but many seem as funny as back 
then, once I reviewed one for predicting the next word from novels, the 
examples from the paper were all stereotypes and banalities, and the LLMs 
celebrate going higher from 68.4 to 69.6%, just like in the 2000s. A part of 
the conclusions of one of the works:

*""" *
*(...) 
Yes, mainstream NLP at the moment:*

*- Is useful.*
*- Solve[s] some abstract specific problems by heuristics.*
*- It works to some degree for "intelligent" tasks, because of course language 
do maps mind.*

*However, the mainstream still does not lead to a chain of intelligent 
operations, there are not loops and cumulative development.*

*-- The length of the chain of inter-related intelligent operations in NLP 
today is very short. This is related to the lack of will and general goals of 
the systems. These systems are "push-the-button-and-fetch-the-result". 
*[Now: "prompts", but moving to agents]

*-- Swallowing of a huge corpus of 1 billion of words or so and a computation 
of statistical dependencies between tokens is not the way mind works.*

!!! Mind learns step by step, modeling simpler 
constructs/situations/dynamics/models before reaching to more complex.
!!! Temporal relations of the input with different complexity is important.
!!! Mind usually uses many sensory inputs while learning. *Very important. 

 [Multimodal models, Vision-Text models]

*
!!! Mind has will, uses feedback and can actively and evolutionary test and 
improve correctness and effectiveness of its operation, including 
natural-language-related.
(...)
*I suggest:*

1. *Hollistic approach* - the goal is building an operational mind with [a] 
long chain of intelligent operations, not completion of a table with values 
94.55% 96.5% 90.4% and a long list with quotes in the end of a paper. *[LOL, 
this is still going on.]*
(...)
""""

Yes, humans seem to need less data (and a proper design should need less), but 
re Whisper in particular - I think the comparison is not well aligned, because 
it "learns" all languages. Speech recognition for a single speaker and single 
language and limited conditions doesn't need so much data (and if it's about 
*understanding* the content, one voice would be enough), even with the "dumb" 
methods and humans are not perfect recognizers either. Average humans are poor 
in non-native language and won't recognize correctly many words with accents in 
their own native language, and won't have a high precision for rare or complex 
words and expressions, especially in languages with irregular writing and 
overlapping phonetics and many accents.

The actual learning time for humans is also not only the "live" time, because 
the brain is replaying (which is also  "data augmentation") and learning is 
more multimodal, one other modality is one's own speech producing system, both 
the commands for controlling it, and its output. 

>The only structure is "tokens". 

Yes, but the only at the input level, there are implicit structures, however 
too much "embedded" in the whole, and the low level one is not well "grounded" 
to the actual other-modalities input which it represents, in the typical LLMs.

The human language, for the human agents, is multimodal, multirange, 
multiresolution,  includes intentions (again at multiple scales, precisions, 
ranges; also other agent's inferred or recognized/suggested ones) etc. from the 
start.


*Theory of Universe and Mind*
https://github.com/Twenkid/Theory-of-Universe-and-Mind

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M7f05c62c71f943e0cce5da1f
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to