Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-24 Thread Rob Freeman
Yes, I seem to have missed where you showed how principal component
analysis would apply to my simple example of ordering sets.

Pity. I would have liked to have seen PCA applied to the simple task
of ordering a set of people alternately by height or age.

Anyway, the good thing is that you can present no coherent objection
to what I suggest.

Your initial objection that my "contradiction" was just your
"variance", having fallen to the idea that what I mean by
"contradiction" is "not what people usually mean." And your second
objection that it is just PCA being forced to lapse into the obscurity
of a missed explanation.

On Mon, Jun 24, 2024 at 5:18 PM Boris Kazachenko  wrote:
>
> Rob, I already explained how it applies to your example, your just "unable" 
> to comprehend it. Because your talk / think ratio is way too high.
> Artificial General Intelligence List / AGI / see discussions + participants + 
> delivery options Permalink

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M7762c7f027470fbb36b7dea0
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-24 Thread Rob Freeman
If you mean my focus is on encoding or representation as the key
unsolved problem holding us back from AGI, then yes, you're probably
right.

On Mon, Jun 24, 2024 at 4:27 PM Quan Tesla  wrote:
>
> Rob
>
> I applied my SSM method to your output here, like a broad AI might've done. 
> The resultant context diagram was enlightening.  You talk many things, but 
> seemingly only evidence two, primary things. One of those has to do with 
> constructors and embedding (the text-to-number transformations). I read up on 
> it. It reminded me of numbering schemas for telecom systems.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M00535cc92708f61aa419b4ae
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-24 Thread Boris Kazachenko
Rob, I already explained how it applies to your example, your just "unable" to 
comprehend it. Because your talk / think ratio is way too high. 
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mcdc0859b846789e70bb07a08
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-24 Thread Quan Tesla
Rob

I applied my SSM method to your output here, like a broad AI might've done.
The resultant context diagram was enlightening.  You talk many things, but
seemingly only evidence two, primary things. One of those has to do with
constructors and embedding (the text-to-number transformations). I read up
on it. It reminded me of numbering schemas for telecom systems.

However, the second feature is all about you vs others. It seems, other
than advancing your own worldview, you don't really have sufficient
knowledge to debate constructively. For this reason, I don't think you're
quite sure of your contribution to the field, even after 35 years. I
derived some rhetorical questions: Are you here hoping to find some
inspiration/help from others, andor to play oneupmanship?

First, know thyself: "A literal translation of the phrase 'Neuro Linguistic
Programming' is that NLP *empowers, enables and teaches us to better
understand the way our brain (neuro) processes the words we use
(linguistic) and how that can impact on our past, present and future
(programming)*."

For example, by "This British guy Checkland wrote some books on management
techniques. Some kind of "seven step" process:...", you mean Professor
Peter Checkland, the considered *father of soft systems methodology*?
https://en.wikipedia.org/wiki/Peter_Checkland and
https://www.lancaster.ac.uk/lums/people/peter-checkland

I wonder what your view is on Ivar Jacobson then, the considered father of
OO?

"You need to try and write some code."
I do specify code, just differently to you. As evidenced, it probably
produces meaningful results far quicker than trying to write a program to
do the same. Ergo, it's highly accurate, rapid, cost effective, and
economical. By any standard, those are great dev metrics.

A departing word of reciprocal advice for you. Even though you refer to
complex-adaptive systems science as "management techniques", I think you
could learn a lot from its science, methodology, and applications.

My 2007+ field-research and sampled contributions remain on Researchgate
for anyone to access freely.

I think we're quite done now.

On Mon, Jun 24, 2024 at 6:46 AM Rob Freeman 
wrote:

> Quan,
>
> Lots of words. None of which mean anything to me...
>
> OK "soft-systems ontology" turns up something:
>
> https://en.wikipedia.org/wiki/Soft_systems_methodology
>
> This British guy Checkland wrote some books on management techniques.
> Some kind of "seven step" process:
>
> 1) Enter situation in which a problem situation(s) have been identified
> 2) Address the issue at hand
> 3) Formulate root definitions of relevant systems of purposeful activity
> 4) Build conceptual models of the systems named in the root
> definitions : This methodology comes into place from raising concerns/
> capturing problems within an organisation and looking into ways how it
> can be solved. Defining the root definition also describes the root
> purpose of a system.
> 5) The comparison stage: The systems thinker is to compare the
> perceived conceptual models against an intuitive perception of a
> real-world situation or scenario. Checkland defines this stage as the
> comparison of Stage 4 with Stage 2, formally, "Comparison of 4 with
> 2". Parts of the problem situation analysed in Stage 2 are to be
> examined alongside the conceptual model(s) created in Stage 4, this
> helps to achieve a "complete" comparison.
> 6) Problems identified should be accompanied now by feasible and
> desirable changes that will distinctly help the problem situation
> based in the system given. Human activity systems and other aspects of
> the system should be considered so that soft systems thinking, and
> Mumford's needs can be achieved with the potential changes. These
> potential changes should not be acted on until step but they should be
> feasible enough to act upon to improve the problem situation.
> 7) Take action to improve the problem situation
>
> CATWOE: Customers, Actors, Transformation process, Weltanshauung,
> Owner, Environmental constraints.
>
> I'm reminded of Edward de Bono. Trying to break pre-conceptions and
> being open to seeing a problem from different perspectives.
>
> Look, Quan, in the most general way these kinds of ideas might be
> relevant. But only in an incredibly general sense.
>
> Would we all benefit from taking a moment to reflect on "how to place
> LLMs in context of such developments." Maybe.
>
> At this point I'm on board with Boris. You need to try and write some
> code. Simply talking about how management theory has some recent
> threads encouraging people to brainstorm together and be open to
> different conceptions of problems, is not a "ready to ship"
> implementation of AGI.
>
> On Mon, Jun 24, 2024 at 5:32 AM Quan Tesla  wrote:
> >
> > Rob. I'm referring to contextualization as general context management
> within complex systems management. ...

--
Artificial General Intelligence List: AGI
Permalink: 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Rob Freeman
Quan,

Lots of words. None of which mean anything to me...

OK "soft-systems ontology" turns up something:

https://en.wikipedia.org/wiki/Soft_systems_methodology

This British guy Checkland wrote some books on management techniques.
Some kind of "seven step" process:

1) Enter situation in which a problem situation(s) have been identified
2) Address the issue at hand
3) Formulate root definitions of relevant systems of purposeful activity
4) Build conceptual models of the systems named in the root
definitions : This methodology comes into place from raising concerns/
capturing problems within an organisation and looking into ways how it
can be solved. Defining the root definition also describes the root
purpose of a system.
5) The comparison stage: The systems thinker is to compare the
perceived conceptual models against an intuitive perception of a
real-world situation or scenario. Checkland defines this stage as the
comparison of Stage 4 with Stage 2, formally, "Comparison of 4 with
2". Parts of the problem situation analysed in Stage 2 are to be
examined alongside the conceptual model(s) created in Stage 4, this
helps to achieve a "complete" comparison.
6) Problems identified should be accompanied now by feasible and
desirable changes that will distinctly help the problem situation
based in the system given. Human activity systems and other aspects of
the system should be considered so that soft systems thinking, and
Mumford's needs can be achieved with the potential changes. These
potential changes should not be acted on until step but they should be
feasible enough to act upon to improve the problem situation.
7) Take action to improve the problem situation

CATWOE: Customers, Actors, Transformation process, Weltanshauung,
Owner, Environmental constraints.

I'm reminded of Edward de Bono. Trying to break pre-conceptions and
being open to seeing a problem from different perspectives.

Look, Quan, in the most general way these kinds of ideas might be
relevant. But only in an incredibly general sense.

Would we all benefit from taking a moment to reflect on "how to place
LLMs in context of such developments." Maybe.

At this point I'm on board with Boris. You need to try and write some
code. Simply talking about how management theory has some recent
threads encouraging people to brainstorm together and be open to
different conceptions of problems, is not a "ready to ship"
implementation of AGI.

On Mon, Jun 24, 2024 at 5:32 AM Quan Tesla  wrote:
>
> Rob. I'm referring to contextualization as general context management within 
> complex systems management. ...

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M3d4918db953d60852bc7ccd0
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Rob Freeman
No, I don't believe I am talking about PCA. But anyway, you are unable
to demonstrate how you implement PCA or anything else, because your
algorithm is "far from complete".

You are unable to apply your conception of the problem to my simple
example of re-ordering a set.

How about PCA itself? If that's what you think I am suggesting, can
you show how PCA would apply to my simple example?

On Mon, Jun 24, 2024 at 9:14 AM Boris Kazachenko  wrote:
>
> What I mean by contradiction is different orderings of an entire set
> of data, not points of contrast within a set of data
>
> That's not what people usually mean by contradiction, definitely not in a 
> general sense.
>
> You are talking about reframing dataset (subset) of multivariate items along 
> the spectrum of one or several most predictive variable in items. This is 
> basically a PCA, closely related to Spectral Clustering I mentioned in the 
> first section of my readme:   "Initial frame of reference here is space-time, 
> but higher levels will reorder the input along all sufficiently predictive 
> derived dimensions, similar to spectral clustering."
>
...
> I can't give you any demonstration because my algo is far from complete. It 
> would be like like demostrating how ANN works before you figure out how 
> single-node perceptron works. Except that my scheme is hundreds of times more 
> complex than perceptron. You just have to decide for yourself if it makes 
> sense from the first principles.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Ma2a9d38780ee195464364b37
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Boris Kazachenko
> What I mean by contradiction is different orderings of an entire set
of data, not points of contrast within a set of data
That's not what people usually mean by contradiction, definitely not in a 
general sense.

You are talking about reframing dataset (subset) of multivariate items along 
the spectrum of one or several most predictive variable in items. This is 
basically a PCA, closely related to Spectral Clustering I mentioned in the 
first section of my readme:   "Initial frame of reference here is space-time, 
but higher levels will reorder the input along all sufficiently predictive 
derived dimensions, similar to spectral clustering 
."

>Claude 3.5:
Yes, there is indeed a relationship between Principal Component Analysis (PCA) 
and spectral clustering. Both techniques involve eigenvalue decomposition and 
can be used for dimensionality reduction, but they have different primary 
purposes and methodologies. Let me explain the connection:

1. Similarity:
   - Both PCA and spectral clustering use eigenvalue decomposition of a matrix 
derived from the data.
   - Both can be used for dimensionality reduction before applying other 
algorithms.

2. Key Differences:
   - PCA focuses on variance maximization, while spectral clustering focuses on 
graph partitioning.
   - PCA operates on the covariance matrix, while spectral clustering typically 
uses the graph Laplacian matrix.

3. Spectral Clustering Overview:
   - Spectral clustering is a technique that treats the data clustering as a 
graph partitioning problem.
   - It uses the eigenvalues of the similarity matrix to perform dimensionality 
reduction before clustering in fewer dimensions.

4. Connection:
   - The eigenvectors used in spectral clustering can be seen as a nonlinear 
generalization of the principal components in PCA.
   - In some cases, when the data lies on a linear manifold, spectral 
clustering can reduce to PCA.

5. Laplacian Eigenmaps:
   - This is a technique that bridges PCA and spectral clustering.
   - It's similar to PCA but preserves local neighborhoods, making it more 
suitable for nonlinear manifolds.

6. Use Cases:
   - PCA is often used for general dimensionality reduction and feature 
extraction.
   - Spectral clustering is particularly effective for clustering data that 
isn't linearly separable in the original space.

I can't give you any demonstration because my algo is far from complete. It 
would be like like demostrating how ANN works before you figure out how 
single-node perceptron works. Except that my scheme is hundreds of times more 
complex than perceptron. You just have to decide for yourself if it makes sense 
from the first principles.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Ma97988eb78fb8750ac41ec53
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Quan Tesla
Rob. I'm referring to contextualization as general context management
within complex systems management. As an ontology. The application of which
has relevance for knowledge graphs, LLMs, and other knowledge-based
representations. Your quotation: "Contextualization ... in LLM systemic
hierarchies.", is incorrect:  I stated: "To increase contextualization and
instill robustness in the LLM systemic hierarchies." I'm asserting that the
suitably-completed, context-management theory could be applied to LLM
systemic hierarchies to instill robustness (schemas of hierarchical
control). This should've been apparent from the remainder of my message.

I made the assumption that we were discussing this in context of LLMs, not
stating that LLM theory was completed 17 years ago,. I have no idea if, and
when it was completed.

My point remains, if you're busy working on context management theory for
complex adaptive systems - as you seem to be implying - such as trying to
engineer fully recursive (in the sense of machine learning) functionality
within LLMs, then this would have direct relevance. Furthermore, I'm
asserting how that foundational work's been completed since 2007 (in the
sense of being ready for application, even to modern-day constructor theory
and LLMs). For example, why debate "ambiguity", when that was elegantly
resolved by such an ontology?

I understand your reference elsewhere as to embedding knowledge via a
number schema, but I recognize some telecom-industry thinking in the
numbering system. Yes, ambiguity would eventually result from that. Not
only from a schema perspective, but also from a multilingual perspective.
Such complexity may not be necessary. If instead of sentence rules
(grammatical constructs), policy of language (governing laws) would be
specified and extracted from domain-specific knowledge, it would matter
more as to what context the words were being used in (as comprehension),
than the actual words being used (sentences). Surely, it must depend on
what LLMs are being purposed for.

In the preceding scenario of use, embedding could then be accurately
understood and referenced via an emergent structure of a context. I think
you were discussing something along those lines, wondering if this may
occur naturally. I can tell you with conviction, I have years of experience
specifying contexts for a large sample of knowledge contexts (domains).
Given a consistent set of algorithms, structure always emerge from full
normalization (optimal knowledge maturation and integration).

My thought was, if any such a soft-systems ontology would be married up
with LLMs (there are a few of these soft system methodologies about), it
may creatively resolve many of the constraints inherent in linear
progression. I think this approach could help fast track developments
towards AGI. In addition, it would probably satisfy Checkland's definition
of emergence, which is an outcome of a "debate" (systemic interaction)
between linear and alinear systems.

For Japan, consider their GRAPE supercomputer architecture. Maybe they just
play their AI cards closer to their chest. Then, there's their robot,
Honda's Asimo (2000). It was the first robot in the world to exhibit
human-friendly functionality. I think it once kicked ball with Obama,
verbalizing cognition of who Obama was. I find it strange that there's an
Internet-based claim that it was retired in 2000. I still watched a video
of a new version of it a few years ago, following its development as much
as public info would allow for. Last version I recall was where it had
achieved companion-like functionality to be of assistance to humans, e.g.,
holding hands while walking and remaining fully conversational, especially
as a lobby assistant, supporting the elderly, and a doggy version
befriending and tutoring children. As far as AGI in Japan is concerned, I'm
also aware of specific research to replicate humans in looks, sound,
movement, personality and reasoning, and to assume human-like, interactive
gender roles. Inter alia, Japan's effectively producing societal AI
products.

For some of us, the notion of capturing the human soul on a chip,
represents a fascinating journey. I think it should form part of the scope
of future-AGI developments.

Let's, for a moment, reflect on how to place LLMs in context of such
developments.



On Mon, Jun 17, 2024 at 1:10 PM Rob Freeman 
wrote:

> On Mon, Jun 17, 2024 at 3:22 PM Quan Tesla  wrote:
> >
> > Rob, basically you're reiterating what I've been saying here all along.
> To increase contextualization and instill robustness in the LLM systemic
> hierarchies. Further, that it seems to be critically lacking within current
> approaches.
> >
> > However, I think this is fast changing, and soon enough, I expect
> breakthroughs in this regard. Neural linking could be one of those
> solutions.
> >
> > While it may not be exactly the same as your hypothesis (?), is it
> because it's part of your PhD that you're not willing to 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Rob Freeman
On Sun, Jun 23, 2024 at 11:05 PM Boris Kazachenko  wrote:
>
> There can be variance on any level of abstraction, be that between pixels or 
> between philosopical categories. And it could be in terms of any property / 
> attribute of compared elements / clusters / concepts: all these are derived 
> by lower-order comparisons.

I'm quite willing to believe there can be variance between anything at
all. But can you give me a concrete example of a "variance" that you
actually implement? One which can demonstrate its equivalence to my
sense of "contradiction" as alternate orderings of a set.

Or are you telling me you have conceptually accounted for my sense of
contradiction, simply by saying the word "variance", which includes
everything, but in practice do not implement it.

If so, since saying the word "variance" will trivially solve any
implementation, can you sketch an implementation for "variance" in my
sense of "contradiction"?

> None of that falls from the sky, other than pixels or equivalents: sensory 
> data at the limit of resolution. The rest is aquired, what we need to define 
> is the aquisition process itself: cross-comp (derivation) and clustering 
> (aggregation).

You think "cross-comp (derivation) and clustering (aggregation)" will do it?

Good. Yes. Please show me how to use "cross-comp (derivation) and
clustering (aggregation)" to implement "variance" in my sense of
"contradiction", as alternate orderings of sets.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M261ada29388968d40ef34faf
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Boris Kazachenko
There can be variance on any level of abstraction, be that between pixels or 
between philosopical categories. And it could be in terms of any property / 
attribute of compared elements / clusters / concepts: all these are derived by 
lower-order comparisons. None of that falls from the sky, other than pixels or 
equivalents: sensory data at the limit of resolution. The rest is aquired, what 
we need to define is the aquisition process itself: cross-comp (derivation) and 
clustering (aggregation).
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M1005e5cf574256f752722a0a
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Rob Freeman
There were 5 or 6 totally mis-interpretations of my words in there,
Boris. Mis-interpretations of my words was almost the whole content of
your argument. I'll limit myself to the most important
mis-interpretation below.

On Sun, Jun 23, 2024 at 7:10 PM Boris Kazachenko  wrote:
> ...
> Starting from your "contradiction": that's simply a linguistic equivalent of 
> my variance.

Is it?

What I mean by contradiction is different orderings of an entire set
of data, not points of contrast within a set of data.

E.g. if you take a group of people and order them by height you will
generally disorder them by age. But if you order them by age, you will
generally disorder them by height. The orders are (depending on
correlation of age and height) contradictory. It's impossible to order
them both at the same time.

Is that really what you mean by "variance"?

I understood your use of "variance" to mean something like edge
detection in CNNs. You write:

"... CNN ... selects for variance"

Are you now saying your "variance" is my re-ordering of sets, and not
"edge" detection or points of contrast within sets?

To clarify what you mean by "variance", can you give me a concrete
example, one which is comparable to my example of ordering people by
height or age, alternately?

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Ma5a6d254cc2308c2f66ae27e
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-23 Thread Boris Kazachenko
Rob, a lot of your disagreements stem from your language-first mindset. 
Which is perverse, you must agree that the language is a product of basic 
cognitive ability, possed by all mammals.
Starting from your "contradiction": that's simply a linguistic equivalent of my 
variance. 
I have no idea why you think my nested hierarchies don't "appear naturally, in 
real time", and are not context-dependent? Maybe it's because you never tried 
to actually implement your ideas, so you don't think procedurally? 
I never said that all parerns must be global, the whole idea of hierarchy means 
that there is gradial transition from local to global.

>Compared to what I remember you are addressing "lateral" patterns now.
Perhaps as a consequence of the success of transformers?

It's always been lateral, I didn't learn anything worth borrowing from 
transformers, or neural nets in general. The whole idea of statistical fitting 
that underlies it is backwards, we do it because it's simple and we are lazy. 
Incremental functional complexity is what progress is all about. Evolution 
found some quick and dirty fixes: neural nets of whatever kind, now we need to 
use them to create something way better. Self-improvement. 


--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Md375cd0b433d4b19f4dcb0e5
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-22 Thread Rob Freeman
On Sat, Jun 22, 2024 at 7:50 PM Boris Kazachenko  wrote:
>
> On Saturday, June 22, 2024, at 7:18 AM, Rob Freeman wrote:
>
> But I'm not sure that just sticking to some idea of learned hierarchy, which 
> is all I remember of your work, without exposing it to criticism, is 
> necessarily going to get you any further.
>
> It's perfectly exposed: https://github.com/boris-kz/CogAlg

I see. The readme seems concise. Quite a good way to expose it.

Trivial typo in that first paragraph BTW, "pattern recognition (is?) a
main focus in ML"?

So what's your idea now? I remember we talked long ago and while you
were early into networks, I couldn't convince you that the key problem
was that meaning could not be fully learned, because meaningful
patterns contradict. You were sure all that was needed was learning
patterns in hierarchy.

Where have you arrived now?

You say, "The problem I have with current ML is conceptual consistency."

By "conceptual consistency" you mean a distinction between searching
for "similarity" with "attention" in transformers, "similarity" being
co-occurrence(?), vs "variance" or edges, in CNNs.

The solution you are working on is to cluster for both "similarity"
and "variance".

FWIW I think it is a mistake to equate transformers with "attention".
Yeah, "attention" was the immediate trigger of the transformer
revolution. But it's a hack to compensate for lack of structure. The
real power of transformers is the "embedding". Embedding was just
waiting for something to liberate it from a lack of structure.
"Attention" did that partially. But it's a hack. The lack of structure
is because the interesting encoding, which is the "embedding",
attempts global optimization, when instead globally it contradicts in
context, and needs to be generated at run-time.

If you do "embedding" at run time, it can naturally involve token
sequences of different length, and embedding sequences of different
length generates a hierarchy, and gives you structure. The structure
pulls context together naturally, and "attention" as some crude dot
product for relevance, won't be necessary.

Ah... Reading on I see you do address embeddings... But you see the
problem there as being back-prop causing information loss over many
layers. So you think the solution is "lateral" clustering first. You
say "This cross-comp and clustering is recursively hierarchical". Yes,
that fits with what I'm saying. You get hierarchy from sequence
embedding.

So what's the problem with these embedding hierarchies in your model?
In mine it is that they contradict and must be found at run time. You
don't have that. Instead you go back to the combination of "similarity
and variance" idea. And you imagine there are some highly complex
"nested derivatives"...

So the contrast between us is still that you don't see that
contradictory patterns prohibit global learning.

Compared to what I remember you are addressing "lateral" patterns now.
Perhaps as a consequence of the success of transformers?

But instead of addressing the historical failure of "lateral",
embedding hierarchies as a consequence of contradictory patterns, as I
do, you imagine that the solution is some mix of this combination of
"similarity" and "variance", combined with some kind of complex
"nested derivatives".

There's a lot of complexity after that. One sentence jumps out "I see
no way evolution could  produce (the) proposed algorithm". With which
I agree. I'm not sure why you don't think that's an argument against
it. This compared with my hypothesis, which sees nested hierarchies
appearing naturally, in real time, as synchronized oscillations over
predictive symmetries in a sequence network.

Can I ask you, have you considered instead my argument, that these
"lateral" hierarchical embeddings might be context dependent, and
contradict globally, so that they can't be globally learned, and must
be generated at run time? Do you have any argument to exclude that?

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mcd56e51f00e643bbf4829174
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-22 Thread Boris Kazachenko
On Saturday, June 22, 2024, at 7:18 AM, Rob Freeman wrote:
> But I'm not sure that
just sticking to some idea of learned hierarchy, which is all I
remember of your work, without exposing it to criticism, is
necessarily going to get you any further.
It's perfectly exposed: https://github.com/boris-kz/CogAlg, I justify it from 
the most general considerations, and do read a bunch. But there must be focus 
and filter, directed generative search if you will. 
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mcef4f83edb25a1b534b55acb
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-22 Thread Rob Freeman
On Sat, Jun 22, 2024 at 6:05 PM Boris Kazachenko  wrote:
> ...
> You both talk too much to get anything done...

Ah well, you may be getting lots done, Boris. The difference is
perhaps, I don't know everything yet.

Though, after 35 years, it can be surprising what other people don't
know. I like to help where I can. Some people just have no clue. Even
LeCun. Vision guy. He's probably been thinking about language only 7
years or so, since transformers. He only knows the mental ruts his
vision, back-prop, career, has led him to. You can be deep in one
problem and shallow in another.

But I don't know everything. Trying to explain keeps me thinking about
it. And here and there you get good new information.

For instance, that paper James introduced me to, perhaps for the wrong
reasons, was excellent new information:

A logical re-conception of neural networks: Hamiltonian bitwise
part-whole architecture E.F.W.Bowen,1 R.Granger,2* A.Rodriguez3
https://openreview.net/pdf?id=hP4dxXvvNc8

Very nice. The only other mention I recall for the open endedness of
"position-based"/"embedding" type encoding, as a key to creativity. A
nice vindication for me. Helps give me confidence I'm on the right
track. And they have some ideas for extensions to vision, etc. Though
I don't think they see the contradiction angle.

And, another example, commenting on that LeCun post (the one
mentioning the "puzzle" of transformer world models which get less
coverage as you increase resolution... A puzzle. Ha. Nice vindication
in itself...) Twitter prompted me to a guy in Australia who it turns
out has just published a paper showing that sequence networks with a
lot of shared "walk" end points, tend to synchronize locally.

Wow. A true wow. Shared endpoints constrain local synchronization. I
was wondering about that!

How shared end points could constrain sub-net synchrony in a feed
forward networks was something I was struggling with. I think I need
it. So a paper explaining that they do is well cool. New information.
It gives me confidence to move forward looking for the right kind of
feed forward net to try and get local synchronizations corresponding
to substitution groupings/embeddings. Those substitution "embeddings"
would be "walks" between such shared end points, and I want them to
synchronize.

Paper here:

Analytic relationship of relative synchronizability to network
structure and motifs
Joseph T. Lizier,1, 2, ∗ Frank Bauer,2, 3 Fatihcan M. Atay,4, 2 and
J¨urgen Jost2
https://openreview.net/pdf?id=hP4dxXvvNc8

More populist presentation shared on my FB group here:

https://www.facebook.com/share/absxV8ij9rio2j9a/
He has a github:

https://github.com/jlizier/linsync

And that guy Lizier appears to be part of a hitherto unsuspected
sub-field of neuro-computational research attempting to reconcile
synchronized oscillations to some kind of processing breakdown. None
of it from my point of view though, I think. I need to explore where
there might be points of connection there.

So, lots of opportunity to waste time, sure. But I'm not sure that
just sticking to some idea of learned hierarchy, which is all I
remember of your work, without exposing it to criticism, is
necessarily going to get you any further.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M0018a3d4180b84e0801eae92
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-22 Thread Boris Kazachenko
>Wow. Lots of words. I don't mind detail, but words are slippery.

He is marking a territory, like any dog. It's all about self-promotion: the 
more he talk about himself, the better he feels. You both talk too much to get 
anything done, it becomes an end in itself, substance is secondary.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mcf035ccd366f978fb7fc5d9d
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-22 Thread Rob Freeman
Twenkid,

Wow. Lots of words. I don't mind detail, but words are slippery.

If you actually want to do stuff, it's better to keep the words to a
minimum and start with concrete examples. At least until some minimum
of consensus is agreed.

Trying to focus on your concrete questions...

On Sat, Jun 22, 2024 at 10:16 AM twenkid  wrote:
>
> Regarding the contradictions - I guess you mean ambiguity

Yeah, I guess so. I was thinking more abstractly in terms of
grammatical classes. You can never learn grammar, because any rule you
make always ends up being violated: AB->C, **except** in certain
cases... etc. But probably ambiguity at the meaning level resolves to
the same kinds of issues, sure.

> * BTW, what is "not to contradict"? How it would look like in a particular 
> case, example?

Oh, I suppose any formal language is an example of a system that
doesn't contradict. Programming languages... Maths, once axioms are
fixed, would be another example of non-contradiction (by definition?
Of course the thing with maths is that different sets of possible
axioms contradict, and that contradiction of possible axiomatizations
is the whole deal.)

> What do you mean by "the language problem"?

"Grammar", text compression, text prediction...

> The language models lead to such an advance: compared to what else, other 
> (non-language?) models.

Advance, compared to everything before transformers.

In terms of company market cap, if you want to quibble.

> Rob: >Do you have any comments on that idea, that patterns of meaning which 
> can be learned contradict, and so have to be generated in real time?
>
> I am not sure about the proper interpretation of your use of "to contradict"; 
> words/texts have multiple meanings and language and text are lower resolution 
> than thought if they are supposed to represent the reality "exactly" lower 
> level, higher precision representations are needed as well

In terms of my analogy to maths, this reads to me like saying: the
fact there are multiple axiomatizations for maths, means maths axioms
are somehow "lower resolution", and the solution for maths is to have
"higher precision representations" for maths... :-b

If you can appreciate how nonsensical that analysis would be within
the context of maths, then you may get a read on what it sounds like
to me from the way I'm looking at language. Instead, I think different
grammaticalizations of language are like different axiomatizations of
maths (inherently random and infinite?)

You're not the only one doing that, of course. Just the other day
LeCun was tweeting something comparable in response to a study which
revealed transformer world models seem to... contradict! Contradict?!
Who'd 've thought it?! The more resolution you get, the less coverage
you get. Wow. Surprise. Gee, that must mean that we need to find a
"higher precision representation" somewhere else!

LeCun's post here:

https://x.com/ylecun/status/1803677519314407752

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M268d0affcf74d745427a406e
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-17 Thread Rob Freeman
On Mon, Jun 17, 2024 at 3:22 PM Quan Tesla  wrote:
>
> Rob, basically you're reiterating what I've been saying here all along. To 
> increase contextualization and instill robustness in the LLM systemic 
> hierarchies. Further, that it seems to be critically lacking within current 
> approaches.
>
> However, I think this is fast changing, and soon enough, I expect 
> breakthroughs in this regard. Neural linking could be one of those solutions.
>
> While it may not be exactly the same as your hypothesis (?), is it because 
> it's part of your PhD that you're not willing to acknowledge that this 
> theoretical work may have been completed by another researcher more than 17 
> years ago, even submitted for review and subsequently approved? The market, 
> especially Japan, grabbed this research as fast as they could. It's the West 
> that turned out to be all "snooty" about its meaningfulness, yet, it was the 
> West that reviewed and approved of it. Instead of serious collaboration, is 
> research not perhaps being hamstrung by the NIH (Not Invented Here) syndrome, 
> acting like a stuck handbrake?

You intrigue me. "Contextualization ... in LLM systemic hierarchies"
was completed and approved 17 years ago?

"Contextualization" is a pretty broad word. I think the fact that
Bengio retreated to distributed representation with "Neural Language
Models" around... 2003(?) might be seen as one acceptance of... if not
contextualization, at least indeterminacy (I see Bengio refers to "the
curse of dimensionality".) But I see nothing about structure until
Coecke et co. around 2007. And even they (and antecedents going back
to the early '90s with Smolensky?) I'm increasingly appreciating seem
trapped in their tensor formalisms.

The Bengio thread, if it went anywhere, it stayed stuck on structure
until deep learning rescued it with LSTM. And then "attention".

Anyway, the influence of Coecke seems to be tiny. And basically
mis-construed. I think Linas Vepstas followed it, but only saw
encouragement to seek other mathematical abstractions of grammar. And
OpenCog wasted a decade trying to learn those grammars.

Otherwise, I've been pretty clear that I think there are hints to what
I'm arguing in linguistics and maths going back decades, and in
philosophy going back centuries. The linguistics ones specifically
ignored by machine learning.

But that any of this, or anything like it was "grabbed ... as fast as
they could" by the market in Japan, is a puzzle to me (17 years ago?
Specifically 17?)

As is the idea that the West failed to use it, even having "reviewed
and approved it", because it was "snooty" about... Japan's market
having grabbed it first?

Sadly Japanese research in AI, to my knowledge, has been dead since
their big push in the 1980s. Dead, right through their "lost" economic
decades. I met the same team I knew working on symbolic machine
translation grammars 1989-91, at a conference in China in 2002, and as
far as I know they were still working on refinements to the same
symbolic grammar. 10 more years. Same team. Same tech. Just one of the
係長 had become 課長.

What is this event from 17 years ago?

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M2187d1c831913c8c67e1fc9c
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-17 Thread Quan Tesla
Rob, basically you're reiterating what I've been saying here all along. To
increase contextualization and instill robustness in the LLM systemic
hierarchies. Further, that it seems to be critically lacking within current
approaches.

However, I think this is fast changing, and soon enough, I expect
breakthroughs in this regard. Neural linking could be one of those
solutions.

While it may not be exactly the same as your hypothesis (?), is it because
it's part of your PhD that you're not willing to acknowledge that this
theoretical work may have been completed by another researcher more than 17
years ago, even submitted for review and subsequently approved? The market,
especially Japan, grabbed this research as fast as they could. It's the
West that turned out to be all "snooty" about its meaningfulness, yet, it
was the West that reviewed and approved of it. Instead of serious
collaboration, is research not perhaps being hamstrung by the NIH (Not
Invented Here) syndrome, acting like a stuck handbrake?

IMO, this is exactly why progress in the West has been so damned slow.
Everyone is competing for the honor of discovering something great, looking
out for number one. Write a book. Grab a TV show, become a hero, or
whatever.

Here's another person on this list who also jetted off into his own space
with a PhD, the fruits of his labor we have never seen. His theory becoming
quickly as irrelevant as the time it takes to code newer applications. As
you must be acutely aware of: Valid doesn't equate to Reliable and Reliable
doesn't equate to Relevance (Attention) and Relevance doesn't equate to
Intel. Seems to me, pundits of LLMs (used to) think it does. After
relevance within LLMs would be resolved (soon), the real battle for Intel
would begin. An epic battle well-worth watching.

Meanwhile, in all probability, whatever we could think of/invent has
already been thought of by another person, somewhere else in the world.
Sometimes, centuries ago. This seems very similar to what Karl Mannheim was
referring to in his view on competition for knowledge as a factor for human
survival, within the context of the sociology of knowledge. I think we
should add this as a mitigating factor to culturally-based knowledge
systems and embed it in LLMs. My 2 cents' worth.

Predictably, at a certain "size" LLMs - on their own - would wander off
into ambiguity. There are multiple reasons for this, one of them due to
exponential complexity. That's the point - I'll predict - at which ~99.7%
of the LLM-dev market's going to be left behind to scramble for
marketing-related contracts/jobs in order to support AI-based sales and
trading efforts.

The remainder ~0.3% are going to emerge as the AI-driven Intel industry.
I'll rate the AI-Intel market segment as a future, trillion-dollar
industry.

Just some thoughts I had. I could be completely wrong. Only time would
tell.

On Sat, Jun 15, 2024 at 9:42 AM Rob Freeman 
wrote:

> On Sat, Jun 15, 2024 at 1:29 AM twenkid  wrote:
> >
> > ...
> > 2. Yes, the tokenization in current LLMs is usually "wrong", ... it
> should be on concepts and world models: ... it should predict the
> *physical* future of the virtual worlds
> 
> Thanks for comments. I can see you've done a lot of thinking, and see
> similarities in many places, not least Jeff Hawkins, HTM, and
> Friston's Active Inference.
> 
> But I read what you are suggesting as a solution to the current
> "token" problem for LLMs, like that of a lot of people currently,
> LeCun prominently, to be that we need to ground representation more
> deeply in the real world.
> 
> I find this immediate retreat to other sources of data kind of funny,
> actually. It's like... studying the language problem has worked really
> well, so the solution to move forward is to stop studying the language
> problem!
> 
> We completely ignore why studying the language problem has caused such
> an advance. And blindly, immediately throw away our success and look
> elsewhere.
> 
> I say look more closely at the language problem. Understand why it has
> caused such an advance before you look elsewhere.
> 
> I think the reason language models have led us to such an advance is
> that the patterns language prompts us to learn are inherently better.
> "Embeddings", gap fillers, substitution groupings, are just closer to
> the way the brain works. And language has led us to them.
> 
> So OK, if "embeddings" have been the advance, replacing both fixed
> labeled objects in supervised learning, and fixed objects based on
> internal similarities in "unsupervised" learning, instead leading us
> to open ended categories based on external relations, why do we still
> have problems? Why can't we structure better than "tokens"? Why does
> it seem like they've led us the other way, to no structure at all?
> 
> My thesis is actually pretty simple. It is that these open ended
> categories of "embeddings" are good, but they contradict. These "open"
> categories can have a whole new level of 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-16 Thread John Rose
On Sunday, June 16, 2024, at 7:09 PM, Matt Mahoney wrote:
> Not everything can be symbolized in words. I can't describe what a person 
> looks as well as showing you a picture. I can't describe what a novel 
> chemical smells like except to let you smell it. I can't tell you how to ride 
> a bicycle without you practicing.

That’s the point. You emit symbols that reference the qualia that you 
experienced of what the person looks like. The symbols or words are a 
compressed impressed representation of the original full symbol that you 
experienced in your mind. Your original qualia is your unique experience and 
another person receives your transmission or description to reference their own 
qualia which are also unique. It’s a hit or miss since you can’t transmit the 
full qualia but you can transmit more words to paint a more accurate picture 
and increase accuracy. There isn’t enough bandwidth, sampling capacity and 
instantaneousness but you have to reference something for the purposes of 
transmitting information spatiotemporally. A “thing” is a reference which it 
seems can only be a symbol, ever, unless the thing is the symbol itself and 
that would be the original unique qualia. Maybe there are exceptions? like 
numbers but they are still references to qualia going back in history... or 
computations? They are still derivatives. And no transmission is 100% reliable 
as there is always some small chance of error AFAIK. If I'm wrong I would like 
to know.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M773f13826341af38c56a4e09
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-16 Thread Matt Mahoney
Not everything can be symbolized in words. I can't describe what a person
looks as well as showing you a picture. I can't describe what a novel
chemical smells like except to let you smell it. I can't tell you how to
ride a bicycle without you practicing.

On Sun, Jun 16, 2024, 5:36 PM John Rose  wrote:

> On Friday, June 14, 2024, at 3:43 PM, James Bowery wrote:
>
> Etter: "Thing (n., singular): anything that can be distinguished from
> something else."
>
>
> I simply use “thing” as anything that can be symbolized and a unique case
> are qualia where from a first-person experiential viewpoint a qualia
> experiential symbol = the symbolized but for transmission the qualia are
> fitted or compressed into symbol(s). So, for example “nothing” is a thing
> simply because it can be symbolized. Is there anything that cannot be
> symbolized? Perhaps things that cannot be symbolized, what would they be?
> Pre-qualia? but then they are already symbolized since they are referenced…
> You could generalize it and say all things are ultimately derivatives of
> qualia and I speculate that it is impossible to name one that is not. Note
> that in ML a perceptron or a set of perceptrons could be considered
> artificial qualia symbol emitters and perhaps that’s why they are named
> such, percept -> tron. A basic binary classifier is emitting an
> experiential symbol as a bit and more sophisticated perceptrons emit higher
> symbol complexity such as color codes or text characters.
>
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M33eaab901fc926ab4a6ae137
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-16 Thread John Rose
On Friday, June 14, 2024, at 3:43 PM, James Bowery wrote:
>> Etter: "Thing (n., singular): anything that can be distinguished from 
>> something else."

I simply use “thing” as anything that can be symbolized and a unique case are 
qualia where from a first-person experiential viewpoint a qualia experiential 
symbol = the symbolized but for transmission the qualia are fitted or 
compressed into symbol(s). So, for example “nothing” is a thing simply because 
it can be symbolized. Is there anything that cannot be symbolized? Perhaps 
things that cannot be symbolized, what would they be? Pre-qualia? but then they 
are already symbolized since they are referenced… You could generalize it and 
say all things are ultimately derivatives of qualia and I speculate that it is 
impossible to name one that is not. Note that in ML a perceptron or a set of 
perceptrons could be considered artificial qualia symbol emitters and perhaps 
that’s why they are named such, percept -> tron. A basic binary classifier is 
emitting an experiential symbol as a bit and more sophisticated perceptrons 
emit higher symbol complexity such as color codes or text characters. 

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Ma5a8d7f7d388f150f9437cf3
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-14 Thread Rob Freeman
On Sat, Jun 15, 2024 at 1:29 AM twenkid  wrote:
>
> ...
> 2. Yes, the tokenization in current LLMs is usually "wrong", ... it  should 
> be on concepts and world models: ... it should predict the *physical* future 
> of the virtual worlds

Thanks for comments. I can see you've done a lot of thinking, and see
similarities in many places, not least Jeff Hawkins, HTM, and
Friston's Active Inference.

But I read what you are suggesting as a solution to the current
"token" problem for LLMs, like that of a lot of people currently,
LeCun prominently, to be that we need to ground representation more
deeply in the real world.

I find this immediate retreat to other sources of data kind of funny,
actually. It's like... studying the language problem has worked really
well, so the solution to move forward is to stop studying the language
problem!

We completely ignore why studying the language problem has caused such
an advance. And blindly, immediately throw away our success and look
elsewhere.

I say look more closely at the language problem. Understand why it has
caused such an advance before you look elsewhere.

I think the reason language models have led us to such an advance is
that the patterns language prompts us to learn are inherently better.
"Embeddings", gap fillers, substitution groupings, are just closer to
the way the brain works. And language has led us to them.

So OK, if "embeddings" have been the advance, replacing both fixed
labeled objects in supervised learning, and fixed objects based on
internal similarities in "unsupervised" learning, instead leading us
to open ended categories based on external relations, why do we still
have problems? Why can't we structure better than "tokens"? Why does
it seem like they've led us the other way, to no structure at all?

My thesis is actually pretty simple. It is that these open ended
categories of "embeddings" are good, but they contradict. These "open"
categories can have a whole new level of "open". They can change all
the time. That's why it seems like they've led us to no structure at
all. Actually we can have structure. It is just we have to generate it
in real time, not try to learn it all at once.

That's really all I'm saying, and my solution to the "token" problem.
It means you can start with "letter" tokens, and build "word" tokens,
and also "phrases", whole hierarchies. But you have to do it in real
time, because the "tokens", "words", "something", "anything", "any
thing", two "words", one "word"... whatever, can contradict and have
to be found always only in their relevant context.

Do you have any comments on that idea, that patterns of meaning which
can be learned contradict, and so have to be generated in real time?

I still basically see nobody addressing it in the machine learning community.

It's a little like Matt's "modeling both words and letters" comment.
But it gets beneath both. It doesn't only use letters and words, it
creates both "letters" and "words" as "fuzzy", or contradictory,
constructs in themselves. And then goes on to create higher level
structures, hierarchies, phrases, sentences, as higher "tokens",
facilitating logic, symbolism, and all those other artifacts of higher
structure which are currently eluding LLMs. All levels of structure
become accessible if we just accept they may contradict, and so have
to be generated in context, at run time.

It's also not unrelated to James' definition of "a thing as anything
that can be distinguished from something else." Though that is more at
the level of equating definition with relationship, or "embedding",
and doesn't get into the missing, contradictory, or "fuzzy" aspect.
Though it allows that fuzzy aspect to exist, and leads to it, if once
you imagine it might, because it decouples the definition of a thing,
from any single internal structure of the thing itself.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M9013929b4653571c40328f7b
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-14 Thread Matt Mahoney
My point was that token boundaries are fuzzy. This causes problems because
LLMs predict tokens, not characters or bits. There was a thread on Reddit
about ChatGPT not being able to count the number of R's in "strawberry".
The problem is that it sees the word but not the letters.
https://www.reddit.com/r/ChatGPT/s/xYBVddV6jw

Text compressors solve this problem by modeling both words and letters and
combining the predictions.

On Fri, Jun 14, 2024, 3:44 PM James Bowery  wrote:

>
>
> On Wed, May 29, 2024 at 11:24 AM Matt Mahoney 
> wrote:
>
>> Natural language is ambiguous at every level including tokens. Is
>> "someone" one word or two?
>>
>
> Tom Etter 's
> tragically unfinished final paper "Membership and Identity
> "
> has this quite insightful passage:
>
> Thing (n., singular): anything that can be distinguished from something
>> else.
>> ...
>> ...the word "thing" is a broken-off fragment of the more
>> fundamental compound words "anything" and "something". That these words are
>> fundamental is hardly debatable, since they are two of the four fundamental
>> words of symbolic logic, where they are written as ∀ and ∃. With this in
>> mind, let's reexamine the above definition of a *thing* as anything that
>> can be distinguished from something else...
>
>
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M66075f51488aa63fe906ccfd
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-14 Thread James Bowery
On Wed, May 29, 2024 at 11:24 AM Matt Mahoney 
wrote:

> Natural language is ambiguous at every level including tokens. Is
> "someone" one word or two?
>

Tom Etter 's
tragically unfinished final paper "Membership and Identity
"
has this quite insightful passage:

Thing (n., singular): anything that can be distinguished from something
> else.
> ...
> ...the word "thing" is a broken-off fragment of the more
> fundamental compound words "anything" and "something". That these words are
> fundamental is hardly debatable, since they are two of the four fundamental
> words of symbolic logic, where they are written as ∀ and ∃. With this in
> mind, let's reexamine the above definition of a *thing* as anything that
> can be distinguished from something else...

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M85f7e0507c5c4a130f91f15b
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-06-14 Thread twenkid
1. @Can symblic approach ...
2. @Rob Freeman LLMs, What's wrong with NLP (2009-2024),  Whisper

*1*. *IMO the sharpness of the division "neat" and "scruffy",  NN and symbolic 
is confused: Neural Networks are also symbolic:* 
http://artificial-mind.blogspot.com/2019/04/neural-networks-are-also-symbolic.html

NNs are a subset of symbolic, as of implementation and the output, and symbolic 
is also a bad term, a better one is *conceptual*, in a developing system it is 
about creation of systems of *concepts* and operating with them; generalization 
from specifics; not about  "symbols" (dereferencing or the characters) or 
mindless algebra, "symbol manipulation", as every calculation can be seen as 
something like that.

Either NN or whatever computational is a program in a computer, the training of 
NN is a kind of programming, in the big-data trained one the data is just the 
biggest part of the code. If the "data part" of the code is represented in a 
more succinct way, or the current NN part is more complex so it needs less 
"brute force"*, they will converge in another intermediate representation. The 
brute force is also relative.

The whole NNs are concepts or "symbols" within "symbolic" systems, it could 
incorporate it and use whatever is already available, existing models and 
training new ones for particular tasks.

An AGI, Mind and Universe, cognitively, are hierarchical simulators of virtual 
universes, with the lowest level being "the real" or "physical" one for the 
current evaluator-observer virtual universe (causality-control unit). Whatever 
works is allowed. The terms are from the Theory of Universe and Mind, classical 
version 2001-2004, taught during the world-first university course in AGI 
(Plovdiv 2010,2011) and the core reasoning gradually got incorporated in the 
mainstream AI (some were hiding there earlier). 

That kind of architecture or working, providing an explicit  imagination is 
something that LLMs currently lack, they "have" only implicit and 
"sweeping-on-the-go" one, encoded within their whole system, like the diffusion 
models and GANs have implicit models for 3D models and global illumination.,

*2.* Yes, the tokenization in current LLMs is usually "wrong", it's workable 
for shaping and generating plausible matter for the modality, given "knowing 
all already" and covering all cases, but it  should be on concepts and world 
models: simulators of virtual universes, and mapped to imagination, it should 
predict the *physical* future of the virtual worlds, not these tokens which are 
often not morphemes, not cases - sometimes they match specific "meaningful" 
ones, and as they now use huge amount, many words and MWE get separate tokens.

The models can indirectly create these *world *models in order to generate the 
correct words, but then the data should include wider information and 
intentions, as in the multimodal models. 

The following early 2009 articles are still valid, while there is progress 
according to some of the suggestions, and now there is a longer "chain of 
intelligent operations" (even "Chain of thought reasoning" as a term):

*What's wrong with Natural Language Processing? Part I: *
https://artificial-mind.blogspot.com/2009/02/whats-wrong-with-natural-language.html

*What's wrong with Natural Language Processing? Part II : Static, Specific, 
High-level, Not-evolving...
*
https://artificial-mind.blogspot.com/2009/03/whats-wrong-with-natural-language.html

It includes criticism about the NLP tests at the time, there were a few back 
then, POS-tagging etc., now they are plenty, but many seem as funny as back 
then, once I reviewed one for predicting the next word from novels, the 
examples from the paper were all stereotypes and banalities, and the LLMs 
celebrate going higher from 68.4 to 69.6%, just like in the 2000s. A part of 
the conclusions of one of the works:

*""" *
*(...) 
Yes, mainstream NLP at the moment:*

*- Is useful.*
*- Solve[s] some abstract specific problems by heuristics.*
*- It works to some degree for "intelligent" tasks, because of course language 
do maps mind.*

*However, the mainstream still does not lead to a chain of intelligent 
operations, there are not loops and cumulative development.*

*-- The length of the chain of inter-related intelligent operations in NLP 
today is very short. This is related to the lack of will and general goals of 
the systems. These systems are "push-the-button-and-fetch-the-result". 
*[Now: "prompts", but moving to agents]

*-- Swallowing of a huge corpus of 1 billion of words or so and a computation 
of statistical dependencies between tokens is not the way mind works.*

!!! Mind learns step by step, modeling simpler 
constructs/situations/dynamics/models before reaching to more complex.
!!! Temporal relations of the input with different complexity is important.
!!! Mind usually uses many sensory inputs while learning. *Very important. 

 [Multimodal models, Vision-Text models]

*
!!! Mind has 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-29 Thread Matt Mahoney
Natural language is ambiguous at every level including tokens. Is "someone"
one word or two? Language models handle this by mixing the predictions
given by the contexts "some", "one", and "someone".

Using fixed dictionaries is a compromise that reduces accuracy for reducing
computation,  like all tradeoffs in data compressors.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M44b0fc5b236911fe9a971c6d
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-29 Thread Rob Freeman
On Wed, May 29, 2024 at 9:37 AM Matt Mahoney  wrote:
>
> On Tue, May 28, 2024 at 7:46 AM Rob Freeman  
> wrote:
>
> > Now, let's try to get some more detail. How do compressors handle the
> > case where you get {A,C} on the basis of AB, CB, but you don't get,
> > say AX, CX? Which is to say, the rules contradict.
>
> Compressors handle contradictory predictions by averaging them

That's what I thought.

> > "Halle (1959, 1962) and especially Chomsky (1964) subjected
> > Bloomfieldian phonemics to a devastating critique."
> >
> > Generative Phonology
> > Michael Kenstowicz
> > http://lingphil.mit.edu/papers/kenstowicz/generative_phonology.pdf
> >
> > But really it's totally ignored. Machine learning does not address
> > this to my knowledge. I'd welcome references to anyone talking about
> > its relevance for machine learning.
>
> Phonology is mostly irrelevant to text prediction.

The point was it invalidated the method of learning linguistic
structure by distributional analysis at any level. If your rules for
phonemes contradict, what doesn't contradict?

Which is a pity. Because we still don't have a clue what governs
language structure. The best we've been able to come up with is crude
hacks like dragging a chunk of important context behind like a ball
and chain in LSTM, or multiplexing pre-guessed "tokens" together in a
big matrix, with "self-attention".

Anyway, your disinterest doesn't invalidate my claim that this result,
pointing to contradiction produced by distributional analysis learning
procedures for natural language, is totally ignored by current machine
learning, which implicitly or otherwise uses those distributional
analysis learning procedures.

> Language evolved to be learnable on neural networks faster than our
> brains evolved to learn language. So understanding our algorithm is
> important.
>
> Hutter prize entrants have to prebuild a lot of the model because
> computation is severely constrained (50 hours in a single thread with
> 10 GB memory). That includes a prebuilt dictionary. The human brain
> takes 20 years to learn language on a 10 petaflop, 1 petabyte neural
> network. So we are asking quite a bit.

Neural networks may have finally gained close to human performance at
prediction. A problem where you can cover a multitude of sins with raw
memory. Something at which computers trivially exceed humans by as
many orders of magnitude as you can stack server farms. You can just
remember each contradiction including the context which selects it. No
superior algorithm required, and certainly none in evidence. (Chinese
makes similar trade-offs, swapping internal mnemonic sound structure
within tokens, with prodigious memory requirements for the tokens
themselves.) Comparing 10 GB with 1 petabyte seems ingenuous. I
strongly doubt any human can recall as much as 10GB of text. (All of
Wikipedia currently ~22GB compressed, without media? Even to read it
all is estimated at 47 years, including 8hrs sleep a night
https://www.reddit.com/r/theydidthemath/comments/80fi3w/self_how_long_would_it_take_to_read_all_of/.
So forget 20 years to learn it, it would take 20 years to read all the
memory you give Prize entrants.) But I would argue our prediction
algorithms totally fail to do any sort of job with language structure.
Whereas you say babies start to structure language before they can
walk? (Walking being something else computers still have problems
with.) And far from stopping at word segmentation, babies go on to
build quite complex structures, including new ones all the time.

Current models do nothing with structure, not at human "data years"
8-10 months, not 77 years (680k hours of audio to train "Whisper" ~77
years? 
https://www.thealgorithmicbridge.com/p/8-features-make-openais-whisper-the.
Perhaps some phoneme structure might help there...) The only structure
is "tokens". I don't even think current algorithms do max entropy to
find words. They just start out with "tokens". Guessed at
pre-training. Here's Karpathy and LeCun talking about it:

Yann LeCun
@ylecun·Feb 21
Text tokenization is almost as much of an abomination for text as it
is for images. Not mentioning video.
...
Replying to @karpathy
We will see that a lot of weird behaviors and problems of LLMs
actually trace back to tokenization. We'll go through a number of
these issues, discuss why tokenization is at fault, and why someone
out there ideally finds a way to delete this stage entirely.

https://x.com/ylecun/status/1760315812345176343

By the way, talking about words. That's another thing which seems to
have contradictory structure in humans, e.g. native Chinese speakers
agree what constitutes a "word" less than 70% of the time:

"Sproat et. al. (1996) give empirical results showing that native
speakers of Chinese frequently agree on the correct segmentation in
less than 70% of the cases."
https://s3.amazonaws.com/tm-town-nlp-resources/ch2.pdf

I guess that will be:

Sproat, Richard W., Chilin Shih, William Gale, and 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-28 Thread Matt Mahoney
On Tue, May 28, 2024 at 7:46 AM Rob Freeman  wrote:

> Now, let's try to get some more detail. How do compressors handle the
> case where you get {A,C} on the basis of AB, CB, but you don't get,
> say AX, CX? Which is to say, the rules contradict.

Compressors handle contradictory predictions by averaging them,
weighted both by the implied confidence of predictions near 0 or 1,
and the model's historical success rate. Although transformer based
LLMs predict a vector of word probabilities, PAQ based compressors
like CMIX predict one bit at a time, which is equivalent but has a
simpler implementation. You could have hundreds of context models
based on the last n bytes or word (the lexical model), short term
memory or sparse models (semantics), and learned word categories
(grammar). The context includes the already predicted bits of the
current word, like when you guess the next word one letter at a time.

The context model predictions are mixed using a simple neural network
with no hidden weights:

p =  squash(w stretch(x))

where x is the vector of input predictions in (0,1), w is the weight
vector, stretch(x) = ln(x/(1-x)), squash is the inverse = 1/(1 +
e^-x), and p is the final bit prediction. The effect of stretch() and
squash() is to favor predictions near 0 or 1. For example, if one
model guesses 0.5 and another is 0.99, the average would be about 0.9.
The weights are then adjusted to favor whichever models were closest:

w := w + L stretch(x) (y - p)

where y is the actual bit (0 or 1), y - p is the prediction error, and
L is the learning rate, typically around 0.001.

> "Halle (1959, 1962) and especially Chomsky (1964) subjected
> Bloomfieldian phonemics to a devastating critique."
>
> Generative Phonology
> Michael Kenstowicz
> http://lingphil.mit.edu/papers/kenstowicz/generative_phonology.pdf
>
> But really it's totally ignored. Machine learning does not address
> this to my knowledge. I'd welcome references to anyone talking about
> its relevance for machine learning.

Phonology is mostly irrelevant to text prediction. But an important
lesson is how infants learn to segment continuous speech around 8-10
months, before they learn their first word around 12 months. This is
important for learning languages without spaces like Chinese (a word
is 1 to 4 symbols, each representing a syllable). The solution is
simple. Word boundaries occur when the next symbol is less
predictable, reading either forward or backwards. I did this research
in 2000. https://cs.fit.edu/~mmahoney/dissertation/lex1.html

Language evolved to be learnable on neural networks faster than our
brains evolved to learn language. So understanding our algorithm is
important.

Hutter prize entrants have to prebuild a lot of the model because
computation is severely constrained (50 hours in a single thread with
10 GB memory). That includes a prebuilt dictionary. The human brain
takes 20 years to learn language on a 10 petaflop, 1 petabyte neural
network. So we are asking quite a bit.

-- 
-- Matt Mahoney, mattmahone...@gmail.com

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M1f60044363c6d90c81505bcc
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-28 Thread Rob Freeman
Matt,

Nice break down. You've actually worked with language models, which
makes it easier to bring it back to concrete examples.

On Tue, May 28, 2024 at 2:36 AM Matt Mahoney  wrote:
>
> ...For grammar, AB predicts AB (n-grams),

Yes, this looks like what we call "words". Repeated structure. No
novelty. And nothing internal we can equate to "meaning" either. Only
meaning by association.

> and AB, CB, CD, predicts AD (learning the rule
> {A,C}{B,D}).

This is the interesting one. It actually kind of creates new meaning.
You can think of "meaning" as a way of grouping things which makes
good predictions. And, indeed, those gap filler sets {A,C} do pull
together sets of words that we intuitively associate with similar
meaning. These are also the sets that the HNet paper identifies as
having "meaning" independent of any fixed pattern. A pattern can be
new, and so long as it makes similar predictions {B,D}, for any set
{B,D...}, {X,Y...}..., we can think of it as having "meaning", based
on the fact that arranging the world that way, makes those shared
predictions. (Even moving beyond language, you can say the atoms of a
ball, share the meaning of a "ball", based on the fact they fly
through the air together, and bounce off walls together. It's a way of
defining what it "means" to be a "ball".)

Now, let's try to get some more detail. How do compressors handle the
case where you get {A,C} on the basis of AB, CB, but you don't get,
say AX, CX? Which is to say, the rules contradict. Sometimes A and C
are the same, but not other times. You want to trigger the "rule" so
you can capture the symmetries. But you can't make a fixed "rule",
saying {A,C}, because the symmetries only apply to particular sub-sets
of contexts.

You get a lot of this in natural language. There are many such shared
context symmetries in language, but they contradict. Or they're
"entangled". You get one by ordering contexts one way, and another by
ordering contexts another way, but you can't get both at once, because
you can't order contexts both ways at once.

I later learned these contradictions were observed even at the level
of phonemes, and this was crucial to Chomsky's argument that grammar
could not be "learned", back in the '50s. That this essentially broke
consensus in the field of linguistics. Which remains in squabbling
sub-fields over this result, to this day. That's why theoretical
linguistics contributes essentially nothing to contemporary machine
learning. Has anyone ever wondered? Why don't linguists tell us how to
build language models? Even the Chomsky hierarchy cited by James'
DeepMind paper from the "learning" point of view is essentially a
misapprehension of what Chomsky concluded (that observable grammar
contradicts, so formal grammar can't be learned.)

A reference available on the Web I've been able to find is this one:

"Halle (1959, 1962) and especially Chomsky (1964) subjected
Bloomfieldian phonemics to a devastating critique."

Generative Phonology
Michael Kenstowicz
http://lingphil.mit.edu/papers/kenstowicz/generative_phonology.pdf

But really it's totally ignored. Machine learning does not address
this to my knowledge. I'd welcome references to anyone talking about
its relevance for machine learning.

I'm sure all the compression algorithms submitted to the Hutter Prize
ignore this. Maybe I'm wrong. Have any addressed it? They probably
just regress to some optimal compromise, and don't think about it too
much.

If we choose not to ignore this, what do we do? Well, we might try to
"learn" all these contradictions, indexed on context. I think this is
what LLMs do. By accident. That was the big jump, right, "attention",
to index context. Then they just enumerate vast numbers of (an
essentially infinite number of?) predictive patterns in one enormous
training time.That's why they get so large.

No-one knows, or wonders, why neural nets work for this, and symbols
don't, viz. the topic post of this thread. But this will be the
reason.

In practice LLMs learn predictive patterns, and index them on context
using "attention", and it turns out there are a lot of those different
predictive "embeddings", indexed on context. There is no theory.
Everything is a surprise. But if you go back in the literature, there
are these results about contradictions to suggest why it might be so.
And the conclusion is still either Chomsky's one, that language can't
be learned, consistent rules exist, but must be innate. Or, what
Chomsky didn't consider, that complexity of novel patterns defying
abstraction, might be part of the solution. It was before the
discovery of chaos when Chomsky was looking at this, so perhaps it's
not fair to blame him for not considering it.

But then it becomes a complexity issue. Just how many unique orderings
of contexts with useful predictive symmetries are there? Are you ever
at an end of finding different orderings of contexts, which specify
some useful new predictive symmetry or other? The example of

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-27 Thread James Bowery
On Mon, May 27, 2024 at 9:34 AM Rob Freeman 
wrote:

> James,
>
> I think you're saying:
>
> 1) Grammatical abstractions may not be real, but they can still be
> useful abstractions to parameterize "learning".
>

And more generally, people pragmatically adopt different fictional abstract
grammars as convenient:  Fictions except perhaps the lowest power ones like
regular/finite-state automata which discrete and finite dynamical systems,
*real*izable in physical phenomena.



> 2) Even if after that there are "rules of thumb" which actually govern
> everything.
>

Well, there are abstract grammars (see #1) and there are instances of those
grammars (#2), which may be induced given the
fictional-abstract-grammar-of-convenience.  BUT we would prefer not to have
to _learn_ these maximum-parsimony induced grammars if it can be avoided.
Some fragments of physics may be learnable from the raw data, but we may
not wish to do so depending on the task at hand -- hence "physics informed
machine learning" makes sense in many cases.

Well, you might say why not just learn the "rules of thumb".

Hopefully you meant "one" rather than "you" since in both #1 and #2 I've
repeatedly asserted that these are rules of thumb/heuristics/ML biases
(whether biasing toward a particular abstract grammar (#1) or toward a
particular grammar (#2)) we should be taking more seriously.


> But the best counter against the usefulness of the Chomsky hierarchy
> for parameterizing machine learning, might be that Chomsky himself
> dismissed the idea it might be learned.


A machine learning algorithm is like the phenotype of the genetic code
Chomsky has hypothesized for innate capacity to learn grammar.  It
"parameterizes" the ML algorithm prior to any learning taking place by that
ML algorithm.  Both #1 and #2 are examples of such "genetic code" hardwired
into the ML algorithm -- to "bias" learning in such a manner as to speed up
convergence on lossless compression of the training data (ie: all prior
observations in the ideal case of the Algorithmic Information Criterion for
model selection since "test" and "validation" data are available in
subsequent observations and you can't do any better than the best lossless
compression as the basis for decision).


> And his most damaging
> argument? That learned categories contradict. "Objects" behave
> differently in one context, from how they behave in another context.


I guess the best thing to do at this stage is stop talking about "Chomsky"
and even about "grammar" and instead talk only about which level of "
automata " one wishes to
focus on in the "domain specific computer programming language" one wishes
to use to achieve the shortest algorithm that outputs all prior
observations.  Some domain specific languages are not Turing complete but
may nevertheless be interpreted by a meta-interpreter that is.

This gets away from anything specific to "Chomsky" and lets us focus on the
more well-grounded notion of Algorithmic Information.


> I see it a bit like our friend the Road Runner. You can figure out a
> physics for him. But sometimes that just goes haywire and contradicts
> itself - bodies make holes in rocks, fly high in the sky, or stretch
> wide.
>
> All the juice is in these weird "rules of thumb".
>
> Chomsky too failed to find consistent objects. He was supposed to push
> past the highly successful learning of phoneme "objects", and find
> "objects" for syntax. And he failed. And the most important reason
> I've found, was that even for phonemes, learned category contradicted.
>
> That hierarchy stuff, that wasn't supposed to appear in the data. That
> could only be in our heads. Innate. Why? Well for one thing, because
> the data contradicted. The "learning procedures" of the time generated
> contradictory objects. This is a forgotten result. Machine learning is
> still ignoring this old result from the '50s. (Fair to say the
> DeepMind paper ignores it?) Chomsky insisted these contradictions
> meant the "objects" must be innate. The idea cognitive objects might
> be new all the time (and particularly the idea they might contradict!)
> is completely orthogonal to his hierarchy (well, it might be
> compatible with context sensitivity, if you accept that the real juice
> is in the mechanism to implement the context sensitivity?)
>
> If categories contradict, that is represented on the Chomsky hierarchy
> how? I don't know. How would you represent contradictory categories on
> the Chomsky hierarchy? A form of context sensitivity?
>
> Actually, I think, probably, using entangled objects like quantum. Or
> relation and variance based objects as in category theory.
>
> I believe Coecke's team has been working on "learning" exactly this:
>
> From Conceptual Spaces to Quantum Concepts: Formalising and Learning
> Structured Conceptual Models
> Sean Tull, Razin A. Shaikh, Sara Sabrina Zemljiˇc and Stephen Clark
> Quantinuum
> 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-27 Thread Matt Mahoney
The top text compressors use simple models of semantics and grammar
that group words into categories as fuzzy equivalence relations. For
semantics, the rules are reflexive, A predicts A (but not too close.
Probability peaks 50-100 bytes away), symmetric, A..B predicts A..B
and B..A, and transitive, A..B, B..C predicts A..C. For grammar, AB
predicts AB (n-grams), and AB, CB, CD, predicts AD (learning the rule
{A,C}{B,D}). Even the simplest compressors like zip model n-grams. The
top compressors learn groupings. For example, "white house", "white
car", "red house" predicts the novel "red car". For cmix variants, the
dictionary would be "white red...house car" and take whole groups as
contexts. The dictionary can be built automatically by clustering in
context space.

Compressors model semantics using sparse contexts. To get the reverse
prediction "A..BB..A" and transitive prediction
"A..BB..C...A..C you use a short term memory like LSTM both for
learning associations and as context for prediction.

Humans use lexical, semantic, and grammar induction to predict text.
For example, how do you predict, "The flob ate the glork. What do
flobs eat?"

Your semanic model learned to associate "flob" with "glork", "eat"
with "glork" and "eat" with "ate". Your grammar model learned that
"the" is usually followed by a noun and that nouns are sometimes
followed by the plural "s". Your lexical model tells you that there is
no space before the "s". Thus, you and a good language model predict
the novel word "glorks".

All of this has a straightforward implementation with neural networks.
It takes a lot of computation because you need on the order as many
parameters as you have bits of training data, around 10^9 for human
level. Current LLMs are far beyond that with 10^13 bits or so. The
basic operations are prediction, y = Wx, and training, W += xy^t,
where x is the input word vector, y is the output word probability
vector, W is the weight matrix, and ^t means transpose. Both
operations require similar computation (the number of parameters,
|W|), but training requires more hardware because you are compressing
a million years worth of text in a few days. Prediction for chatbots
only has to be real time, about 10 bits per second.

And as I have been saying since 2006, text prediction (measured by
compression) is all you need to pass the Turing test, and therefore
all you need to appear conscious or sentient.

-- 
-- Matt Mahoney, mattmahone...@gmail.com

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M0a4075c52c080ace6a702efa
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-27 Thread Rob Freeman
James,

I think you're saying:

1) Grammatical abstractions may not be real, but they can still be
useful abstractions to parameterize "learning".

2) Even if after that there are "rules of thumb" which actually govern
everything.

Well, you might say why not just learn the "rules of thumb".

But the best counter against the usefulness of the Chomsky hierarchy
for parameterizing machine learning, might be that Chomsky himself
dismissed the idea it might be learned. And his most damaging
argument? That learned categories contradict. "Objects" behave
differently in one context, from how they behave in another context.

I see it a bit like our friend the Road Runner. You can figure out a
physics for him. But sometimes that just goes haywire and contradicts
itself - bodies make holes in rocks, fly high in the sky, or stretch
wide.

All the juice is in these weird "rules of thumb".

Chomsky too failed to find consistent objects. He was supposed to push
past the highly successful learning of phoneme "objects", and find
"objects" for syntax. And he failed. And the most important reason
I've found, was that even for phonemes, learned category contradicted.

That hierarchy stuff, that wasn't supposed to appear in the data. That
could only be in our heads. Innate. Why? Well for one thing, because
the data contradicted. The "learning procedures" of the time generated
contradictory objects. This is a forgotten result. Machine learning is
still ignoring this old result from the '50s. (Fair to say the
DeepMind paper ignores it?) Chomsky insisted these contradictions
meant the "objects" must be innate. The idea cognitive objects might
be new all the time (and particularly the idea they might contradict!)
is completely orthogonal to his hierarchy (well, it might be
compatible with context sensitivity, if you accept that the real juice
is in the mechanism to implement the context sensitivity?)

If categories contradict, that is represented on the Chomsky hierarchy
how? I don't know. How would you represent contradictory categories on
the Chomsky hierarchy? A form of context sensitivity?

Actually, I think, probably, using entangled objects like quantum. Or
relation and variance based objects as in category theory.

I believe Coecke's team has been working on "learning" exactly this:

>From Conceptual Spaces to Quantum Concepts: Formalising and Learning
Structured Conceptual Models
Sean Tull, Razin A. Shaikh, Sara Sabrina Zemljiˇc and Stephen Clark
Quantinuum
https://browse.arxiv.org/pdf/2401.08585

I'm not sure. I think the symbolica.ai people may be working on
something similar: find some level of abstraction which applies even
across varying objects (contradictions?)

For myself, in contrast to Bob Coecke, and the category theory folks,
I think it's pointless, and maybe unduly limiting, to learn this
indeterminate object formalism from data, and then collapse it into
one or other contradictory observable form, each time you observe it.
(Or seek some way you can reason with it even in indeterminate object
formulation, as with the category theory folks?) I think you might as
well collapse observable objects directly from the data.

I believe this collapse "rule of thumb", is the whole game, one shot,
no real "learning" involved.

All the Chomsky hierarchy limitations identified in the DeepMind paper
would disappear too. They are all limitations of not identifying
objects. Context coding hacks like LSTM, or "attention", introduced in
lieu of actual objects, and grammars over those objects, stemming from
the fact grammars of contradictory objects are not "learnable."

On Sun, May 26, 2024 at 11:24 PM James Bowery  wrote:
>
> It's also worth reiterating a point I made before about the confusion between 
> abstract grammar as a prior (heuristic) for grammar induction and the 
> incorporation of so-induced grammars as priors, such as in "physics informed 
> machine learning".
>
> In the case of physics informed machine learning, the language of physics is 
> incorporated into the learning algorithm.  This helps the machine learning 
> algorithm learn things about the physical world without having to re-derive 
> the body of physics knowledge.
>
> Don't confuse the two levels here:
>
> 1) My suspicion that natural language learning may benefit from prioritizing 
> HOPDA as an abstract grammar to learn something about natural languages -- 
> such as their grammars.
>
> 2) My suspicion (supported by "X informed machine learning" exemplified by 
> the aforelinked work) that there may be prior knowledge about natural 
> language more specific than the level of abstract grammar -- such as specific 
> rules of thumb for, say, the English language that may greatly speed training 
> time on English corpora.
>
> On Sun, May 26, 2024 at 9:40 AM James Bowery  wrote:
>>
>> See the recent DeepMind paper "Neural Networks and the Chomsky Hierarchy" 
>> for the sense of "grammar" I'm using when talking about the HNet paper's 
>> connection to 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-26 Thread James Bowery
It's also worth reiterating a point I made before

about the confusion between abstract grammar as a prior (heuristic) for
grammar induction and the incorporation of so-induced grammars as priors,
such as in "physics informed machine learning
".

In the case of physics informed machine learning, the language of physics
is incorporated into the learning algorithm.  This helps the machine
learning algorithm learn things about the physical world without having to
re-derive the body of physics knowledge.

Don't confuse the two levels here:

1) My suspicion that natural language learning may benefit from
prioritizing HOPDA as an *abstract* grammar to learn something about
natural languages -- such as their grammars.

2) My suspicion (supported by "X informed machine learning" exemplified by
the aforelinked work) that there may be prior knowledge about natural
language more specific than the level of *abstract* grammar -- such as
specific rules of thumb for, say, the English language that may greatly
speed training time on English corpora.

On Sun, May 26, 2024 at 9:40 AM James Bowery  wrote:

> See the recent DeepMind paper "Neural Networks and the Chomsky Hierarchy
> " for the sense of "grammar" I'm using
> when talking about the HNet paper's connection to Granger's prior papers
> about "grammar", the most recent being "Toward the quantification of
> cognition ".  Although the DeepMind
> paper doesn't refer to Granger's work on HOPDAs, it does at least
> illustrate a fact, long-recognized in the theory of computation:
>
> Grammar, Computation
> Regular, Finite-state automaton
> Context-free, Non-deterministic pushdown automaton
> Context sensitive, Linear-bounded non-deterministic Turing machine
> Recursively enumerable, Turing machine
>
> Moreover, the DeepMind paper's empirical results support the corresponding
> hierarchy of computational power.
>
> Having said that, it is critical to recognize that everything in a finite
> universe reduces to finite-state automata in hardware -- it is only in our
> descriptive languages that the hierarchy exists.  We don't describe all
> computer programs in terms of finite-state automata aka regular grammar
> languages.  We don't describe all computer programs even in terms of Turing
> complete automata aka recursively enumerable grammar languages.
>
> And I *have* stated before (which I first linked to the HNet paper)
> HOPDAs are interesting as a heuristic because they *may* point the way to
> a prioritization if not restriction on the program search space that
> evolution has found useful in creating world models during an individual
> organism's lifetime.
>
> The choice of language, hence the level of grammar, depends on its utility
> in terms of the Algorithmic Information Criterion for model selection.
>
> I suppose one could assert that none of that matters so long as there is
> any portion of the "instruction set" that requires the Turing complete
> fiction, but that's a rather ham-handed critique of my nuanced point.
>
>
>
> On Sat, May 25, 2024 at 9:37 PM Rob Freeman 
> wrote:
>
>> Thanks Matt.
>>
>> The funny thing is though, as I recall, finding semantic primitives
>> was the stated goal of Marcus Hutter when he instigated his prize.
>>
>> That's fine. A negative experimental result is still a result.
>>
>> I really want to emphasize that this is a solution, not a problem, though.
>>
>> As the HNet paper argued, using relational categories, like language
>> embeddings, decouples category from pattern. It means we can have
>> categories, grammar "objects" even, it is just that they may
>> constantly be new. And being constantly new, they can't be finitely
>> "learned".
>>
>> LLMs may have been failing to reveal structure, because there is too
>> much of it, an infinity, and it's all tangled up together.
>>
>> We might pick it apart, and have language models which expose rational
>> structure, the Holy Grail of a neuro-symbolic reconciliation, if we
>> just embrace the constant novelty, and seek it as some kind of
>> instantaneous energy collapse in the relational structure of the data.
>> Either using a formal "Hamiltonian", or, as I suggest, finding
>> prediction symmetries in a network of language sequences, by
>> synchronizing oscillations or spikes.
>>
>> On Sat, May 25, 2024 at 11:33 PM Matt Mahoney 
>> wrote:
>> >
>> > I agree. The top ranked text compressors don't model grammar at all.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M440c27b119465aeba2e4d2bb
Delivery options: 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-26 Thread James Bowery
See the recent DeepMind paper "Neural Networks and the Chomsky Hierarchy
" for the sense of "grammar" I'm using
when talking about the HNet paper's connection to Granger's prior papers
about "grammar", the most recent being "Toward the quantification of
cognition ".  Although the DeepMind paper
doesn't refer to Granger's work on HOPDAs, it does at least illustrate a
fact, long-recognized in the theory of computation:

Grammar, Computation
Regular, Finite-state automaton
Context-free, Non-deterministic pushdown automaton
Context sensitive, Linear-bounded non-deterministic Turing machine
Recursively enumerable, Turing machine

Moreover, the DeepMind paper's empirical results support the corresponding
hierarchy of computational power.

Having said that, it is critical to recognize that everything in a finite
universe reduces to finite-state automata in hardware -- it is only in our
descriptive languages that the hierarchy exists.  We don't describe all
computer programs in terms of finite-state automata aka regular grammar
languages.  We don't describe all computer programs even in terms of Turing
complete automata aka recursively enumerable grammar languages.

And I *have* stated before (which I first linked to the HNet paper) HOPDAs
are interesting as a heuristic because they *may* point the way to a
prioritization if not restriction on the program search space that
evolution has found useful in creating world models during an individual
organism's lifetime.

The choice of language, hence the level of grammar, depends on its utility
in terms of the Algorithmic Information Criterion for model selection.

I suppose one could assert that none of that matters so long as there is
any portion of the "instruction set" that requires the Turing complete
fiction, but that's a rather ham-handed critique of my nuanced point.



On Sat, May 25, 2024 at 9:37 PM Rob Freeman 
wrote:

> Thanks Matt.
>
> The funny thing is though, as I recall, finding semantic primitives
> was the stated goal of Marcus Hutter when he instigated his prize.
>
> That's fine. A negative experimental result is still a result.
>
> I really want to emphasize that this is a solution, not a problem, though.
>
> As the HNet paper argued, using relational categories, like language
> embeddings, decouples category from pattern. It means we can have
> categories, grammar "objects" even, it is just that they may
> constantly be new. And being constantly new, they can't be finitely
> "learned".
>
> LLMs may have been failing to reveal structure, because there is too
> much of it, an infinity, and it's all tangled up together.
>
> We might pick it apart, and have language models which expose rational
> structure, the Holy Grail of a neuro-symbolic reconciliation, if we
> just embrace the constant novelty, and seek it as some kind of
> instantaneous energy collapse in the relational structure of the data.
> Either using a formal "Hamiltonian", or, as I suggest, finding
> prediction symmetries in a network of language sequences, by
> synchronizing oscillations or spikes.
>
> On Sat, May 25, 2024 at 11:33 PM Matt Mahoney 
> wrote:
> >
> > I agree. The top ranked text compressors don't model grammar at all.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mcca9a6d522c416b1c95cd3d1
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-25 Thread Rob Freeman
Thanks Matt.

The funny thing is though, as I recall, finding semantic primitives
was the stated goal of Marcus Hutter when he instigated his prize.

That's fine. A negative experimental result is still a result.

I really want to emphasize that this is a solution, not a problem, though.

As the HNet paper argued, using relational categories, like language
embeddings, decouples category from pattern. It means we can have
categories, grammar "objects" even, it is just that they may
constantly be new. And being constantly new, they can't be finitely
"learned".

LLMs may have been failing to reveal structure, because there is too
much of it, an infinity, and it's all tangled up together.

We might pick it apart, and have language models which expose rational
structure, the Holy Grail of a neuro-symbolic reconciliation, if we
just embrace the constant novelty, and seek it as some kind of
instantaneous energy collapse in the relational structure of the data.
Either using a formal "Hamiltonian", or, as I suggest, finding
prediction symmetries in a network of language sequences, by
synchronizing oscillations or spikes.

On Sat, May 25, 2024 at 11:33 PM Matt Mahoney  wrote:
>
> I agree. The top ranked text compressors don't model grammar at all.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Meac024d4e635bb1d9e8f34e9
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-25 Thread Matt Mahoney
I agree. The top ranked text compressors don't model grammar at all.

On Fri, May 24, 2024, 11:47 PM Rob Freeman 
wrote:

> Ah, I see. Yes, I saw that reference. But I interpreted it only to
> mean the general forms of a grammar. Do you think he means the
> mechanism must actually be a grammar?
>
> In the earlier papers I interpret him to be saying, if language is a
> grammar, what kind of a grammar must it be? And, yes, it seemed he was
> toying with actual physical mechanisms relating to levels of brain
> structure. Thalamo-cortical loops?
>
> The problem with that is, language doesn't actually seem to be any
> kind of grammar at all.
>
> It's like saying if the brain had to be an internal combustion engine,
> it might be a Mazda rotary. BFD. It's not an engine at all.
>
> I don't know if the authors realized that. But surely that's the point
> of the HNet paper. That something can generate the general forms of a
> grammar, without actually being a grammar.
>
> I guess this goes back to your assertion in our prior thread that
> "learning" needs to be constrained by "physical priors" of some kind
> (was it?) Are there physical "objects" constraining the "learning", or
> does the "learning" vaguely resolve as physical objects, but not
> quite?
>
> I don't think vague resemblance to objects means the objects must exist,
> at all.
>
> Take Kepler and the planets. If the orbits of planets are epicycles,
> which epicycles would they be? The trouble is, it turns out they are
> not epicycles.
>
> And at least epicycles work! That's the thing for natural language.
> Formal grammar doesn't even work. None of them. Nested stacks, context
> free, Chomsky hierarchy up, down, and sideways. They don't work. So
> figuring out which formal grammar is best, is a pointless exercise.
> None of them work.
>
> Yes, broadly human language seems to resolve itself into forms which
> resemble formal grammar (it's probably designed to do that, so that it
> can usefully represent the world.) And it might be generally useful to
> decide which formal grammar it best (vaguely) resembles.
>
> But in detail it turns out human language does not obey the rules of
> any formal grammar at all.
>
> It seems to be a bit like the way the output of a TV screen looks like
> objects moving around in space. Yes, it looks like objects moving in
> space. You might even generate a physics based on the objects which
> appear to be there. It might work quite well until you came to Road
> Runner cartoons. That doesn't mean the output of a TV screen is
> actually objects moving around in space. If you insist on implementing
> a TV screen as objects moving around in space, well, it might be a
> puppet show similar enough to amuse the kids. But you won't make a TV
> screen. You will always fail. And fail in ways very reminiscent of the
> way formal grammars almost succeed... but fail, to represent human
> language.
>
> Same thing with a movie. Also looks a lot like objects moving around
> on a screen. But is it objects moving on a screen? Different again.
>
> Superficial forms do not always equate to mechanisms.
>
> That's what's good about the HNet paper for me. It discusses how those
> general forms might emerge from something else.
>
> The history of AI in general, and natural language processing in
> particular, has been a search for those elusive "grammars" we see
> chasing around on the TV screens of our minds. And they all failed.
> What has succeeded has been breaking the world into bits (pixels?) and
> allowing them to come together in different ways. Then the game became
> how to bring them together. Supervised "learning" spoon fed the
> "objects" and bound the pixels together explicitly. Unsupervised
> learning tried to resolve "objects" as some kind of similarity between
> pixels. AI got a bump when, by surprise, letting the "objects" go
> entirely turned out to generate text that was more natural than ever!
> Who'd a thunk it? Letting "objects" go entirely works best! If it
> hadn't been for the particular circumstances of language, pushing you
> to a "prediction" conception of the problem, how long would it have
> taken us to stumble on that? The downside to that was, letting
> "objects" go entirely also doesn't totally fit with what we
> experience. We do experience the world as "objects". And without those
> "objects" at all, LLMs are kind of unhinged babblers.
>
> So where's the right balance? Is the solution as LeCun, and perhaps
> you, suggest (or Ben, looking for "semantic primitives" two years
> ago...), to forget about the success LLMs had by letting go of objects
> entirely. To repeat our earlier failures and seek the "objects"
> elsewhere. Some other data. Physics? I see the objects, dammit! Look!
> There's a coyote, and there's a road runner, and... Oh, my physics
> didn't allow for that...
>
> Or could it be the right balance is, yes, to ignore the exact
> structure of the objects as LLMs have done, but no, not to do it as
> LLMs 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-24 Thread Rob Freeman
Ah, I see. Yes, I saw that reference. But I interpreted it only to
mean the general forms of a grammar. Do you think he means the
mechanism must actually be a grammar?

In the earlier papers I interpret him to be saying, if language is a
grammar, what kind of a grammar must it be? And, yes, it seemed he was
toying with actual physical mechanisms relating to levels of brain
structure. Thalamo-cortical loops?

The problem with that is, language doesn't actually seem to be any
kind of grammar at all.

It's like saying if the brain had to be an internal combustion engine,
it might be a Mazda rotary. BFD. It's not an engine at all.

I don't know if the authors realized that. But surely that's the point
of the HNet paper. That something can generate the general forms of a
grammar, without actually being a grammar.

I guess this goes back to your assertion in our prior thread that
"learning" needs to be constrained by "physical priors" of some kind
(was it?) Are there physical "objects" constraining the "learning", or
does the "learning" vaguely resolve as physical objects, but not
quite?

I don't think vague resemblance to objects means the objects must exist, at all.

Take Kepler and the planets. If the orbits of planets are epicycles,
which epicycles would they be? The trouble is, it turns out they are
not epicycles.

And at least epicycles work! That's the thing for natural language.
Formal grammar doesn't even work. None of them. Nested stacks, context
free, Chomsky hierarchy up, down, and sideways. They don't work. So
figuring out which formal grammar is best, is a pointless exercise.
None of them work.

Yes, broadly human language seems to resolve itself into forms which
resemble formal grammar (it's probably designed to do that, so that it
can usefully represent the world.) And it might be generally useful to
decide which formal grammar it best (vaguely) resembles.

But in detail it turns out human language does not obey the rules of
any formal grammar at all.

It seems to be a bit like the way the output of a TV screen looks like
objects moving around in space. Yes, it looks like objects moving in
space. You might even generate a physics based on the objects which
appear to be there. It might work quite well until you came to Road
Runner cartoons. That doesn't mean the output of a TV screen is
actually objects moving around in space. If you insist on implementing
a TV screen as objects moving around in space, well, it might be a
puppet show similar enough to amuse the kids. But you won't make a TV
screen. You will always fail. And fail in ways very reminiscent of the
way formal grammars almost succeed... but fail, to represent human
language.

Same thing with a movie. Also looks a lot like objects moving around
on a screen. But is it objects moving on a screen? Different again.

Superficial forms do not always equate to mechanisms.

That's what's good about the HNet paper for me. It discusses how those
general forms might emerge from something else.

The history of AI in general, and natural language processing in
particular, has been a search for those elusive "grammars" we see
chasing around on the TV screens of our minds. And they all failed.
What has succeeded has been breaking the world into bits (pixels?) and
allowing them to come together in different ways. Then the game became
how to bring them together. Supervised "learning" spoon fed the
"objects" and bound the pixels together explicitly. Unsupervised
learning tried to resolve "objects" as some kind of similarity between
pixels. AI got a bump when, by surprise, letting the "objects" go
entirely turned out to generate text that was more natural than ever!
Who'd a thunk it? Letting "objects" go entirely works best! If it
hadn't been for the particular circumstances of language, pushing you
to a "prediction" conception of the problem, how long would it have
taken us to stumble on that? The downside to that was, letting
"objects" go entirely also doesn't totally fit with what we
experience. We do experience the world as "objects". And without those
"objects" at all, LLMs are kind of unhinged babblers.

So where's the right balance? Is the solution as LeCun, and perhaps
you, suggest (or Ben, looking for "semantic primitives" two years
ago...), to forget about the success LLMs had by letting go of objects
entirely. To repeat our earlier failures and seek the "objects"
elsewhere. Some other data. Physics? I see the objects, dammit! Look!
There's a coyote, and there's a road runner, and... Oh, my physics
didn't allow for that...

Or could it be the right balance is, yes, to ignore the exact
structure of the objects as LLMs have done, but no, not to do it as
LLMs do by totally ignoring "objects", but to ignore only the internal
structure of the "objects", by focusing on relations defining objects
in ways which allow their internal "pattern" to vary.

That's what I see being presented in the HNet paper. Maybe I'm getting
ahead of its authors. Because 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-24 Thread James Bowery
On Thu, May 23, 2024 at 9:19 PM Rob Freeman 
wrote:

> ...(Regarding the HNet paper)
> The ideas of relational category in that paper might really shift the
> needle for current language models.
>
> That as distinct from the older "grammar of mammalian brain capacity"
> paper, which I frankly think is likely a dead end.
>

Quoting the HNet paper:

> We conjecture that ongoing hierarchical construction of
> such entities can enable increasingly “symbol-like” repre-
> sentations, arising from lower-level “statistic-like” repre-
> sentations. Figure 9 illustrates construction of simple “face”
> configuration representations, from exemplars constructed
> within the CLEVR system consisting of very simple eyes,
> nose, mouth features. Categories (¢) and sequential rela-
> tions ($) exhibit full compositionality into sequential rela-
> tions of categories of sequential relations, etc.; these define
> formal grammars (Rodriguez & Granger 2016; Granger
> 2020). Exemplars (a,b) and near misses (c,d) are presented,
> initially yielding just instances, which are then greatly re-
> duced via abductive steps (see Supplemental Figure 13).

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mb30f879a8ccbe35506565e18
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-23 Thread Rob Freeman
James,

Not sure whether all that means you think category theory might be
useful for AI or not.

Anyway, I was moved to post those examples by Rich Hickey and Bartoz
Milewsky in my first post to this thread, by your comment that ideas
of indeterminate categories might annoy what you called 'the risible
tradition of so-called "type theories" in both mathematics and
programming languages'. I see the Hickey and Milewsky refs as examples
of ideas of indeterminate category entering computer programming
theory too.

Whether posted on the basis of a spurious connection or not, thanks
for the Granger HNet paper. That's maybe the most interesting paper
I've seen this year. As I say, it's the only reference I've seen other
than my own presenting the idea that relational categories liberate
category from any given pattern instantiating it. Which I see as
distinct from regression.

The ideas of relational category in that paper might really shift the
needle for current language models.

That as distinct from the older "grammar of mammalian brain capacity"
paper, which I frankly think is likely a dead end.

Real time "energy relaxation" finding new relational categories, as in
the Hamiltonian Net paper, is what I am pushing for. I see current
LLMs as incorporating a lot of that power by accident. But because
they still concentrate on the patterns, and not the relational
generating procedure, they do it only by becoming "large". We need to
understand the (relational) theory behind it in order to jump out of
the current LLM "local minimum".

On Thu, May 23, 2024 at 11:47 PM James Bowery  wrote:
>
>
> On Wed, May 22, 2024 at 10:34 PM Rob Freeman  
> wrote:
>>
>> On Wed, May 22, 2024 at 10:02 PM James Bowery  wrote:
>> > ...
>> > You correctly perceive that the symbolic regression presentation is not to 
>> > the point regarding the HNet paper.  A big failing of the symbolic 
>> > regression world is the same as it is in the rest of computerdom:  Failure 
>> > to recognize that functions are degenerate relations and you had damn well 
>> > better have thought about why you are degenerating when you do so.  But 
>> > likewise, when you are speaking about second-order theories (as opposed to 
>> > first-order theories), such as Category Theory, you had damn well have 
>> > thought about why you are specializing second-order predicate calculus 
>> > when you do so.
>> >
>> > Not being familiar with Category Theory I'm in no position to critique 
>> > this decision to specialize second-order predicate calculus.  I just 
>> > haven't seen Category Theory presented as a second-order theory.  Perhaps 
>> > I could understand Category Theory thence where the enthusiasm for 
>> > Category Theory comes from if someone did so.
>> >
>> > This is very much like my problem with the enthusiasm for type theories in 
>> > general.
>>
>> You seem to have an objection to second order predicate calculus.
>
>
> On the contrary; I see second order predicate calculus as foundational to any 
> attempt to deal with process which, in the classical case, is computation.
>
>> Dismissing category theory because you equate it to that. On what
>> basis do you equate them? Why do you reject second order predicate
>> calculus?
>
>
> I don't "dismiss" category theory.  It's just that I've never seen a category 
> theorist describe it as a second order theory.   Even in type theories 
> covering computation one finds such phenomena as the Wikipedia article on 
> "Type theory as a logic" lacking any reference to "second order".
>
> If I appear to "equate" category theory and second order predicate calculus 
> it is because category theory is a second order theory.  But beyond that, I 
> have an agenda related to Tom Etter's attempt to flesh out his theory of 
> "mind and matter" which I touched on in my first response to this thread 
> about fixing quantum logic.  An aspect of this project is the proof that 
> identity theory belongs to logic in the form of relative identity theory.  My 
> conjecture is that it ends up belonging to second order logic (predicate 
> calculus), which is why I resorted to Isabelle (HOL proof assistant).
>
>> What I like about category theory (as well as quantum formulations) is
>> that I see it as a movement away from definitions in terms of what
>> things are, and towards definitions in terms of how things are
>> related. Which fits with my observations of variation in objects
>> (grammar initially) defying definition, but being accessible to
>> definition in terms of relations.
>
>
> On this we heartily agree.  Why do you think first-order predicate calculus 
> is foundational to Codd's so-called "relational algebra"?  Why do you think 
> that "updates" aka "transactions" aka "atomic actions" are so problematic 
> within that first order theory?
>
>> > But I should also state that my motivation for investigating Granger et 
>> > al's approach to ML is based not the fact that it focuses on abduced 
>> > relations -- but on 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-23 Thread Quan Tesla
Rob

Yes, I understand the difference between a video and a paper. I did not
think John criticized it either. If he had, it's helathy to render
critique. You referenced the paper, and it remains relevant to the topic,
regardless of the medium accessed.

Indeed, I was referring to your decoupling point Rob. As you are well
aware, technically, there's a significant different between a relation and
an association. Relation implies functional dependency, whereas the paper
seems to indicate a method for relating an object to itself, and purely by
abstracted value, comparing it to any other object within its universe.
There are a significant number of inherent patterns right there.

I value your point on position-based categories. However, if these
positions are accurately plotted on a spacetime continuum, then we're
looking at a very-interesting approach to enabling category theory (and
like you I'm still learning) with accepted, quantum referencing within
specific, topological structures. Some of us think that's exactly how the
known universe achieves equilibrium.

No, I'm not equating 2nd order predicate calculus, or logic to category
theory. My point on "order" relates to the notion of emerging such from the
calculated values of objects (points) relative to other points, within x
space. This possibly may well offer a useful workaround for getting stuck
in Heisenberg's Uncertainty.  Once the positions of data-bearing objects
are known, in a sense it would negate the need to use globalization and
rather move to specifics. In my view, this would provide an environment in
which deabstraction and optimization would functionally fit into.

Again, I understand your preference for clarity on relationships, but
relationships change. My view would be more to get a valid and reliable fix
on compound relationships (in the sense of associations), where all
possible changes between objects are accommodated within hierarchies of
control. I think the paper made that point, probably using other terms. A
change in a historical relationship, wouldn't  necessarily have to mean
destroying all associated historical predictions (statistically calculate).
The approach in the paper seemingly allows for rapid reintegration without
loss of any data. I see pure data objects. Probably, because the robustness
of the system wouldn't get compromised if a point pseudo-randomly moved
from one position to another. Agreed, I also like category theory, but
probably for slightly different reasons to you. I'm biased towards
contexts.

There's more to "language" as I put it, than grammar. The paper did not
mention it as such.

Last, you asked:  "How do you relate their relational encoding to
regression?"
This is an excellent question. I don't quite know how they do their
relational encoding. As for regression, if I understand it correctly, it
relies on functional dependencies to emerge most-probable results, as
implied statistically.  What I think is that, with their way of setting up
each object as its own part/entity/element, they could probably relate
objects statistically by the degree of overall and core similarity. If so,
this would enable a fractal-relational principle for data cohesiveness. I
believe cohesiveness has again become all the rage.

To conclude, my excitement at what the paper contains is not for, or
against any theory. Theories are great. We read and think about them. Heck,
I even have a theory or two. Even so, theory must be tempered by practical
results. I'd put the results in the paper as having significant empirical
value. And exactly fo that reason, I could find practical value in it
beyond what was stated. Why not discuss it even further?

I concede that the conversation among you are quite theoretical. Even so,
we see what our eyes see. I see the paradigm shift in specifying a
practical, data approach to systematically converge all of engineering
towards quantum fundamentals. I remain an advocate for quantum engineering
methodologies and practices. It not only gives me hope that the road to
purely-machine-based AGI applications could be shortened significantly. It
also give me hope for commercialized products, such as "Engineering on a
chip".



On Thu, May 23, 2024 at 7:27 AM Rob Freeman 
wrote:

> On Thu, May 23, 2024 at 10:10 AM Quan Tesla  wrote:
> >
> > The paper is specific to a novel and quantitative approach and method
> for association in general and specifically.
>
> John was talking about the presentation James linked, not the paper,
> Quan. He may be right that in that presentation they use morphisms etc
> to map learned knowledge from one domain to another.
>
> He's not criticising the paper though. Only the presentation. And the
> two were discussing different techniques. John isn't criticising the
> Granger et al. "relational encoding" paper at all.
>
> > The persistence that pattern should be somehow decoupled doesn't make
> much sense to me. Information itself is as a result of pattern. Pattern is
> 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-23 Thread James Bowery
On Wed, May 22, 2024 at 10:34 PM Rob Freeman 
wrote:

> On Wed, May 22, 2024 at 10:02 PM James Bowery  wrote:
> > ...
> > You correctly perceive that the symbolic regression presentation is not
> to the point regarding the HNet paper.  A big failing of the symbolic
> regression world is the same as it is in the rest of computerdom:  Failure
> to recognize that functions are degenerate relations and you had damn well
> better have thought about why you are degenerating when you do so.  But
> likewise, when you are speaking about second-order theories (as opposed to
> first-order theories), such as Category Theory, you had damn well have
> thought about why you are specializing second-order predicate calculus when
> you do so.
> >
> > Not being familiar with Category Theory I'm in no position to critique
> this decision to specialize second-order predicate calculus.  I just
> haven't seen Category Theory presented as a second-order theory.  Perhaps I
> could understand Category Theory thence where the enthusiasm for Category
> Theory comes from if someone did so.
> >
> > This is very much like my problem with the enthusiasm for type theories
> in general.
>
> You seem to have an objection to second order predicate calculus.
>

On the contrary; I see second order predicate calculus as foundational to
any attempt to deal with process which, in the classical case, is
computation.

Dismissing category theory because you equate it to that. On what
> basis do you equate them? Why do you reject second order predicate
> calculus?
>

I don't "dismiss" category theory.  It's just that I've never seen a
category theorist describe it as a second order theory.   Even in type
theories covering computation one finds such phenomena as the Wikipedia
article on "Type theory as a logic"
 lacking
any reference to "second order".

If I appear to "equate" category theory and second order predicate calculus
it is because category theory is a second order theory
.  But beyond
that, I have an agenda related to Tom Etter's attempt to flesh out his
theory of "mind and matter" which I touched on in my first response to this
thread about fixing quantum logic.

An aspect of this project is the proof that identity theory belongs to
logic in the form of relative identity theory
.
My conjecture is that it ends up belonging to second order logic (predicate
calculus), which is why I resorted to Isabelle (HOL proof assistant)
.

What I like about category theory (as well as quantum formulations) is
> that I see it as a movement away from definitions in terms of what
> things are, and towards definitions in terms of how things are
> related. Which fits with my observations of variation in objects
> (grammar initially) defying definition, but being accessible to
> definition in terms of relations.
>

On this we heartily agree.  Why do you think first-order predicate calculus
is foundational to Codd's so-called "relational algebra"?  Why do you think
that "updates" aka "transactions" aka "atomic actions" are so problematic
within that *first* order theory?

> But I should also state that my motivation for investigating Granger et
> al's approach to ML is based not the fact that it focuses on abduced
> relations -- but on its basis in "The grammar of mammalian brain capacity"
> being a neglected order of grammar in the Chomsky Hierarchy: High Order
> Push Down Automata.  The fact that the HNet paper is about abduced
> relations was one of those serendipities that the prospector in me sees as
> a of gold in them thar HOPDAs.
>
> Where does the Granger Hamiltonian net paper mention "The grammar of
> mammalian brain capacity"? If it's not mentioned, how do you think
> they imply it?
>

My apologies for not providing the link to the paper by Granger and
Rodriguez:

https://arxiv.org/abs/1612.01150

> To wrap up, your definition of "regression" seems to differ from mine in
> the sense that, to me, "regression" is synonymous with data-driven modeling
> which is that aspect of learning, including machine learning, concerned
> with what IS as opposed to what OUGHT to be the case.
>
> The only time that paper mentions regression seems to indicate that
> they are also making a distinction between their relational encoding
> and regression:
>
> 'LLMs ... introduce sequential information supplementing the standard
> classification-based “isa” relation, although much of the information
> is learned via regression, and remains difficult to inspect or
> explain'
>
> How do you relate their relational encoding to regression?


Consider the 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-22 Thread Rob Freeman
On Wed, May 22, 2024 at 10:02 PM James Bowery  wrote:
> ...
> You correctly perceive that the symbolic regression presentation is not to 
> the point regarding the HNet paper.  A big failing of the symbolic regression 
> world is the same as it is in the rest of computerdom:  Failure to recognize 
> that functions are degenerate relations and you had damn well better have 
> thought about why you are degenerating when you do so.  But likewise, when 
> you are speaking about second-order theories (as opposed to first-order 
> theories), such as Category Theory, you had damn well have thought about why 
> you are specializing second-order predicate calculus when you do so.
>
> Not being familiar with Category Theory I'm in no position to critique this 
> decision to specialize second-order predicate calculus.  I just haven't seen 
> Category Theory presented as a second-order theory.  Perhaps I could 
> understand Category Theory thence where the enthusiasm for Category Theory 
> comes from if someone did so.
>
> This is very much like my problem with the enthusiasm for type theories in 
> general.

You seem to have an objection to second order predicate calculus.
Dismissing category theory because you equate it to that. On what
basis do you equate them? Why do you reject second order predicate
calculus?

What I like about category theory (as well as quantum formulations) is
that I see it as a movement away from definitions in terms of what
things are, and towards definitions in terms of how things are
related. Which fits with my observations of variation in objects
(grammar initially) defying definition, but being accessible to
definition in terms of relations.

> But I should also state that my motivation for investigating Granger et al's 
> approach to ML is based not the fact that it focuses on abduced relations -- 
> but on its basis in "The grammar of mammalian brain capacity" being a 
> neglected order of grammar in the Chomsky Hierarchy: High Order Push Down 
> Automata.  The fact that the HNet paper is about abduced relations was one of 
> those serendipities that the prospector in me sees as a of gold in them thar 
> HOPDAs.

Where does the Granger Hamiltonian net paper mention "The grammar of
mammalian brain capacity"? If it's not mentioned, how do you think
they imply it?

> To wrap up, your definition of "regression" seems to differ from mine in the 
> sense that, to me, "regression" is synonymous with data-driven modeling which 
> is that aspect of learning, including machine learning, concerned with what 
> IS as opposed to what OUGHT to be the case.

The only time that paper mentions regression seems to indicate that
they are also making a distinction between their relational encoding
and regression:

'LLMs ... introduce sequential information supplementing the standard
classification-based “isa” relation, although much of the information
is learned via regression, and remains difficult to inspect or
explain'

How do you relate their relational encoding to regression?

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M2f9210fa34834e5bb8e46d0c
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-22 Thread Rob Freeman
On Thu, May 23, 2024 at 10:10 AM Quan Tesla  wrote:
>
> The paper is specific to a novel and quantitative approach and method for 
> association in general and specifically.

John was talking about the presentation James linked, not the paper,
Quan. He may be right that in that presentation they use morphisms etc
to map learned knowledge from one domain to another.

He's not criticising the paper though. Only the presentation. And the
two were discussing different techniques. John isn't criticising the
Granger et al. "relational encoding" paper at all.

> The persistence that pattern should be somehow decoupled doesn't make much 
> sense to me. Information itself is as a result of pattern. Pattern is 
> everything. Light itself is a pattern, so are the four forces. Ergo.  I 
> suppose, it depends on how you view it.

If you're questioning my point, it is that definition in terms of
relations means the pattern can vary. It's like the gap filler example
in the paper:

"If John kissed Mary, Bill kissed Mary, and Hal kissed Mary, etc.,
then a novel category ¢X can be abduced such that ¢X kissed Mary.
Importantly, the new entity ¢X is not a category based on the features
of the members of the category, let alone the similarity of such
features. I.e., it is not a statistical cluster in any usual sense.
Rather, it is a “position-based category,” signifying entities that
stand in a fixed relation with other entities. John, Bill, Hal may not
resemble each other in any way, other than being entities that all
kissed Mary. Position based categories (PBCs) thus fundamentally
differ from “isa” categories, which can be similarity-based (in
unsupervised systems) or outcome-based (in supervised systems)."

If you define your category on the basis of kissing Mary, then who's
to say that you might not find other people who have kissed Mary, and
change your category from moment to moment. As you discovered clusters
of former lovers by fits and starts, the actual pattern of your
"category" might change dramatically. But it would still be defined by
its defining relation of having kissed Mary.

That might also talk to the "regression" distinction. Or
characterizing the system, or indeed all cognition, as "learning"
period. It elides both "similarity-based" unsupervised, and
supervised, "learning". The category can in fact grow as you "learn"
of new lovers. A process which I also have difficulty equating with
regression.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M8c58bf8eb0a279da79ea34eb
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-22 Thread Quan Tesla
The paper is specific to a novel and quantitative approach and method for
association in general and specifically.

 It emerges possible and statistical (most correct)  relationships. This
stands in stark contrast to  the deterministic commitment to construct
functional relationships. Hence, a polymorphic feature is enabled.
Constructors are implied in the design. How else?

 Further, this paper opens the door to computational entanglement and auto
optimization. Mathematically, a control hierarchy could relatively simply
be set at any order of logic. Thus, it has a scalar feature (deabstraction
readiness).

Rather than  classification dependent, categorization may become fully
enabled. Probably, information and semantics would be contextually enabled
across any number of universes. This paper doesn't venture into all the
implications, which in all fairness justifies scepticism.

The persistence that pattern should be somehow decoupled doesn't make much
sense to me. Information itself is as a result of pattern. Pattern is
everything. Light itself is a pattern, so are the four forces. Ergo.  I
suppose, it depends on how you view it.

Here, we have a "language" in which to emerge and initiate any pattern, to
bring form2function2form (circular, yet  progressive chain reactions). I
think it qualifies the design as having the potential to become fully
recursive. We'll have to wait and see.

For now, I'll contend that 'Design' (as pattern application/architectural
principles) remains key.







On Wed, May 22, 2024, 15:01 John Rose  wrote:

> On Tuesday, May 21, 2024, at 10:34 PM, Rob Freeman wrote:
>
> Unless I've missed something in that presentation. Is there anywhere in
> the hour long presentation where they address a decoupling of category from
> pattern, and the implications of this for novelty of structure?
>
>
> I didn’t watch the video but isn’t this just morphisms and functors so you
> can map ML between knowledge domains. Some may need to be fuzzy and the
> best structure I’ve found is Smarandache’s neutrosphic...So a generalized
> intelligence will manage sets of various morphisms across N domains. For
> example, if an AI that knows how to drive a car attempts to build a
> birdhouse it takes a small subset of morphisms between the two but grows
> more towards the birdhouse. As it attempts to build the birdhouse there
> actually may be some morphismic structure that apply to driving a car but
> most will be utilized and grow one way… N morphisms for example epi, mono,
> homo, homeo, endo, auto, zero, etc. and most obvious iso. Another mapping
> from car driving to motorcycle driving would have more utilizable
> morphisms… like steering wheel to handlebars… there is some symmetry
> mapping between group operations but they are not full iso. The pattern
> recognition is morphism recognition and novelty is created from
> mathematical structure manipulation across knowledge domains. This works
> very well when building new molecules since there are tight, almost
> lossless IOW iso morphismic relationships.
>
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M34b70925a493b96ae5ccdf6f
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-22 Thread James Bowery
On Tue, May 21, 2024 at 9:35 PM Rob Freeman 
wrote:

> 
>
> Whereas the NN presentation is talking about NNs regressing to fixed
> encodings. Not about an operator which "calculates energies" in real
> time.
>
> Unless I've missed something in that presentation. Is there anywhere
> in the hour long presentation where they address a decoupling of
> category from pattern, and the implications of this for novelty of
> structure?
>

You correctly perceive that the symbolic regression presentation is not to
the point regarding the HNet paper.  A big failing of the symbolic
regression world is the same as it is in the rest of computerdom:  Failure
to recognize that functions are degenerate relations and you had damn well
better have thought about why you are degenerating when you do so.  But
likewise, when you are speaking about second-order theories (as
opposed to first-order
theories ),
such as Category Theory, you had damn well have thought about why you are
*specializing* second-order predicate calculus when you do so.

Not being familiar with Category Theory I'm in no position to critique this
decision to specialize second-order predicate calculus.  I just haven't
seen Category Theory presented *as* a second-order theory.  Perhaps I could
understand Category Theory thence where the enthusiasm for Category Theory
comes from if someone did so.

This is very much like my problem with the enthusiasm for type theories in
general.

But I should also state that my motivation for investigating Granger et
al's approach to ML is based *not* the fact that it focuses on abduced
*relations* -- but on its basis in "The grammar of mammalian brain
capacity" being a neglected order of grammar in the Chomsky Hierarchy: High
Order Push Down Automata.  The fact that the HNet paper is about abduced
*relations* was one of those serendipities that the prospector in me sees
as a of gold in them thar HOPDAs.

To wrap up, your definition of "regression" seems to differ from mine in
the sense that, to me, "regression" is synonymous with data-driven modeling
which is that aspect of learning, including machine learning, concerned
with what IS as opposed to what OUGHT to be the case.


>
> On Tue, May 21, 2024 at 11:36 PM James Bowery  wrote:
> >
> > Symbolic Regression is starting to catch on but, as usual, people aren't
> using the Algorithmic Information Criterion so they end up with
> unprincipled choices on the Pareto frontier between residuals and model
> complexity if not unprincipled choices about how to weight the complexity
> of various "nodes" in the model's "expression".
> >
> > https://youtu.be/fk2r8y5TfNY
> >
> > A node's complexity is how much machine language code it takes to
> implement it on a CPU-only implementation.  Error residuals are program
> literals aka "constants".
> >
> > I don't know how many times I'm going to have to point this out to
> people before it gets through to them (probably well beyond the time
> maggots have forgotten what I tasted like) .

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mac2ae2959e680fe509d66197
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-22 Thread John Rose
On Tuesday, May 21, 2024, at 10:34 PM, Rob Freeman wrote:
> Unless I've missed something in that presentation. Is there anywhere
in the hour long presentation where they address a decoupling of
category from pattern, and the implications of this for novelty of
structure?

I didn’t watch the video but isn’t this just morphisms and functors so you can 
map ML between knowledge domains. Some may need to be fuzzy and the best 
structure I’ve found is Smarandache’s neutrosphic...So a generalized 
intelligence will manage sets of various morphisms across N domains. For 
example, if an AI that knows how to drive a car attempts to build a birdhouse 
it takes a small subset of morphisms between the two but grows more towards the 
birdhouse. As it attempts to build the birdhouse there actually may be some 
morphismic structure that apply to driving a car but most will be utilized and 
grow one way… N morphisms for example epi, mono, homo, homeo, endo, auto, zero, 
etc. and most obvious iso. Another mapping from car driving to motorcycle 
driving would have more utilizable morphisms… like steering wheel to 
handlebars… there is some symmetry mapping between group operations but they 
are not full iso. The pattern recognition is morphism recognition and novelty 
is created from mathematical structure manipulation across knowledge domains. 
This works very well when building new molecules since there are tight, almost 
lossless IOW iso morphismic relationships.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Me455a509be8e5e3671c3b5e0
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-21 Thread Quan Tesla
Thanks for sharing this paper.

Positively brilliant! I think this is in-line with quantum thinking and
holds great promise for quantum computing. It relates to a concept advanced
by myself and my mentor, namely, gestalt management. Penultimately, we
endeavor to most-correctly represent relativistic, multiversal realities.

This work increases the probability of success of significant value to my
Po1 theory. The day would come, where emergent requirements for locating
"needles in data haystacks", near instantaneously, would place an
unrelenting demand on these types of networks. I think this type of
architecture - when fully matured - would be perfectly suited for that.

On Wed, May 22, 2024 at 6:35 AM Rob Freeman 
wrote:

> James,
>
> The Hamiltonian paper was nice for identifying gap filler tasks as
> decoupling meaning from pattern: "not a category based on the features
> of the members of the category, let alone the similarity of such
> features".
>
> Here, for anyone else:
>
> A logical re-conception of neural networks: Hamiltonian bitwise
> part-whole architecture
> E.F.W.Bowen,1 R.Granger,2* A.Rodriguez
> https://openreview.net/pdf?id=hP4dxXvvNc8
>
> "Part-whole architecture". A new thing. Though they 'share some
> characteristics with “embeddings” in transformer architectures'.
>
> So it's a possible alternate reason for the surprise success of
> transformers. That's good. The field blunders about surprising itself.
> But there's no theory behind it. Transformers just stumbled into
> embedding representations because they looked at language. We need to
> start thinking about why these things work. Instead of just blithely
> talking about the miracle of more data. Disingenuously scaring the
> world with idiotic fears about "more data" becoming conscious by
> accident. Or insisting like LeCun that the secret is different data.
>
> But I think you're missing the point of that Hamiltonian paper if you
> think this decoupling of meaning from pattern is regression. I think
> the point of this, and also the category theoretic representations of
> Symbolica, and also quantum mechanical formalizations, is
> indeterminate symbolization, even novelty.
>
> Yeah, maybe regression will work for some things. But that ain't
> language. And it ain't cognition. They are more aligned with a
> different "New Kind of Science", that touted by Wolfram, new
> structure, all the time. Not regression, going backward, but novelty,
> creativity.
>
> In my understanding the point with the Hamiltonian paper is that a
> "position-based encoding" decouples meaning from any given pattern
> which instantiates it.
>
> Whereas the NN presentation is talking about NNs regressing to fixed
> encodings. Not about an operator which "calculates energies" in real
> time.
>
> Unless I've missed something in that presentation. Is there anywhere
> in the hour long presentation where they address a decoupling of
> category from pattern, and the implications of this for novelty of
> structure?
>
> On Tue, May 21, 2024 at 11:36 PM James Bowery  wrote:
> >
> > Symbolic Regression is starting to catch on but, as usual, people aren't
> using the Algorithmic Information Criterion so they end up with
> unprincipled choices on the Pareto frontier between residuals and model
> complexity if not unprincipled choices about how to weight the complexity
> of various "nodes" in the model's "expression".
> >
> > https://youtu.be/fk2r8y5TfNY
> >
> > A node's complexity is how much machine language code it takes to
> implement it on a CPU-only implementation.  Error residuals are program
> literals aka "constants".
> >
> > I don't know how many times I'm going to have to point this out to
> people before it gets through to them (probably well beyond the time
> maggots have forgotten what I tasted like) .

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M24e8a3387f6852d9e8287be3
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-21 Thread Rob Freeman
James,

The Hamiltonian paper was nice for identifying gap filler tasks as
decoupling meaning from pattern: "not a category based on the features
of the members of the category, let alone the similarity of such
features".

Here, for anyone else:

A logical re-conception of neural networks: Hamiltonian bitwise
part-whole architecture
E.F.W.Bowen,1 R.Granger,2* A.Rodriguez
https://openreview.net/pdf?id=hP4dxXvvNc8

"Part-whole architecture". A new thing. Though they 'share some
characteristics with “embeddings” in transformer architectures'.

So it's a possible alternate reason for the surprise success of
transformers. That's good. The field blunders about surprising itself.
But there's no theory behind it. Transformers just stumbled into
embedding representations because they looked at language. We need to
start thinking about why these things work. Instead of just blithely
talking about the miracle of more data. Disingenuously scaring the
world with idiotic fears about "more data" becoming conscious by
accident. Or insisting like LeCun that the secret is different data.

But I think you're missing the point of that Hamiltonian paper if you
think this decoupling of meaning from pattern is regression. I think
the point of this, and also the category theoretic representations of
Symbolica, and also quantum mechanical formalizations, is
indeterminate symbolization, even novelty.

Yeah, maybe regression will work for some things. But that ain't
language. And it ain't cognition. They are more aligned with a
different "New Kind of Science", that touted by Wolfram, new
structure, all the time. Not regression, going backward, but novelty,
creativity.

In my understanding the point with the Hamiltonian paper is that a
"position-based encoding" decouples meaning from any given pattern
which instantiates it.

Whereas the NN presentation is talking about NNs regressing to fixed
encodings. Not about an operator which "calculates energies" in real
time.

Unless I've missed something in that presentation. Is there anywhere
in the hour long presentation where they address a decoupling of
category from pattern, and the implications of this for novelty of
structure?

On Tue, May 21, 2024 at 11:36 PM James Bowery  wrote:
>
> Symbolic Regression is starting to catch on but, as usual, people aren't 
> using the Algorithmic Information Criterion so they end up with unprincipled 
> choices on the Pareto frontier between residuals and model complexity if not 
> unprincipled choices about how to weight the complexity of various "nodes" in 
> the model's "expression".
>
> https://youtu.be/fk2r8y5TfNY
>
> A node's complexity is how much machine language code it takes to implement 
> it on a CPU-only implementation.  Error residuals are program literals aka 
> "constants".
>
> I don't know how many times I'm going to have to point this out to people 
> before it gets through to them (probably well beyond the time maggots have 
> forgotten what I tasted like) .

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M8418e9bd5e49f7ca08dfb816
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-21 Thread James Bowery
Symbolic Regression is starting to catch on but, as usual, people aren't
using the Algorithmic Information Criterion so they end up with
unprincipled choices on the Pareto frontier between residuals and model
complexity if not unprincipled choices about how to weight the complexity
of various "nodes" in the model's "expression".

https://youtu.be/fk2r8y5TfNY

A node's complexity is how much machine language code it takes to implement
it on a CPU-only implementation.  Error residuals are program literals aka
"constants".

I don't know how many times I'm going to have to point this out to people
before it gets through to them (probably well beyond the time maggots have
forgotten what I tasted like) .

On Mon, May 20, 2024 at 10:23 PM Rob Freeman 
wrote:

> "Importantly, the new entity ¢X is not a category based on the
> features of the members of the category, let alone the similarity of
> such features"
>
> Oh, nice. I hadn't seen anyone else making that point. This paper 2023?
>
> That's what I was saying. Nice. A vindication. Such categories
> decouple the pattern itself from the category.
>
> But I'm astonished they don't cite Coecke, as the obvious quantum
> formulation precedent (though I noticed it for language in the '90s.)
>
> I wonder how their formulation relates to what Symbolica are doing
> with their category theoretic formulations:
>
> https://youtu.be/rie-9AEhYdY?si=9RUB3O_8WeFSU3ni
>
> I haven't read closely enough to know if they make that decoupling of
> category from pattern a sense for "creativity" the way I'm suggesting.
> Perhaps that's because a Hamiltonian formulation is still too trapped
> in symbolism. We need to remain trapped in the symbolism for physics.
> Because for physics we don't have access to an underlying reality.
> That's where AI, and particularly language, has an advantage. Because,
> especially for language, the underlying reality of text is the only
> reality we do have access to (though Chomsky tried to swap that
> around, and insist we only access our cognitive insight.)
>
> For AI, and especially for language, we have the opportunity to get
> under even a quantum formalism. It will be there implicitly, but
> instead of laboriously formulating it, and then collapsing it at run
> time, we can simply "collapse" structure directly from observation.
> But that "collapse" must be flexible, and allow different structures
> to arise from different symmetries found in the data from moment to
> moment. So it requires the abandonment of back-prop.
>
> In theory it is easy though. Everything can remain much as it is for
> LLMs. Only, instead of trying to "learn" stable patterns using
> back-prop, we must "collapse" different symmetries in the data in
> response to a different "prompt", at run time.
>
> On Tue, May 21, 2024 at 5:01 AM James Bowery  wrote:
> >
> > From A logical re-conception of neural networks: Hamiltonian bitwise
> part-whole architecture
> >> From hierarchical statistics to abduced symbols
> >> It is perhaps useful to envision some of the ongoing devel-
> >> opments that are arising from enlarging and elaborating the
> >> Hamiltonian logic net architecture. As yet, no large-scale
> >> training whatsoever has gone into the present minimal HNet
> >> model; thus far it is solely implemented at a small, introduc-
> >> tory scale, as an experimental new approach to representa-
> >> tions. It is conjectured that with large-scale training, hierar-
> >> chical constructs would be accreted as in large deep network
> >> systems, with the key difference that, in HNets, such con-
> >> structs would have relational properties beyond the “isa”
> >> (category) relation, as discussed earlier.
> >> Such relational representations lend themselves to abduc-
> >> tive steps (McDermott 1987) (or “retroductive” (Pierce
> >> 1883)); i.e., inferential generalization steps that go beyond
> >> warranted statistical information. If John kissed Mary, Bill
> >> kissed Mary, and Hal kissed Mary, etc., then a novel cate-
> >> gory ¢X can be abduced such that ¢X kissed Mary.
> >> Importantly, the new entity ¢X is not a category based on
> >> the features of the members of the category, let alone the
> >> similarity of such features. I.e., it is not a statistical cluster
> >> in any usual sense. Rather, it is a “position-based category,”
> >> signifying entities that stand in a fixed relation with other
> >> entities. John, Bill, Hal may not resemble each other in any
> >> way, other than being entities that all kissed Mary. Position-
> >> based categories (PBCs) thus fundamentally differ from
> >> “isa” categories, which can be similarity-based (in unsuper-
> >> vised systems) or outcome-based (in supervised systems).
> >> PBCs share some characteristics with “embeddings” in
> >> transformer architectures.
> >> Abducing a category of this kind often entails overgener-
> >> alization, and subsequent learning may require learned ex-
> >> ceptions to the overgeneralization. (Verb past 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread Rob Freeman
"Importantly, the new entity ¢X is not a category based on the
features of the members of the category, let alone the similarity of
such features"

Oh, nice. I hadn't seen anyone else making that point. This paper 2023?

That's what I was saying. Nice. A vindication. Such categories
decouple the pattern itself from the category.

But I'm astonished they don't cite Coecke, as the obvious quantum
formulation precedent (though I noticed it for language in the '90s.)

I wonder how their formulation relates to what Symbolica are doing
with their category theoretic formulations:

https://youtu.be/rie-9AEhYdY?si=9RUB3O_8WeFSU3ni

I haven't read closely enough to know if they make that decoupling of
category from pattern a sense for "creativity" the way I'm suggesting.
Perhaps that's because a Hamiltonian formulation is still too trapped
in symbolism. We need to remain trapped in the symbolism for physics.
Because for physics we don't have access to an underlying reality.
That's where AI, and particularly language, has an advantage. Because,
especially for language, the underlying reality of text is the only
reality we do have access to (though Chomsky tried to swap that
around, and insist we only access our cognitive insight.)

For AI, and especially for language, we have the opportunity to get
under even a quantum formalism. It will be there implicitly, but
instead of laboriously formulating it, and then collapsing it at run
time, we can simply "collapse" structure directly from observation.
But that "collapse" must be flexible, and allow different structures
to arise from different symmetries found in the data from moment to
moment. So it requires the abandonment of back-prop.

In theory it is easy though. Everything can remain much as it is for
LLMs. Only, instead of trying to "learn" stable patterns using
back-prop, we must "collapse" different symmetries in the data in
response to a different "prompt", at run time.

On Tue, May 21, 2024 at 5:01 AM James Bowery  wrote:
>
> From A logical re-conception of neural networks: Hamiltonian bitwise 
> part-whole architecture
>
>> From hierarchical statistics to abduced symbols
>> It is perhaps useful to envision some of the ongoing devel-
>> opments that are arising from enlarging and elaborating the
>> Hamiltonian logic net architecture. As yet, no large-scale
>> training whatsoever has gone into the present minimal HNet
>> model; thus far it is solely implemented at a small, introduc-
>> tory scale, as an experimental new approach to representa-
>> tions. It is conjectured that with large-scale training, hierar-
>> chical constructs would be accreted as in large deep network
>> systems, with the key difference that, in HNets, such con-
>> structs would have relational properties beyond the “isa”
>> (category) relation, as discussed earlier.
>> Such relational representations lend themselves to abduc-
>> tive steps (McDermott 1987) (or “retroductive” (Pierce
>> 1883)); i.e., inferential generalization steps that go beyond
>> warranted statistical information. If John kissed Mary, Bill
>> kissed Mary, and Hal kissed Mary, etc., then a novel cate-
>> gory ¢X can be abduced such that ¢X kissed Mary.
>> Importantly, the new entity ¢X is not a category based on
>> the features of the members of the category, let alone the
>> similarity of such features. I.e., it is not a statistical cluster
>> in any usual sense. Rather, it is a “position-based category,”
>> signifying entities that stand in a fixed relation with other
>> entities. John, Bill, Hal may not resemble each other in any
>> way, other than being entities that all kissed Mary. Position-
>> based categories (PBCs) thus fundamentally differ from
>> “isa” categories, which can be similarity-based (in unsuper-
>> vised systems) or outcome-based (in supervised systems).
>> PBCs share some characteristics with “embeddings” in
>> transformer architectures.
>> Abducing a category of this kind often entails overgener-
>> alization, and subsequent learning may require learned ex-
>> ceptions to the overgeneralization. (Verb past tenses typi-
>> cally are formed by appending “-ed”, and a language learner
>> may initially overgeneralize to “runned” and “gived,” neces-
>> sitating subsequent exception learning of “ran” and “gave”.)
>
>
> The abduced "category" ¢X bears some resemblance to the way Currying (as in 
> combinator calculus) binds a parameter of a symbol to define a new symbol.  
> In practice it only makes sense to bother creating this new symbol if it, in 
> concert with all other symbols, compresses the data in evidence.  (As for 
> "overgeneralization", that applies to any error in prediction encountered 
> during learning and, in the ideal compressor, increases the algorithm's 
> length even if only by appending the exceptional data in a conditional -- NOT 
> "falsifying" anything as would that rascal Popper).
>
> This is "related" to quantum-logic in the sense that Tom Etter calls out in 
> the linked 

Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread keghnfeem
Tokens inside transformers are supervised internal symbols.
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M102516027fd65ca8c1f90b8b
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread James Bowery
From
*A logical re-conception of neural networks: Hamiltonian bitwise part-whole
architecture*


> *From hierarchical statistics to abduced symbols*It is perhaps useful to
> envision some of the ongoing devel-
> opments that are arising from enlarging and elaborating the
> Hamiltonian logic net architecture. As yet, no large-scale
> training whatsoever has gone into the present minimal HNet
> model; thus far it is solely implemented at a small, introduc-
> tory scale, as an experimental new approach to representa-
> tions. It is conjectured that with large-scale training, hierar-
> chical constructs would be accreted as in large deep network
> systems, with
> *the key difference that, in HNets, such con-structs would have relational
> properties* beyond the “isa”
> (category) relation, as discussed earlier.
> Such relational representations lend themselves to abduc-
> tive steps (McDermott 1987) (or “retroductive” (Pierce
> 1883)); i.e., inferential generalization steps that go beyond
> warranted statistical information. If John kissed Mary, Bill
> kissed Mary, and Hal kissed Mary, etc., then a novel cate-
> gory ¢X can be abduced such that ¢X kissed Mary.
> Importantly, the new entity ¢X is not a category based on
> the features of the members of the category, let alone the
> similarity of such features. I.e., it is not a statistical cluster
> in any usual sense. Rather, it is a “position-based category,”
> signifying entities that stand in a fixed relation with other
> entities. John, Bill, Hal may not resemble each other in any
> way, other than being entities that all kissed Mary. Position-
> based categories (PBCs) thus fundamentally differ from
> “isa” categories, which can be similarity-based (in unsuper-
> vised systems) or outcome-based (in supervised systems).
> PBCs share some characteristics with “embeddings” in
> transformer architectures.
> Abducing a category of this kind often entails overgener-
> alization, and subsequent learning may require learned ex-
> ceptions to the overgeneralization. (Verb past tenses typi-
> cally are formed by appending “-ed”, and a language learner
> may initially overgeneralize to “runned” and “gived,” neces-
> sitating subsequent exception learning of “ran” and “gave”.)


The abduced "category" ¢X bears some resemblance to the way Currying
(as in combinator
calculus ) binds a
parameter of a symbol to define a new symbol.  In practice it only makes
sense to bother creating this new symbol if it, in concert with all other
symbols, compresses the data in evidence.  (As for "overgeneralization",
that applies to any error in prediction encountered during learning and, in
the ideal compressor, increases the algorithm's length even if only by
appending the exceptional data in a conditional -- *NOT* "falsifying"
anything as would that rascal Popper).

This is "related" to quantum-logic in the sense that Tom Etter calls out in
the linked presentation:

Digram box linking, which is based on the *mathematics of relations
> rather than of functions*, is a more general operation than the
> composition of transition matrices.


On Thu, May 16, 2024 at 7:24 PM James Bowery  wrote:

> First, fix quantum logic:
>
>
> https://web.archive.org/web/20061030044246/http://www.boundaryinstitute.org/articles/Dynamical_Markov.pdf
>
> Then realize that empirically true cases can occur not only in
> multiplicity (OR), but with structure that includes the simultaneous (AND)
> measurement dimensions of those cases.
>
> But don't tell anyone because it might obviate the risible tradition of
> so-called "type theories" in both mathematics and programming languages
> (including SQL and all those "fuzzy logic" kludges) and people would get
> *really* pissy at you.
>
>
> On Thu, May 16, 2024 at 10:27 AM  wrote:
>
>> What should symbolic approach include to entirely replace neural networks
>> approach in creating true AI? Is that task even possible? What benefits and
>> drawbacks we could expect or hope for if it is possible? If it is not
>> possible, what would be the reasons?
>>
>> Thank you all for your time.
>> *Artificial General Intelligence List *
>> / AGI / see discussions  +
>> participants  +
>> delivery options 
>> Permalink
>> 
>>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Ma9215f03be1998269e14f977
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread James Bowery
On Mon, May 20, 2024 at 9:49 AM Rob Freeman 
wrote:

> Well, I don't know number theory well, but what axiomatization of
> maths are you basing the predictions in your series on?
>
> I have a hunch the distinction I am making is similar to a distinction
> about the choice of axiomatization. Which will be random. (The
> randomness demonstrated by Goedel's diagonalization lemma? "True" but
> not provable/predictable within the system?)
>

Here's how I tend to think about it:

Solomonoff addressed this "random" choice of axioms by introducing a random
bit string (the axioms of the theory) interpreted as an algorithm (rules of
inference) which, itself, produces another bit string (theorems).

However, this leaves undefined the "rules of inference" which, in my way of
thinking, is like leaving undefined the choice of UTM within Algorithmic
Information Theory.

I've addressed this before in terms of the axioms of arithmetic by saying
that the choice of UTM is no more "random" than is the choice of axioms of
arithmetic which must, itself, incorporate the rules of inference else you
have no theory.

Marcus Hutter has addressed this "philosophical nuisance" in terms of no
post hoc (after observing the dataset) choice of UTM being permitted by the
principles of prediction.

I've further addressed this philosophical nuisance by permitting the
sophist to examine the dataset prior to "choosing the UTM", but restricted
to NiNOR Complexity
 which
further reduces the argument surface available to sophists.


> On Mon, May 20, 2024 at 9:09 PM James Bowery  wrote:
> >
> >
> >
> > On Sun, May 19, 2024 at 11:32 PM Rob Freeman 
> wrote:
> >>
> >> James,
> >>
> >> My working definition of "truth" is a pattern that predicts. And I'm
> >> tending away from compression for that.
> >
> >
> > 2, 4, 6, 8
> >
> > does it mean
> > 2n?
> >
> > or does it mean
> > 10?
> >
> >
> >
> >> Related to your sense of "meaning" in (Algorithmic Information)
> >> randomness. But perhaps not quite the same thing.
> >
> >
> > or does it mean a probability distribution of formulae that all produce
> 2, 4, 6, 8 whatever they may subsequently produce?
> >
> > or does it mean a probability distribution of sequences
> > 10, 12?
> > 10, 12, 14?
> > 10, 13, 14?
> > ...

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M1ce471d20cc6a3bfdec9f397
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread Rob Freeman
Well, I don't know number theory well, but what axiomatization of
maths are you basing the predictions in your series on?

I have a hunch the distinction I am making is similar to a distinction
about the choice of axiomatization. Which will be random. (The
randomness demonstrated by Goedel's diagonalization lemma? "True" but
not provable/predictable within the system?)

On Mon, May 20, 2024 at 9:09 PM James Bowery  wrote:
>
>
>
> On Sun, May 19, 2024 at 11:32 PM Rob Freeman  
> wrote:
>>
>> James,
>>
>> My working definition of "truth" is a pattern that predicts. And I'm
>> tending away from compression for that.
>
>
> 2, 4, 6, 8
>
> does it mean
> 2n?
>
> or does it mean
> 10?
>
>
>
>> Related to your sense of "meaning" in (Algorithmic Information)
>> randomness. But perhaps not quite the same thing.
>
>
> or does it mean a probability distribution of formulae that all produce 2, 4, 
> 6, 8 whatever they may subsequently produce?
>
> or does it mean a probability distribution of sequences
> 10, 12?
> 10, 12, 14?
> 10, 13, 14?
> ...

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M086013ed4b196bdfe9a874c8
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread James Bowery
On Sun, May 19, 2024 at 11:32 PM Rob Freeman 
wrote:

> James,
>
> My working definition of "truth" is a pattern that predicts. And I'm
> tending away from compression for that.
>

2, 4, 6, 8

does it mean
2n?

or does it mean
10?



Related to your sense of "meaning" in (Algorithmic Information)
> randomness. But perhaps not quite the same thing.
>

or does it mean a probability distribution of formulae that all produce 2,
4, 6, 8 whatever they may subsequently produce?

or does it mean a probability distribution of sequences
10, 12?
10, 12, 14?
10, 13, 14?
...



> I want to emphasise a sense in which "meaning" is an expansion of the
> world, not a compression. By expansion I mean more than one,
> contradictory, predictive pattern from a single set of data.
>

I hope you can see from the above questions that we are talking about
probability distributions.  What is the difference between the probability
distribution of algorithms (aka formulae) and the probability distribution
of the strings they generate?


> Note I'm saying a predictive pattern, not a predictable pattern.
> (Perhaps as a random distribution of billiard balls might predict the
> evolution of the table, without being predictable itself?)
>
> There's randomness at the heart of that. Contradictory patterns
> require randomness. A single, predictable, pattern, could not have
> contradictory predictive patterns either? But I see the meaning coming
> from the prediction, not any random pattern that may be making the
> prediction.
>
> Making meaning about prediction, and not any specific pattern itself,
> opens the door to patterns which are meaningful even though new. Which
> can be a sense for creativity.
>
> Anyway, the "creative" aspect of it would explain why LLMs get so big,
> and don't show any interpretable structure.
>
> With a nod to the topic of this thread, it would also explain why
> symbolic systems would never be adequate. It would undermine the idea
> of stable symbols, anyway.
>
> So, not consensus through a single, stable, Algorithmic Information
> most compressed pattern, as I understand you are suggesting (the most
> compressed pattern not knowable anyway?) Though dependent on
> randomness, and consistent with your statement that "truth" should be
> "relative to a given set of observations".
>
> On Sat, May 18, 2024 at 11:57 PM James Bowery  wrote:
> >
> > Rob, the problem I have with things like "type theory" and "category
> theory" is that they almost always elide their foundation in HOL (high
> order logic) which means they don't really admit that they are syntactic
> sugars for second-order predicate calculus.  The reason I describe this as
> "risible" is the same reason I rather insist on the Algorithmic Information
> Criterion for model selection in the natural sciences:
> >
> > Reduce the argument surface that has us all going into hysterics over
> "truth" aka "the science" aka what IS the case as opposed to what OUGHT to
> be the case.
> >
> > Note I said "reduce" rather than "eliminate" the argument surface.  All
> I'm trying to do is get people to recognize that relative to a given set of
> observations the Algorithmic Information Criterion is the best operational
> definition of the truth.
> >
> > It's really hard for people to take even this baby step toward standing
> down from killing each other in a rhyme with The Thirty Years War, given
> that social policy is so centralized that everyone must become a de facto
> theocratic supremacist as a matter of self defence.  It's really obvious
> that the trend is toward capturing us in a control system, e.g. a
> Valley-Girl flirtation friendly interface to Silicon Chutulu that can only
> be fought at the physical level such as sniper bullets through the cooling
> systems of data centers.  This would probably take down civilization itself
> given the over-emphasis on efficiency vs resilience in civilization's
> dependence on information systems infrastructure.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Me2c000d7572de5b0a5769775
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-20 Thread John Rose
On Saturday, May 18, 2024, at 6:53 PM, Matt Mahoney wrote:
> Surely you are aware of the 100% failure rate of symbolic AI over the last 70 
> years? It should work in theory, but we have a long history of 
> underestimating the cost, lured by the early false success of covering half 
> of the cases with just a few hundred rules.
> 

I view LLM’s as systems within symbolic systems. Why? Simply that we exist in a 
spacetime environment and ALL COMMUNICATION is symbolic. And sub-symbolic 
representation is required for computation. All bits are symbols based on 
probabilities. Then as LLM’s become more intelligent the physical power 
consumption required to produce similar results will decrease as their symbolic 
networks grow and optimize.

Could be wrong but It makes sense to me… saying everything is symbolic 
eliminates the argument. I know it's lazy but  that's often how developers look 
at things in order to code them up :) Laziness is a form of optimization... 

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M2252941b1c7cca5b59b32c1f
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-19 Thread Rob Freeman
James,

My working definition of "truth" is a pattern that predicts. And I'm
tending away from compression for that.

Related to your sense of "meaning" in (Algorithmic Information)
randomness. But perhaps not quite the same thing.

I want to emphasise a sense in which "meaning" is an expansion of the
world, not a compression. By expansion I mean more than one,
contradictory, predictive pattern from a single set of data.

Note I'm saying a predictive pattern, not a predictable pattern.
(Perhaps as a random distribution of billiard balls might predict the
evolution of the table, without being predictable itself?)

There's randomness at the heart of that. Contradictory patterns
require randomness. A single, predictable, pattern, could not have
contradictory predictive patterns either? But I see the meaning coming
from the prediction, not any random pattern that may be making the
prediction.

Making meaning about prediction, and not any specific pattern itself,
opens the door to patterns which are meaningful even though new. Which
can be a sense for creativity.

Anyway, the "creative" aspect of it would explain why LLMs get so big,
and don't show any interpretable structure.

With a nod to the topic of this thread, it would also explain why
symbolic systems would never be adequate. It would undermine the idea
of stable symbols, anyway.

So, not consensus through a single, stable, Algorithmic Information
most compressed pattern, as I understand you are suggesting (the most
compressed pattern not knowable anyway?) Though dependent on
randomness, and consistent with your statement that "truth" should be
"relative to a given set of observations".

On Sat, May 18, 2024 at 11:57 PM James Bowery  wrote:
>
> Rob, the problem I have with things like "type theory" and "category theory" 
> is that they almost always elide their foundation in HOL (high order logic) 
> which means they don't really admit that they are syntactic sugars for 
> second-order predicate calculus.  The reason I describe this as "risible" is 
> the same reason I rather insist on the Algorithmic Information Criterion for 
> model selection in the natural sciences:
>
> Reduce the argument surface that has us all going into hysterics over "truth" 
> aka "the science" aka what IS the case as opposed to what OUGHT to be the 
> case.
>
> Note I said "reduce" rather than "eliminate" the argument surface.  All I'm 
> trying to do is get people to recognize that relative to a given set of 
> observations the Algorithmic Information Criterion is the best operational 
> definition of the truth.
>
> It's really hard for people to take even this baby step toward standing down 
> from killing each other in a rhyme with The Thirty Years War, given that 
> social policy is so centralized that everyone must become a de facto 
> theocratic supremacist as a matter of self defence.  It's really obvious that 
> the trend is toward capturing us in a control system, e.g. a Valley-Girl 
> flirtation friendly interface to Silicon Chutulu that can only be fought at 
> the physical level such as sniper bullets through the cooling systems of data 
> centers.  This would probably take down civilization itself given the 
> over-emphasis on efficiency vs resilience in civilization's dependence on 
> information systems infrastructure.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M8a84fef3037323602ea7dcca
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-18 Thread Quan Tesla
It's not about who wins the battle of models, rather if the models employed
would theoretically (symbolically) be a true representation of an AGI with
potential for ASI.

I think that LLMs on their own simply won't hack it. You may be satisfied
with the tradeoffs in commercialized value, but there are researchers who
are capable of predicting the quantum limits of current "solutions".

The profiteers profit while old-school scientists slog away, ever waiting
to pounce on their insights. Human behaviour, as self interested as it is,
suffers the Icarus complex.

In your argument, most likely you"ll have to aggregate all those costs,
placing most-correct AGI beyond the reach of the lifetimes of all the
players of the day.

Where does it leave this symbol of human ambition? In a negative value, in
an infinite loop.

Simply, because we have proven anf persist in a model which proves that we
have no respect - or perhaps insufficient understanding - of the cosmic
perfection in the conservation of energy.

No one model will do, not unless in its holism it would generate a net,
positive value. The number of that value would always be equivalent to 1.

Scientists are beginning to understand the simplicity of this thought. It's
about pattern languages, not brute force.

Is an LLM a pattern language? If so, is it sufficient to express all
aspects of a "language" for describing and specifying and managing AGI
evolution in, holistically?

If not, what is lacking and how can it be realized?

I think nature is pragmatic. It adds and it subtracts. If AGI is a symbol
of a natural system, then do the sum.

On Sun, May 19, 2024, 02:54 Matt Mahoney  wrote:

> On Thu, May 16, 2024, 11:27 AM  wrote:
>
>> What should symbolic approach include to entirely replace neural networks
>> approach in creating true AI? Is that task even possible? What benefits and
>> drawbacks we could expect or hope for if it is possible? If it is not
>> possible, what would be the reasons?
>>
>
> Surely you are aware of the 100% failure rate of symbolic AI over the last
> 70 years? It should work in theory, but we have a long history of
> underestimating the cost, lured by the early false success of covering half
> of the cases with just a few hundred rules.
>
> A human level language model is 10^9 bits, equivalent to 60M lines of code
> according to my compression tests, which yield 16 bits per line. A line of
> code costs $100, so your development cost is $6 billion, far beyond the
> budgets of the most ambitious attemps like Cyc or OpenCog.
>
> Or you can train a LLM with 100 to 1000 times as much knowledge for a few
> million at $2 per GPU hour.
>
>
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M892519d3918783ea7007180d
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-18 Thread Matt Mahoney
On Thu, May 16, 2024, 11:27 AM  wrote:

> What should symbolic approach include to entirely replace neural networks
> approach in creating true AI? Is that task even possible? What benefits and
> drawbacks we could expect or hope for if it is possible? If it is not
> possible, what would be the reasons?
>

Surely you are aware of the 100% failure rate of symbolic AI over the last
70 years? It should work in theory, but we have a long history of
underestimating the cost, lured by the early false success of covering half
of the cases with just a few hundred rules.

A human level language model is 10^9 bits, equivalent to 60M lines of code
according to my compression tests, which yield 16 bits per line. A line of
code costs $100, so your development cost is $6 billion, far beyond the
budgets of the most ambitious attemps like Cyc or OpenCog.

Or you can train a LLM with 100 to 1000 times as much knowledge for a few
million at $2 per GPU hour.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M5d7336a46b79663a410d119c
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-18 Thread James Bowery
Rob, the problem I have with things like "type theory" and "category
theory" is that they almost always elide their foundation in HOL (high
order logic) which means they don't *really* admit that they are syntactic
sugars for second-order predicate calculus.  The reason I describe this as
"risible" is the same reason I rather insist on the Algorithmic Information
Criterion for model selection in the natural sciences:

Reduce the argument surface that has us all going into hysterics over
"truth" aka "the science" aka what IS the case as opposed to what OUGHT to
be the case.

Note I said "reduce" rather than "eliminate" the argument surface.  All I'm
trying to do is get people to recognize that *relative to a given set of
observations* the Algorithmic Information Criterion is the best operational
definition of the truth.

It's really hard for people to take even this *baby* step toward standing
down from killing each other in a rhyme with The Thirty Years War, given
that social policy is so centralized that everyone must become a de facto
theocratic supremacist as a matter of self defence.  It's really obvious
that the trend is toward capturing us in a control system, e.g. a
Valley-Girl flirtation friendly interface to Silicon Chutulu that can only
be fought at the physical level such as sniper bullets through the cooling
systems of data centers.  This would probably take down civilization itself
given the over-emphasis on efficiency vs resilience in civilization's
dependence on information systems infrastructure.

On Thu, May 16, 2024 at 10:36 PM Rob Freeman 
wrote:

> James,
>
> For relevance to type theories in programming I like Bartosz
> Milewski's take on it here. An entire lecture series, but the part
> that resonates with me is in the introductory lecture:
>
> "maybe composability is not a property of nature"
>
> Cued up here:
>
> Category Theory 1.1: Motivation and Philosophy
> Bartosz Milewski
> https://youtu.be/I8LbkfSSR58?si=nAPc1f0unpj8i2JT=2734
>
> Also Rich Hickey, the creator of Clojure language, had some nice
> interpretations in some of his lectures, where he argued for the
> advantages of functional languages over object oriented languages.
> Basically because, in my interpretation, the "objects" can only ever
> be partially "true".
>
> Maybe summarized well here:
>
> https://twobithistory.org/2019/01/31/simula.html
>
> Or here:
>
>
> https://www.flyingmachinestudios.com/programming/the-unofficial-guide-to-rich-hickeys-brain/
>
> Anyway, the code guys are starting to notice it too.
>
> -Rob
>
> On Fri, May 17, 2024 at 7:25 AM James Bowery  wrote:
> >
> > First, fix quantum logic:
> >
> >
> https://web.archive.org/web/20061030044246/http://www.boundaryinstitute.org/articles/Dynamical_Markov.pdf
> >
> > Then realize that empirically true cases can occur not only in
> multiplicity (OR), but with structure that includes the simultaneous (AND)
> measurement dimensions of those cases.
> >
> > But don't tell anyone because it might obviate the risible tradition of
> so-called "type theories" in both mathematics and programming languages
> (including SQL and all those "fuzzy logic" kludges) and people would get
> really pissy at you.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M2f546f083c9091e4e39fabc8
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-17 Thread Nanograte Knowledge Technologies
Mostly agreed, but it depends on your definition of NN. NN is equivalent to 
mutation (supposed to be). If we applied it in that sense, then NN could 
support other schemas of mutation, not diminish in functional value. 
Ultimately, I think we're heading towards a biochemical model for AGI, even if 
it is a synthetic one.

Synthetic means, not naturally made. It doesn't mean that a synthetic machine 
cannot function as a fully-recursive machine, which demonstrates of its own 
intuit an ability to perform conscious decisions of the highest order.

The concern with AGI has often been in the region of autonomous decision 
making. Who would predict exactly which moral, or 
strategic-tactical-operational, or "necessary" decision a powerful, autonomous 
machine could come to.

Which tribe would it conclude it belonged to and where would it position its 
sense of fealty? Would it be as fickle as humans on belonging and issues of 
loyalty to greater society? Altruism, would it get it? Would it develop a good 
and bad inclination, and structure society to favor either one of those 
"instincts" it may deem most logically indicated?


Mostly, would it be inclined towards "criminal" behavior, or even "terrorism" 
by any name? And if it decided to turn to rage in a relationship, would it feel 
justified in overpowering a weaker sex?

In that sense, success! We would have duplicated the complications of humanity!

From: John Rose 
Sent: Friday, 17 May 2024 13:48
To: AGI 
Subject: Re: [agi] Can symbolic approach entirely replace NN approach?

On Thursday, May 16, 2024, at 11:26 AM, ivan.moony wrote:
What should symbolic approach include to entirely replace neural networks 
approach in creating true AI?

Symbology will compress NN monstrosities… right?  Or should say increasing 
efficiency via emerging symbolic activity for complexity reduction. Then less 
NN will be required since the “intelligence” was will have been formed. But 
still need sensory…

There is much room for innovation in mathematics… some of us have been working 
on that for a while.
Artificial General Intelligence List<https://agi.topicbox.com/latest> / AGI / 
see discussions<https://agi.topicbox.com/groups/agi> + 
participants<https://agi.topicbox.com/groups/agi/members> + delivery 
options<https://agi.topicbox.com/groups/agi/subscription> 
Permalink<https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M5b45da5fff085a720d8ea765>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M844f85d23b2020dafbaecc77
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-17 Thread John Rose
On Thursday, May 16, 2024, at 11:26 AM, ivan.moony wrote:
> What should symbolic approach include to entirely replace neural networks 
> approach in creating true AI?

Symbology will compress NN monstrosities… right?  Or should say increasing 
efficiency via emerging symbolic activity for complexity reduction. Then less 
NN will be required since the “intelligence” was will have been formed. But 
still need sensory…

There is much room for innovation in mathematics… some of us have been working 
on that for a while.
--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M5b45da5fff085a720d8ea765
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread Rob Freeman
James,

For relevance to type theories in programming I like Bartosz
Milewski's take on it here. An entire lecture series, but the part
that resonates with me is in the introductory lecture:

"maybe composability is not a property of nature"

Cued up here:

Category Theory 1.1: Motivation and Philosophy
Bartosz Milewski
https://youtu.be/I8LbkfSSR58?si=nAPc1f0unpj8i2JT=2734

Also Rich Hickey, the creator of Clojure language, had some nice
interpretations in some of his lectures, where he argued for the
advantages of functional languages over object oriented languages.
Basically because, in my interpretation, the "objects" can only ever
be partially "true".

Maybe summarized well here:

https://twobithistory.org/2019/01/31/simula.html

Or here:

https://www.flyingmachinestudios.com/programming/the-unofficial-guide-to-rich-hickeys-brain/

Anyway, the code guys are starting to notice it too.

-Rob

On Fri, May 17, 2024 at 7:25 AM James Bowery  wrote:
>
> First, fix quantum logic:
>
> https://web.archive.org/web/20061030044246/http://www.boundaryinstitute.org/articles/Dynamical_Markov.pdf
>
> Then realize that empirically true cases can occur not only in multiplicity 
> (OR), but with structure that includes the simultaneous (AND) measurement 
> dimensions of those cases.
>
> But don't tell anyone because it might obviate the risible tradition of 
> so-called "type theories" in both mathematics and programming languages 
> (including SQL and all those "fuzzy logic" kludges) and people would get 
> really pissy at you.

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mea3f554271a532a282d58fa0
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread Mike Archbold
Historically the AGI community has not really embraced neural networks --
and the cost has been that the AI explosion has come from the mainstream
more or less.

On Thu, May 16, 2024 at 7:01 PM Quan Tesla  wrote:

> Without neural networks, a symbolic approach wouldn't be effective. My
> view is that, depending on the definition of what "symbolic approach" means
> in the context of AGI, in the least both such operational schemas would be
> required to achieve the level of systems abstraction that would satisfy a
> scientifically-sound (transferable) form of human intelligence. By
> implication, they would have to be seamlessly integrated. Anyone here
> working on such an integration?
>
> On Thu, May 16, 2024 at 7:27 PM  wrote:
>
>> What should symbolic approach include to entirely replace neural networks
>> approach in creating true AI? Is that task even possible? What benefits and
>> drawbacks we could expect or hope for if it is possible? If it is not
>> possible, what would be the reasons?
>>
>> Thank you all for your time.
>>
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M91559e2546f956afaa896d8e
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread Quan Tesla
Without neural networks, a symbolic approach wouldn't be effective. My view
is that, depending on the definition of what "symbolic approach" means in
the context of AGI, in the least both such operational schemas would be
required to achieve the level of systems abstraction that would satisfy a
scientifically-sound (transferable) form of human intelligence. By
implication, they would have to be seamlessly integrated. Anyone here
working on such an integration?

On Thu, May 16, 2024 at 7:27 PM  wrote:

> What should symbolic approach include to entirely replace neural networks
> approach in creating true AI? Is that task even possible? What benefits and
> drawbacks we could expect or hope for if it is possible? If it is not
> possible, what would be the reasons?
>
> Thank you all for your time.
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M2b52d7820d191a6fa0078f55
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread James Bowery
First, fix quantum logic:

https://web.archive.org/web/20061030044246/http://www.boundaryinstitute.org/articles/Dynamical_Markov.pdf

Then realize that empirically true cases can occur not only in multiplicity
(OR), but with structure that includes the simultaneous (AND) measurement
dimensions of those cases.

But don't tell anyone because it might obviate the risible tradition of
so-called "type theories" in both mathematics and programming languages
(including SQL and all those "fuzzy logic" kludges) and people would get
*really* pissy at you.


On Thu, May 16, 2024 at 10:27 AM  wrote:

> What should symbolic approach include to entirely replace neural networks
> approach in creating true AI? Is that task even possible? What benefits and
> drawbacks we could expect or hope for if it is possible? If it is not
> possible, what would be the reasons?
>
> Thank you all for your time.
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M4e5f58df19d779da625ab70e
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread Mike Archbold
It seems like most approaches including symbolic only could eventually lead
to "true AI" if you mean ~ passing the Turing test, but it might take 100
years. There is a race to the finish aspect to AI though.

On Thu, May 16, 2024 at 8:27 AM  wrote:

> What should symbolic approach include to entirely replace neural networks
> approach in creating true AI? Is that task even possible? What benefits and
> drawbacks we could expect or hope for if it is possible? If it is not
> possible, what would be the reasons?
>
> Thank you all for your time.
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Mf21b8aa56f755a2e5104c181
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread Basile Starynkevitch


On 5/16/24 17:26, ivan.mo...@gmail.com wrote:
What should symbolic approach include to entirely replace neural 
networks approach in creating true AI? Is that task even possible? 
What benefits and drawbacks we could expect or hope for if it is 
possible? If it is not possible, what would be the reasons?



Expert system rules.

Generation of code (in C++ or machine code) thru declarative 
rules,including metarules generating rules and code.


Runtime Reflection by inspection of call stacks (e.g. using libbacktrace).

The RefPerSys  project (see http://refpersys.org/ 
and open source code on https://github.com/RefPerSys/RefPerSys ...) is 
developed with these ideas.


Email me for details.


Regards from near Paris in FrancePermalink  


--
Basile Starynkevitch
(only mine opinions / les opinions sont miennes uniquement)
8 rue de la Faïencerie, 92340 Bourg-la-Reine, France
web page: starynkevitch.net/Basile/
See/voir:https://github.com/RefPerSys/RefPerSys

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-M911b1bf07aaf1f24f0aaefc1
Delivery options: https://agi.topicbox.com/groups/agi/subscription


Re: [agi] Can symbolic approach entirely replace NN approach?

2024-05-16 Thread Jim Rutt
Seems unlikely as the first approach.  ANNs help us bridge things we have
little understanding of via brute force and lots of data.

Perhaps AFTER we get to ASI the ASI can figure out how to recode itself
symbolically, at huge gain (likely) in performance.

On Thu, May 16, 2024 at 11:27 AM  wrote:

> What should symbolic approach include to entirely replace neural networks
> approach in creating true AI? Is that task even possible? What benefits and
> drawbacks we could expect or hope for if it is possible? If it is not
> possible, what would be the reasons?
>
> Thank you all for your time.
> *Artificial General Intelligence List *
> / AGI / see discussions  +
> participants  +
> delivery options 
> Permalink
> 
>


-- 
Jim Rutt
My podcast: https://www.jimruttshow.com/

--
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T682a307a763c1ced-Ma7dbfbf1e4ae324a5c8a9ed1
Delivery options: https://agi.topicbox.com/groups/agi/subscription