AW: AW: AW: [agi] How general can be and should be AGI?

2008-05-02 Thread Dr. Matthias Heger


Matt Mahoney [mailto:[EMAIL PROTECTED] wrote

>>>
Object oriented programming is good for organizing software but I don't
think for organizing human knowledge.  It is a very rough
approximation.  We have used O-O for designing ontologies and expert
systems (IS-A links, etc), but this approach does not scale well and
does not allow for incremental learning from examples.  It totally does
not work for language modeling, which is the first problem that AI must
solve.
<

I agree that the O-O paradigm is not adequate to model all learning
algorithms and models we use. My own example of recognizing voices should
show that I have doubts that we use O-O models in our brain for everything
of our environment.

I think our brain learns a somewhat a hierarchical model of the world. And
the algorithm for the low level (e.g. voices, sounds) are probably complete
different from the algorithms for higher levels of our models. It is evident
that a child has learning capabilities that are far beyond those from an
adult. 
The reason is not only that the child's brain is nearly empty.
The physiological architecture is different to some degree. So we can expect
that learning the basic low levels of a world model requires algorithms
which we only have had as a child.
And the result of that learning is to some degree used for bias in later
learning algorithm when we are adult.

For example we had to learn to extract syllables from the sound wave of
spoken language. Learning the grammar rules are in higher levels. Learning
semantics is still higher and so on.

But it is a matter of fact that we use an O-O like model in the top-levels
of our world. 
You can see this also from language grammar. Subjects objects, predicates,
adjectives have their counterparts in the O-O paradigm.

A photo of a certain scene is physically an array of colored pixels. But you
can ask a human what he sees. And a possible answer could be:
Well, there is a house. A man walks to the door. It wears a blue shirt. A
woman looks through the window ...

Obviously, the answer shows a lot how people model the world in their
top-level (= conscious)
And obviously the model consists of interacting objects with attributes and
behavior.  
So knowledge representation at higher levels is indeed O-O like.

I think your and my answer show that we do not use a single algorithm which
is responsible to extract all the regularities from our perceptions.

And more important: There is physiological and psychological evidence that
the algorithms we use change to some degree during the first decade of our
life.



---
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com


Re: AW: AW: AW: [agi] How general can be and should be AGI?

2008-05-02 Thread Matt Mahoney
--- "Dr. Matthias Heger" <[EMAIL PROTECTED]> wrote:
> For example we had to learn to extract syllables from the sound wave
> of spoken language. Learning the grammar rules are in higher levels.
> Learning semantics is still higher and so on.

Actually that's only true in artificial languages.  Children learn
words with semantic content like "ball" and "milk" before they learn
function words like "the" and "of", in spite of their higher frequency.
 Techniques for parsing artificial languages fail for natural languages
because the parse depends on the meanings of the words, as in the
following example:

- I ate pizza with pepperoni.
- I ate pizza with a fork.
- I ate pizza with a friend.

> But it is a matter of fact that we use an O-O like model in the
> top-levels of our world. 
> You can see this also from language grammar. Subjects objects,
> predicates, adjectives have their counterparts in the O-O paradigm.

This is the false path of AI that so many have followed.  It seems so
obvious that high level knowledge has a compact representation like
Loves(John, Mary) that is easily represented on a 1960's era computer. 
We can just fill in the low level knowledge later.  This is completely
backwards from the way people learn.  The most spectacular failure is
Cyc's 20+ year effort to manually encode common sense knowledge and
their subsequent failure to attach a natural language interface.

The obvious hierarchical structure of knowledge has a neural
representation, as layers from simple to arbitrarily complex.  In
language, the structure is phonemes -> words -> semantics -> grammar. 
For vision (I am oversimplifying) it is pixels -> lines -> shapes ->
objects.  Learning is from simple to complex, training one layer at a
time.  Children learn phoneme recognition and basic visual perception
at a young age or not at all.

We should not expect that language modeling will be easier than vision.
The brain devotes similar volumes to each.  Long term memory tests by
Landauer [1] show similar learning rates for words and pictures, 2 bits
per second each.  We shy away from a neural approach because a
straightforward brain-sized neural network simulation would require
10^15 bits of memory and 10^16 operations per second.  We note long
term memory has 10^9 bits of complexity [1], so surely we can do
better. But so far we have not, nor have we any explanation why it
takes a million synapses to store one bit of information.

1. http://www.merkle.com/humanMemory.html



-- Matt Mahoney, [EMAIL PROTECTED]

---
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com


AW: AW: AW: AW: [agi] How general can be and should be AGI?

2008-05-02 Thread Dr. Matthias Heger


 Matt Mahoney [mailto:[EMAIL PROTECTED]  wrote

Actually that's only true in artificial languages.  Children learn
words with semantic content like "ball" and "milk" before they learn
function words like "the" and "of", in spite of their higher frequency.



Before they learn the words and their meanings they have to learn to
recognize the sounds for the words. And even if they use words like "with"
"of" and "the" later they must be able to separate these function-words and
relation-words from object-words before they learn any word.
But separating words means classifying words and that means knowledge of
grammar for a certain degree.




>>> Matt Mahoney [mailto:[EMAIL PROTECTED]  wrote
Techniques for parsing artificial languages fail for natural languages
because the parse depends on the meanings of the words, as in the
following example:

- I ate pizza with pepperoni.
- I ate pizza with a fork.
- I ate pizza with a friend.


In days of early AI the O-O paradigm was not so sophisticated as it is
today. The  phenomenon of your example is well-known in O-O paradigm and is
modeled by overwritten functions which means that
Objects may have several functions with the same name but with different
signatures.

eat(Food f)
eat(Food f, List l)
eat (Food f, List l)
eat (Food f, List l)
...

Maybe this example is too much simplified but I think it shows that 
the O-O paradigm is powerful enough to model very complex domains.
In fact nearly every software which is developed today uses the O-O paradigm
with great success. And the domains are manifold: From banking processes
over motor control in cars to simulating black holes and the big bang. We
can do it all with O-O based models.



> Matt Mahoney [mailto:[EMAIL PROTECTED]  wrote

>(Matthias Heger wrote)
 But it is a matter of fact that we use an O-O like model in the
> top-levels of our world. 
> You can see this also from language grammar. Subjects objects,
> predicates, adjectives have their counterparts in the O-O paradigm.

This is the false path of AI that so many have followed.  It seems so
obvious that high level knowledge has a compact representation like
Loves(John, Mary) that is easily represented on a 1960's era computer. 
We can just fill in the low level knowledge later.  This is completely
backwards from the way people learn.  ---
  Learning is from simple to complex, training one layer at a
time.  
<<<

I do not think, that we should try to write a ready to use- O-O model of the
world for AGI. Instead I think we agree, that there are underlying layers of
models which are not O-O like and which we do not understand now but which
are necessary to understand how our brain creates O-O like models of the
world.


I think, it is clear that there are representations like classes, objects,
relation between objects, attributes of objects.

But the crucial questions are:
How did we and do we build our O-O models?
How created the brain abstract concepts like "ball" and "milk"?
How do we find classes, objects and relations?



>> Matt Mahoney [mailto:[EMAIL PROTECTED]  wrote
We shy away from a neural approach because a
straightforward brain-sized neural network simulation would require
10^15 bits of memory and 10^16 operations per second.  We note long
term memory has 10^9 bits of complexity [1], so surely we can do
better. But so far we have not, nor have we any explanation why it
takes a million synapses to store one bit of information.

<

I think one reason for the apparent waste of memory is fault tolerance. For
example, when you sneeze 1 neurons die.
So there is need for a lot of redundancy in the brain.

The second reason is, that the brain is very strong in associations. If some
patterns are active in the brain (or we can also say classes if we use O-O
language) then different patterns (classes) become active and so on. So the
technical representation of these patterns extend over many neurons. You
cannot locate precisely the representation of simple classes because they
superpose each other over many neurons. I think the ability to find
associations between patterns costs a lot of resources of the brain. But of
course this ability is one of the most fruitful one in human-like
intelligence and seems to be necessary for the behavior of creativity.




---
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com


Language learning (was Re: AW: AW: AW: AW: [agi] How general can be and should be AGI?)

2008-05-02 Thread Matt Mahoney
--- "Dr. Matthias Heger" <[EMAIL PROTECTED]> wrote:

>  Matt Mahoney [mailto:[EMAIL PROTECTED]  wrote
> 
> Actually that's only true in artificial languages.  Children learn
> words with semantic content like "ball" and "milk" before they learn
> function words like "the" and "of", in spite of their higher
> frequency.
> 
> 
> 
> Before they learn the words and their meanings they have to learn to
> recognize the sounds for the words. And even if they use words like
> "with" "of" and "the" later they must be able to separate these
> function-words and
> relation-words from object-words before they learn any word.
> But separating words means classifying words and that means knowledge
> of grammar for a certain degree.

Lexical segmentation is learned before semantics, but other grammar is
learned afterwards.  Babies learn to segment continuous speech into
words at 7-10 months [1].  This is before they learn their first word,
but is detectable because babies will turn their heads in preference to
segmentable speech.

It is also possible to guess word divisions in text without spaces
given only a statistical knowledge of letter n-grams [2].

Natural language has a structure that makes it easy to learn
incrementally from examples with a sufficiently powerful neural
network.  It must, because any unlearnable features will disappear.


> >>> Matt Mahoney [mailto:[EMAIL PROTECTED]  wrote
> Techniques for parsing artificial languages fail for natural
> languages
> because the parse depends on the meanings of the words, as in the
> following example:
> 
> - I ate pizza with pepperoni.
> - I ate pizza with a fork.
> - I ate pizza with a friend.
> 
> 
> In days of early AI the O-O paradigm was not so sophisticated as it
> is
> today. The  phenomenon of your example is well-known in O-O paradigm
> and is modeled by overwritten functions which means that
> Objects may have several functions with the same name but with
> different signatures.
> 
> eat(Food f)
> eat(Food f, List l)
> eat (Food f, List l)
> eat (Food f, List l)
> ...

This type of knowledge representation has been tried and it leads to a
morass of rules and no intuition on how children learn grammar.  We do
not know how many grammar rules there are, but it probably exceeds the
number of words in our vocabulary, given how long it takes to learn.

> I think, it is clear that there are representations like classes,
> objects, relation between objects, attributes of objects.
> 
> But the crucial questions are:
> How did we and do we build our O-O models?
> How created the brain abstract concepts like "ball" and "milk"?
> How do we find classes, objects and relations?

We need to understand how children learn grammar without any concept of
what a noun or a verb is.  Also, how do people learn hierarchical
relationships before they learn what a hierarchy is?

1. Jusczyk, Peter W. (1996), "Investigations of the word segmentation
abilities of infants", 4'th Intl. Conf. on Speech and Language
Processing, Vol. 3, 1561-1564.

2. http://cs.fit.edu/~mmahoney/dissertation/lex1.html


-- Matt Mahoney, [EMAIL PROTECTED]

---
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com


AW: Language learning (was Re: AW: AW: AW: AW: [agi] How general can be and should be AGI?)

2008-05-02 Thread Dr. Matthias Heger

>> Matt Mahoney [mailto:[EMAIL PROTECTED] wrote

> eat(Food f)
> eat(Food f, List l)
> eat (Food f, List l)
> eat (Food f, List l)
> ...

This type of knowledge representation has been tried and it leads to a
morass of rules and no intuition on how children learn grammar.  We do
not know how many grammar rules there are, but it probably exceeds the
number of words in our vocabulary, given how long it takes to learn.

<<<

As I said, my intention is not to find a set of O-O like rules to create
AGI.
The fact that early approaches failed to build AGI by a set of similar rules
does not prove, that AGI cannot consist of such rules.

For example, there were also approaches to create AI by biological inspired
neural networks with some minor success but there was not the real
breakthrough too.

So this does not prove anything but that the problem of AGI is not so easy
to solve.

The brain is still a black box regarding many phenomenon.

We can analyze our own conscious thoughts and our communication which is
nothing else than sending ideas and thoughts from one brain to the other
brain via natural language.

I am convinced, that the structure and contents of our language is not
independent of the internal representation of knowledge.

And from language we must conclude that there are O-O like models in the
brain because the semantics is O-O.

There might be millions of classes and relationships.
And surely every day or night, the brain refactores parts of its model.

The roadmap to AGI will probably be top-down and not bottom-up.
The bottom-up approach is used by biological evolution.

Creating AGI by software engineering means that we first must know where we
want to go and then how to go there.

Human language and conscious thoughts suggests that AGI must be able to
represent the world O-O like at the top-level.
So this ability is the answer for the question where we want to go.

Again, this does not mean that we must find all the classes and objects. But
we must find an algorithm that generates O-O like models of its environment
based on its perceptions and some bias where the need for the bias can be
proven from reasons of performance.

We can expect that the top-level architecture of AGI is the easiest part in
an AGI project, because the contents of our own consciousness gives us some
hints (but not all) how our own world representation works at the top-level.
And this is O-O in my opinion. There is also a  phenomenon of associations
between patterns (classes). But this is just a question of retrieving
information and attention to relevant parts of the O-O model and is no
contradiction to the existence of the O-O paradigm.

When we go to lower levels, it is clear that difficulties arise.
The reason is that we have no possibility for conscious introspection of the
low levels in our brain. Science gives us hints mainly for the lowest levels
(chemistry, physics...).

So the medium layers of AGI will be the most difficult layers.
By the way this is also often the case in normal software.
In the medium layers there will be base functionalities and the framework
for the top-level. 





---
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com