Matt,

When you said (in the text below):

In every practical case of machine learning, whether it is with decision trees, neural networks, genetic algorithms, linear regression, clustering, or whatever, the problem is you are given training pairs (x,y) and you have to choose a hypothesis h from a
hypothesis space H that best classifies novel test instances, h(x) =
y.

... you did *exactly* what I was complaining about. Correct me if I am wrong, but it looks like you just declared "learning" to be a particular class of mathematical optimization problem, without making reference to the fact that there is a more general meaning of "learning" that is vastly more complex than your above definition.

What I wanted was a set of non-circular definitions of such terms as "intelligence" and "learning", so that you could somehow *demonstrate* that your mathematical idealization of these terms correspond with the real thing, ... so that we could believe that the mathematical idealizations were not just a fantasy.

If what you gave was supposed to be a definition, then it was circular (you defined learning to *be* the idealization).

The rest of what you say (about Occam's Razor etc.) is irrelevant if you or Hutter cannot prove something more than a hand-waving connection between the mathematical idealizations of "intelligence," "learning," etc., and the original meanings of those words.

So my original request stands unanswered.


Richard Loosemore.



P.S. The above definition is broken anyway: what about unsupervised learning? What about learning by analogy?




Matt Mahoney wrote:
--- Richard Loosemore <[EMAIL PROTECTED]> wrote:

Matt Mahoney wrote:
--- Richard Loosemore <[EMAIL PROTECTED]> wrote:

Matt Mahoney wrote:
As you probably know, Hutter proved that the optimal behavior of a
goal seeking agent in an unknown environment (modeled as a pair of
interacting Turing machines, with the enviroment sending an
additional reward signal to the agent that the agent seeks to
maximize) is for the agent to guess at each step that the environment
is modeled by the shortest program consistent with the observed
interaction so far.  The proof requires the assumption that the
environment be computable.  Essentially, the proof says that Occam's
Razor is the best general strategy for problem solving.  The fact
that this works in practice strongly suggests that the universe is
indeed a simulation.
It suggests nothing of the sort.

Hutter's theory is a mathematical fantasy with no relationship to the real world.
Hutter's theory makes a very general statement about the optimal behavior
of
rational agents.  Is this really irrelevant to the field of machine
learning?

Define "rational agent".

Define "optimal behavior".

In the framework of Hutter's AIXI, optimal behavior is the behavior that
maximizes the accumulated reward signal from the environment.  In general,
this problem is not computable.  (It is equivalent to solving the Kolmogorov
complexity of the environment).  An agent with limited computational resources
is rational if it chooses the best strategy within those limits for maximizing
its accumulated reward signal (in general, a suboptimal solution).

Then prove that a "rational agent" following "optimal behavior" is actually "intelligent" (as we in colloquial speech use the word "intelligent"), and do this *without* circularly defining the meaning of intelligence to be, in effect, the optimal behavior of a rational agent.

Turing defined an agent as intelligent if communication with it is
indistinguishable from human.  This is not the same as rational behavior, but
it is probably the best definition we have.

One caveat:

Don't come back and ask me to be precise about what we in colloquial speech mean when we use the word "intelligent," because some of us who reject this theory would state that the term does not have an analytic definition, only an empirical one.

Your position, on the other hand, is that a precise definition does exist and that you know what it is when you say that a "rational agent" following "optimal behavior" is an "intelligent" system.

For this reason the onus is on you (and not me) to say what intelligence is.

My claim is that you cannot, without circularity, prove that "rational agents" following "optimal behavior" are the same thing as intelligent systems, and for that reason your use of all of these terms is just unsubstantiated speculation. Labels attached to an abstract mathematical formalism with nothing but your intuition in the way of justification.

This unsubstantiated speculation then escalates into a zone of complete nonsense when it talks about hypothetical systems of infinite size and power, without showing in any way why we should believe that the properties of such infinitely large systems carry over to systems in the real world.

Hence, it is a mathematical fantasy with no relationship to the real world.

QED.



Richard Loosemore.

Hutter realizes that optimal behavior is not computable, and that even his
space and time restricted case, AIXI^tl is intractable.  That is not the
point.  In every practical case of machine learning, whether it is with
decision trees, neural networks, genetic algorithms, linear regression,
clustering, or whatever, the problem is you are given training pairs (x,y) and
you have to choose a hypothesis h from a hypothesis space H that best
classifies novel test instances, h(x) = y.  Most papers on machine learning
start with some specific data set, and then the authors choose H in some
ad-hoc fashion and claim that their method is superior to prior work.  What
Hutter did was to find a general principle.  The best h is the algorithmically
simplest hypothesis that is consistent with the training data.  For many
practical forms of machine learning, this is quite practical.  For example,
the principle says to select the neural network with the fewest connections or
the decision tree with the fewest branches that fits the training data. Researchers have already been doing this, but now we know why.

But this is tangential to my original point.  AIXI applies not just to machine
learning, but also to real life.  Scientists look for the "most elegant"
theory to explain experimental data.  Usually, "most elegant" means the
simplest, the easiest to explain.  This is nothing new.  William of Ockham
noted in the 14th century that the simplest explanation is usually the best. http://en.wikipedia.org/wiki/Ockham%27s_razor

What I argue is this: the fact that Occam's Razor holds suggests that the
universe is a computation.  It may or may not be.  We are programmed to
believe the universe is real, so I expect some disagreement.



-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=11983



-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?list_id=11983

Reply via email to