--- Richard Loosemore <[EMAIL PROTECTED]> wrote: > Matt Mahoney wrote: > > --- Richard Loosemore <[EMAIL PROTECTED]> wrote: > > > >> Matt Mahoney wrote: > >>> As you probably know, Hutter proved that the optimal behavior of a > >>> goal seeking agent in an unknown environment (modeled as a pair of > >>> interacting Turing machines, with the enviroment sending an > >>> additional reward signal to the agent that the agent seeks to > >>> maximize) is for the agent to guess at each step that the environment > >>> is modeled by the shortest program consistent with the observed > >>> interaction so far. The proof requires the assumption that the > >>> environment be computable. Essentially, the proof says that Occam's > >>> Razor is the best general strategy for problem solving. The fact > >>> that this works in practice strongly suggests that the universe is > >>> indeed a simulation. > >> > >> It suggests nothing of the sort. > >> > >> Hutter's theory is a mathematical fantasy with no relationship to the > >> real world. > > > > Hutter's theory makes a very general statement about the optimal behavior > of > > rational agents. Is this really irrelevant to the field of machine > learning? > > Define "rational agent". > > Define "optimal behavior".
In the framework of Hutter's AIXI, optimal behavior is the behavior that maximizes the accumulated reward signal from the environment. In general, this problem is not computable. (It is equivalent to solving the Kolmogorov complexity of the environment). An agent with limited computational resources is rational if it chooses the best strategy within those limits for maximizing its accumulated reward signal (in general, a suboptimal solution). > Then prove that a "rational agent" following "optimal behavior" is > actually "intelligent" (as we in colloquial speech use the word > "intelligent"), and do this *without* circularly defining the meaning of > intelligence to be, in effect, the optimal behavior of a rational agent. Turing defined an agent as intelligent if communication with it is indistinguishable from human. This is not the same as rational behavior, but it is probably the best definition we have. > > One caveat: > > Don't come back and ask me to be precise about what we in colloquial > speech mean when we use the word "intelligent," because some of us who > reject this theory would state that the term does not have an analytic > definition, only an empirical one. > > Your position, on the other hand, is that a precise definition does > exist and that you know what it is when you say that a "rational agent" > following "optimal behavior" is an "intelligent" system. > > For this reason the onus is on you (and not me) to say what intelligence is. > > My claim is that you cannot, without circularity, prove that "rational > agents" following "optimal behavior" are the same thing as intelligent > systems, and for that reason your use of all of these terms is just > unsubstantiated speculation. Labels attached to an abstract > mathematical formalism with nothing but your intuition in the way of > justification. > > This unsubstantiated speculation then escalates into a zone of complete > nonsense when it talks about hypothetical systems of infinite size and > power, without showing in any way why we should believe that the > properties of such infinitely large systems carry over to systems in the > real world. > > Hence, it is a mathematical fantasy with no relationship to the real world. > > QED. > > > > Richard Loosemore. Hutter realizes that optimal behavior is not computable, and that even his space and time restricted case, AIXI^tl is intractable. That is not the point. In every practical case of machine learning, whether it is with decision trees, neural networks, genetic algorithms, linear regression, clustering, or whatever, the problem is you are given training pairs (x,y) and you have to choose a hypothesis h from a hypothesis space H that best classifies novel test instances, h(x) = y. Most papers on machine learning start with some specific data set, and then the authors choose H in some ad-hoc fashion and claim that their method is superior to prior work. What Hutter did was to find a general principle. The best h is the algorithmically simplest hypothesis that is consistent with the training data. For many practical forms of machine learning, this is quite practical. For example, the principle says to select the neural network with the fewest connections or the decision tree with the fewest branches that fits the training data. Researchers have already been doing this, but now we know why. But this is tangential to my original point. AIXI applies not just to machine learning, but also to real life. Scientists look for the "most elegant" theory to explain experimental data. Usually, "most elegant" means the simplest, the easiest to explain. This is nothing new. William of Ockham noted in the 14th century that the simplest explanation is usually the best. http://en.wikipedia.org/wiki/Ockham%27s_razor What I argue is this: the fact that Occam's Razor holds suggests that the universe is a computation. It may or may not be. We are programmed to believe the universe is real, so I expect some disagreement. -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=11983