On Wed, Apr 24, 2002 at 04:51:18PM +0200, Marcus Hutter wrote: > In "A Theory of Universal Artificial Intelligence based on > Algorithmic Complexity" http://www.idsia.ch/~marcus/ai/pkcunai.htm > I developed a rational decision maker which makes optimal > decisions in any environment. The only assumption I make is that > the environment is sampled from a computable (but unknown!) > probability distribution (or in a deterministic world is > computable), which should fit nicely into the basic assumptions of > this list. Although logic plays a role in optimal resource bounded > decisions, it plays no role in the unrestricted model. > > I would be pleased to see this work discussed here.
I'm glad to see you bring it up, because I do want to discuss it. :) For people who haven't read Marcus's paper, the model consist of two computers, one representing an intelligent being, and the other one the environment, communicating with each other. The subject sends its decisions to the environment, and the environment sends information and rewards to the subject. The subject's goal is to maximize the sum of rewards over some time period. The paper then presents an algorithm that solves the subject's problem, and shows that it's close to optimal in some sense. In this model, the real goals of the subject (who presumably wants to acomplish objectives other than maximizing some abstract number) are encoded in the environment algorithm. But how can the environment algorithm be smart enough to evaluate the decisions of the subject? Unless the evaluation part of the environment algorithm is as intelligent as the subject, you'll have problems with the subject exploiting vulnerabilities in the evaluation algorithm to obtain rewards without actually acomplishing any real objectives. You can see an example of this problem in drug abusers. If we simply assume that the environment is smart enough, then we've just moved the problem around. So, how can we change the model so that the evaluation algorithm is part of the subject rather than the environment? First we have to come up with some way to formalize the real objectives of the subject. I think the formalism must be able to handle objectives that are about the internal state of the environment, rather than just the information the subject receives from the environment, otherwise we can't explain why people care about things that they'll never see, for example things that happen after they die. Then we would invent a universal decision algorithm for acomplishing any set of objectives and show that it's close to optimal. This seems very difficult because we'll have to talk about the internal state of general algorithms, which we have very little theory for.