On 7/12/07, Panu Horsmalahti <[EMAIL PROTECTED]> wrote: > > It is my understanding that the basic problem in Friendly AI is that it is > possible for the AI to interpret the command "help humanity" etc wrong, and > then destroy humanity (what we don't want it to do). The whole problem is to > find some way to make it more probable to not destroy us all. It is correct > that a simple sentence can be interpreted to mean something that we don't > really mean, even though the interpretation is logical for the AI. >
Right ... this is a special case of the problem that, if an AGI is allowed to modify itself substantially (likely necessary if we want its intelligence to progressively increase), then each successive self-modification may interpret its original supergoals a little differently from the previous one, so that one has a kind of "supergoal drift"... There are many ways to formalize the above notion. Here is one way that I have played with... Where X is an AI system, let F[X] denote a probability distribution over the space of AI systems, defined as F[X](Y, E) = the probability that X will self-modify itself into Y, given environment E Then, we can iterate F repeatedly from any initial A system X_0, obtaining a probability distribution F^n for the distribution over AI systems achieved after n successive self-modifications. What we want is to create X_0 so as to maximize the odds that {an AI system chosen randomly from F^n} will be judged by {an AI system chosen randomly from F^(n-1)} as having the right supergoals. Where the previous sentence is to be interpreted with E equal to our actual universe (a vexing dependency, since we don't know our universe all that well). I have suggested a provably correct way to do this in some old articles (which are offline, but I will put them back online), but, it was horribly computationally intractable ... so in reality I have no idea how to achieve this sort of thing with provable reliability. Though intuitively I think Novamente will fit the bill ;-) -- Ben Goertzel So, one wants to find a (supergoal, AI system) combination X_0 so that there is a pathway -- starting from X_0 as an initial condition -- where X_i is capable of figuring out how to create X_(i+1) -- where the X_i continually and rapidly increase in intelligence, as i increases -- where for each X_i, ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=4007604&id_secret=26319371-7b0de3