On 7/12/07, Panu Horsmalahti <[EMAIL PROTECTED]> wrote:
>
> It is my understanding that the basic problem in Friendly AI is that it is
> possible for the AI to interpret the command "help humanity" etc wrong, and
> then destroy humanity (what we don't want it to do). The whole problem is to
> find some way to make it more probable to not destroy us all. It is correct
> that a simple sentence can be interpreted to mean something that we don't
> really mean, even though the interpretation is logical for the AI.
>

Right ... this is a special case of the problem that, if an AGI is allowed
to modify itself substantially (likely necessary if we want its intelligence
to progressively increase), then each successive self-modification may
interpret its original supergoals a little differently from the previous
one, so that one has a kind of "supergoal drift"...

There are many ways to formalize the above notion.  Here is one way that I
have played with...

Where X is an AI system, let F[X] denote a probability distribution over the
space of AI systems, defined as

F[X](Y, E) = the probability that X will self-modify itself into Y, given
environment E

Then, we can iterate F repeatedly from any initial A system X_0, obtaining a
probability distribution F^n for the distribution over AI systems achieved
after n successive self-modifications.

What we want is to create X_0 so as to maximize the odds that {an AI system
chosen randomly from F^n} will be judged by {an AI system chosen randomly
from F^(n-1)} as having the right supergoals.  Where the previous sentence
is to be interpreted with E equal to our actual universe (a vexing
dependency, since we don't know our universe all that well).

I have suggested a provably correct way to do this in some old articles
(which are offline, but I will put them back online), but, it was horribly
computationally intractable ... so in reality I have no idea how to achieve
this sort of thing with provable reliability.  Though intuitively I think
Novamente will fit the bill ;-)

-- Ben Goertzel




So, one wants to find a (supergoal, AI system) combination X_0 so that there
is a pathway

-- starting from X_0 as an initial condition
-- where X_i is capable of figuring out how to create X_(i+1)
-- where the X_i continually and rapidly increase in intelligence, as i
increases
-- where for each X_i,

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&id_secret=26319371-7b0de3

Reply via email to