--- rg <[EMAIL PROTECTED]> wrote:
Matt: Why will an AGI be friendly ?
The question only makes sense if you can define friendliness, which we
can't.
Why Matt, thank you for such a wonderful opening . . . . :-)
Friendliness *CAN* be defined. Furthermore, it is my contention that
Friendliness can be implemented reasonably easily ASSUMING an AGI platform
(i.e. it is just as easy to implement a Friendly AGI as it is to implement
an Unfriendly AGI).
I have a formal paper that I'm just finishing that presents my definition of
Friendliness and attempts to prove the above contention (and several others)
but would like to to do a preliminary acid test by presenting the core ideas
via several e-mails that I'll be posting over the next few days (i.e. y'all
are my lucky guinea pig initial audience :-). Assuming that the ideas
survive the acid test, I'll post the (probably heavily revised :-) formal
paper a couple of days later.
= = = = = = = = = =
PART 1.
The obvious initial starting point is to explicitly recognize that the point
of Friendliness is that we wish to prevent the extinction of the *human
race* and/or to prevent many other horrible nasty things that would make
*us* unhappy. After all, this is why we believe Friendliness is so
important. Unfortunately, the problem with this starting point is that it
biases the search for Friendliness in a direction towards a specific type of
Unfriendliness. In particular, in a later e-mail, I will show that several
prominent features of Eliezer Yudkowski's vision of Friendliness are
actually distinctly Unfriendly and will directly lead to a system/situation
that is less safe for humans.
One of the critically important advantages of my proposed definition/vision
of Friendliness is that it is an attractor in state space. If a system
finds itself outside (but necessarily somewhat/reasonably close) to an
optimally Friendly state -- it will actually DESIRE to reach or return to
that state (and yes, I *know* that I'm going to have to prove that
contention). While Eli's vision of Friendliness is certainly stable (i.e.
the system won't intentionally become unfriendly), there is no "force" or
desire helping it to return to Friendliness if it deviates somehow due to an
error or outside influence. I believe that this is a *serious* shortcoming
in his vision of the extrapolation of the collective volition (and yes, this
does mean that I believe both that Friendliness is CEV and that I,
personally, (and shortly, we collectively) can define a stable path to an
attractor CEV that is provably sufficient and arguably optimal and which
should hold up under all future evolution.
TAKE-AWAY: Friendliness is (and needs to be) an attractor CEV
PART 2 will describe how to create an attractor CEV and make it more obvious
why you want such a thing.
!! Let the flames begin !! :-)
-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com