Re: [agi] Religion-free technical content

Richard Loosemore Mon, 01 Oct 2007 06:21:28 -0700

Matt Mahoney wrote:

--- Richard Loosemore <[EMAIL PROTECTED]> wrote:

Derek Zahn wrote:

Richard Loosemore writes:

 > It is much less opaque.
 >
 > I have argued that this is the ONLY way that I know of to ensure that
 > AGI is done in a way that allows safety/friendliness to be guaranteed.
 >
 > I will have more to say about that tomorrow, when I hope to make an
 > announcement.
Cool. I'm sure I'm not the only one eager to see how you can guarantee(read: prove) such specific detailed things about the behaviors of acomplex system.

Hmmm... do I detect some skepticism?  ;-)


I remain skeptical.  Your argument applies to an AGI not modifying its own
motivational system.  It does not apply to an AGI making modified copies of
itself.  In fact you say:

Not correct, I am afraid: I specifically emphasize that the AGI isallowed to modify its own motivational system. I don't know how you gotthe opposite idea. (I haven't had time to review my text, so apologiesif it was my fault and I did accidentally give the wrong impression ....but the whole point of this essay was to suggest a way to guranteefriendliness under any circumstances, including self-improvement).

Also, during the development of the first true AI, we would monitor theconnections going from motivational system to thinking system. It wouldbe easy to set up alarm bells if certain kinds of thoughts started totake hold -- just do it by associating with certain keys sets ofconcepts and keywords. While we are designing a stable motivationalsystem, we can watch exactly what goes on, and keep tweeking until itgets to a point where it is clearly not going to get out of the largepotential well.


I do not see how this illustrates your point above.

You refer to the humans building the first AGI.  Humans, being imperfect,
might not get the algorithm for friendliness exactly right in the first
iteration.  So it will be up to the AGI to tweak the second copy a little more
(according to the first AGI's interpretation of friendliness).  And so on.  So
the goal drifts a little with each iteration.  And we have no control over
which way it drifts.


What an extraordinary statement to make!

The purpose of the essay was to argue that with each iteration it digsitself deeper into the same pattern and cannot drift out into anunfriendly state.

But you reply to this by just stating that the opposite is going to bethe case, without saying why. Which part of my argument did you decidewas wrong, that you could state the opposite conclusion?




Richard Loosemore




-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=48412669-8b7478

Re: [agi] Religion-free technical content

Reply via email to