Re: [singularity] Re: [agi] Motivational Systems that are stable

Richard Loosemore Sat, 28 Oct 2006 09:12:19 -0700

Ben Goertzel wrote:

Richard,


As I see it, in this long message you have given a conceptual sketch
of an AI design including a motivational subsystem and a cognitive
subsystem, connected via a complex network of continually adapting
connections.  You've discussed the way such a system can potentially
build up a self-model involving empathy and a high level of awareness,
and stability, etc.

All this makes sense, conceptually; though as you point out, the story
you give is short on details, and I'm not so sure you really know how
to "cash it out" in terms of mechanisms that will actually function
with adequate intelligence ... but that's another story...

However, you have given no argument as to why the failure of this kind
of architecture to be stably Friendly is so ASTOUNDINGLY UNLIKELY as
you claimed in your original email.  You have just argued why it's
plausible to believe such a system would probably have a stable goal
system.  As I see it, you did not come close to proving your original
claim, that

>> > The motivational system of some types of AI (the types you would
>> > classify as tainted by complexity) can be made so reliable that the
>> > likelihood of them becoming unfriendly would be similar to the

>> > likelihood of the molecules of an Ideal Gas suddenly deciding tosplit

>> > into two groups and head for opposite ends of their container.


I don't understand how this extreme level of reliability would be
achieved, in your design.

Rather, it seems to me that the reliance on complex, self-organizing
dynamics makes some degree of indeterminacy in the system almost
inevitable, thus making the system less than absolutely reliable.
Illustratng this point, humans (who are complex dynamical systems) are
certainly NOT reliable in terms of Friendliness or any other subtle
psychological property...

-- Ben G

The problem, Ben, is that your response amounts to "I don't see why thatwould work", but without any details. You ask no questions, norredescribe the proposal back to me in specific terms, so it is hard notto conclude that your comments are based on not understanding it.

You do go further at one point and say that you don't believe I can cashout the sketch in terms of mechanisms that work. I fail to see how canyou come to such a strong conclusion, when the rest of what you sayimplies (or says directly) that you do not understand the proposedmechanism.


**************

The central claim was that because the behavior of the system isconstrained by a large number of connections that go from motivationalmechanism to thinking mechanism, the latter is tightly governed. Youknow as well as I do about the power of massive numbers of weakconstraints. You know that as the numbers of constraints rise, thedepth of the potential well that they can define increases. I used thatgeneral idea, coupled with some details about a motivational system, toclaim that the latter would constrain the thinking mechanism in justthat way. That leads to the possibility of an extremely deep potentialwell which is the behavior we call Friendly.

You may disagree about the details, but in that case you should talkabout the details, not try to imply that there is no way whatsoever thata type of behavior could be extremely tightly constrained. That latterassertion is just wrong: there are ways to make a system verypredictable when multiple simultaneous weak constraints are applied. Soyou are in no position to just deny *that* part. In principle, that isdoable. What matters is how I propose to get those multiple constraintsto work. I gave details. You do not respond with arguments against anyof those details.


**********
ASIDE

In case anyone else is reading this and is puzzled by the idea ofmultiple weak constraints, let me give a classic example, due to Hinton:


  There is an unknown thing (call it "x").
  x is constrained in three ways, each of which is extremely vague.

Actually, it is worse than that: one of the constraints is actuallywrong. (But I will not tell you which one).

  Here are the three constraints:

  1       x was intelligent
  2       x was once an actor
  3       x was once a president

Of all the things in the universe that x could be, most people arecapable of identifying what x referred to. (Or were, in the 1980s, whenthis example was proposed). And yet there were only three extremelyweak pieces of information: weak because of ambiguity, and because oneof the pieces of information was not even reliable.

Now imagine an x constrained from a thousand different directions atonce. In principle, the value of x could be pinned down extremelyprecisely. For what it is worth, this is the basic reason why neuralnets work as well as they do.



**********

As for the last paragraph, the point you make there is prettyunbelievable. You are claiming that human beings are complex dynamicalsystems, and that they are not reliable in terms of Friendliness, andthat therefore....


This paragraph is just a collection of non-sequiteurs:

1) Are you saying that humans in general "are not reliable", or thatevery individual that ever existed is unreliable? I accept that ingeneral humans can be, but it is silly to say that every individual isunreliable, just because somewhere there are other individuals that arenot reliable. Human individuals have very different instances of thesame motivational system design.

2) At what point in my description did I make use of the idea that thehuman mind is a complex dynamical system? (I did not: that is amistaken interpretation of what I said). I make a distinction betweendifferent manifestations of complexity, and the one most important to mewould not attract some of the implications that most people seem toassociate with "complex dynamical system". The point is not veryrelevant to the mechanism I described.

3) And what would the argument in this paragraph have to do with themechanism I proposed, which at no point relied on using *exactly* thesame mechanism as the human one? Why infer that my proposal wouldsuffer the same problems that exist in some human cases? And expeciallywhen I talked about the differences (not having certain certainmotivations in common, for example, and with the AI taking active stepsto make itself different by increasing the stabilization).

5) And why does "some degree of indeterminacy in the system ... [ make ]the system less than absolutely reliable"? This is just a flat outcontradiction of my thesis, thrown out with no justification. And this,after I had gone to so much trouble to cite counterexamples: an IdealGas is riddled with indeterminacy, and yet the relationship betweentemperature, pressure and volume is extremely reliable; the Sun is aseething cauldron of indeterminacy, and yet nobody worries about thenon-zero probability that it might quantum-tunnel itself to somewhere inthe vicinity of Betelgeuse....

I think it is okay that you did not understand it, but I think it wouldbe fairer to ask for more detail before rushing to judgment.



Richard Loosemore




-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]

Re: [singularity] Re: [agi] Motivational Systems that are stable

Reply via email to