Re: [singularity] Motivational Systems that are stable

Richard Loosemore Sat, 28 Oct 2006 21:02:04 -0700


Ben,

There is something about the gist of your response that seemed strangeto me, but I think I have put my finger on it: I am proposing a general*class* of architectures for an AI-with-motivational-system. I am notsaying that this is a specific instance (with all the details naileddown) of that architecture, but an entire class..... an approach.

However, as I explain in detail below, most of your criticisms are thatthere MIGHT be instances of that architecture that do not work. Out ofthe countless possible instantiations of my proposed architecture, youare searching for SOME that might not work. But so what if they don't?It has no consequences for my argument.

To see this more vividly, consider the following analogy, which capturesthe type of argument going on here. Imagine we are back in the late1800s and someone claimed that the new-fangled four-wheeled automobilesCANNOT be used on the battlefield because they will bog down intrenches. I disagree with this person and say that I know of a type ofvehicle that CAN be used even where there are trenches.

They challenge me to propose such a thing. So I describe the *class* ofvehicles that uses tracks instead of wheels. I don't describe aparticular instance of a tracked vehicle, just the class, saying that INPRINCIPLE this class of vehicle could do the job.

But then the type of replies I get are like these [this is not a parody,btw, just a genuine attempt to illuminate the argument]:

1) The existence of tracks does not guarantee that it will crosstrenches: what if the tracks are only 2 feet long? ("The existence of

a large number of constraints does not intrinsically imply "tight

governance." My Response: No, but only in weird instances of myproposed architecture would tight governance be a difficult thing toarrange, so why would I care about those weird cases?)

2) Okay, so the tracks could be long enough, but you haven't presentedany arguments to show the vehicle will be flexible enough to turncorners. ("But the question then becomes whether this set of constraintscan simultaneously provide ... the flexibility of governance needed topermit general, broad-based learning". My Response: I don'tunderstand: why would it *not* be capable of general, broad-basedlearning? I can see how it might be possible for such a problem toarise, but only in weird instances of the architecture I proposed, notin the general case. Please explain why this might be a generalproperty of the class of systems, because I don't think it is.)

3) Well, I wonder if it would be possible, using this tracked vehicle,to really do all the required tasks and carry all the required equipment... maybe it is possible, but you don't give an argument re this point.("I just wonder if, in this sort of architecture you describe, it isreally possible to guarantee Friendliness without hampering creativelearning. Maybe it is possible, but you don't give an argument re thispoint." My Response: Why would you even suspect that 'creativelearning' might be a problem? I gave no argument re that point, becauseI cannot see any way that it should be a problem. Please explain whythis would follow.)

4) Yes, but your whole argument seems to assume tracks on only the firstgeneration of vehicles, not on all future production models. ("However,your whole argument seems to assume an AGI with a fixed level ofintelligence, rather than a constantly self-modifying and improving AGI.If an AGI is rapidly increasing its hardware infrastructure and itsintelligence, then I maintain that guaranteeing its Friendliness isprobably impossible ... and your argument gives no way of getting aroundthis." My Response: I'm afraid this point is simply not true... Ivery specifically said that once you had built the first AI with thisarchitecture, it would then choose to augment itself into a new systemwith the same constraints. It understands the significance of not doingso, and therefore will take the necessary steps. I cannot answer thatpoint any plainer than I did. I certainly said nothing at all thatimplied a fixed level of AI intelligence.)



At the end you make this point, which I will deal with directly:

> In a radically self-improving AGI built according to your
> architecture, the set of constraints would constantly be increasing in
> number and complexity ... in a pattern based on stimuli from the
> environment as well as internal stimuli ... and it seems to me you
> have no way to guarantee based on the smaller **initial** set of
> constraints, that the eventual larger set of constraints is going to
> preserve "Friendliness" or any other criterion.

On the contrary, this is a system that grows by adding new ideas whosemotivatonal status must be consistent with ALL of the previous ones, andthe longer the system is allowed to develop, the deeper the new ideasare constrained by the sum total of what has gone before.

Thus: if the system has grown up and acquired a huge number of examplesand ideas about what constitutes good behavior according to its internalsystem of values, then any new ideas about new values must, because ofthe way the system is designed, prove themselves by being comparedagainst all of the old ones. If it starts to contemplate a new ideathat is wildly different from the status quo, there is a ridiculouslysmall chance that the new idea will make it through without the systemnoticing that it is grossly inconsistent with a lot of other acceptednorms of behavior. It is precisely because of the growing number ofconstraints that the system will find it harder and harder to deviate.

And I said "ridiculously small chance" advisedly: if 10,000 previousconstraints apply to each new motivational idea, and if 9,900 of themsay 'Hey, this is inconsistent with what I think is a good thing to do',then it doesn't have a snowball's chance in hell of getting accepted.THIS is the deep potential well I keep referring to.

I maintain that we can, during early experimental work, understand thestructure of the motivational system well enough to get it up to athreshold of acceptably friendly behavior, and that beyond that pointits stability will be self-reinforcing, for the above reasons.



A final, and important point:

Overall, the goal would be, not to give it an intrinsic definition of"friendliness", but actually something closer to a "desire to discoverand empathize with the normative aspirations of the human species". Inother words, make it want to be part of the community of humankind, theway that you and I, as compassionate individual humans, want to be partof that community. Make it one of us, but without certain irrasciblemotivational mechanisms that evolution stuck us with.

Get that far, and from there on out the problem is solved, because itwill be trying to find out what we, as a whole, want for ourselves.



Richard Loosemore.




Ben Goertzel wrote:

Hi,

The problem, Ben, is that your response amounts to "I don't see why that
would work", but without any details.


The problem, Richard, is that you did not give any details as to why
you think your proposal will "work" (in the sense of delivering a
system whose Friendliness can be very confidently known)

The central claim was that because the behavior of the system is
constrained by a large number of connections that go from motivational
mechanism to thinking mechanism, the latter is tightly governed.


But this claim, as stated, seems not to be true....  The existence of
a large number of constraints does not intrinsically imply "tight
governance."

Of course, though, one can posit the existence of a large number of
constraints that DOES provide tight governance.

But the question then becomes whether this set of constraints can
simultaneously provide

a) the tightness of governance needed to guarantee Friendliness

b) the flexibility of governance needed to permit general, broad-basedlearning


You don't present any argument as to why this is going to be the case....

I just wonder if, in this sort of architecture you describe, it is
really possible to guarantee Friendliness without hampering creative
learning.  Maybe it is possible, but you don't give an argument re
this point.

Actually, I suspect that it probably **is** possible to make a
reasonably benevolent AGI according to the sort of NN architecture you
suggest ... (as well as according to a bunch of other sorts of
architectures)

However, your whole argument seems to assume an AGI with a fixed level
of intelligence, rather than a constantly self-modifying and improving
AGI.  If an AGI is rapidly increasing its hardware infrastructure and
its intelligence, then I maintain that guaranteeing its Friendliness
is probably impossible ... and your argument gives no way of getting
around this.

In a radically self-improving AGI built according to your
architecture, the set of constraints would constantly be increasing in
number and complexity ... in a pattern based on stimuli from the
environment as well as internal stimuli ... and it seems to me you
have no way to guarantee based on the smaller **initial** set of
constraints, that the eventual larger set of constraints is going to
preserve "Friendliness" or any other criterion.

-- Ben

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]


-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]

Re: [singularity] Motivational Systems that are stable

Reply via email to