Ben,

There is something about the gist of your response that seemed strange to me, but I think I have put my finger on it: I am proposing a general *class* of architectures for an AI-with-motivational-system. I am not saying that this is a specific instance (with all the details nailed down) of that architecture, but an entire class..... an approach.

However, as I explain in detail below, most of your criticisms are that there MIGHT be instances of that architecture that do not work. Out of the countless possible instantiations of my proposed architecture, you are searching for SOME that might not work. But so what if they don't? It has no consequences for my argument.


To see this more vividly, consider the following analogy, which captures the type of argument going on here. Imagine we are back in the late 1800s and someone claimed that the new-fangled four-wheeled automobiles CANNOT be used on the battlefield because they will bog down in trenches. I disagree with this person and say that I know of a type of vehicle that CAN be used even where there are trenches.

They challenge me to propose such a thing. So I describe the *class* of vehicles that uses tracks instead of wheels. I don't describe a particular instance of a tracked vehicle, just the class, saying that IN PRINCIPLE this class of vehicle could do the job.

But then the type of replies I get are like these [this is not a parody, btw, just a genuine attempt to illuminate the argument]:

1) The existence of tracks does not guarantee that it will cross trenches: what if the tracks are only 2 feet long? ("The existence of
a large number of constraints does not intrinsically imply "tight
governance." My Response: No, but only in weird instances of my proposed architecture would tight governance be a difficult thing to arrange, so why would I care about those weird cases?)

2) Okay, so the tracks could be long enough, but you haven't presented any arguments to show the vehicle will be flexible enough to turn corners. ("But the question then becomes whether this set of constraints can simultaneously provide ... the flexibility of governance needed to permit general, broad-based learning". My Response: I don't understand: why would it *not* be capable of general, broad-based learning? I can see how it might be possible for such a problem to arise, but only in weird instances of the architecture I proposed, not in the general case. Please explain why this might be a general property of the class of systems, because I don't think it is.)

3) Well, I wonder if it would be possible, using this tracked vehicle, to really do all the required tasks and carry all the required equipment ... maybe it is possible, but you don't give an argument re this point. ("I just wonder if, in this sort of architecture you describe, it is really possible to guarantee Friendliness without hampering creative learning. Maybe it is possible, but you don't give an argument re this point." My Response: Why would you even suspect that 'creative learning' might be a problem? I gave no argument re that point, because I cannot see any way that it should be a problem. Please explain why this would follow.)

4) Yes, but your whole argument seems to assume tracks on only the first generation of vehicles, not on all future production models. ("However, your whole argument seems to assume an AGI with a fixed level of intelligence, rather than a constantly self-modifying and improving AGI. If an AGI is rapidly increasing its hardware infrastructure and its intelligence, then I maintain that guaranteeing its Friendliness is probably impossible ... and your argument gives no way of getting around this." My Response: I'm afraid this point is simply not true... I very specifically said that once you had built the first AI with this architecture, it would then choose to augment itself into a new system with the same constraints. It understands the significance of not doing so, and therefore will take the necessary steps. I cannot answer that point any plainer than I did. I certainly said nothing at all that implied a fixed level of AI intelligence.)


At the end you make this point, which I will deal with directly:

> In a radically self-improving AGI built according to your
> architecture, the set of constraints would constantly be increasing in
> number and complexity ... in a pattern based on stimuli from the
> environment as well as internal stimuli ... and it seems to me you
> have no way to guarantee based on the smaller **initial** set of
> constraints, that the eventual larger set of constraints is going to
> preserve "Friendliness" or any other criterion.

On the contrary, this is a system that grows by adding new ideas whose motivatonal status must be consistent with ALL of the previous ones, and the longer the system is allowed to develop, the deeper the new ideas are constrained by the sum total of what has gone before.

Thus: if the system has grown up and acquired a huge number of examples and ideas about what constitutes good behavior according to its internal system of values, then any new ideas about new values must, because of the way the system is designed, prove themselves by being compared against all of the old ones. If it starts to contemplate a new idea that is wildly different from the status quo, there is a ridiculously small chance that the new idea will make it through without the system noticing that it is grossly inconsistent with a lot of other accepted norms of behavior. It is precisely because of the growing number of constraints that the system will find it harder and harder to deviate.

And I said "ridiculously small chance" advisedly: if 10,000 previous constraints apply to each new motivational idea, and if 9,900 of them say 'Hey, this is inconsistent with what I think is a good thing to do', then it doesn't have a snowball's chance in hell of getting accepted. THIS is the deep potential well I keep referring to.

I maintain that we can, during early experimental work, understand the structure of the motivational system well enough to get it up to a threshold of acceptably friendly behavior, and that beyond that point its stability will be self-reinforcing, for the above reasons.


A final, and important point:

Overall, the goal would be, not to give it an intrinsic definition of "friendliness", but actually something closer to a "desire to discover and empathize with the normative aspirations of the human species". In other words, make it want to be part of the community of humankind, the way that you and I, as compassionate individual humans, want to be part of that community. Make it one of us, but without certain irrascible motivational mechanisms that evolution stuck us with.

Get that far, and from there on out the problem is solved, because it will be trying to find out what we, as a whole, want for ourselves.


Richard Loosemore.




Ben Goertzel wrote:
Hi,

The problem, Ben, is that your response amounts to "I don't see why that
would work", but without any details.

The problem, Richard, is that you did not give any details as to why
you think your proposal will "work" (in the sense of delivering a
system whose Friendliness can be very confidently known)

The central claim was that because the behavior of the system is
constrained by a large number of connections that go from motivational
mechanism to thinking mechanism, the latter is tightly governed.

But this claim, as stated, seems not to be true....  The existence of
a large number of constraints does not intrinsically imply "tight
governance."

Of course, though, one can posit the existence of a large number of
constraints that DOES provide tight governance.

But the question then becomes whether this set of constraints can
simultaneously provide

a) the tightness of governance needed to guarantee Friendliness

b) the flexibility of governance needed to permit general, broad-based learning

You don't present any argument as to why this is going to be the case....

I just wonder if, in this sort of architecture you describe, it is
really possible to guarantee Friendliness without hampering creative
learning.  Maybe it is possible, but you don't give an argument re
this point.

Actually, I suspect that it probably **is** possible to make a
reasonably benevolent AGI according to the sort of NN architecture you
suggest ... (as well as according to a bunch of other sorts of
architectures)

However, your whole argument seems to assume an AGI with a fixed level
of intelligence, rather than a constantly self-modifying and improving
AGI.  If an AGI is rapidly increasing its hardware infrastructure and
its intelligence, then I maintain that guaranteeing its Friendliness
is probably impossible ... and your argument gives no way of getting
around this.

In a radically self-improving AGI built according to your
architecture, the set of constraints would constantly be increasing in
number and complexity ... in a pattern based on stimuli from the
environment as well as internal stimuli ... and it seems to me you
have no way to guarantee based on the smaller **initial** set of
constraints, that the eventual larger set of constraints is going to
preserve "Friendliness" or any other criterion.

-- Ben

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]



-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to