Ben Goertzel wrote:
Richard,

As I see it, in this long message you have given a conceptual sketch
of an AI design including a motivational subsystem and a cognitive
subsystem, connected via a complex network of continually adapting
connections.  You've discussed the way such a system can potentially
build up a self-model involving empathy and a high level of awareness,
and stability, etc.

All this makes sense, conceptually; though as you point out, the story
you give is short on details, and I'm not so sure you really know how
to "cash it out" in terms of mechanisms that will actually function
with adequate intelligence ... but that's another story...

However, you have given no argument as to why the failure of this kind
of architecture to be stably Friendly is so ASTOUNDINGLY UNLIKELY as
you claimed in your original email.  You have just argued why it's
plausible to believe such a system would probably have a stable goal
system.  As I see it, you did not come close to proving your original
claim, that

>> > The motivational system of some types of AI (the types you would
>> > classify as tainted by complexity) can be made so reliable that the
>> > likelihood of them becoming unfriendly would be similar to the
>> > likelihood of the molecules of an Ideal Gas suddenly deciding to split
>> > into two groups and head for opposite ends of their container.

I don't understand how this extreme level of reliability would be
achieved, in your design.

Rather, it seems to me that the reliance on complex, self-organizing
dynamics makes some degree of indeterminacy in the system almost
inevitable, thus making the system less than absolutely reliable.
Illustratng this point, humans (who are complex dynamical systems) are
certainly NOT reliable in terms of Friendliness or any other subtle
psychological property...

-- Ben G

The problem, Ben, is that your response amounts to "I don't see why that would work", but without any details. You ask no questions, nor redescribe the proposal back to me in specific terms, so it is hard not to conclude that your comments are based on not understanding it.

You do go further at one point and say that you don't believe I can cash out the sketch in terms of mechanisms that work. I fail to see how can you come to such a strong conclusion, when the rest of what you say implies (or says directly) that you do not understand the proposed mechanism.

**************

The central claim was that because the behavior of the system is constrained by a large number of connections that go from motivational mechanism to thinking mechanism, the latter is tightly governed. You know as well as I do about the power of massive numbers of weak constraints. You know that as the numbers of constraints rise, the depth of the potential well that they can define increases. I used that general idea, coupled with some details about a motivational system, to claim that the latter would constrain the thinking mechanism in just that way. That leads to the possibility of an extremely deep potential well which is the behavior we call Friendly.

You may disagree about the details, but in that case you should talk about the details, not try to imply that there is no way whatsoever that a type of behavior could be extremely tightly constrained. That latter assertion is just wrong: there are ways to make a system very predictable when multiple simultaneous weak constraints are applied. So you are in no position to just deny *that* part. In principle, that is doable. What matters is how I propose to get those multiple constraints to work. I gave details. You do not respond with arguments against any of those details.

**********
ASIDE

In case anyone else is reading this and is puzzled by the idea of multiple weak constraints, let me give a classic example, due to Hinton:

  There is an unknown thing (call it "x").
  x is constrained in three ways, each of which is extremely vague.
Actually, it is worse than that: one of the constraints is actually wrong. (But I will not tell you which one).
  Here are the three constraints:

  1       x was intelligent
  2       x was once an actor
  3       x was once a president

Of all the things in the universe that x could be, most people are capable of identifying what x referred to. (Or were, in the 1980s, when this example was proposed). And yet there were only three extremely weak pieces of information: weak because of ambiguity, and because one of the pieces of information was not even reliable.

Now imagine an x constrained from a thousand different directions at once. In principle, the value of x could be pinned down extremely precisely. For what it is worth, this is the basic reason why neural nets work as well as they do.


**********
As for the last paragraph, the point you make there is pretty unbelievable. You are claiming that human beings are complex dynamical systems, and that they are not reliable in terms of Friendliness, and that therefore....

This paragraph is just a collection of non-sequiteurs:

1) Are you saying that humans in general "are not reliable", or that every individual that ever existed is unreliable? I accept that in general humans can be, but it is silly to say that every individual is unreliable, just because somewhere there are other individuals that are not reliable. Human individuals have very different instances of the same motivational system design.

2) At what point in my description did I make use of the idea that the human mind is a complex dynamical system? (I did not: that is a mistaken interpretation of what I said). I make a distinction between different manifestations of complexity, and the one most important to me would not attract some of the implications that most people seem to associate with "complex dynamical system". The point is not very relevant to the mechanism I described.

3) And what would the argument in this paragraph have to do with the mechanism I proposed, which at no point relied on using *exactly* the same mechanism as the human one? Why infer that my proposal would suffer the same problems that exist in some human cases? And expecially when I talked about the differences (not having certain certain motivations in common, for example, and with the AI taking active steps to make itself different by increasing the stabilization).

5) And why does "some degree of indeterminacy in the system ... [ make ] the system less than absolutely reliable"? This is just a flat out contradiction of my thesis, thrown out with no justification. And this, after I had gone to so much trouble to cite counterexamples: an Ideal Gas is riddled with indeterminacy, and yet the relationship between temperature, pressure and volume is extremely reliable; the Sun is a seething cauldron of indeterminacy, and yet nobody worries about the non-zero probability that it might quantum-tunnel itself to somewhere in the vicinity of Betelgeuse....


I think it is okay that you did not understand it, but I think it would be fairer to ask for more detail before rushing to judgment.


Richard Loosemore




-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]

Reply via email to