Ben Goertzel wrote:
Richard,
As I see it, in this long message you have given a conceptual sketch
of an AI design including a motivational subsystem and a cognitive
subsystem, connected via a complex network of continually adapting
connections. You've discussed the way such a system can potentially
build up a self-model involving empathy and a high level of awareness,
and stability, etc.
All this makes sense, conceptually; though as you point out, the story
you give is short on details, and I'm not so sure you really know how
to "cash it out" in terms of mechanisms that will actually function
with adequate intelligence ... but that's another story...
However, you have given no argument as to why the failure of this kind
of architecture to be stably Friendly is so ASTOUNDINGLY UNLIKELY as
you claimed in your original email. You have just argued why it's
plausible to believe such a system would probably have a stable goal
system. As I see it, you did not come close to proving your original
claim, that
>> > The motivational system of some types of AI (the types you would
>> > classify as tainted by complexity) can be made so reliable that the
>> > likelihood of them becoming unfriendly would be similar to the
>> > likelihood of the molecules of an Ideal Gas suddenly deciding to
split
>> > into two groups and head for opposite ends of their container.
I don't understand how this extreme level of reliability would be
achieved, in your design.
Rather, it seems to me that the reliance on complex, self-organizing
dynamics makes some degree of indeterminacy in the system almost
inevitable, thus making the system less than absolutely reliable.
Illustratng this point, humans (who are complex dynamical systems) are
certainly NOT reliable in terms of Friendliness or any other subtle
psychological property...
-- Ben G
The problem, Ben, is that your response amounts to "I don't see why that
would work", but without any details. You ask no questions, nor
redescribe the proposal back to me in specific terms, so it is hard not
to conclude that your comments are based on not understanding it.
You do go further at one point and say that you don't believe I can cash
out the sketch in terms of mechanisms that work. I fail to see how can
you come to such a strong conclusion, when the rest of what you say
implies (or says directly) that you do not understand the proposed
mechanism.
**************
The central claim was that because the behavior of the system is
constrained by a large number of connections that go from motivational
mechanism to thinking mechanism, the latter is tightly governed. You
know as well as I do about the power of massive numbers of weak
constraints. You know that as the numbers of constraints rise, the
depth of the potential well that they can define increases. I used that
general idea, coupled with some details about a motivational system, to
claim that the latter would constrain the thinking mechanism in just
that way. That leads to the possibility of an extremely deep potential
well which is the behavior we call Friendly.
You may disagree about the details, but in that case you should talk
about the details, not try to imply that there is no way whatsoever that
a type of behavior could be extremely tightly constrained. That latter
assertion is just wrong: there are ways to make a system very
predictable when multiple simultaneous weak constraints are applied. So
you are in no position to just deny *that* part. In principle, that is
doable. What matters is how I propose to get those multiple constraints
to work. I gave details. You do not respond with arguments against any
of those details.
**********
ASIDE
In case anyone else is reading this and is puzzled by the idea of
multiple weak constraints, let me give a classic example, due to Hinton:
There is an unknown thing (call it "x").
x is constrained in three ways, each of which is extremely vague.
Actually, it is worse than that: one of the constraints is actually
wrong. (But I will not tell you which one).
Here are the three constraints:
1 x was intelligent
2 x was once an actor
3 x was once a president
Of all the things in the universe that x could be, most people are
capable of identifying what x referred to. (Or were, in the 1980s, when
this example was proposed). And yet there were only three extremely
weak pieces of information: weak because of ambiguity, and because one
of the pieces of information was not even reliable.
Now imagine an x constrained from a thousand different directions at
once. In principle, the value of x could be pinned down extremely
precisely. For what it is worth, this is the basic reason why neural
nets work as well as they do.
**********
As for the last paragraph, the point you make there is pretty
unbelievable. You are claiming that human beings are complex dynamical
systems, and that they are not reliable in terms of Friendliness, and
that therefore....
This paragraph is just a collection of non-sequiteurs:
1) Are you saying that humans in general "are not reliable", or that
every individual that ever existed is unreliable? I accept that in
general humans can be, but it is silly to say that every individual is
unreliable, just because somewhere there are other individuals that are
not reliable. Human individuals have very different instances of the
same motivational system design.
2) At what point in my description did I make use of the idea that the
human mind is a complex dynamical system? (I did not: that is a
mistaken interpretation of what I said). I make a distinction between
different manifestations of complexity, and the one most important to me
would not attract some of the implications that most people seem to
associate with "complex dynamical system". The point is not very
relevant to the mechanism I described.
3) And what would the argument in this paragraph have to do with the
mechanism I proposed, which at no point relied on using *exactly* the
same mechanism as the human one? Why infer that my proposal would
suffer the same problems that exist in some human cases? And expecially
when I talked about the differences (not having certain certain
motivations in common, for example, and with the AI taking active steps
to make itself different by increasing the stabilization).
5) And why does "some degree of indeterminacy in the system ... [ make ]
the system less than absolutely reliable"? This is just a flat out
contradiction of my thesis, thrown out with no justification. And this,
after I had gone to so much trouble to cite counterexamples: an Ideal
Gas is riddled with indeterminacy, and yet the relationship between
temperature, pressure and volume is extremely reliable; the Sun is a
seething cauldron of indeterminacy, and yet nobody worries about the
non-zero probability that it might quantum-tunnel itself to somewhere in
the vicinity of Betelgeuse....
I think it is okay that you did not understand it, but I think it would
be fairer to ask for more detail before rushing to judgment.
Richard Loosemore
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]