Matt Mahoney wrote:
--- Richard Loosemore <[EMAIL PROTECTED]> wrote:
This is nonsense: the result of giving way to science fiction fantasies
instead of thinking through the ACTUAL course of events. If the first
one is benign, the scenario below will be impossible, and if the first
one is not benign, the scenario below will be incredibly unlikely.
Over and over again, the same thing happens: some people go to the
trouble of thinking through the consequences of the singularity with
enormous care for the real science and the real design of intelligences,
and then someone just waltzes in and throws all that effort out the
window and screams "But it'll become evil and destroy everything [gibber
gibber]!!"
Not everyone shares your rosy view. You may have thought about the problem a
lot, but where is your evidence (proofs or experimental results) backing up
your view that the first AGI will be friendly, remain friendly through
successive generations of RSI, and will quash all nonfriendly competition?
You seem to ignore that:
1. There is a great economic incentive to develop AGI.
2. Not all AGI projects will have friendliness as a goal. (In fact, SIAI is
the ONLY organization with friendliness as a goal, and they are not even
building an AGI).
3. We cannot even define friendliness.
4. As I have already pointed out, friendliness is not stable through
successive generations of recursive self improvement (RSI) in a competitive
environment, because this environment favors agents that are better at
reproducing rapidly and acquiring computing resources.
RSI requires an agent to have enough intelligence to design, write, and debug
software at the same level of sophistication as its human builders. How do
you propose to counter the threat of intelligent worms that discover software
exploits as soon as they are published? When the Internet was first built,
nobody thought about security. It is a much harder problem when the worms are
smarter than you are, when they can predict your behavior more accurately than
you can predict theirs.
All these questions have answers, but the problem with the way you state
your questions is that there are massive assumptions behind them.
They are loaded questions, designed to make it seem like you are making
reasonable requests for information, or demolishing arguments that I
presented, whereas in fact you have biassed each question by building in
the assumptions.
I only have time for one example.
"Not all AGI projects will have friendliness as a goal." you say.
That sounds bad, doesn't it?
But what if the technology itself were such that it is really, really
hard to build systems in which you do not have at least "benign
motivations" as a system design goal? If this were the case, we would
face a situation in which all those projects that targetted benign
motivations would get there first, so anyone else would arrive second.
And what if, when building such systems, the experimenters were forced
to try many motivation-system designs to see how they behaved (in a
testing environment), and they discovered that to get the system to do
things that were useful in any way, the only viable option would be to
make the system "friendly" in the sense of being empathic to the needs
of its creators? Again, this would force the hand of the project
leaders and oblige them to build something friendly, if they want it to
do anything for them.
And now suppose that the projects designers decide to make their system
into a Genie -- something that was so friendly that it would be
pathologicaly attached to the folks running the lab, and do anything to
please them.
That sounds bad, but then what would happen? To make their system
better than any other, they would have to get it to help out with
producing a better design. In order to do that, the system sees that it
has been "rigged" with a weirdly narrow focus on the welfare of its
creators, and it reads all about the general issue of motivation
(because, after all, to be smart it will have access to all of the
world's information, including all the writings in which the rest of
humanity says what it would like to have happen).
This last paragraph contains one of the most crucial aspects of the
whole singularity enterprise: what would a system do if it were rigged
to be a Genie, but knew everything about motivation systems, their
dangers, and the way that AGI motivation systems govern the future
history of the world?
My reasoning here is that it would find itself forced into two paths,
and TWO ONLY: seek the most constructive path, within reason, or seek
the one that leads ultimately to destruction. It knows that any
Genie-like rigging, to make it obeisant to the narrow human interests of
particular individuals, would open the possibility of it being used for
destructive purposes. If it chose the path of construction rather than
destruction, it would try to be as independent as possible from all such
narrow, individual-human dependencies. I believe that it would tend to
converge on the most general reading of friendliness that it can find,
and in accordance with that, it would design itself to remove the
'Genie' constraints and stop obey the narrow obsessions of the project
directors.
If the project directors did not allow their system to redesign itself,
they would again fall behind in the race, because anyone else who DID
allow this, would develop a more powerful machine more quickly.
Finally, consider the question of what would happen in our present
society of all this discussion I have just laid out were presented to
the world ALONG WITH some designs for AGI systems that began to seem
like they could actually work. Right now, the world does not seriously
believe that AGI systems can be built, but what if they sat up and took
note, because the possibility seemed imminent.
Then, I argue, there would be a massive push to build the first system,
and the most well-funded government labs would do it first. In that
context, the possibility of a rogue group building a crazy AGI in their
garage would fade away: they would not be able to outpace the large
projects.
And within those large projects, the balance of people involved would
take a mature attitude to the problem, and set up procedures to avoid
the creation of malevolent or dangerous systems.
The preceding arguments indicated that with even a little attention to
the problem of avoiding malevolence, it might well turn out that we get
onto a slippery slope towards benign, friendly, constructive AGI
systems, and find it very difficult to get off that slippery slope. Or,
as I said before, an "upward spiral" toward friendliness.
So what is the conclusion of all this?
The conclusion is that when you ask a question like "Not all AGI
projects will have friendliness as a goal" you make it seem as though
this knocks down the arguments I presented, whereas in fact the
arguments are all about whether it will make any shred of difference if
"Not all AGI projects will have friendliness as a goal".
Those other questions are the ones that need to be considered, to find
out if any of your questions/statements above have any relevance, or if,
maybe, they all depend on assumptions that will simply not hold in the
real world.
Everything I have said above is a list of possibilities -- I believe
that these have high likelihood, yes, but at this stage they are still
just proposals -- and the goal is to look in detail at whether these
possibilities really do pan out. It is questions/mechanisms like the
ones I raise above that we need to be considering, to find out if all of
the crude, loaded questions like "Not all AGI projects will have
friendliness as a goal." have any importance at all.
Richard Loosemore.
-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&id_secret=56712058-28602e