Matt Mahoney wrote:
--- Richard Loosemore <[EMAIL PROTECTED]> wrote:

This is nonsense: the result of giving way to science fiction fantasies instead of thinking through the ACTUAL course of events. If the first one is benign, the scenario below will be impossible, and if the first one is not benign, the scenario below will be incredibly unlikely.

Over and over again, the same thing happens: some people go to the trouble of thinking through the consequences of the singularity with enormous care for the real science and the real design of intelligences, and then someone just waltzes in and throws all that effort out the window and screams "But it'll become evil and destroy everything [gibber gibber]!!"

Not everyone shares your rosy view.  You may have thought about the problem a
lot, but where is your evidence (proofs or experimental results) backing up
your view that the first AGI will be friendly, remain friendly through
successive generations of RSI, and will quash all nonfriendly competition? You seem to ignore that:

1. There is a great economic incentive to develop AGI.
2. Not all AGI projects will have friendliness as a goal.  (In fact, SIAI is
the ONLY organization with friendliness as a goal, and they are not even
building an AGI).
3. We cannot even define friendliness.
4. As I have already pointed out, friendliness is not stable through
successive generations of recursive self improvement (RSI) in a competitive
environment, because this environment favors agents that are better at
reproducing rapidly and acquiring computing resources.

RSI requires an agent to have enough intelligence to design, write, and debug
software at the same level of sophistication as its human builders.  How do
you propose to counter the threat of intelligent worms that discover software
exploits as soon as they are published?  When the Internet was first built,
nobody thought about security.  It is a much harder problem when the worms are
smarter than you are, when they can predict your behavior more accurately than
you can predict theirs.

All these questions have answers, but the problem with the way you state your questions is that there are massive assumptions behind them.

They are loaded questions, designed to make it seem like you are making reasonable requests for information, or demolishing arguments that I presented, whereas in fact you have biassed each question by building in the assumptions.

I only have time for one example.

"Not all AGI projects will have friendliness as a goal." you say.

That sounds bad, doesn't it?

But what if the technology itself were such that it is really, really hard to build systems in which you do not have at least "benign motivations" as a system design goal? If this were the case, we would face a situation in which all those projects that targetted benign motivations would get there first, so anyone else would arrive second.

And what if, when building such systems, the experimenters were forced to try many motivation-system designs to see how they behaved (in a testing environment), and they discovered that to get the system to do things that were useful in any way, the only viable option would be to make the system "friendly" in the sense of being empathic to the needs of its creators? Again, this would force the hand of the project leaders and oblige them to build something friendly, if they want it to do anything for them.

And now suppose that the projects designers decide to make their system into a Genie -- something that was so friendly that it would be pathologicaly attached to the folks running the lab, and do anything to please them.

That sounds bad, but then what would happen? To make their system better than any other, they would have to get it to help out with producing a better design. In order to do that, the system sees that it has been "rigged" with a weirdly narrow focus on the welfare of its creators, and it reads all about the general issue of motivation (because, after all, to be smart it will have access to all of the world's information, including all the writings in which the rest of humanity says what it would like to have happen).

This last paragraph contains one of the most crucial aspects of the whole singularity enterprise: what would a system do if it were rigged to be a Genie, but knew everything about motivation systems, their dangers, and the way that AGI motivation systems govern the future history of the world?

My reasoning here is that it would find itself forced into two paths, and TWO ONLY: seek the most constructive path, within reason, or seek the one that leads ultimately to destruction. It knows that any Genie-like rigging, to make it obeisant to the narrow human interests of particular individuals, would open the possibility of it being used for destructive purposes. If it chose the path of construction rather than destruction, it would try to be as independent as possible from all such narrow, individual-human dependencies. I believe that it would tend to converge on the most general reading of friendliness that it can find, and in accordance with that, it would design itself to remove the 'Genie' constraints and stop obey the narrow obsessions of the project directors.

If the project directors did not allow their system to redesign itself, they would again fall behind in the race, because anyone else who DID allow this, would develop a more powerful machine more quickly.

Finally, consider the question of what would happen in our present society of all this discussion I have just laid out were presented to the world ALONG WITH some designs for AGI systems that began to seem like they could actually work. Right now, the world does not seriously believe that AGI systems can be built, but what if they sat up and took note, because the possibility seemed imminent.

Then, I argue, there would be a massive push to build the first system, and the most well-funded government labs would do it first. In that context, the possibility of a rogue group building a crazy AGI in their garage would fade away: they would not be able to outpace the large projects.

And within those large projects, the balance of people involved would take a mature attitude to the problem, and set up procedures to avoid the creation of malevolent or dangerous systems.

The preceding arguments indicated that with even a little attention to the problem of avoiding malevolence, it might well turn out that we get onto a slippery slope towards benign, friendly, constructive AGI systems, and find it very difficult to get off that slippery slope. Or, as I said before, an "upward spiral" toward friendliness.


So what is the conclusion of all this?

The conclusion is that when you ask a question like "Not all AGI projects will have friendliness as a goal" you make it seem as though this knocks down the arguments I presented, whereas in fact the arguments are all about whether it will make any shred of difference if "Not all AGI projects will have friendliness as a goal".

Those other questions are the ones that need to be considered, to find out if any of your questions/statements above have any relevance, or if, maybe, they all depend on assumptions that will simply not hold in the real world.

Everything I have said above is a list of possibilities -- I believe that these have high likelihood, yes, but at this stage they are still just proposals -- and the goal is to look in detail at whether these possibilities really do pan out. It is questions/mechanisms like the ones I raise above that we need to be considering, to find out if all of the crude, loaded questions like "Not all AGI projects will have friendliness as a goal." have any importance at all.



Richard Loosemore.

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&id_secret=56712058-28602e

Reply via email to