Mark Waser wrote:
And apart from the global differences between the two types of AGI, it would be no good to try to guarantee friendliness using the kind of conventional AI system that is Novamente, because inasmuch as general goals would be encoded in such a system, they are explicitly coded as "statement" which are then interpreted by something else. To put it crudely (and oversimplify slightly) if the goal "Be empathic to the needs of human beings" were represented just like that, as some kind of proposition, and stored at a particular location, it wouldn't take much for a hacker to get inside and change the statement to "Make [hacker's name] rich and sacrifice as much of humanity as necessary". If that were to become the AGI's top level goal, we would then be in deep doodoo. In the system I propose, such events could not happen.

I think that this focuses on the wrong aspect. It is not the fact that the goal is explicitly encoded as a statement that is a problem -- it is the fact that it is in only one place that is dangerous. My assumption is that your system basically build it's base constraints from a huge number of examples and that it is distributed enough that that it would be difficult if not impossible to maliciously change enough to cause a problem. The fact that you're envisioning your system as not having easy-to-read statements is really orthogonal to your argument and a system that explicitly codes all of it's constraints as readable statements but still builds it's base constraints from a huge number of examples should be virtually as incorruptible as your system (with the difference being security by obscurity -- which is not a good thing to rely upon and also means that your system is less comprehensible).

Mark,

You have put your finger on one aspect of the proposal that came up, in a slightly different way, when Jef Allbright started talking about pragmatics: the "semantics" of the system. This is the hardest feature to explain in a short space.

I really did consciously mean to have both things, not just distributed representation of the constraints, but also the fact that the semantics of the system is distributed. This distributed, semi-opaque semantics is what I meant by talking about the propositions not being explicitly encoded, above, and what I also was referring to in my comment to Jef.

If the basic knowledge units ("atoms") of the system develop as a result of learning mechanisms + real world interaction (which together make them grounded), then the meaning of any given atom is encoded in the whole web of connections between it and the other atoms, and also by the mechanisms that browse on (/use, /modify) these atoms. It is not easy to point to an atom and say exactly what it does.

This is not an optional part of the framework: it is crucial. It is the main reason why the system has some complexity. It is also the reason why the system can be properly grounded and is scalable (which is what, with an ordinary, conventional AI system, cannot be done because of the complex systems problem).

In a sense the system is less comprehensible, but this is only a matter of degree. I don't think it makes any practical difference to our attempts to govern its behavior. It is going to be comprehensible enouigh that we can put hooks in for monitoring purposes.

The great benefit of this way of doing things is that, once the system has matured to adulthood, it cannot be hacked: you cannot just write a worm to go around hunting for constraints and modifying them in a regular way (as you might be able to do with ordinary distributed constraints, where the semantics of each individual atom is well defined enough that you can make a clean edit), because if you tried to do this you would destabilize the whole thing and turn it into a gibbering wreck. It would stop working ... and the effect would be so dramatic that we (and it) could easily set up automatic shutdown mechanisms to intervene in such a case.




Richard Loosemore



-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=48729534-6c9bfe

Reply via email to