I *very* rarely cross-post but felt that cross-posting the following that I 
just placed on the SL4 list to here was the Friendliest way to fulfill my 
promise of  "attempting to ensure that any salient points make it to both 
lists".

> From: Rolf Nelson 
> Here's some generic unsolicited advice for friendliness proposals.
> 1. It's not sufficient to have the correct solution, it has to be compelling 
> to other people or it will never get implemented. 

Well, my experience on SL4 certainly proves that -- so let me try to 
communicate why my solution is compelling.
  a.. My APPROACH is compelling because it is simple, fairly easily explained, 
robust and LIKELY TO LEAD TO A CORRECT SOLUTION EVEN IF MY CURRENT VERSION OF 
THE SOLUTION IS WRONG 
  b.. The results that *I* am seeing are compelling to *me* because I suddenly 
have an awesome new Ethics tool that correctly does things that I've never seen 
done correctly before 
  c.. The results that you all are seeing should be compelling because there's 
this person who suddenly goes berserk and started yelling "Eureka, I've solved 
it!  All I have to do is DECLARE that I'm Friendly."
The solution is compelling because . . . . well . . . . it *would* be 
compellingly powerful if y'all believed it to be correct.  Sigh.  Except that 
I'm not adequately communicating it so that it looks correct.

The good news (for me) is that this realization suggests another approach that 
might be compelling --> describing the approach that IS compelling (literally) 
rather than the solution which is apparently not.

= = = = = = = = = =
THE APPROACH

In order to simplify the task, I started out by *assuming* that Friendliness is 
not only possible but actually *reasonably* easy (heresy on this list and part 
of why I'm having such a tough time getting my message across).

<CRITICAL CLARIFICATION:  This assumption is *just a tool* to simplify the 
approach.  I do not believe that there is *any* basis to assert the truth of 
this assumption and any solutions derived DO NOT rely on the assumption.>

Assuming that Friendliness IS reasonably easy places some *very* specific 
constraints on the state space of any possible solution.  These constraints 
then make Friendliness easier to solve -- IF there is still a solution in the 
constrained space.

In particular, since everyone seems to believe that Friendliness is (virtually 
if not totally) impossible to stabilize, "easiness" seems to require that 
Friendliness *MUST* be self-stabilizing -- so the approach is entirely focused 
on that.

<REPEAT:  DERIVED ASSUMPTION:  Friendliness *MUST* be self-stabilizing>

Next, since Friendliness is at least as rich and complex as the sum of (as 
Thomas McCabe puts it) "the ten bazillion different things humans value", any 
self-stabilizing structure must be able to be at least that complex to be a 
solution.

The complexity issue led me to focus on attractors since they can be infinitely 
complex yet still constrained -- a perfect analogy for Friendliness!

Asking the question "What would be attractive to an AGI (or any other 
intelligent entity)?" yields the answers "Their own self-interest!" and 
"Fulfilling their goals!"

Asking the question "What would be most repellent to an AGI (or any other 
intelligent entity)?" yields the answer "Having their goals interfered with!"

Now we're at the point where I can argue that if we have a set of entities that 
can fulfill both the personal goal of self-interest AND the "other guy" goal of 
not interfering with the goals of others, then we have a stable Friendly system.

So how do we collapse the two frequently conflicting goals into one uniform 
non-conflicting goal?

How about "Don't interfere with the goals of others unless not doing so 
basically prevents you fulfilling your goals (explicitly not including low 
probability freak events for you pedants out there)"

That's a pretty close approximation and has the really cool, awesome trait of 
having all of the basic precepts and conclusions of ethics (according to me) 
just naturally fall out of the natural implications and effects of everyone 
having that goal as a primary goal.

Or, in other words, pretty much PROVING (in the loose sense of the word for you 
pedants) that ETHICS IS SIMPLY ENLIGHTENED SELF-INTEREST BECAUSE THEY BOTH FALL 
OUT OF THE SAME PRIMARY GOAL STATEMENT.

And THAT, I believe is *really* exciting and compelling and thus the slogan 
"Friendliness:  The Ice-9 of Ethics and the Ultimate in Self-Interest"

Now, if you can/do believe the slogan, then you're an idiot for not making a 
Declaration of Friendliness and attempting to create and join a stable Friendly 
society/group because doing so is "the Ultimate in Self-Interest".  

(Note:  There is absolutely no requirement that everyone participate for 
Friendliness to be in your self-interest -- merely that you have a group of 
participating entities.  The larger the group, the stronger the effect -- which 
is why the secondary goal of Friendliness is to spread -- but it works just 
fine even if everyone doesn't play).

My declaration of Friendliness was just such an attempt.  It took on the 
primary overriding Friendliness goal (and the secondary goal of spreading 
Friendliness), added some protections against being taken advantage of by 
UnFriendlies and Friendly Mimics, and finished by adding statements necessary 
to make it a complete closed system/solution that both protected my 
self-interest and that of others.  My *initial* claims are that my Declaration 
of Friendliness:
  a.. is in my self-interest
  b.. does not allow me to commit horrible and unethical acts without breaking 
the declaration.
My follow-on claim is that - IF you can make an AGI that can and does 
understand (because it is true) that making a Declaration of Friendliness and 
following through on it is in it's own self-interest, then you will have an 
ETHICAL machine that will only stomp on your goals (or existence) 
  a.. when it is the ethically correct thing to do OR
  b.. out of IGNORANCE or ERROR (which is an intelligence problem, not a 
Friendliness problem).
And my final claim is that the above-described AGI is AT LEAST a 
Friendliness-satisficing AGI (if it isn't actually the most Friendly AGI 
possible -- which I believe that it is).

    Mark

Vision/Slogan - Friendliness:  The Ice-9 of Ethics and the Ultimate in 
Self-Interest


-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com

Reply via email to