Mark Waser wrote:
If the motives depend on "satisficing", and the questing for
unlimited fulfillment is avoided, then this limits the danger. The
universe won't be converted into toothpicks, if a part of setting the
goal for "toothpicks!" is limiting the quantity of toothpicks.
(Limiting it reasonably might almost be a definition of friendliness
... or at least neutral behavior.)
You have a good point. Goals should be fulfilled after satisficing
except when the goals are of the form "as <goal> as possible"
(hereafter referred to as "unbounded" goals). Unbounded-goal-entities
*are* particularly dangerous (although being aware of the danger
should mmitigate it to some degree).
My Friendliness basically works by limiting the amount of interference
with other's goals (under the theory that doing so will prevent
other's from interfering with your goals). Stupid entities that can't
see the self-interest in the parenthetical point are not inclined to
be Friendly. Stupid unbounded-goal-entities are Eliezer's
paperclip-producing nightmare.
And, though I'm not clear on how this should be set up, this
"limitation" should be a built-in primitive, i.e. not something
subject to removal, but only to strengthening or weakening via
learning. It should ante-date the recognition of visual images. But
it needs to have a slightly stronger residual limitation that it does
with people. Or perhaps it's initial appearance needs to be during
the formation of the statement of the problem. I.e., a solution to a
problem can't be sought without knowing limits. People seem to just
manage that via a dynamic sensing approach, and that sometimes
suffers from inadequate feedback mechanisms (saying "Enough!").
The limitation is "Don't stomp on other people's goals unless it is
truly necessary" *and* "It is very rarely truly necessary".
(It's not clear to me that it differs from what you are saying, but
it does seem to address a part of what you were addressing, and I
wasn't really clear about how you intended the satisfaction of to be
limited.)
As far as my theory/vision goes, I was pretty much counting on the
fact that we are multi-goal systems and that our other goals will
generally limit any single goal from getting out of hand. Further, if
that doesn't do it, the proclamation of not stepping on other's goals
unless absolutely necessary should help handle the problem . . . . but
. . . . actually you do have a very good point. My theory/vision
*does* have a vulnerability toward single-unbounded-goal entities in
that my Friendly attractor has no benefit for such a system (unless,
of course it's goal is Friendliness or it is forced to have a
secondary goal of Friendliness).
The trouble with "not stepping on other's goals unless absolutely
necessary" is that it relies on mind-reading. The goals of others are
often opaque and not easily verbalizable even if they think to. Then
there's the question of "unless absolutely necessary". How and why
should I decide that their goals are more important than mine? So one
needs to know not only how important their goals are to them, but also
how important my conflicting goals are to me. And, of course, whether
there's a means for mutual satisfaction that it's too expensive. (And
just try to define that "too".)
For some reason I'm reminded of the story about the peasant, his son,
and the donkey carrying a load of sponges. I'd just as soon nobody ends
up in the creek. ("Please all, please none.")
-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
http://www.listbox.com/member/?member_id=8660244&id_secret=95818715-a78a9b
Powered by Listbox: http://www.listbox.com