Re: AGI goals (was Re: Information theoretic approaches to AGI (was Re: [agi] The Necessity of Embodiment))

Mark Waser Wed, 27 Aug 2008 19:19:13 -0700

Hi,

   I think that I'm missing some of your points . . . .

Whatever good is, it cannot be something directly
observable, or the AI will just wirehead itself (assuming it gets
intelligent enough to do so, of course).

I don't understand this unless you mean by "directly observable" that thedefinition is observable and changeable. If I define good as making allhumans happy without modifying them, how would the AI wirehead itself? Whatam I missing here?

So, the AI needs to have a concept of external goodness, with a weak
probabilistic correlation to its directly observable pleasure.

I agree with the concept of external goodness but why does the correlationbetween external goodness and it's pleasure have to be low? Why can'texternal goodness directly cause pleasure? Clearly, it shouldn't believethat it's pleasure causes external goodness (that would be reversing causeand effect and an obvious logic error).


   Mark

P.S. I notice that several others answered your wirehead query so I won'tbelabor the point. :-)

----- Original Message -----From: "Abram Demski" <[EMAIL PROTECTED]>

To: <agi@v2.listbox.com>
Sent: Wednesday, August 27, 2008 3:43 PM

Subject: **SPAM** Re: AGI goals (was Re: Information theoretic approaches toAGI (was Re: [agi] The Necessity of Embodiment))

Mark,

The main motivation behind my setup was to avoid the wirehead
scenario. That is why I make the explicit goodness/pleasure
distinction. Whatever good is, it cannot be something directly
observable, or the AI will just wirehead itself (assuming it gets
intelligent enough to do so, of course). But, goodness cannot be
completely unobservable, or the AI will have no idea what it should
do.

So, the AI needs to have a concept of external goodness, with a weak
probabilistic correlation to its directly observable pleasure. That
way, the system will go after pleasant things, but won't be able to
fool itself with things that are maximally pleasant. For example, if
it were to consider rewiring its visual circuits to see only
skin-color, it would not like the idea, because it would know that
such a move would make it less able to maximize goodness in general.
(It would know that seeing only tan does not mean that the entire
world is made of pure goodness.) An AI that was trying to maximize
pleasure would see nothing wrong with self-stimulation of this sort.

So, I think that pushing the problem of goal-setting back to
pleasure-setting is very useful for avoiding certain types of
undesirable behavior.

By the way, where does this term "wireheading" come from? I assume
from context that it simply means self-stimulation.

-Abram Demski

On Wed, Aug 27, 2008 at 2:58 PM, Mark Waser <[EMAIL PROTECTED]> wrote:

Hi,

  A number of problems unfortunately . . . .

-Learning is pleasurable.


. . . . for humans.  We can choose whether to make it so for machines or
not.  Doing so would be equivalent to setting a goal of learning.

-Other things may be pleasurable depending on what we initially want
the AI to enjoy doing.

See . . . all you've done here is pushed goal-setting topleasure-setting

. . . .

= = = = =

Further, if you judge goodness by pleasure, you'll probably create anAGIwhose shortest path-to-goal is to wirehead the universe (which I considerto

be a seriously suboptimal situation - YMMV).




----- Original Message ----- From: "Abram Demski" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Wednesday, August 27, 2008 2:25 PM

Subject: **SPAM** Re: AGI goals (was Re: Information theoretic approachesto

AGI (was Re: [agi] The Necessity of Embodiment))

Mark,

OK, I take up the challenge. Here is a different set of goal-axioms:

-"Good" is a property of some entities.
-Maximize good in the world.
-A more-good entity is usually more likely to cause goodness than a
less-good entity.
-A more-good entity is often more likely to cause pleasure than a
less-good entity.
-"Self" is the entity that causes my actions.
-An entity with properties similar to "self" is more likely to be good.

Pleasure, unlike goodness, is directly observable. It comes from many
sources. For example:
-Learning is pleasurable.
-A full battery is pleasurable (if relevant).
-Perhaps the color of human skin is pleasurable in and of itself.
(More specifically, all skin colors of any existing race.)
-Perhaps also the sound of a human voice is pleasurable.
-Other things may be pleasurable depending on what we initially want
the AI to enjoy doing.

So, the definition if "good" is highly probabilistic, and the system's
inferences about goodness will depend on its experiences; but pleasure
can be directly observed, and the pleasure-mechanisms remain fixed.

On Wed, Aug 27, 2008 at 12:32 PM, Mark Waser <[EMAIL PROTECTED]>wrote:


But, how does your description not correspond to giving the AGI the
goals of being helpful and not harmful? In other words, what more does
it do than simply try for these? Does it pick goals randomly such that
they conflict only minimally with these?


Actually, my description gave the AGI four goals: be helpful, don't be
harmful, learn, and keep moving.

Learn, all by itself, is going to generate an infinite number of
subgoals.

Learning subgoals will be picked based upon what is most likely tolearn

the
most while not being harmful.

(and, by the way, be helpful and learn should both generate a
self-protection sub-goal  in short order with procreation following
immediately behind)

Arguably, be helpful would generate all three of the other goals but
learning and not being harmful without being helpful is a *much* better

goal-set for a novice AI to prevent "accidents" when the AI thinks itisbeing helpful. In fact, I've been tempted at times to entirely dropthe

be
helpful since the other two will eventually generate it with a lessened
probability of trying-to-be-helpful accidents.

Don't be harmful by itself will just turn the AI off.

The trick is that there needs to be a balance between goals. Anysingle

goal intelligence is likely to be lethal even if that goal is to help
humanity.

Learn, do no harm, help. Can anyone come up with a better set ofgoals?

(and, once again, note that learn does *not* override the other two --
 there
is meant to be a balance between the three).

----- Original Message ----- From: "Abram Demski"<[EMAIL PROTECTED]>

To: <agi@v2.listbox.com>
Sent: Wednesday, August 27, 2008 11:52 AM

Subject: **SPAM** Re: AGI goals (was Re: Information theoreticapproaches

to
AGI (was Re: [agi] The Necessity of Embodiment))

Mark,

I agree that we are mired 5 steps before that; after all, AGI is not
"solved" yet, and it is awfully hard to design prefab concepts in a
knowledge representation we know nothing about!

But, how does your description not correspond to giving the AGI the
goals of being helpful and not harmful? In other words, what more does
it do than simply try for these? Does it pick goals randomly such that
they conflict only minimally with these?

--Abram

On Wed, Aug 27, 2008 at 11:09 AM, Mark Waser <[EMAIL PROTECTED]>
wrote:

It is up to humans to define the goals of an AGI, so that it willdo
what
we want it to do.


Why must we define the goals of an AGI?  What would be wrong with
setting
it
off with strong incentives to be helpful, even stronger incentives to
not
be
harmful, and let it chart it's own course based upon the vagaries of
the
world?  Let it's only hard-coded goal be to keep it's satisfaction
above
a
certain level with helpful actions increasing satisfaction, harmful
actions

heavily decreasing satisfaction; learning increasing satisfaction,andsatisfaction naturally decaying over time so as to promote action . ..

.

Seems to me that humans are pretty much coded that way (with
evolution's

additional incentives of self-defense and procreation). The realtrick

of
the matter is defining helpful and harmful clearly but everyone is
still
mired five steps before that.

----- Original Message -----
From: Matt Mahoney
To: agi@v2.listbox.com
Sent: Wednesday, August 27, 2008 10:52 AM
Subject: AGI goals (was Re: Information theoretic approaches to AGI
(was
Re:
[agi] The Necessity of Embodiment))
An AGI will not design its goals. It is up to humans to define the
goals
of
an AGI, so that it will do what we want it to do.

Unfortunately, this is a problem. We may or may not be successful in
programming the goals of AGI to satisfy human goals. If we are not

successful, then AGI will be useless at best and dangerous at worst.If

we
are successful, then we are doomed because human goals evolved in a
primitive environment to maximize reproductive success and not in an

environment where advanced technology can give us whatever we want.AGI

will
allow us to connect our brains to simulated worlds with magic genies,
or

worse, allow us to directly reprogram our brains to alter ourmemories,

goals, and thought processes. All rational goal-seeking agents must
have
a
mental state of maximum utility where any thought or perception would
be
unpleasant because it would result in a different state.

-- Matt Mahoney, [EMAIL PROTECTED]

----- Original Message ----
From: Valentina Poletti <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Tuesday, August 26, 2008 11:34:56 AM

Subject: Re: Information theoretic approaches to AGI (was Re: [agi]The

Necessity of Embodiment)

Thanks very much for the info. I found those articles veryinteresting.

Actually though this is not quite what I had in mind with the term

information-theoretic approach. I wasn't very specific, my bad. WhatI

am

looking for is a a theory behind the actual R itself. Theseapproaches(correnct me if I'm wrong) give an r-function for granted and workfromthat. In real life that is not the case though. What I'm looking foris

how
the AGI will create that function. Because the AGI is created by
humans,

some sort of direction will be given by the humans creating them.What

kind

of direction, in mathematical terms, is my question. In other wordsI'm

looking for a way to mathematically define how the AGI will
mathematically
define its goals.

Valentina


On 8/23/08, Matt Mahoney <[EMAIL PROTECTED]> wrote:


Valentina Poletti <[EMAIL PROTECTED]> wrote:

> I was wondering why no-one had brought up the> information-theoretic

> aspect of this yet.

It has been studied. For example, Hutter proved that the optimal
strategy

of a rational goal seeking agent in an unknown computableenvironment

is
AIXI: to guess that the environment is simulated by the shortest
program
consistent with observation so far [1]. Legg and Hutter also propose
as
a
measure of universal intelligence the expected reward over a
Solomonoff
distribution of environments [2].

These have profound impacts on AGI design. First, AIXI is (provably)
not
computable, which means there is no easy shortcut to AGI. Second,
universal
intelligence is not computable because it requires testing in an
infinite

number of environments. Since there is no other well accepted testofintelligence above human level, it casts doubt on the main premiseof

the

singularity: that if humans can create agents with greater thanhuman

intelligence, then so can they.

Prediction is central to intelligence, as I argue in [3]. Leggproved

in
[4] that there is no elegant theory of prediction. Predicting all
environments up to a given level of Kolmogorov complexity requires a
predictor with at least the same level of complexity. Furthermore,
above
a

small level of complexity, such predictors cannot be proven becauseof

Godel

incompleteness. Prediction must therefore be an experimentalscience.


There is currently no software or mathematical model of
non-evolutionary
recursive self improvement, even for very restricted or simple
definitions

of intelligence. Without a model you don't have friendly AI; youhave

accelerated evolution with AIs competing for resources.

References

1. Hutter, Marcus (2003), "A Gentle Introduction to The Universal
Algorithmic Agent {AIXI}",

in Artificial General Intelligence, B. Goertzel and C. Pennachineds.,

Springer. http://www.idsia.ch/~marcus/ai/aixigentle.htm

2. Legg, Shane, and Marcus Hutter (2006),
A Formal Measure of Machine Intelligence, Proc. Annual machine
learning conference of Belgium and The Netherlands (Benelearn-2006).
Ghent, 2006.  http://www.vetta.org/documents/ui_benelearn.pdf

3. http://cs.fit.edu/~mmahoney/compression/rationale.html

4. Legg, Shane, (2006), Is There an Elegant Universal Theory of
Prediction?,
Technical Report IDSIA-12-06, IDSIA / USI-SUPSI,
Dalle Molle Institute for Artificial Intelligence, Galleria 2, 6928
Manno,
Switzerland.
http://www.vetta.org/documents/IDSIA-12-06-1.pdf

-- Matt Mahoney, [EMAIL PROTECTED]


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com




--
A true friend stabs you in the front. - O. Wilde

Einstein once thought he was wrong; then he discovered he was wrong.

For every complex problem, there is an answer which is short, simple
and
wrong. - H.L. Mencken
________________________________
agi | Archives | Modify Your Subscription
________________________________
agi | Archives | Modify Your Subscription

________________________________
agi | Archives | Modify Your Subscription



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription:
https://www.listbox.com/member/?&;
Powered by Listbox: http://www.listbox.com



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/

Modify Your Subscription:https://www.listbox.com/member/?&;

Powered by Listbox: http://www.listbox.com





-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Re: AGI goals (was Re: Information theoretic approaches to AGI (was Re: [agi] The Necessity of Embodiment))

Reply via email to