Ian,
The reward button *would* be amoung the well-defined ones, though... sounds
to me like you are just abusing Goedel's theorem. Can you give a more
detailed argument?
--Abram
On Sun, Jul 4, 2010 at 4:47 PM, Ian Parker wrote:
>
>
> No it would not. AI willk "press its own buttons" only if th
Joshua,
Fortunately, this is not that hard to fix by abandoning the idea of a reward
function and going back to a normal utility function... I am working on a
paper on how to do that.
--Abram
On Mon, Jul 5, 2010 at 9:43 AM, Joshua Fox wrote:
> Abram,
>
> Good point. But I am ignoring the imple
Abram,
Good point. But I am ignoring the implementation of the utility/reward
function , and treating it as a Platonic mathematical function of
world-state or observations which cannot be changed without reducing the
total utility/reward. You are quite right that when we do bring
implementation
On Fri, Jul 2, 2010 at 2:35 PM, Steve Richfield
wrote:
> It appears that one hemisphere is a *completely* passive observer, that
> does *not* even bother to distinguish you and not-you, other than noting a
> probable boundary. The other hemisphere concerns itself with manipulating
> the world, reg
No it would not. AI willk "press its own buttons" only if those buttons are
defined. In one sense you can say that Goedel's theorem is a proof of
friendliness as it means that there must always be one button that AI cannot
press.
- Ian Parker
On 4 July 2010 16:43, Abram Demski wrote:
> Joshu
Demski
To: agi
Sent: Sun, July 4, 2010 11:43:46 AM
Subject: Re: [agi] Reward function vs utility
Joshua,
But couldn't it game the external utility function by taking actions which
modify it? For example, if the suggestion is taken literally and you have a
person deciding the reward at
Joshua,
But couldn't it game the external utility function by taking actions which
modify it? For example, if the suggestion is taken literally and you have a
person deciding the reward at each moment, an AI would want to focus on
making that person *think* the reward should be high, rather than f
Another point. I'm probably repeating the obvious, but perhaps this will be
useful to some.
On the one hand, an agent could not game a Legg-like intelligence metric
by altering the utility function, even an internal one,, since the metric is
based on the function before any such change.
On the o
To all,
There may be a fundamental misdirection here on this thread, for your
consideration...
There have been some very rare cases where people have lost the use of one
hemisphere of their brains, and then subsequently recovered, usually with
the help of recently-developed clot-removal surgery.
I found the answer as given by Legg, *Machine Superintelligence*, p. 72,
copied below. A reward function is used to bypass potential difficulty in
communicating a utility function to the agent.
Joshua
The existence of a goal raises the problem of how the agent knows what the
goal is. One possibil
You can always build the utility function into the assumed universal Turing
machine underlying the definition of algorithmic information...
I guess this will improve learning rate by some additive constant, in the
long run ;)
ben
On Sun, Jun 27, 2010 at 4:22 PM, Joshua Fox wrote:
> This has pr
Subject: [agi] Reward function vs utility
This has probably been discussed at length, so I will appreciate a reference on
this:
Why does Legg's definition of intelligence (following on Hutters' AIXI and
related work) involve a reward function rather than a utility function? For
th
This has probably been discussed at length, so I will appreciate a reference
on this:
Why does Legg's definition of intelligence (following on Hutters' AIXI and
related work) involve a reward function rather than a utility function? For
this purpose, reward is a function of the word state/history
13 matches
Mail list logo