I found the answer as given by Legg, *Machine Superintelligence*, p. 72,
copied below. A reward function is used to bypass potential difficulty in
communicating a utility function to the agent.

Joshua

The existence of a goal raises the problem of how the agent knows what the
goal is. One possibility would be for the goal to be known in advance and
for this knowledge to be built into the agent. The problem with this is that
it limits each agent to just one goal. We need to allow agents that are more
flexible, specifically, we need to be able to inform the agent of what the
goal
is. For humans this is easily done using language. In general however, the
possession of a suffciently high level of language is too strong an
assumption
to make about the agent. Indeed, even for something as intelligent as a dog
or a cat, direct explanation is not very effective.

Fortunately there is another possibility which is, in some sense, a blend of
the above two. We define an additional communication channel with the sim-
plest possible semantics: a signal that indicates how good the agent’s
current
situation is. We will call this signal the reward. The agent simply has to
maximise the amount of reward it receives, which is a function of the goal.
In
a complex setting the agent might be rewarded for winning a game or solving
a puzzle. If the agent is to succeed in its environment, that is, receive a
lot of
reward, it must learn about the structure of the environment and in
particular
what it needs to do in order to get reward.




On Mon, Jun 28, 2010 at 1:32 AM, Ben Goertzel <b...@goertzel.org> wrote:

> You can always build the utility function into the assumed universal Turing
> machine underlying the definition of algorithmic information...
>
> I guess this will improve learning rate by some additive constant, in the
> long run ;)
>
> ben
>
> On Sun, Jun 27, 2010 at 4:22 PM, Joshua Fox <joshuat...@gmail.com> wrote:
>
>> This has probably been discussed at length, so I will appreciate a
>> reference on this:
>>
>> Why does Legg's definition of intelligence (following on Hutters' AIXI and
>> related work) involve a reward function rather than a utility function? For
>> this purpose, reward is a function of the word state/history which is
>> unknown to the agent while  a utility function is known to the agent.
>>
>> Even if  we replace the former with the latter, we can still have a
>> definition of intelligence that integrates optimization capacity over
>> possible all utility functions.
>>
>> What is the real  significance of the difference between the two types of
>> functions here?
>>
>> Joshua
>>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
>> <https://www.listbox.com/member/archive/rss/303/> | 
>> Modify<https://www.listbox.com/member/?&;>Your Subscription
>> <http://www.listbox.com>
>>
>
>
>
> --
> Ben Goertzel, PhD
> CEO, Novamente LLC and Biomind LLC
> CTO, Genescient Corp
> Vice Chairman, Humanity+
> Advisor, Singularity University and Singularity Institute
> External Research Professor, Xiamen University, China
> b...@goertzel.org
>
> "
> “When nothing seems to help, I go look at a stonecutter hammering away at
> his rock, perhaps a hundred times without as much as a crack showing in it.
> Yet at the hundred and first blow it will split in two, and I know it was
> not that blow that did it, but all that had gone before.”
>
>    *agi* | Archives <https://www.listbox.com/member/archive/303/=now>
> <https://www.listbox.com/member/archive/rss/303/> | 
> Modify<https://www.listbox.com/member/?&;>Your Subscription
> <http://www.listbox.com>
>



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=8660244-6e7fb59c
Powered by Listbox: http://www.listbox.com

Reply via email to