Re: [agi] Motivational system

2006-06-12 Thread James Ratcliff
A couple things below William,1) I agree that direct reward has to be in-built(into brain / AI system).2) I don't see why direct reward cannot be used for rewarding mentalachievements. I think that this "direct rewarding mechanism" ispreprogrammed in genes and cannot be used directly by mind.This mechanism probably can be cheated to the certain extend by themind. For example mind can claim that there is mental achievement whenactually there is none.That possibility of cheating with rewards is definitely a problem.I think this problem is solved (in human brain) by using only smalldozes of "mental rewards".For example, you can get small positive mental rewards by cheating yourmind to like finding solutions to "1+1=2" problem.However, if you do it too often you'll eventually get hungry and
 wouldget huge negative reward. This negative reward would not just stop youdoing "1+1=2" operation over and over, it would also re-setup yourjudgement mechanism, so you will not consider "1+1=2" problem as anachievement anymore.- In theory here, this would be solved by the goal/motivation mechanism, after having performed a task once, we could reference how we solved the problem the next time we encounter it, outside teh reward system, cause we know how to handle it.Also, we all familiar with what "boring" is.When you solve a problem once - it's boring to solve it again.I guess that that is another genetically programmed mechanism withprevents cheating with mental rewards.3) Indirect rewarding mechanisms definitely work too, but they are notsufficient for bootstrapping strong-AI capable system.Consider a baby.
 She doesn't know why it's good to play (alone or withothers). Indirect reward from "childhood playing" will come years laterfrom professional success. Baby cannot understand human language yet, so she cannot envision thissuccess.AI system would face the same problem.My conclusion: indirect reward mechanisms (as you described them) would not beable to bootstrap strong-AI capable system.Back to real baby: typically nobody explains to baby that it's good to play.But somehow babies/children like to play.My conclusion: there are direct reward mechanisms in humans even forthings which are not directly beneficial to the system (like mentalachievements, speech, physical activity).- Likewise here, I propose to have an "exloration" reward that rewards for every new experience gained, so the first time, the baby would 'want' to do random things, and play with the toys,
 or emulate what the other kids are doing. Then as it finds activities it 'likes' it could repeat variations on the theme.So its getting different types of rewards for different types of things, even if it is not a direct approved goal.Friday, June 9, 2006, 4:48:07 PM, you wrote: How do you know that we don't get direct rewards on solving crossword puzzles (or any other mental task)? I don't know, I only make hypotheses. As far as my model is concerned the structures that give direct reward have to be pretty much in-built otherwise for a selectionist system allowing a selected for behaviour to give direct reward would quickly lead to behaviour that gives itself direct reward and doesn't actually do anything. Chances are that under certain mental condition
 ("achievement state"), brain produces some form of pleasure signal. If there is no such reward, then what's your explanation why people like to solve crossword puzzles? Why? By indirect rewards! If you will allow me to slip into my economics metaphor, I shall try to explain my view of things. The consumer is the direct reward giver, something that attempts to mold the system to produce certain products, it doesn't say what is wants just what is good, by giving money ( direct reward). In humans this role played by the genome constructing structures that says nice food and sex is good, along with respect from your peers (probably the Hypothalamus and amygdala). The role of raw materials is played by the information coming from the environment. It can be converted to products or tools. You have retail outlets that interact
 directly with the consumer, being closest to the outputs they get directly the money that allows their survival. However they have to pass some of the money onto the companies that produced the products they passed onto the consumer. This network of money passing will have to carefully controlled so that more money isn't produced in one company than was given (currently I think of the network of dopaminergic neurons being this part). Now with this sort of system you can make a million just so stories about why one program would be selected that passes reward to another, that is give indirect reward. This is where the complexity kicks in. In terms of crossword solving one possibility is that a program closer to the output and with lots of reward has selected for rewarding logical problem solving because in general it is useful for getting reward and so passes
 reward on to a program that has 

[agi] Motivational system

2006-06-09 Thread Dennis Gorelik
William,

 It is very simple and I wouldn't apply it to everything that
 behaviourists would (we don't get direct rewards for solving crossword
 puzzles).

How do you know that we don't get direct rewards on solving crossword
puzzles (or any other mental task)?
Chances are that under certain mental condition (achievement state),
brain produces some form of pleasure signal.
If there is no such reward, then what's your explanation why people
like to solve crossword puzzles?



---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational system

2006-06-09 Thread James Ratcliff
I definitely get pleasure out of doing them, that appears to be adirect feedback that is easily seen.Another harder one I saw the other day, is long term gains, which seem to be much harder to visualize.Take for instance flossing your teeth, it hurts sometimes, and could make your mouth bleed, not really the most pleasant task, but down the road you get the benefit of having a healthy mouth. But how do we know to look that far down the road, and how to we represent this tradeoff nicely.James RatcliffDennis Gorelik [EMAIL PROTECTED] wrote: William, It is very simple and I wouldn't apply it to everything that behaviourists would (we don't get direct rewards for solving crossword puzzles).How do you know that we don't get direct rewards on solving crosswordpuzzles
 (or any other mental task)?Chances are that under certain mental condition ("achievement state"),brain produces some form of pleasure signal.If there is no such reward, then what's your explanation why peoplelike to solve crossword puzzles?---To unsubscribe, change your address, or temporarily deactivate your subscription, please go to http://v2.listbox.com/member/[EMAIL PROTECTED] __Do You Yahoo!?Tired of spam?  Yahoo! Mail has the best spam protection around http://mail.yahoo.com 
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


Re: [agi] Motivational system

2006-06-09 Thread William Pearson

On 09/06/06, Dennis Gorelik [EMAIL PROTECTED] wrote:

William,

 It is very simple and I wouldn't apply it to everything that
 behaviourists would (we don't get direct rewards for solving crossword
 puzzles).

How do you know that we don't get direct rewards on solving crossword
puzzles (or any other mental task)?


I don't know, I only make hypotheses. As far as my model is concerned
the structures that give direct reward have to be pretty much in-built
otherwise for a selectionist system allowing a selected for behaviour
to give direct reward would quickly lead to behaviour that gives
itself direct reward and doesn't actually do anything.


Chances are that under certain mental condition (achievement state),
brain produces some form of pleasure signal.
If there is no such reward, then what's your explanation why people
like to solve crossword puzzles?


Why? By indirect rewards! If you will allow me to slip into my
economics metaphor, I shall try to explain my view of things. The
consumer is the direct reward giver, something that attempts to mold
the system to produce certain products, it doesn't say what is wants
just what is good, by giving money ( direct reward).

In humans this role played by the genome constructing structures that
says nice food and sex is good, along with respect from your peers
(probably the Hypothalamus and amygdala).

The role of raw materials is played by the information coming from the
environment. It can be converted to products or tools.

You have retail outlets that interact directly with the consumer,
being closest to the outputs they get directly the money that allows
their survival. However they have to pass some of the money onto the
companies that produced the products they passed onto the consumer.
This network of money passing will have to carefully controlled so
that more money isn't produced in one company than was given
(currently I think of the network of dopaminergic neurons being this
part).

Now with this sort of system you can make a million just so stories
about why one program would be selected that passes reward to another,
that is give indirect reward. This is where the complexity kicks in.
In terms of crossword solving one possibility is that a program closer
to the output and with lots of reward has selected for rewarding
logical problem solving because in general it is useful for getting
reward and so passes reward on to a program that has proven its
ability to logical problem solve, possibly entering into a deal of
some sort.

This is all very subconcious, as it is needed to be to be able to
encompass and explain low level learning such as neural plasticity,
which is very subconcious itself.

Will Pearson

---
To unsubscribe, change your address, or temporarily deactivate your subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]


[agi] Motivational system

2006-06-09 Thread Dennis Gorelik
William,

1) I agree that direct reward has to be in-built
(into brain / AI system).
2) I don't see why direct reward cannot be used for rewarding mental
achievements. I think that this direct rewarding mechanism is
preprogrammed in genes and cannot be used directly by mind.
This mechanism probably can be cheated to the certain extend by the
mind. For example mind can claim that there is mental achievement when
actually there is none.
That possibility of cheating with rewards is definitely a problem.
I think this problem is solved (in human brain) by using only small
dozes of mental rewards.
For example, you can get small positive mental rewards by cheating your
mind to like finding solutions to 1+1=2 problem.
However, if you do it too often you'll eventually get hungry and would
get huge negative reward. This negative reward would not just stop you
doing 1+1=2 operation over and over, it would also re-setup your
judgement mechanism, so you will not consider 1+1=2 problem as an
achievement anymore.

Also, we all familiar with what boring is.
When you solve a problem once - it's boring to solve it again.
I guess that that is another genetically programmed mechanism with
prevents cheating with mental rewards.

3) Indirect rewarding mechanisms definitely work too, but they are not
sufficient for bootstrapping strong-AI capable system.
Consider a baby. She doesn't know why it's good to play (alone or with
others). Indirect reward from childhood playing will come years later
from professional success. 
Baby cannot understand human language yet, so she cannot envision this
success.
AI system would face the same problem.

My conclusion: indirect reward mechanisms (as you described them) would not be
able to bootstrap strong-AI capable system.

Back to real baby: typically nobody explains to baby that it's good to play.
But somehow babies/children like to play.
My conclusion: there are direct reward mechanisms in humans even for
things which are not directly beneficial to the system (like mental
achievements, speech, physical activity).

Friday, June 9, 2006, 4:48:07 PM, you wrote:

 How do you know that we don't get direct rewards on solving crossword
 puzzles (or any other mental task)?

 I don't know, I only make hypotheses. As far as my model is concerned
 the structures that give direct reward have to be pretty much in-built
 otherwise for a selectionist system allowing a selected for behaviour
 to give direct reward would quickly lead to behaviour that gives
 itself direct reward and doesn't actually do anything.

 Chances are that under certain mental condition (achievement state),
 brain produces some form of pleasure signal.
 If there is no such reward, then what's your explanation why people
 like to solve crossword puzzles?

 Why? By indirect rewards! If you will allow me to slip into my
 economics metaphor, I shall try to explain my view of things. The
 consumer is the direct reward giver, something that attempts to mold
 the system to produce certain products, it doesn't say what is wants
 just what is good, by giving money ( direct reward).

 In humans this role played by the genome constructing structures that
 says nice food and sex is good, along with respect from your peers
 (probably the Hypothalamus and amygdala).

 The role of raw materials is played by the information coming from the
 environment. It can be converted to products or tools.

 You have retail outlets that interact directly with the consumer,
 being closest to the outputs they get directly the money that allows
 their survival. However they have to pass some of the money onto the
 companies that produced the products they passed onto the consumer.
 This network of money passing will have to carefully controlled so
 that more money isn't produced in one company than was given
 (currently I think of the network of dopaminergic neurons being this
 part).

 Now with this sort of system you can make a million just so stories
 about why one program would be selected that passes reward to another,
 that is give indirect reward. This is where the complexity kicks in.
 In terms of crossword solving one possibility is that a program closer
 to the output and with lots of reward has selected for rewarding
 logical problem solving because in general it is useful for getting
 reward and so passes reward on to a program that has proven its
 ability to logical problem solve, possibly entering into a deal of
 some sort.

 This is all very subconcious, as it is needed to be to be able to
 encompass and explain low level learning such as neural plasticity,
 which is very subconcious itself.

  Will Pearson


---
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to http://v2.listbox.com/member/[EMAIL PROTECTED]