You need more than just reinforcement learning.  You need "regulation" and 
"compensation" psychological terms Piaget used. 
Regulation is the correction of failed behaviors or reinforcement of successful 
behaviors.  Compensation is the inversion of failed behaviors or the 
elimination of undesirable side effects.
Both regulation and compensation should be intrinsic in the cognitive system, 
and in my view, should build new cognitive structures tightly integrated into 
existing and new behaviors.   This is way more than Reinforcement Learning.
~PM
Date: Tue, 29 Jan 2013 08:54:39 -0500
Subject: [agi] RL Does Not Fully Explain Inner Direction
From: [email protected]
To: [email protected]

On Mon, Jan 28, 2013 at 6:21 PM, Aaron Hosford <[email protected]> wrote:

In regards to the idea that intrinsic rewards are somehow different from 
extrinsic ones, a reward signal can just as easily be modulated by internal 
events (thoughts) as external ones (percepts). Furthermore, if you read up on 
RL, you'll see that in all effective multi-step RL-style algorithms, there is a 
backward chaining of reward, so that previous behaviors or other early triggers 
for a behavior are rewarded, not just the immediate actions. All actions, 
whether extrinsically or intrinsically rewarding, derive their value from 
either immediate or indirect/backward-chained reward signals, which means we 
can modulate behavior arbitrarily to any level of complexity with relatively 
minimal difficulty by taking advantage of this backward chaining.
  Well the fact that backwards chaining of the actions leading up to a rewarded 
behavior is an interesting point.  And while anyone with a little imagination 
could come up with a creative means to develop a way to use RL to reinforce 
complex behaviors based on parts of a behavior string that is reinforced this 
is not explained by the backward-chained reward signals that you mentioned.
 But looking beyond that the claim that any internal motivation could be 
explained by external reinforcement is unnecessarily complicated because it is 
dependent on external rewards which would demand that things like the massive 
levels of complexity of infinitesimal past rewards could explain inner 
direction.  This is the same problem as insisting that Bayesian Reasoning along 
with some priors are all that is necessary to explain human intelligence. Sorry 
but it just does not work - unless you change the presumptions of what is meant 
by Reinforcement Learning or Bayesian Reasoning.  (Which is ok, I am just 
saying...)
 Jim Bromer



  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  

                                          


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to