You need more than just reinforcement learning. You need "regulation" and "compensation" psychological terms Piaget used. Regulation is the correction of failed behaviors or reinforcement of successful behaviors. Compensation is the inversion of failed behaviors or the elimination of undesirable side effects. Both regulation and compensation should be intrinsic in the cognitive system, and in my view, should build new cognitive structures tightly integrated into existing and new behaviors. This is way more than Reinforcement Learning. ~PM Date: Tue, 29 Jan 2013 08:54:39 -0500 Subject: [agi] RL Does Not Fully Explain Inner Direction From: [email protected] To: [email protected]
On Mon, Jan 28, 2013 at 6:21 PM, Aaron Hosford <[email protected]> wrote: In regards to the idea that intrinsic rewards are somehow different from extrinsic ones, a reward signal can just as easily be modulated by internal events (thoughts) as external ones (percepts). Furthermore, if you read up on RL, you'll see that in all effective multi-step RL-style algorithms, there is a backward chaining of reward, so that previous behaviors or other early triggers for a behavior are rewarded, not just the immediate actions. All actions, whether extrinsically or intrinsically rewarding, derive their value from either immediate or indirect/backward-chained reward signals, which means we can modulate behavior arbitrarily to any level of complexity with relatively minimal difficulty by taking advantage of this backward chaining. Well the fact that backwards chaining of the actions leading up to a rewarded behavior is an interesting point. And while anyone with a little imagination could come up with a creative means to develop a way to use RL to reinforce complex behaviors based on parts of a behavior string that is reinforced this is not explained by the backward-chained reward signals that you mentioned. But looking beyond that the claim that any internal motivation could be explained by external reinforcement is unnecessarily complicated because it is dependent on external rewards which would demand that things like the massive levels of complexity of infinitesimal past rewards could explain inner direction. This is the same problem as insisting that Bayesian Reasoning along with some priors are all that is necessary to explain human intelligence. Sorry but it just does not work - unless you change the presumptions of what is meant by Reinforcement Learning or Bayesian Reasoning. (Which is ok, I am just saying...) Jim Bromer AGI | Archives | Modify Your Subscription ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
