An information theoretic measure of reinforcement (was RE: [agi] AGI and Deity)

Matt Mahoney Tue, 11 Dec 2007 11:09:02 -0800

--- "John G. Rose" <[EMAIL PROTECTED]> wrote:
> Is an AGI really going to feel pain or is it just going to be some numbers?
> I guess that doesn't have a simple answer. The pain has to be engineered
> well for it to REALLY understand it.


An agent capable of reinforcement learning has an upper bound on the amount of
pleasure or pain it can experience in a lifetime, in an information theoretic
sense.  If an agent responds to input X with output Y, followed by
reinforcement R, then we say that R is a positive reinforcement (pleasure,
R>0) if it increases the probability P(Y|X) and negative reinforcement (pain,
R<0) if it decreases P(Y|X).  Let S1 be the state of the agent before R, and
S2 be the state afterwards.  We may define the bound:

  |R| <= K(S2|S1)

where K is Kolmogorov complexity, the length of the shortest program that
outputs an encoding of S2 given S1 as input.  This definition is intuitive in
that the greater the reinforcement, the greater the change in behavior of the
agent.  Also, it is consistent with the belief that higher animals (like
humans) have greater capacity to feel pleasure and pain than lower animals
(like insects) that have simpler mental states.

We must use the absolute value of R because the behavior X -> Y could be
learned using either positive reinforcement (rewarding X -> Y), negative
reinforcement (penalizing X -> not Y), or by neutral methods such as classical
conditioning (presenting X and Y together).

If you accept this definition, then an agent cannot feel more accumulated
pleasure or pain in its lifetime than K(S(death)|S(birth)).  A simple program
like autobliss ( http://www.mattmahoney.net/autobliss.txt ) could not
experience more than 256 bits of reinforcement, whereas a human could
experience 10^9 bits according to cognitive models of long term memory.


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=74724148-5841d4

An information theoretic measure of reinforcement (was RE: [agi] AGI and Deity)

Reply via email to