Re: [nupic-discuss] Behavior, emotions and learning with NuPIC

Marek Otahal Thu, 05 Dec 2013 02:25:32 -0800

...and Ian!
I'd like to ask about your work on reconstruction (inverse for SP/TP's
compute()), Scott mentioned you've done some research on it. Unfortunately,
it's currently missing in Nupic.


Usecases I'd love to see with this repo are:
action learning = learning the effects, requirements, costs etc of the
action.
and planning = orginize actions in sequences to achieve a goal.

And the reconstruction is needed for these tasks.

Experiments I'd like to once carry out:
1/ learn directions:
similar to the simple utility map to a target, instead of score, return a
best-match way-to-go action.
So for example {1 1 Fwd}, {1 2 Fwd} {1 5 Right}..

Thinks I'd hope to observe:
A/ SP's ability to store all the needed memory of a 2D map

B/ SP's ability to abstract, deal with missing values - so for
noisy/missing input values, we get the best match based on SP's
abstraction.


2/ learning actions
imagine a very young creature, exploring its abilities..
Agent does a random "walk" where for each position a random action is
executed, and a score is stored for a triplet {position (x y) action
inner-state(hungry,..)}.

Here already a number of fun experiments is waiting:

C/ abstract the meaning of inner values: learn to eat when hungry:
{has_food, hungry_80% eat}, {has_food, hungry_70% eat}... -> {has_food,
hungry_83% ???}

D/ TP to learn sequencces of actons (=planning): {grab a key}-{walk to the
door}-{open} = perfect, vs
{grab a hammer}-{walk to the door}-{dance} = still cool but not what we
intended.

..
E/ and more complex and fun stuff: TP does planning, anomaly+emotions shape
behavior - personality:
we can make a curious/coward agent, who executes actions according to the
TP's best prediction, but when anomaly score is high enough: he either runs
away, or starts to explore.


Could you show me if there's some easy way to get reconstruction back to
nupic and use it with SP,TP please? And better yet if you were interested
in some of these examples and help me with them (as im running short of
time now).

Thanks, Mark



On Thu, Dec 5, 2013 at 12:04 AM, Ian Danforth <[email protected]>wrote:

> This is very very cool! I'm trying to run these but running to some issues:
>
> 1. Can you describe your development workflow / git setup?
>
> I just added breznak as a remote, did a git fetch, switched to the
> utility-encoder branch, and I'm now rebuilding.
>
> Are you using the NTAX_DEVELOPER_BUILD flag? Or do you do ./build.sh for
> each new branch?
>
> 2. It looks like there is another requirement 'vtk' which is a pretty huge
> install process all by itself. As much as I want to see the pretty graphs I
> can't get it to build on my machine (OSX 10.9)
>
> 3. Ignoring the lack of vtk I tried running the scripts but got import
> errors, you need a __init__.py file in the alife dir.
>
> 4. After I added that I get this error:
>
> ians-air:ALife iandanforth$ python
> alife/experiments/utility_map/utility_map.py 3 6 24 24
>
> feval not set! do not forget to def(ine) the function and set it with
> setEvaluationFn()
>
> Can't show you nice picture; couldn't import mayavi
> Thanks for all the cool work, I really want to play around with this!
>
> Ian
>
> P.S. I think you mean "homeostasis" rather than "osmosis."
>
>
>
>
>
> On Wed, Dec 4, 2013 at 1:35 PM, Marek Otahal <[email protected]> wrote:
>
>> This mail introduces my experiments with NuPIC on simulating behavior,
>> emotions, goals and learning.
>>
>> It uses a utility-encoder:
>> https://github.com/breznak/nupic/tree/utility-encoder
>> which I'd like to ask you for review, opinions and consideration for
>> mailine.
>> More than for practical issues, I hope this encoder could be an entry
>> point for a field of some very interesting experiments with CLAs.
>>
>> The principle of the encoder is very easy, it provides some kind of
>> postprocessing of the original input, which is then added to the encoders
>> output as another field (score).
>>
>> Usecase for the encoder is eg behavior modeling (which I'll show
>> further). A typical example is: use vector encoder where two fields carry
>> the meaning of (position X, position Y); at initialization, the encoder is
>> passed user-defined evaluation function, which accepts the input and
>> produces a score of it. For this example, the score could be eucleidian
>> distance to a defined target (1,1). The resulting (post)input would be "[x,
>> y], score" -> which is converted to bitmap as output.
>>
>>
>> ===================
>>
>> The behavior and emotions experiments with NuPIC can be found in my
>> https://github.com/breznak/ALife repo.
>>
>>
>> 1/ Emotions:
>> -I went on to assume that basic emotions (low level, like hunger, pain,
>> "feeling good") can be hardwired to the program, and so are in humans and
>> animals where these are encoded in levels of hormones (adrenalin, ..). Such
>> emotions drive "osmosis" where the body wants to keep certains conditions,
>> inner states - like feeling hungry, keeping reasonable temperature,
>> "biological clock for mothers", ...
>>
>> This is modelled by the utility-encoder (above).
>>
>> Emotions can be used to model a higher level goals as well. Here it loses
>> the biological plausability but the use of utility still holds. Such case
>> can be "will to reach a target position, get highest profit in trades, etc"
>>
>>
>> 2/ Actions' effects
>> Another interesting use is where the creature is discovering its
>> abilities (a young baby, completely new environment [space], or a n
>> artifitial limb ["vision" through taste gadget for blind people].
>> The similar concept is used in Prolog programming/planning - where
>> actions have it's prerequisities and effects "ie examples of cranes & cars,
>> monkey&banana&box".
>> This utilizes nicely the concept of SP (and TP) to learn effects,
>> requirements and changes of actions.
>> Example can be: {"hungry", eat, chicken} -> inner state hunger goes down
>> -> high score!
>> while {"full", eat, chicken} -> not much improvement in inner states ->
>> med score. And finally: {"extremely hungry", play violin, violin} -> lowers
>> food ammount -> very low score.
>>
>> Staced up actions example could be a sequence {no food, hungry, walk}
>> followed by {have food, hungry, eat} has high score, vs {no food, hungry,
>> eat}, {no food, hungry, walk} sequence  does not.
>>
>> 3/ Behavior
>> Is the final stage, combines the above + some sort of planning.
>> Can be described like pursuing the main goal(s) while switching to more
>> actual sub-goals as needed. "Eg Get from NYC to LA, avoid planes and dont
>> die (hunger, hit by cars, ...)"
>>
>> The utility map is quite hard to plot, because actually it's changing by
>> position-action-innerstates(-and time).
>>
>> This is modelled by the behavior agent who percieves the worlds, keeps
>> inner representation of the explored states (memory - "5gold on pos [1,5];
>> troll on [8,8]"), has a collection of it's inner states (hunger, body
>> temperature, oz in car's gas tank,..). This agent updates its utility map
>> for each {state, innerstate, action} taken (similar to reinforcement
>> learning)
>>
>> Like I said, the agent creates a utility map as it consumes its resources
>> and moves through the environment. Emotions allow shaping its directions
>> toward (sub)goals. the progress is done by minimalization (or max, doesn''t
>> matter) of the utility function, following the gradient. Here, the "choose
>> the best" can be done either "artificially" (non-biological way), or there
>> could be a higher level region which will take the possible inputs and
>> choose the by the minimum score. (Out of interest, such minimalizing CLA
>> would be a nice proof of concept).
>>
>>
>> I'd like to hear you further ideas, other examples, flaws in my plan etc
>> etc :)
>>
>> Cheers,
>> breznak
>>
>> --
>> Marek Otahal :o)
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 
Marek Otahal :o)

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-discuss] Behavior, emotions and learning with NuPIC

Reply via email to