Should we open an issue to fix that portion of the white paper?

Sent from my iPhone

> On Aug 19, 2014, at 7:37 AM, Fergal Byrne <[email protected]> wrote:
> 
> Hi John,
> 
> Good spot. That was an error in the white paper, and nothing ever came from 
> attempts to implement it. It's been superseded by the new theory involving 
> sensorimotor memory, in which inter-region feedforward communication is 
> composed of an SDR of active neurons in L3.
> 
> Regards,
> 
> Fergal Byrne 
> 
> 
>> On Tue, Aug 19, 2014 at 1:15 PM, John Blackburn <[email protected]> 
>> wrote:
>> Fergal, Thanks for your replies. You have certainly made things clearer to 
>> me.
>> 
>> However, I think the whitepaper (v0.2.1, Sep 2011) says that both predictive 
>> and active cells are passed to the next region:
>> 
>> p25: "The output of a region is the activity of all cells in the region, 
>> including the cells active because of feed-forward input and the cells 
>> active in the predictive state. As mentioned earlier, predictions"
>> 
>> p31: "Note that only cells that are active due to feed-forward input 
>> propagate activity within the region, otherwise predictions would lead to 
>> further predictions. But all the active cells (feed-forward and predictive) 
>> form the output of a region and propagate to the next region in the 
>> hierarchy."
>> 
>> John.
>> 
>> 
>>> On Tue, Aug 19, 2014 at 1:01 PM, Fergal Byrne <[email protected]> 
>>> wrote:
>>> Hi Nick,
>>> 
>>> Only active states are ever transmitted.
>>> 
>>> There are several reasons for this. CLA is a computational model for 
>>> neocortex, so it must conform with neuroscience at certain levels of 
>>> detail. In particular, inter-neuron communication over a certain distance 
>>> must only be (in neocortex) by the propagation over axons of action 
>>> potentials by firing neurons, or (in CLA), by a neuron being active in the 
>>> current timestep. Any other information is invisible.
>>> 
>>> Much more locally, however, predictive potential does play a role. In 
>>> forming an SDR in CLA, we enforce sparseness by choosing the n% (usually 
>>> 2%) highest "potentials" among cells (among columns in NuPIC), based on 
>>> their response to the input.  We call this "inhibition". In the neocortex, 
>>> what actually happens is that each cell is depolarised at a different rate 
>>> depending on synaptic inputs. The cells with the highest rates reach their 
>>> firing potential first and fire, triggering a wave of inhibition to spread 
>>> outwards and drastically reduce their neighbours' rates of depolarisation.
>>> 
>>> In NuPIC, potential due to feedforward alone is used in SP to choose the 
>>> columns, and then potential due to lateral or predictive inputs is used to 
>>> choose the active cell within each column. In the neocortex, and in a more 
>>> faithful CLA implementation, predictive depolarisation is combined with 
>>> feedforward depolarisation to choose individual cells. 
>>> 
>>> Regards,
>>> 
>>> Fergal Byrne
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On Tue, Aug 19, 2014 at 11:48 AM, Nicholas Mitri <[email protected]> 
>>>> wrote:
>>>> Thanks Fergal,
>>>> 
>>>> I’d like to iterate John’s question though. Are the predictive and active 
>>>> states passed on to the next region as ‘1’ or do we follow the same 
>>>> paradigm in assuming all relevant information is encoded in the active 
>>>> bits and only propagate those states upward while ignoring predictive 
>>>> states?
>>>> 
>>>> Nick
>>>> 
>>>> 
>>>>> On Aug 19, 2014, at 1:40 PM, Fergal Byrne <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>> Hi John,
>>>>> 
>>>>> The classifier is extra-cortical - it's a piece of engineering added to 
>>>>> efficiently extract useful predictions. To explain how it works, let's 
>>>>> use a concrete example of predicting energy use 10 steps ahead in the 
>>>>> hotgym use case.
>>>>> 
>>>>> Firstly, at the outset you tell NuPIC you want to predict a certain field 
>>>>> a certain number of steps ahead (you can do multiple predictions but 
>>>>> these are just copies of the same process). The classifier sets up a 
>>>>> virtual histogram for every cell, which will store the 10-step 
>>>>> predictions of energy use for that cell. For every input seen, the 
>>>>> classifier looks at the active cells from 10 steps in the past and 
>>>>> updates their histograms with the current value of energy use.
>>>>> 
>>>>> To extract a prediction for 10 steps in the future, look at all the 
>>>>> active cells' histograms, and combine their predictions. 
>>>>> 
>>>>> The reason this (often, usually) works is that the pattern of currently 
>>>>> active cells (not just columns) identifies the current input in the 
>>>>> current learned sequence. This very sparse representation statistically 
>>>>> implies a very limited set of future outcomes, and the layer's collective 
>>>>> beliefs, derived from combining the histograms, form a good estimate of 
>>>>> the future of the data.
>>>>> 
>>>>> The pattern of predictive cells in CLA is a prediction of the next SDR(s) 
>>>>> one timestep ahead. It could also be used for prediction if you're only 
>>>>> interested in exactly one step ahead, but it would have to be "decoded" 
>>>>> to reconstruct the next input for one field; the histogram already has 
>>>>> its data in the input domain, so it's easier and cheaper just to use the 
>>>>> histograms. 
>>>>> 
>>>>> The predictive pattern is, however, crucial in identifying which cells to 
>>>>> activate in the next timestep, which then become the sparse set of active 
>>>>> cells from which we derive the 10-step prediction, so predictive states 
>>>>> are key to NuPIC's predictive power.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> Fergal Byrne
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Tue, Aug 19, 2014 at 11:14 AM, John Blackburn 
>>>>>> <[email protected]> wrote:
>>>>>> I've been following this discussion with interest. One question, you say 
>>>>>> only active cells are considered in the classifier but my understanding 
>>>>>> is the input to the next region is the union of active and predictive 
>>>>>> cells. That is, if the cell is active or predictive, the next region in 
>>>>>> the hierarchy gets a 1. If it is inactive it gets a 0. Thus, the next 
>>>>>> region cannot distinguish between active and predictive cells. Is that 
>>>>>> still the case? If so, why does the classifier not take the same 
>>>>>> approach?
>>>>>> 
>>>>>> Many thanks for your advice,
>>>>>> 
>>>>>> John Blackburn
>>>>>> 
>>>>>> 
>>>>>>> On Tue, Aug 19, 2014 at 8:42 AM, Nicholas Mitri <[email protected]> 
>>>>>>> wrote:
>>>>>>> Great! Thank Subutai. Much appreciated. 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Aug 19, 2014, at 3:32 AM, Subutai Ahmad <[email protected]> wrote:
>>>>>>>> 
>>>>>>>> Hi Nick,
>>>>>>>> 
>>>>>>>> I believe your understanding is exactly right. If we are predicting 10 
>>>>>>>> steps into the future, the classifier has to keep a rolling buffer of 
>>>>>>>> the last 10 sets of active bits. The classifier sort-of outputs the 
>>>>>>>> conditional probability of each bucket given the current activation. I 
>>>>>>>> say "sort-of" because there's a rolling average in there, so it's 
>>>>>>>> really a "recent conditional probability".  This is how the OPF 
>>>>>>>> outputs probabilities for each set of predictions.
>>>>>>>> 
>>>>>>>> I believe the implementation stores the indices only for the 
>>>>>>>> historical buffer.   The C++ code for this is in nupic.core, in 
>>>>>>>> FastClaClassifier.hpp/cpp.  
>>>>>>>> 
>>>>>>>> --Subutai
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Sat, Aug 16, 2014 at 6:14 AM, Nicholas Mitri <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>> Hi Subutai,
>>>>>>>>> 
>>>>>>>>> So we’re using the predictive state of the cells as a middle step 
>>>>>>>>> (during learning) to encode context into the representation of the 
>>>>>>>>> input pattern using only active bits? But that’s the extent of their 
>>>>>>>>> practical use as far as the CLA classifier is concerned. 
>>>>>>>>> 
>>>>>>>>> I understood the point you made about the fact that context encoded 
>>>>>>>>> into active bits gives us all the information we need for prediction, 
>>>>>>>>> but there’s still one issue I’m having with the operation of the CLA 
>>>>>>>>> classifier. 
>>>>>>>>> 
>>>>>>>>> If we’re only using active bits, then the RADC matrix we’re storing 
>>>>>>>>> should maintain and update a coincidence counter between the current 
>>>>>>>>> bucket and the active bits from a previous time step during its 
>>>>>>>>> leaning phase. In that way, when the classifier is in inference mode, 
>>>>>>>>> the likelihood becomes the conditional probability of a future bucket 
>>>>>>>>> given current activation. In other words, the classifier learning 
>>>>>>>>> phase creates a relation between past info (active output of TP at 
>>>>>>>>> time = t - x) and current input value (bucket index at time t) so 
>>>>>>>>> that during inference we can use current information (at time = t) to 
>>>>>>>>> predict future values (at time = t + x). (The document attached isn’t 
>>>>>>>>> very clear on that point).
>>>>>>>>> 
>>>>>>>>> If that’s the case, then the active state of the region should be 
>>>>>>>>> stored for future use. Is any of that accurate? and if so, would we 
>>>>>>>>> be storing the state of every cell or only the index of the active 
>>>>>>>>> ones?
>>>>>>>>> 
>>>>>>>>> best,
>>>>>>>>> Nick
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Aug 15, 2014, at 9:18 PM, Subutai Ahmad <[email protected]> 
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi Nick,
>>>>>>>>>> 
>>>>>>>>>> That’s a great question, and one we worked through as well. The 
>>>>>>>>>> classifier does really only use the active bits.  If you think about 
>>>>>>>>>> it, the active bits include all the available information about the 
>>>>>>>>>> high order sequence. It includes the full dynamic context and all 
>>>>>>>>>> future predictions about this sequence can be derived from the 
>>>>>>>>>> active bits. 
>>>>>>>>>> 
>>>>>>>>>> For example, suppose you've learned different melodies and start 
>>>>>>>>>> listening to a song. Once the first few notes are played, there 
>>>>>>>>>> could be many different musical pieces that start the same way. The 
>>>>>>>>>> active state includes all possible melodies that start with these 
>>>>>>>>>> notes. 
>>>>>>>>>> 
>>>>>>>>>> Once you are in the middle of the melody and it’s now unambiguous, 
>>>>>>>>>> the active state at any point is unique to that melody as well as 
>>>>>>>>>> the position within that melody. If you are a musician, you could 
>>>>>>>>>> actually stop listening, take over and play the rest of the song. 
>>>>>>>>>> Similarly, a classifier can take that state as input and predict the 
>>>>>>>>>> sequence of all those notes into the future with 100% accuracy.  
>>>>>>>>>> This is a very cool property. It is a result of the capacity 
>>>>>>>>>> inherent in sparse representations and critical to representing high 
>>>>>>>>>> order sequences.
>>>>>>>>>> 
>>>>>>>>>> As such, the classifier only needs the active state to predict the 
>>>>>>>>>> next N steps.
>>>>>>>>>> 
>>>>>>>>>> So what is the predictive state? The predictive state is in fact 
>>>>>>>>>> just a function of the active bits and the current set of segments. 
>>>>>>>>>> It doesn’t add new information. However it has other uses. The 
>>>>>>>>>> predictive state is used in the Temporal Memory to update the set of 
>>>>>>>>>> active bits given new sensory information. This helps fine tune the 
>>>>>>>>>> active state as you get new information. It also helps the system 
>>>>>>>>>> refine learning as new (possibly unpredicted) information comes in.  
>>>>>>>>>> 
>>>>>>>>>> —Subutai
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Fri, Aug 15, 2014 at 7:40 AM, Nicholas Mitri 
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> Hi Subutai, 
>>>>>>>>>>> 
>>>>>>>>>>> Again, thanks for forwarding the document. It was really helpful. 
>>>>>>>>>>> 
>>>>>>>>>>> I have a quick question before I delve deeper into the classifier. 
>>>>>>>>>>> The document mentions that the classifier makes use of the ‘active’ 
>>>>>>>>>>> bits of the temporal pooler. Are we grouping active and predictive 
>>>>>>>>>>> bits under the label ‘active' here?
>>>>>>>>>>> 
>>>>>>>>>>> If the predictive bits are not mapped into actual values by the 
>>>>>>>>>>> classifier, then what module is performing that task when I query 
>>>>>>>>>>> for the predicted field value at any time step?
>>>>>>>>>>> 
>>>>>>>>>>> If they are, what process is used to decouple multiple simultaneous 
>>>>>>>>>>> predictions and map each to its corresponding value to compare it 
>>>>>>>>>>> against a value after X time steps? Is it as simple as looking at 
>>>>>>>>>>> the normalized RADC table and picking the top 3 buckets with the 
>>>>>>>>>>> highest likelihoods, mapping them into their actual values, then 
>>>>>>>>>>> attaching the likelihood to the prediction as a confidence measure?
>>>>>>>>>>> 
>>>>>>>>>>> There are clearly some major holes in my understanding of the 
>>>>>>>>>>> algorithms at play, I’d appreciate the clarifications :).
>>>>>>>>>>> 
>>>>>>>>>>> thanks,
>>>>>>>>>>> Nick
>>>>>>>>>>> 
>>>>>>>>>>>> On Aug 13, 2014, at 8:39 PM, Subutai Ahmad <[email protected]> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Nick,
>>>>>>>>>>>> 
>>>>>>>>>>>> Nice diagram!  In addition to the video David sent, we have a 
>>>>>>>>>>>> NuPIC issue to create this document:
>>>>>>>>>>>> 
>>>>>>>>>>>> https://github.com/numenta/nupic/issues/578
>>>>>>>>>>>> 
>>>>>>>>>>>> I found some old documentation in our archives. Scott is planning 
>>>>>>>>>>>> to update the wiki with this information. I have also attached it 
>>>>>>>>>>>> here for reference (but warning, it may be a bit outdated!)
>>>>>>>>>>>> 
>>>>>>>>>>>> --Subutai
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, Aug 13, 2014 at 9:03 AM, cogmission1 . 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> Hi Nicholas,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> This is the only source with any depth I have seen. Have you seen 
>>>>>>>>>>>>> this?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> https://www.youtube.com/watch?v=z6r3ekreRzY
>>>>>>>>>>>>> 
>>>>>>>>>>>>> David
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Aug 13, 2014 at 10:46 AM, Nicholas Mitri 
>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>> Hey all, 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Based on my understanding of the material in the wiki, the CLA 
>>>>>>>>>>>>>> algorithms can be depicted by the figure below. 
>>>>>>>>>>>>>> There’s plenty of info about SP and TP in both theory and 
>>>>>>>>>>>>>> implementation details. 
>>>>>>>>>>>>>> I can’t seem to find much information about the classifier 
>>>>>>>>>>>>>> though. 
>>>>>>>>>>>>>> If I’ve understood correctly, this is not a classifier in the 
>>>>>>>>>>>>>> Machine Learning sense of the word but rather a mechanism to 
>>>>>>>>>>>>>> translate TP output into values of the same data type as the 
>>>>>>>>>>>>>> input for comparison purposes. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I’d really appreciate some more involved explanation of the 
>>>>>>>>>>>>>> process in terms of what data is stored step to step and how the 
>>>>>>>>>>>>>> look-up/mapping mechanics are implemented. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> best,
>>>>>>>>>>>>>> Nick
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> <Screen Shot 2013-12-02 at 4.00.01 PM.png>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> nupic mailing list
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> nupic mailing list
>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>>>> 
>>>>>>>>>>>> <multistep_prediction.docx>_______________________________________________
>>>>>>>>>>>> 
>>>>>>>>>>>> nupic mailing list
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> nupic mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> nupic mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> nupic mailing list
>>>>>>>>> [email protected]
>>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> nupic mailing list
>>>>>>>> [email protected]
>>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> nupic mailing list
>>>>>>> [email protected]
>>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> nupic mailing list
>>>>>> [email protected]
>>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> 
>>>>> Fergal Byrne, Brenter IT
>>>>> 
>>>>> Author, Real Machine Intelligence with Clortex and NuPIC 
>>>>> https://leanpub.com/realsmartmachines
>>>>> 
>>>>> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014: 
>>>>> http://euroclojure.com/2014/
>>>>> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>>>>> 
>>>>> http://inbits.com - Better Living through Thoughtful Technology
>>>>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>>>> 
>>>>> e:[email protected] t:+353 83 4214179
>>>>> Join the quest for Machine Intelligence at http://numenta.org
>>>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>>> _______________________________________________
>>>>> nupic mailing list
>>>>> [email protected]
>>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>> 
>>>> 
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>> 
>>> 
>>> 
>>> -- 
>>> 
>>> Fergal Byrne, Brenter IT
>>> 
>>> Author, Real Machine Intelligence with Clortex and NuPIC 
>>> https://leanpub.com/realsmartmachines
>>> 
>>> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014: 
>>> http://euroclojure.com/2014/
>>> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>>> 
>>> http://inbits.com - Better Living through Thoughtful Technology
>>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>> 
>>> e:[email protected] t:+353 83 4214179
>>> Join the quest for Machine Intelligence at http://numenta.org
>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>> 
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>> 
>> 
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
> 
> 
> 
> -- 
> 
> Fergal Byrne, Brenter IT
> 
> Author, Real Machine Intelligence with Clortex and NuPIC 
> https://leanpub.com/realsmartmachines
> 
> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014: 
> http://euroclojure.com/2014/
> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
> 
> http://inbits.com - Better Living through Thoughtful Technology
> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
> 
> e:[email protected] t:+353 83 4214179
> Join the quest for Machine Intelligence at http://numenta.org
> Formerly of Adnet [email protected] http://www.adnet.ie
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to