Thanks for the info Scott!

>From what I understand it assumes running the data through OPF..where data
is like:

# val1,val2,meta
1,1,S
1,2,S
2,1,R
2,2,S
2,2,R

So for the model I'd specify col #3 be meta..and it'll reset before each
'R' line?

Thanks,


On Fri, Nov 15, 2013 at 10:15 PM, Scott Purdy <[email protected]> wrote:

> It seems like you figured out resets directly with the CLA model but just
> for future reference: you can specify resets in data for description.py
> files through a specific type of column. The file format has three header
> lines and the third is the "FieldMetaSpecial.".
>
> You can specify "S" (for sequence) to have a reset inserted right before
> any new value. In other words, you put the same value in the column for
> every row in the sequence and when the OPF sees a new value it will know it
> is the start of a new sequence and insert a reset before the record.
>
> Alternatively, you can specify "R" as a boolean field. This is a different
> method for achieving the same result.
>
> See the FieldMetaSpecial class here:
> py/nupic/data/fieldmeta.py
>
>
> On Thu, Nov 14, 2013 at 8:48 PM, Marek Otahal <[email protected]>wrote:
>
>> Hey, thanks a lott!!
>>
>> First I wanted direct access to TP/SP, but looking at the model I found
>> this, which is great!
>>
>> https://github.com/numenta/nupic/blob/master/py/nupic/frameworks/opf/clamodel.py#L239
>>
>> When calling the reset for sentence separators (!,.,?,:,",....), the
>> results look much more accurate: see below.
>>
>> Btw, the cpp impl of SP serves linguist well. I'll send a PR to your
>> branch tmr.
>>
>> Best regards, Mark
>>
>>
>> ----------------Linguist on
>> child-stories.txt-------------------------------------
>> [27944]  s ==>  would come toge (0.59 | 0.43 | 0.44 | 0.43 | 0.43 | 0.45
>> | 0.64 | 0.45 | 0.60 | 0.43 | 0.45 | 0.65 | 0.47 | 0.43 | 0.43 | 0.45)
>> [27945]  . ==> |If a boat saile (0.92 | 0.87 | 0.87 | 0.87 | 0.60 | 0.69
>> | 0.60 | 0.65 | 0.60 | 0.60 | 0.64 | 0.62 | 0.60 | 0.60 | 0.60 | 0.65)
>> DEBUG:  Result of PyRegion::executeCommand : 'None'
>> reset
>> [27946]  | ==> If a boat sailed (0.84 | 0.84 | 0.85 | 0.57 | 0.62 | 0.57
>> | 0.57 | 0.57 | 0.57 | 0.61 | 0.58 | 0.58 | 0.58 | 0.57 | 0.58 | 0.58)
>> [27947]  T ==> hey were as high (1.00 | 1.00 | 0.92 | 0.93 | 0.91 | 0.85
>> | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.86 | 0.85 | 0.86 | 0.85 | 0.86)
>> [27948]  h ==> e  were as high  (0.96 | 0.41 | 0.55 | 0.40 | 0.36 | 0.36
>> | 0.37 | 0.53 | 0.36 | 0.36 | 0.44 | 0.36 | 0.37 | 0.37 | 0.42 | 0.41)
>> [27949]  e ==>   atee  the rock (0.52 | 0.36 | 0.36 | 0.33 | 0.29 | 0.25
>> | 0.25 | 0.25 | 0.28 | 0.24 | 0.24 | 0.46 | 0.27 | 0.27 | 0.27 | 0.27)
>> [27950]  s ==> .|Thed come toge (0.51 | 0.51 | 0.35 | 0.35 | 0.35 | 0.45
>> | 0.64 | 0.45 | 0.60 | 0.43 | 0.45 | 0.65 | 0.47 | 0.43 | 0.43 | 0.45)
>> [27951]  e ==> therhehd break t (0.26 | 0.25 | 0.25 | 0.25 | 0.36 | 0.36
>> | 0.26 | 0.34 | 0.36 | 0.32 | 0.31 | 0.31 | 0.31 | 0.31 | 0.32 | 0.31)
>> [27952]    ==> poeces.|T esetle (0.23 | 0.46 | 0.32 | 0.23 | 0.28 | 0.26
>> | 0.26 | 0.26 | 0.23 | 0.29 | 0.26 | 0.23 | 0.27 | 0.29 | 0.25 | 0.54)
>> [27953]  r ==> ocks wo ld tome  (0.36 | 0.36 | 0.65 | 0.36 | 0.35 | 0.35
>> | 0.35 | 0.37 | 0.35 | 0.35 | 0.42 | 0.37 | 0.35 | 0.35 | 0.35 | 0.37)
>> [27954]  o ==> at  ooeeder aehe (0.25 | 0.25 | 0.47 | 0.40 | 0.22 | 0.40
>> | 0.21 | 0.29 | 0.26 | 0.30 | 0.30 | 0.21 | 0.21 | 0.37 | 0.38 | 0.38)
>> [27955]  c ==> ese These ro and (0.34 | 0.63 | 0.34 | 0.34 | 0.34 | 0.34
>> | 0.34 | 0.35 | 0.36 | 0.35 | 0.34 | 0.35 | 0.53 | 0.48 | 0.48 | 0.48)
>> [27956]  k ==> s would come tog (0.66 | 0.66 | 0.64 | 0.65 | 0.64 | 0.64
>> | 0.65 | 0.75 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64)
>>
>>
>>
>> On Fri, Nov 15, 2013 at 4:33 AM, Chetan Surpur <[email protected]>wrote:
>>
>>> Mark,
>>>
>>> Linguist doesn't use the OPF other than for swarming. It directly calls
>>> methods on the CLA model. If you want to have it reset the sequence when it
>>> reads a particular character, you can just add that logic to the Linguist
>>> code.
>>>
>>> - Chetan
>>>
>>>
>>> On Thu, Nov 14, 2013 at 6:51 PM, Marek Otahal <[email protected]>wrote:
>>>
>>>> This problem touches text prediction/generation. But is of a general
>>>> Nupic algorithmic topic.
>>>>
>>>> Playing with Chetan's linguist repo
>>>> https://github.com/chetan51/linguist/issues/1 , I discussed the
>>>> (relatively poor) results with Chetan and Scott. (conversation below)
>>>>
>>>>  Then I realized we do not do resets in the text streams. And text
>>>> streams are one example where resets are well reasonable to do (and well
>>>> defined too).
>>>>
>>>> From what I recall, OPF allows to force a TP reset after periodic time
>>>> intervals, that is unusable here (worst case, I could set it to an average
>>>> sentence length). The other example where OPF does reset is end of the
>>>> dataset and start of a new epoch. That;s why relatively good results on
>>>> trivial "Hello World!" datasets.
>>>>
>>>> Ideally, I'd like to set a set of "terminators" = ['!','.','?'] and
>>>> call a reset() whenever the new char == one of those. Is there a reasonable
>>>> way to rewrite (where?) OPF to allow this behavior?
>>>>
>>>> Related to the OPF & API thread, that's why I'd like OPF, or its
>>>> successor to have a choice for 'fnName' : 'listOfParams' setting, where
>>>> fnName would be executed each round with parameters listOfParams. This way,
>>>> I could just simply pass def _checkTerminate(c,listTerm): if c in listTerm:
>>>> TP.reset();
>>>>
>>>>
>>>> You may say I don't use OPF then. For this  case I probably will, as
>>>> it's easy to chain encoder|SP|TP. OPF does some improved things for the
>>>> inference etc, see Scott below.
>>>>
>>>> Cheers! Mark.
>>>>
>>>>
>>>> ---------------------------------------------
>>>>
>>>> The temporal pooler will have a set of cells predicted at each step
>>>> (multiple simultaneous predictions). The classifier converts the predicted
>>>> cells back to letters. So when it sees "m" it may be predicting the TP
>>>> cells for both "a" in "made" and "a" in "matches". The classifier is
>>>> guessing that the "m" is the start of "made" but when the "a" comes the TP
>>>> doesn't necessarily lock on to just the "made" sequence. So in the next
>>>> step the classifier is still guessing whether you are in the "made"
>>>> sequence or the "matches" sequence.
>>>>
>>>> I am sort of spitballing here but it seems like the behavior seen,
>>>> while not intuitive, could be correct, at least for some of the letters.
>>>>
>>>> The spatial pooler and the CLA classifier make it a little hard to
>>>> reason about the results. Perhaps an alternative would be to use just the
>>>> temporal pooler. You could have 40 or so columns for each character that
>>>> you want to include. I would limit the characters you include (convert
>>>> everything to lowercase, for instance). If you have 30 characters with 40
>>>> columns per character than you need a TP with 1200 columns. Assign the
>>>> first 40 columns to "a", the next 40 to "b", etc. And you can directly map
>>>> the predicted cells/columns back into predicted letters (and the more
>>>> predicted columns for a given letter, the more likely you can say that
>>>> letter will come next).
>>>>
>>>> The downside is that you can only predict one step ahead. So not sure
>>>> if you want to move to this but it would make it easier to reason about the
>>>> results. You can see examples of using the TP directly here:
>>>> https://github.com/numenta/nupic/tree/master/examples/tp
>>>>
>>>> Hope that helps a little.
>>>>
>>>>
>>>> --
>>>> Marek Otahal :o)
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>>
>> --
>> Marek Otahal :o)
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 
Marek Otahal :o)
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to