Thanks for the info Scott! >From what I understand it assumes running the data through OPF..where data is like:
# val1,val2,meta 1,1,S 1,2,S 2,1,R 2,2,S 2,2,R So for the model I'd specify col #3 be meta..and it'll reset before each 'R' line? Thanks, On Fri, Nov 15, 2013 at 10:15 PM, Scott Purdy <[email protected]> wrote: > It seems like you figured out resets directly with the CLA model but just > for future reference: you can specify resets in data for description.py > files through a specific type of column. The file format has three header > lines and the third is the "FieldMetaSpecial.". > > You can specify "S" (for sequence) to have a reset inserted right before > any new value. In other words, you put the same value in the column for > every row in the sequence and when the OPF sees a new value it will know it > is the start of a new sequence and insert a reset before the record. > > Alternatively, you can specify "R" as a boolean field. This is a different > method for achieving the same result. > > See the FieldMetaSpecial class here: > py/nupic/data/fieldmeta.py > > > On Thu, Nov 14, 2013 at 8:48 PM, Marek Otahal <[email protected]>wrote: > >> Hey, thanks a lott!! >> >> First I wanted direct access to TP/SP, but looking at the model I found >> this, which is great! >> >> https://github.com/numenta/nupic/blob/master/py/nupic/frameworks/opf/clamodel.py#L239 >> >> When calling the reset for sentence separators (!,.,?,:,",....), the >> results look much more accurate: see below. >> >> Btw, the cpp impl of SP serves linguist well. I'll send a PR to your >> branch tmr. >> >> Best regards, Mark >> >> >> ----------------Linguist on >> child-stories.txt------------------------------------- >> [27944] s ==> would come toge (0.59 | 0.43 | 0.44 | 0.43 | 0.43 | 0.45 >> | 0.64 | 0.45 | 0.60 | 0.43 | 0.45 | 0.65 | 0.47 | 0.43 | 0.43 | 0.45) >> [27945] . ==> |If a boat saile (0.92 | 0.87 | 0.87 | 0.87 | 0.60 | 0.69 >> | 0.60 | 0.65 | 0.60 | 0.60 | 0.64 | 0.62 | 0.60 | 0.60 | 0.60 | 0.65) >> DEBUG: Result of PyRegion::executeCommand : 'None' >> reset >> [27946] | ==> If a boat sailed (0.84 | 0.84 | 0.85 | 0.57 | 0.62 | 0.57 >> | 0.57 | 0.57 | 0.57 | 0.61 | 0.58 | 0.58 | 0.58 | 0.57 | 0.58 | 0.58) >> [27947] T ==> hey were as high (1.00 | 1.00 | 0.92 | 0.93 | 0.91 | 0.85 >> | 0.85 | 0.85 | 0.85 | 0.85 | 0.85 | 0.86 | 0.85 | 0.86 | 0.85 | 0.86) >> [27948] h ==> e were as high (0.96 | 0.41 | 0.55 | 0.40 | 0.36 | 0.36 >> | 0.37 | 0.53 | 0.36 | 0.36 | 0.44 | 0.36 | 0.37 | 0.37 | 0.42 | 0.41) >> [27949] e ==> atee the rock (0.52 | 0.36 | 0.36 | 0.33 | 0.29 | 0.25 >> | 0.25 | 0.25 | 0.28 | 0.24 | 0.24 | 0.46 | 0.27 | 0.27 | 0.27 | 0.27) >> [27950] s ==> .|Thed come toge (0.51 | 0.51 | 0.35 | 0.35 | 0.35 | 0.45 >> | 0.64 | 0.45 | 0.60 | 0.43 | 0.45 | 0.65 | 0.47 | 0.43 | 0.43 | 0.45) >> [27951] e ==> therhehd break t (0.26 | 0.25 | 0.25 | 0.25 | 0.36 | 0.36 >> | 0.26 | 0.34 | 0.36 | 0.32 | 0.31 | 0.31 | 0.31 | 0.31 | 0.32 | 0.31) >> [27952] ==> poeces.|T esetle (0.23 | 0.46 | 0.32 | 0.23 | 0.28 | 0.26 >> | 0.26 | 0.26 | 0.23 | 0.29 | 0.26 | 0.23 | 0.27 | 0.29 | 0.25 | 0.54) >> [27953] r ==> ocks wo ld tome (0.36 | 0.36 | 0.65 | 0.36 | 0.35 | 0.35 >> | 0.35 | 0.37 | 0.35 | 0.35 | 0.42 | 0.37 | 0.35 | 0.35 | 0.35 | 0.37) >> [27954] o ==> at ooeeder aehe (0.25 | 0.25 | 0.47 | 0.40 | 0.22 | 0.40 >> | 0.21 | 0.29 | 0.26 | 0.30 | 0.30 | 0.21 | 0.21 | 0.37 | 0.38 | 0.38) >> [27955] c ==> ese These ro and (0.34 | 0.63 | 0.34 | 0.34 | 0.34 | 0.34 >> | 0.34 | 0.35 | 0.36 | 0.35 | 0.34 | 0.35 | 0.53 | 0.48 | 0.48 | 0.48) >> [27956] k ==> s would come tog (0.66 | 0.66 | 0.64 | 0.65 | 0.64 | 0.64 >> | 0.65 | 0.75 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64 | 0.64) >> >> >> >> On Fri, Nov 15, 2013 at 4:33 AM, Chetan Surpur <[email protected]>wrote: >> >>> Mark, >>> >>> Linguist doesn't use the OPF other than for swarming. It directly calls >>> methods on the CLA model. If you want to have it reset the sequence when it >>> reads a particular character, you can just add that logic to the Linguist >>> code. >>> >>> - Chetan >>> >>> >>> On Thu, Nov 14, 2013 at 6:51 PM, Marek Otahal <[email protected]>wrote: >>> >>>> This problem touches text prediction/generation. But is of a general >>>> Nupic algorithmic topic. >>>> >>>> Playing with Chetan's linguist repo >>>> https://github.com/chetan51/linguist/issues/1 , I discussed the >>>> (relatively poor) results with Chetan and Scott. (conversation below) >>>> >>>> Then I realized we do not do resets in the text streams. And text >>>> streams are one example where resets are well reasonable to do (and well >>>> defined too). >>>> >>>> From what I recall, OPF allows to force a TP reset after periodic time >>>> intervals, that is unusable here (worst case, I could set it to an average >>>> sentence length). The other example where OPF does reset is end of the >>>> dataset and start of a new epoch. That;s why relatively good results on >>>> trivial "Hello World!" datasets. >>>> >>>> Ideally, I'd like to set a set of "terminators" = ['!','.','?'] and >>>> call a reset() whenever the new char == one of those. Is there a reasonable >>>> way to rewrite (where?) OPF to allow this behavior? >>>> >>>> Related to the OPF & API thread, that's why I'd like OPF, or its >>>> successor to have a choice for 'fnName' : 'listOfParams' setting, where >>>> fnName would be executed each round with parameters listOfParams. This way, >>>> I could just simply pass def _checkTerminate(c,listTerm): if c in listTerm: >>>> TP.reset(); >>>> >>>> >>>> You may say I don't use OPF then. For this case I probably will, as >>>> it's easy to chain encoder|SP|TP. OPF does some improved things for the >>>> inference etc, see Scott below. >>>> >>>> Cheers! Mark. >>>> >>>> >>>> --------------------------------------------- >>>> >>>> The temporal pooler will have a set of cells predicted at each step >>>> (multiple simultaneous predictions). The classifier converts the predicted >>>> cells back to letters. So when it sees "m" it may be predicting the TP >>>> cells for both "a" in "made" and "a" in "matches". The classifier is >>>> guessing that the "m" is the start of "made" but when the "a" comes the TP >>>> doesn't necessarily lock on to just the "made" sequence. So in the next >>>> step the classifier is still guessing whether you are in the "made" >>>> sequence or the "matches" sequence. >>>> >>>> I am sort of spitballing here but it seems like the behavior seen, >>>> while not intuitive, could be correct, at least for some of the letters. >>>> >>>> The spatial pooler and the CLA classifier make it a little hard to >>>> reason about the results. Perhaps an alternative would be to use just the >>>> temporal pooler. You could have 40 or so columns for each character that >>>> you want to include. I would limit the characters you include (convert >>>> everything to lowercase, for instance). If you have 30 characters with 40 >>>> columns per character than you need a TP with 1200 columns. Assign the >>>> first 40 columns to "a", the next 40 to "b", etc. And you can directly map >>>> the predicted cells/columns back into predicted letters (and the more >>>> predicted columns for a given letter, the more likely you can say that >>>> letter will come next). >>>> >>>> The downside is that you can only predict one step ahead. So not sure >>>> if you want to move to this but it would make it easier to reason about the >>>> results. You can see examples of using the TP directly here: >>>> https://github.com/numenta/nupic/tree/master/examples/tp >>>> >>>> Hope that helps a little. >>>> >>>> >>>> -- >>>> Marek Otahal :o) >>>> >>>> _______________________________________________ >>>> nupic mailing list >>>> [email protected] >>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>>> >>>> >>> >>> _______________________________________________ >>> nupic mailing list >>> [email protected] >>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >>> >>> >> >> >> -- >> Marek Otahal :o) >> >> _______________________________________________ >> nupic mailing list >> [email protected] >> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org >> >> > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org > > -- Marek Otahal :o)
_______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
