Mark,

Linguist doesn't use the OPF other than for swarming. It directly calls
methods on the CLA model. If you want to have it reset the sequence when it
reads a particular character, you can just add that logic to the Linguist
code.

- Chetan


On Thu, Nov 14, 2013 at 6:51 PM, Marek Otahal <[email protected]> wrote:

> This problem touches text prediction/generation. But is of a general Nupic
> algorithmic topic.
>
> Playing with Chetan's linguist repo
> https://github.com/chetan51/linguist/issues/1 , I discussed the
> (relatively poor) results with Chetan and Scott. (conversation below)
>
>  Then I realized we do not do resets in the text streams. And text streams
> are one example where resets are well reasonable to do (and well defined
> too).
>
> From what I recall, OPF allows to force a TP reset after periodic time
> intervals, that is unusable here (worst case, I could set it to an average
> sentence length). The other example where OPF does reset is end of the
> dataset and start of a new epoch. That;s why relatively good results on
> trivial "Hello World!" datasets.
>
> Ideally, I'd like to set a set of "terminators" = ['!','.','?'] and call a
> reset() whenever the new char == one of those. Is there a reasonable way to
> rewrite (where?) OPF to allow this behavior?
>
> Related to the OPF & API thread, that's why I'd like OPF, or its successor
> to have a choice for 'fnName' : 'listOfParams' setting, where fnName would
> be executed each round with parameters listOfParams. This way, I could just
> simply pass def _checkTerminate(c,listTerm): if c in listTerm: TP.reset();
>
>
> You may say I don't use OPF then. For this  case I probably will, as it's
> easy to chain encoder|SP|TP. OPF does some improved things for the
> inference etc, see Scott below.
>
> Cheers! Mark.
>
>
> ---------------------------------------------
>
> The temporal pooler will have a set of cells predicted at each step
> (multiple simultaneous predictions). The classifier converts the predicted
> cells back to letters. So when it sees "m" it may be predicting the TP
> cells for both "a" in "made" and "a" in "matches". The classifier is
> guessing that the "m" is the start of "made" but when the "a" comes the TP
> doesn't necessarily lock on to just the "made" sequence. So in the next
> step the classifier is still guessing whether you are in the "made"
> sequence or the "matches" sequence.
>
> I am sort of spitballing here but it seems like the behavior seen, while
> not intuitive, could be correct, at least for some of the letters.
>
> The spatial pooler and the CLA classifier make it a little hard to reason
> about the results. Perhaps an alternative would be to use just the temporal
> pooler. You could have 40 or so columns for each character that you want to
> include. I would limit the characters you include (convert everything to
> lowercase, for instance). If you have 30 characters with 40 columns per
> character than you need a TP with 1200 columns. Assign the first 40 columns
> to "a", the next 40 to "b", etc. And you can directly map the predicted
> cells/columns back into predicted letters (and the more predicted columns
> for a given letter, the more likely you can say that letter will come next).
>
> The downside is that you can only predict one step ahead. So not sure if
> you want to move to this but it would make it easier to reason about the
> results. You can see examples of using the TP directly here:
> https://github.com/numenta/nupic/tree/master/examples/tp
>
> Hope that helps a little.
>
>
> --
> Marek Otahal :o)
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to