How anomaly score works and anomaly detection tunning?

Wakan Tanka Tue, 18 Aug 2015 16:26:45 -0700

Hello NuPIC,
I would like to clarify a few questions:

1a. In "Predicting a Sine Waves with NuPIC" video at 30:00 Matt saysthat the predicted data are a bit shifted (I suppose this is due to step1 that was chosen) compared to input data. Matt also says that as NuPICsees more and more data the predicted wave should have less bumps andalso anomaly score should have less spikes. In my experiments I havehuge anomaly score at the beginning (nearly 1.0 for the whole 1st periodof sine) which seems logic because it took time of size 1st period ofsine for NuPIC to see repeating pattern, everything before 1st period isnew/unknown for NuPIC and the high anomaly score is result of that. Buthow you explain following: I often see in my experiments high anomalyscore but nice predicting wave (absence of bumps) and vice versa, partswhere predicted wave does not copy the input data nicely and has a lotof bumps have small anomaly score, is this OK if yes why? How actuallyanomaly score works, what is the correlation between anomaly score andpredicted wave?

1b. Somewhere I've also read (or saw in videos) if NuPIC does not knowwhat to predict (or does not have enough data) it simply repeats afterthe input. Can I somehow know when NuPIC repeats and when predicts,should I worry about this?

1c. In my sine experiments I see nice prediction at the beginning (Isuppose NuPIC is repeating) then some bumps (I suppose NuPIC ispredicting) and then again nice prediction and then again some bumps. My(very simplified) understanding of this is here (in ASCII art):

In first 1/4 of period everything is new (and thus the anomaly score ishigh) for NuPIC and it is repeating what it sees:

/

Then in 2/4 of period it start to see some repeating pattern:
/\

After the first period it strengthening confidence of what it sees (twotimes same pattern):

/\/

As it is constantly learning it does not know at the moment how datawill looks like next (so it still report some anomaly score). Maybe therepeating pattern is following: 1 period of frequency N + 1 period offrequency N/2 as shown here:

   __
/\/  \__/

This leads me to following conclusion: There will be never such enoughdata (you know, maybe frequency change will happen after 2nd 3rd ....1000th .... period, who knows) that NuPIC can say for 100% sure thatsomething has 0 anomaly score and it was totally expected. There will bealways some unpredictability. But in my sine experiments I've often seethat anomaly score equals to 0. My understanding is: anomaly score wascomputed only after whole dataset was seen. But if anomaly score wascalculated after whole dataset was seen why there is no anomaly score of0 in whole file except 1st period? So how and when it was computed?



1d. I've found this interesting article
https://github.com/subutai/nupic.subutai/blob/master/swarm_examples/README.md
which refers to Jeff Hawkins mail about improving the prediction
http://lists.numenta.org/pipermail/nupic_lists.numenta.org/2013-June/000327.html

It is 2 years old stuff. Jeff mentioned there data encoding (samplingrate, noise...) and also evolutionary algorithm that would experimentwith different parameters until it a found a set that worked wellsolving the exact problem specified. He said that: "It is our intentionto put this PSO code into NuPIC but we haven't been able to do that yet"so I would like to ask if this problem is already solved and PSO is inNuPIC now? Isn't swarm process (finding the best model parameters) whatthere is called PSO? Should I worry about tunning prediction if yes how,or it is part of NuPIC nowadays?

1e. Should I know the details of HTM or some other low level stuff tofully understand those questions?


Thank you

PS: I'm non native English speaker so please bare with me ;)


Best Regards

Name: Wakan Tanka a.k.a. Wakatana a.k.a. MackoP00h
Location: Europe
Contact:
[email protected]
http://stackoverflow.com/users/1616488/wakan-tanka
https://github.com/wakatana
https://twitter.com/MackoP00h

How anomaly score works and anomaly detection tunning?

Reply via email to