Hello Daniel,
On 11/03/2015 09:56 PM, Daniel McDonald wrote:
There's one aspect of this thread that I don't feel has been touched on,
which may help in understanding prediction and the anomaly score. I
learned this at the spring hackathon in NY.
If you look at how the anomaly score is implemented [1], you'll see that
it computes the ratio of the difference of the number of active columns
and the number of active columns which were also predicted to the number
of actives. That is, (#active - #activeAndPredicted) / #active. Note
that this formula does not depend on the total number of predicted
columns. In fact, if the HTM predicts all columns, the anomaly score
will be 0 for any subsequent input. In this case, the HTM would be
completely uncertain about the next step in the sequence, so it predicts
a superposition of all possible patterns; therefore, any subsequent
input is not anomalous.
What do you mean by "active columns" and "number of actives"? Are you
are saying anomaly score is depending on number of columns (values or
dimensions if you want) which are predicted (e.g. memory consumption,
cpu sonsumption ...)? This is what you mean? Sorry for my bad English.
That is exactly what happened to my Market Patterns hack at the
hackathon. After training the HTM on years of stock market data, the
anomaly score dropped quite low; however, when I looked carefully at
what was going on, the HTM had, in fact, saturated and was predicting
more than half of the columns to be active at each step in the
sequence. In effect, it was saying that the sequences were
unpredictable and anything was possible in the next step (we already
knew that about the stock market, right?). Consequently, whatever
happened next was not anomalous.
You are saying that is it possible to have low anomaly score on nearly
all value that is predicted due to many possible predictions which are
equally confident?
When I look at your example data, I read it this way:
At 175, 0.0 was read and 0.0 is the prediction for the next step. The
anomaly score of 0.325 is meaningless, because we don't have data from
the previous step.
At 176, 62 was read, which doesn't match the prediction of 0.0 (from
175), so it is anomalous (0.65). 52 is predicted for the next step.
At 177, 402 is read. It is completely anomalous (1.0). That is there
is no overlap in the columns predicted for the value 52 and the columns
active for the value 402. If you are using a scalar encoder, that makes
sense, since the bit patterns for such different numbers likely have no
overlap in the encoding or in the SDR produced by the SP. 0.0 is
predicted for the next step.
How can I know if I'm using scalar encoder. Can I see somehow the SDR of
402, 52 is there any mapping for this? I'm using OPF.
At 178, 0 is read, and the anomaly score drops low (0.125), since the
actual matches closely to what was predicted at the previous step. The
score isn't exactly 0, because the predicted SDR from the previous step
and the encoded SDR for the new input may differ in some columns. In
other words, in the previous step, when 0.0 was reported as the
prediction, this was only an approximate translation of a predicted SDR,
where 0.0 was the closest decoded representation. 0.0 is predicted for
the next step.
At 179, 402 is read, which is completely anomalous (1.0) because the
predicted SDR for 0.0 had no column in common with the encoded SDR for 402.
180 is similar to 178, and 0.0 is predicted.
At 181, 3 is read. The anomaly score is low (0.05), because the scalar
encoder produces overlapping patterns for similar numbers, so there is
likely overlap in the SDR's for 0 and 3. 402 is predicted.
At 182, 50 is read. The anomaly score is low (0.1), which is a bit
puzzling; however, it may be due to saturation. The prediction of 402
could represent a case where many columns were predicted representing a
superposition of possible states, and 402 was just the strongest one
(i.e., had the highest overlap of the encoded SDR for 402 with the
predicted columns). That is, 52 may have also been predicted, but to a
lesser degree than 402. It may be helpful to look at how many columns
are predicted vs. active in each step to see when this happens. If the
number of predicted columns suddenly jumps, it means that the HTM is
uncertain about the next step (or, that it sees many possible next steps
given the current context).
Wht do you mean by "how many columns are predicted vs active"?
Thank you very much