Hi John,

No, NuPIC is great at looking at multiple fields of data and extracting
both the per-field structure and the inter-field structure, but in practise
it makes sense to proceed step-by-step from 18 single-metric models, up to
pairs, and so on, and thus discover as you go how best to feed data in.
With 18 metrics, the power set of combinations is very large, and most of
these will be useless (or at best marginal), so you add fields one at a
time to models which already lead the list of models, neglecting the ones
which are failing to match your events.

It's almost impossible statistically that the structure is evenly
distributed across all your metrics, and much more likely that the most
interesting inputs will be single fields, pairs or triplets of fields. If
you have a strong intuition (or some evidence) that one pair of fields -
such as temperature and tilt - is correlated, then these should be at the
top of your list when you get up to pairs of metrics.

Combining fields is simply a matter of concatenating the encodings for each
field into a larger bit array (this happens internally in OPF). See the
hotgym example for how to code it. Each column will sample from a subset of
all bits, so NuPIC will identify correlative patterns automatically. The
rate at which it does this will depend on the "density' of structure in the
entire input. Giving the system a combination of all 18 metrics will work,
but will do so very slowly, because much of the input data (coming from
irrelevant metrics) will not contribute to the recognition or the anomaly
detection. On the other hand, treating each single metric as if it were the
only input will help identify (to first order) which metrics contribute the
most to solving the problem.

My recommendation is to follow the procedure for identifying "unusual
events" using the likelihood module to filter anomaly detection as outlined
by Subutai and Matt. You're looking for good matches between known
disturbances and the output of signals from the likelihood module (in
Matt's talk he identifies the correlation between changes in the music and
"red zones" in the likelihood plot). Go through this process for each
single metric, and choose the top several metrics to "breed" your
generation of paired metrics. If you get an improved correlation, add the
best to your gene pool and iterate. Terminate when you stop improving the
model, or when you get tired of seeking the last 1%!

Regards,

Fergal Byrne



On Fri, Aug 15, 2014 at 10:42 AM, John Blackburn <[email protected]
> wrote:

> Dear Fergal and Ian,
>
> Thanks very much for your replies on this. Are you saying it is not
> possible for NuPIC to take in multiple time series and predict multiple
> time series? As I understand it, you are advising me to input only one of
> the time series e.g. the first tilt sensor. However, in my system there is
> a strong correlation between the temperature and the tilt so it would be
> wrong for NuPIC to be unaware of the temperature data while predicting
> tilt. Is it possible for NuPIC to account for spatial correlations between
> data sets also?
>
> I could presumably give it all the data as a bitmap but then how would I
> extract one of the data (eg tile 1) without getting mixed up with the other
> data. It would be useful to have some more documentation on what the
> decoder does and how to use it. Is any available?
>
> John.
>
>
> On Thu, Aug 14, 2014 at 12:30 PM, Fergal Byrne <
> [email protected]> wrote:
>
>> Hi John,
>>
>> I agree with Ian: the first thing to do is to create a separate model
>> which learns the spatiotemporal characteristics of each input metric. This
>> will give you a picture of how well each metric behaves as a measure of the
>> anomalies in your bridge's lifecycle. Experience with Grok (which does only
>> this model-per-metric regime) on numerous systems shows that this is often
>> enough, in that a single high anomaly likelihood score among all the
>> metrics is enough to identify an event worthy of attention, and a second or
>> third blip on other metrics will confirm it.
>>
>> It's important to use the likelihood score first, as it will filter out
>> many perfectly normal events which your system produces, and which might
>> frequently cause high anomaly scores from the raw predictions. if you can
>> confirm that you are getting good correlations between your known events
>> and likelihood alarms on one or more metrics, this will allow you to
>> identify which single metrics and combinations are best at identifying your
>> disturbances.
>>
>> Once you've identified the clearly best metrics (A, B and C say), you
>> could start adding the others (d, e, f, etc) one at a time, creating a set
>> of metrics which might give you even better correlation (eg Ac, Ba might be
>> better than A or B alone).
>>
>> As Ian says, this is how the swarming algorithm works, but in this case
>> the space of combinations is too large for swarming to make any sense. Use
>> a depth-first approach instead by using single-metric models to group your
>> metrics in quality bands. (The other issue with swarming is that it uses
>> anomaly scores rather than likelihood scores to rank candidate choices of
>> input fields).
>>
>> Please keep us informed about how you get on.
>>
>> Regards,
>>
>> Fergal Byrne
>>
>>
>> On Wed, Aug 13, 2014 at 6:05 PM, Ian Danforth <[email protected]>
>> wrote:
>>
>>> Use separate models for each giving each model time and sensor values.
>>>
>>> Start with two sensors and run both through the swarming process and let
>>> us know what difficulties you run into.
>>>
>>> Ian
>>> On 13 Aug 2014 03:37, "John Blackburn" <[email protected]>
>>> wrote:
>>>
>>>> Dear All,
>>>>
>>>> I am a researcher at the National Physical Laboratory, London and am
>>>> attempting to use NuPIC to model the strain and temperature variations of a
>>>> concrete bridge for anomaly detection. The bridge has 10 temperatures
>>>> sensors and 8 "tilt sensors" (basically strain) arranged across it. I have
>>>> hourly readings for all of these sensors for a 3 year period. I would like
>>>> NuPIC to predict all of these quantities (and keep them separate). Compared
>>>> to the "hotgym" example, the difference here is that there are 18 separate
>>>> streams of data which would need to be suitably encoded and decoded to make
>>>> predictions of each one. I suspect the decoding stage would be most
>>>> difficult: from the set of cell activations we need to discover 18 numbers
>>>> and keep them separate. The HTM should account for cross correlations
>>>> between time series as well as auto-correlations. I would like to consider
>>>> +1 and +5 predictions, for example.
>>>>
>>>> During the course of the experiment, various interventions were carried
>>>> out at known times. These include cutting support cables, removing chunks
>>>> of concrete and adding heavy weights. The NN should show anomalous
>>>> behaviour at the time these interventions were done. The system has been
>>>> modelled using an Echo Sensor Network so I want to compare performance of
>>>> ESN to HTM.
>>>>
>>>> So, is this task possible with NuPIC and how might I adjust the
>>>> encoder, decoder to deal with multiple streams?
>>>>
>>>> Many thanks for your help,
>>>>
>>>> John Blackburn.
>>>>
>>>> _______________________________________________
>>>> nupic mailing list
>>>> [email protected]
>>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>>
>>>>
>>> _______________________________________________
>>> nupic mailing list
>>> [email protected]
>>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>>
>>>
>>
>>
>> --
>>
>> Fergal Byrne, Brenter IT
>>
>> Author, Real Machine Intelligence with Clortex and NuPIC
>> https://leanpub.com/realsmartmachines
>>
>> Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014:
>> http://euroclojure.com/2014/
>> and at LambdaJam Chicago, July 2014: http://www.lambdajam.com
>>
>> http://inbits.com - Better Living through Thoughtful Technology
>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>
>> e:[email protected] t:+353 83 4214179
>> Join the quest for Machine Intelligence at http://numenta.org
>> Formerly of Adnet [email protected] http://www.adnet.ie
>>
>> _______________________________________________
>> nupic mailing list
>> [email protected]
>> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>>
>>
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>
>


-- 

Fergal Byrne, Brenter IT

Author, Real Machine Intelligence with Clortex and NuPIC
https://leanpub.com/realsmartmachines

Speaking on Clortex and HTM/CLA at euroClojure Krakow, June 2014:
http://euroclojure.com/2014/
and at LambdaJam Chicago, July 2014: http://www.lambdajam.com

http://inbits.com - Better Living through Thoughtful Technology
http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne

e:[email protected] t:+353 83 4214179
Join the quest for Machine Intelligence at http://numenta.org
Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to