Hi Casper,

I'm not 100% sure why the HTM Engine is giving high anomaly scores with the
sine wave. The HTM Engine is not setup for completely artificial data – for
example it uses the time stamp as an input under the assumption that time
of day matters in many real world data.  On the other hand you have set it
up nicely such that the sine wave has a 24 hour cycle, and the data itself
repeats exactly every 24 hours.

My best guess is that the encoder parameters are not setup for fine
predictions required for the sine wave. The HTM Engine params were tuned
based on noisy real world data and use a coarser bucket size than you would
want for sine waves.

It sounds like the Engine is working well for real world data like NAB
data? If you really want anomaly detection working well with clean sine
wave-like data and not real world data, then I think you are right - you
should probably fine tune the model params yourself.  If mostly you want to
use real-world data you might want to switch to experimenting with more
realistic data sets but stick with HTM Engine for now.

-- Subutai

On Thu, Dec 3, 2015 at 7:54 AM, Cas <[email protected]> wrote:

> update: I ran one of my datasets through the example at
> https://github.com/rhyolight/nupic.examples/tree/master/sine-prediction,
> after making some small changes to it.
>
> Contrary to the HTM Engine this script runs a swarm to make model
> parameters. Here I can see that NuPIC definitely recognizes a pattern and
> reports overall low anomaly scores, up until about halfway when it just
> completely loses the pattern and reports all points as highly anomalous. I
> included the output of nupic and a small part of the plotted graph where it
> starts to report anomalies.
>
> This result is more promising, and makes me think that I should definitely
> swarm over self generated data first. That means I have to step away from
> HTM Engine definitively, though.
>
> NuPIC still reports seemingly random anomaly scores in the sine wave
> pattern that I created. I will try the same thing with the other pattern
> and report back if there's any progress.
>
> Met vriendelijke groet,
>
> Casper Rooker
> [email protected]
>
> On Thu, Dec 3, 2015 at 2:11 PM, Cas <[email protected]> wrote:
>
>> Hello NuPIC,
>>
>> Can you share your thoughts on the anomaly scores that HTM Engine is
>> reporting on my self-generated datasets?
>>
>> I'd like to experiment with different model parameters, but as far as I
>> could see I can only offer the name, mininum value and maximum value.
>>
>> I'm very much interested in experimenting with the anomaly detection
>> functionality, so I really hope someone can help me.
>>
>> In my set up I am using the skeleton-htm-engine-app from within a vagrant
>> box. I am offering all of my data through the API of the HTM Engine. Since
>> I am using the most basic implementation of the HTM Engine I doubt that
>> making a public repo would help. If there are other factors that influence
>> the output of HTM Engine, please let me know.
>>
>>
>>
>> Met vriendelijke groet,
>>
>> Casper Rooker
>> [email protected]
>>
>> On Tue, Dec 1, 2015 at 4:04 PM, Cas <[email protected]> wrote:
>>
>>> Hi Fergal,
>>>
>>> thanks for your reply. I assume you mean the contents of the model
>>> parameters that are required for anomaly detection by HTM Engine? For these
>>> datasets all I did was specify the name, minimum value and maximum value of
>>> the dataset. The rest of the model parameters are filled in by HTM Engine
>>> itself, as far as I know.
>>>
>>> For all the given datasets I specified 0 as minimum and 100 as maximum.
>>>
>>> I use my own java app to generate data and convert and send datasets to
>>> the HTM Engine.
>>>
>>> For my setup I am using the skeleton htmengine app running on a linux
>>> vagrant box.
>>>
>>> The dates you see in the datasets are formatted so I can use the NuPIC
>>> visualisation project.
>>>
>>> I could try to dig up the model parameters that are stored in the HTM
>>> Engine cache on the Vagrant Box, but I'll have to find out how to get a
>>> clean export of it. I'll let you know when I have it.
>>>
>>> Met vriendelijke groet,
>>>
>>> Casper Rooker
>>> [email protected]
>>>
>>> On Tue, Dec 1, 2015 at 3:36 PM, Fergal Byrne <
>>> [email protected]> wrote:
>>>
>>>> Hi Casper,
>>>>
>>>> Please provide your parameter description, it's impossible to guess
>>>> what's going on just from the output. A github repo would be the best way
>>>> to share your setup.
>>>>
>>>> On Tue, Dec 1, 2015 at 2:29 PM, Cas <[email protected]> wrote:
>>>>
>>>>> P.S.: To clarify my problem, here is another dataset of a sine wave
>>>>> that HTM Engine is also reporting seemingly random anomaly scores on. I've
>>>>> included a picture of a piece of the data near the end of the file.
>>>>>
>>>>> Met vriendelijke groet,
>>>>>
>>>>> Casper Rooker
>>>>> [email protected]
>>>>>
>>>>> On Tue, Dec 1, 2015 at 1:03 PM, Cas <[email protected]> wrote:
>>>>>
>>>>>> Hello NuPIC,
>>>>>>
>>>>>> I'm using the HTM Engine to experiment with a self-generated pattern.
>>>>>> I have deployed it on several datasets from the NAB datasets with 
>>>>>> promising
>>>>>> results.
>>>>>>
>>>>>> I then offered a very simple dataset that I have generated myself,
>>>>>> but the anomaly scores are completely random across the entire dataset. I
>>>>>> have attached one dataset from the NuPIC anomaly benchmark with anomaly
>>>>>> scores and one dataset that I have generated myself along with anomaly
>>>>>> scores. I'd like to know your thoughts on what is making the HTM Engine
>>>>>> report these anomaly scores that are basically useless.
>>>>>>
>>>>>> With regards,
>>>>>>
>>>>>> Casper Rooker
>>>>>> [email protected]
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Fergal Byrne, Brenter IT @fergbyrne
>>>>
>>>> http://inbits.com - Better Living through Thoughtful Technology
>>>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>>>
>>>> Founder of Clortex: HTM in Clojure -
>>>> https://github.com/nupic-community/clortex
>>>> Co-creator @OccupyStartups Time-Bombed Open License
>>>> http://occupystartups.me
>>>>
>>>> Author, Real Machine Intelligence with Clortex and NuPIC
>>>> Read for free or buy the book at https://leanpub.com/realsmartmachines
>>>>
>>>> e:[email protected] t:+353 83 4214179
>>>> Join the quest for Machine Intelligence at http://numenta.org
>>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>>
>>>
>>>
>>
>

Reply via email to