Hi Cas,

I have encountered similar problems before. One way to improve the
prediction accuracy is to first encode your data with a delta encoder
before feeding it into HTM.  For example, if your raw data is y(1), y(2),
y(3), ... , y(t)-y(t-1). You fed in y(2)-y(1), y(3)-y(2), ..., y(t)-y(t-1).
Once you get the prediction, you can convert it back using simple
integration.

This trick works if the change of the data at each time step is much
smaller than the entire range of the data, such as a sine wave with a high
sampling rate. The encoder may not have enough resolution to accurately
represent the change across neighboring time points.

Yuwei

On Wed, Dec 9, 2015 at 4:40 AM, Cas <[email protected]> wrote:

> Hi Subutai,
>
> I have gotten around to running some medium strength swarms before running
> the dataset through NuPIC.
>
> A very persistent problem is that NuPIC keeps 'forgetting' the order of
> predicted data points, I see this in my dataset where the predicted value
> often spikes up or down to a value that is closer to the next datapoint
> over. I will be trying again with heavy swarming. A dataset of 10000
> records seems to give the most 'mature' results.
>
> I was wondering if anyone has any tips to improve the temporal memory's
> ability to predict the next value correctly? As I said the main problem
> seems to be that NuPIC is forgetting the order of the pattern, rather than
> the space that the pattern occupies.
>
> If anyone wants to view my results, feel free to send me a personal
> message.
>
> Met vriendelijke groet,
>
> Casper Rooker
> [email protected]
>
> On Thu, Dec 3, 2015 at 6:36 PM, Subutai Ahmad <[email protected]> wrote:
>
>> Hi Casper,
>>
>> I'm not 100% sure why the HTM Engine is giving high anomaly scores with
>> the sine wave. The HTM Engine is not setup for completely artificial data –
>> for example it uses the time stamp as an input under the assumption that
>> time of day matters in many real world data.  On the other hand you have
>> set it up nicely such that the sine wave has a 24 hour cycle, and the data
>> itself repeats exactly every 24 hours.
>>
>> My best guess is that the encoder parameters are not setup for fine
>> predictions required for the sine wave. The HTM Engine params were tuned
>> based on noisy real world data and use a coarser bucket size than you would
>> want for sine waves.
>>
>> It sounds like the Engine is working well for real world data like NAB
>> data? If you really want anomaly detection working well with clean sine
>> wave-like data and not real world data, then I think you are right - you
>> should probably fine tune the model params yourself.  If mostly you want to
>> use real-world data you might want to switch to experimenting with more
>> realistic data sets but stick with HTM Engine for now.
>>
>> -- Subutai
>>
>> On Thu, Dec 3, 2015 at 7:54 AM, Cas <[email protected]> wrote:
>>
>>> update: I ran one of my datasets through the example at
>>> https://github.com/rhyolight/nupic.examples/tree/master/sine-prediction,
>>> after making some small changes to it.
>>>
>>> Contrary to the HTM Engine this script runs a swarm to make model
>>> parameters. Here I can see that NuPIC definitely recognizes a pattern and
>>> reports overall low anomaly scores, up until about halfway when it just
>>> completely loses the pattern and reports all points as highly anomalous. I
>>> included the output of nupic and a small part of the plotted graph where it
>>> starts to report anomalies.
>>>
>>> This result is more promising, and makes me think that I should
>>> definitely swarm over self generated data first. That means I have to step
>>> away from HTM Engine definitively, though.
>>>
>>> NuPIC still reports seemingly random anomaly scores in the sine wave
>>> pattern that I created. I will try the same thing with the other pattern
>>> and report back if there's any progress.
>>>
>>> Met vriendelijke groet,
>>>
>>> Casper Rooker
>>> [email protected]
>>>
>>> On Thu, Dec 3, 2015 at 2:11 PM, Cas <[email protected]> wrote:
>>>
>>>> Hello NuPIC,
>>>>
>>>> Can you share your thoughts on the anomaly scores that HTM Engine is
>>>> reporting on my self-generated datasets?
>>>>
>>>> I'd like to experiment with different model parameters, but as far as I
>>>> could see I can only offer the name, mininum value and maximum value.
>>>>
>>>> I'm very much interested in experimenting with the anomaly detection
>>>> functionality, so I really hope someone can help me.
>>>>
>>>> In my set up I am using the skeleton-htm-engine-app from within a
>>>> vagrant box. I am offering all of my data through the API of the HTM
>>>> Engine. Since I am using the most basic implementation of the HTM Engine I
>>>> doubt that making a public repo would help. If there are other factors that
>>>> influence the output of HTM Engine, please let me know.
>>>>
>>>>
>>>>
>>>> Met vriendelijke groet,
>>>>
>>>> Casper Rooker
>>>> [email protected]
>>>>
>>>> On Tue, Dec 1, 2015 at 4:04 PM, Cas <[email protected]> wrote:
>>>>
>>>>> Hi Fergal,
>>>>>
>>>>> thanks for your reply. I assume you mean the contents of the model
>>>>> parameters that are required for anomaly detection by HTM Engine? For 
>>>>> these
>>>>> datasets all I did was specify the name, minimum value and maximum value 
>>>>> of
>>>>> the dataset. The rest of the model parameters are filled in by HTM Engine
>>>>> itself, as far as I know.
>>>>>
>>>>> For all the given datasets I specified 0 as minimum and 100 as maximum.
>>>>>
>>>>> I use my own java app to generate data and convert and send datasets
>>>>> to the HTM Engine.
>>>>>
>>>>> For my setup I am using the skeleton htmengine app running on a linux
>>>>> vagrant box.
>>>>>
>>>>> The dates you see in the datasets are formatted so I can use the NuPIC
>>>>> visualisation project.
>>>>>
>>>>> I could try to dig up the model parameters that are stored in the HTM
>>>>> Engine cache on the Vagrant Box, but I'll have to find out how to get a
>>>>> clean export of it. I'll let you know when I have it.
>>>>>
>>>>> Met vriendelijke groet,
>>>>>
>>>>> Casper Rooker
>>>>> [email protected]
>>>>>
>>>>> On Tue, Dec 1, 2015 at 3:36 PM, Fergal Byrne <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hi Casper,
>>>>>>
>>>>>> Please provide your parameter description, it's impossible to guess
>>>>>> what's going on just from the output. A github repo would be the best way
>>>>>> to share your setup.
>>>>>>
>>>>>> On Tue, Dec 1, 2015 at 2:29 PM, Cas <[email protected]> wrote:
>>>>>>
>>>>>>> P.S.: To clarify my problem, here is another dataset of a sine wave
>>>>>>> that HTM Engine is also reporting seemingly random anomaly scores on. 
>>>>>>> I've
>>>>>>> included a picture of a piece of the data near the end of the file.
>>>>>>>
>>>>>>> Met vriendelijke groet,
>>>>>>>
>>>>>>> Casper Rooker
>>>>>>> [email protected]
>>>>>>>
>>>>>>> On Tue, Dec 1, 2015 at 1:03 PM, Cas <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hello NuPIC,
>>>>>>>>
>>>>>>>> I'm using the HTM Engine to experiment with a self-generated
>>>>>>>> pattern. I have deployed it on several datasets from the NAB datasets 
>>>>>>>> with
>>>>>>>> promising results.
>>>>>>>>
>>>>>>>> I then offered a very simple dataset that I have generated myself,
>>>>>>>> but the anomaly scores are completely random across the entire 
>>>>>>>> dataset. I
>>>>>>>> have attached one dataset from the NuPIC anomaly benchmark with anomaly
>>>>>>>> scores and one dataset that I have generated myself along with anomaly
>>>>>>>> scores. I'd like to know your thoughts on what is making the HTM Engine
>>>>>>>> report these anomaly scores that are basically useless.
>>>>>>>>
>>>>>>>> With regards,
>>>>>>>>
>>>>>>>> Casper Rooker
>>>>>>>> [email protected]
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Fergal Byrne, Brenter IT @fergbyrne
>>>>>>
>>>>>> http://inbits.com - Better Living through Thoughtful Technology
>>>>>> http://ie.linkedin.com/in/fergbyrne/ - https://github.com/fergalbyrne
>>>>>>
>>>>>> Founder of Clortex: HTM in Clojure -
>>>>>> https://github.com/nupic-community/clortex
>>>>>> Co-creator @OccupyStartups Time-Bombed Open License
>>>>>> http://occupystartups.me
>>>>>>
>>>>>> Author, Real Machine Intelligence with Clortex and NuPIC
>>>>>> Read for free or buy the book at
>>>>>> https://leanpub.com/realsmartmachines
>>>>>>
>>>>>> e:[email protected] t:+353 83 4214179
>>>>>> Join the quest for Machine Intelligence at http://numenta.org
>>>>>> Formerly of Adnet [email protected] http://www.adnet.ie
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to