Re: dodging overfitting

Subutai Ahmad Tue, 12 Apr 2016 20:30:07 -0700

Hi Sam,

I'm not sure how much data you have, but typically we do not run the swarm
on all the data. We usually just run one to two thousand rows, then use
those model parameters to run through the whole data stream starting from
the beginning.  That should avoid overfitting.  If you want to be extra
careful you can ignore the first N predictions, where N is the number of
rows used for swarming.


Another option for your specific situation is to run swarm for some of the
subjects, but report results only for the other subjects.

--Subutai

-----
@SubutaiAhmad

On Tue, Apr 12, 2016 at 2:20 PM, Samuel O Heiserman <[email protected]
> wrote:

> Hey Nupic!
>
>     I'm wondering: when running data ]through Nupic, should I not run the
> same file to build the model as I did to swarm for the parameters? Since
> the parameters were tuned to that exact data, it seems like a potential
> overfitting risk. The data is a series of control actions of subjects
> playing a simple game. What I'm trying to do is train a model on the
> subject 1's data, save that model and use it to forecast for subjects 1 -
> 20.
>     I hope to show that the HTM can learn the individual behavioral
> patterns of a given subject distinct from the others, and I plan to show
> this capacity with a result where the model does well forecasting for all
> subjects, but especially well at forecasting for the subject it was trained
> on. However I wonder if when testing the model on subject 1, I should use
> different subject 1 data than I used to swarm for the parameters. Thanks
> again!
>
> -- Sam
>

Re: dodging overfitting

Reply via email to