On Fri, Aug 22, 2014 at 3:28 AM, Cavan Day-Lewis <[email protected]>
wrote:

> Unfortunately, the data file is absolutely massive (it takes over 10 hours
> for a small swarm to go over all the rows). I don’t think we will have time
> to do a large swarm even over a smaller portion of the data.


Aha, so there is a setting in the swarm description JSON file that will
allow you to only swarm over a portion of the input file (which is usually
fine, because you don't need to run ALL the data through every swarm model
for it to get an indication of whether the model params are decent).

From
https://github.com/numenta/nupic/wiki/Running-Swarms#the-swarm-description :

The *iterationCount* value gives the maximum number of *aggregated* records
> to feed into the model. If your data file contains 1000 records spaced 15
> minutes apart, and you've specified a 1 hour aggregation interval, then you
> have a max of 250 aggregated records available to feed into the model.
> Setting this value to -1 means to feed all available aggregated records.


So I would suggest you set the iterationCount to a few thousand rows, and
run a medium swarm. That should decrease the swarming time considerably.

---------
Matt Taylor
OS Community Flag-Bearer
Numenta
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to