Making nupic into a distributed system would be a fairly large project. And that's assuming that it maps well to the distributed mechanisms in whatever language one would choose (Clojure, Scala, Haskell, etc). Putting it on top of another distributed system, like Apache Spark or Nokia's Disco, would either require a retooling of the 3rd party system, or a mapping of nupic to a map/reduce problem set, correct? In either case, I don't think the gains would be realized unless one was using an incredibly massive dataset. These systems, while very useful for many things, introduce their own latencies which are not insignificant. But the idea is still pretty cool.
> Date: Thu, 5 Feb 2015 20:40:27 +0100 > From: [email protected] > Subject: Re: [nupic] Steps toward a distributed NuPic > To: [email protected] > > What about making Nupic on top of Apache Spark, as it has support for > resiliency and distributed. It also has recently added support for > sparse matrix. Although I am don't much about that part. Just to put is > as a suggestion. As I know there has been some work done in nupic to > port it in Java. So it can be easy to use Spark with that to make it > distributed. > > - Gurvinder
