@DavidRagazzi - Thanks! :) The work isn't over yet!
On Sun, Oct 19, 2014 at 12:23 PM, cogmission1 . <[email protected]> wrote: > Hi Brian, > > Thanks! This has been (its not over yet) both a labor of love and a severe > push, largely for some of the same reasons you pointed out. This is my > first foray into Python, and so I had all the same feelings of > disorientation and intimidation. I (as well as Numenta's flag bearer, Matt > Taylor) also could foresee as you do the enormous opportunity to introduce > HTM theory to a significantly large group of developers - by providing a > Java version. > > I have to admit, most of my effort and focus have been on getting a > thorough and tested Java version of NuPIC up and running as quick as > possible. As a result, the only comparisons I have made are to the Python > version. As one would expect, the Java version is magnitudes faster than > the Python version (which mostly exists as a research platform and as a > knowledge transfer platform for new users due to the ease with which new > ideas can be implemented in a quick fashion). I have not (yet) had a chance > to make any comparisons between the Java and C++ versions - however it is > my goal to make sure the Java version is at least competitive with the C++ > version (if not exceeding it - as it could very well do in a long running, > primed JVM). The emphasis however is to augment the utility of NuPIC in > general, and introduce as many people as possible to these technologies > because they are a very unique and important contribution the field of > machine learning - which is why I'm doing this! > > Regards, > David > > On Sun, Oct 19, 2014 at 11:31 AM, Brian Eppert <[email protected]> > wrote: > >> Very impressive, must have taken a lot determination, nice work! >> >> It’s great to see the java port is more strongly typed, one of the >> scariest parts for me looking at the python code was the wealth >> configuration parameters as (mis-typable, unconstrained) strings and >> arrays. It seems more surmountable as an neophyte to use an IDE that can >> compile and flag bad values, and provide code completions, in place >> documentation or “go to definition” capabilities. >> >> Another win is having this in Java allows for native use by the other JVM >> hosted languages like Groovy, Scala, Clojure, JRuby, etc. That’s accessible >> to quite a few more developers, and with with Java’s strong >> cross-platform-ness a ton of avenues of use open up. >> >> That is all wonderful but I’m bracing myself as I ask this but what have >> you seen as far as performance as compared to the NuPIC python and C++ code? >> >> >> On Oct 18, 2014, at 10:37 AM, cogmission1 . <[email protected]> >> wrote: >> >> Hi Everybody, >> >> After 2 (looooooooong) months we finally have usable NuPIC functionality >> in Java! >> >> Repo: https://github.com/numenta/htm.java >> Wiki: https://github.com/numenta/htm.java/wiki >> Twitter: https://twitter.com/search?q=%23HtmJavaDevUpdates&src=typd >> >> Here's a blurb describing the goals, and future plans for the project: >> >> ====== >> >> Throughout the development of the TemporalMemory and the SpatialPooler, >> there was an emphasis on keeping a 1-to-1 correlation between the methods >> and functions implementing each algorithm in the Java and Python versions. >> To this end, I would say that 98% of the Python tests in each module have >> the *exact* same output produced within the Java unit tests and integration >> tests. The only place where they differ is in places where calls to an >> underlying RandomNumberGenerator have a significant impact - however, even >> in those places, every other aspect of the code output is carefully >> monitored to ensure that had certain initial parameters been the same, the >> two versions (Python and Java) would produce the exact same output. This >> was achieved by altering the Python tests temporarily to be initialized >> with the same values that the Java version was initialized with - and >> making sure the output produced was the same! >> >> Additionally, a utility object (ArrayUtils) was created to bridge the gap >> between functionality native to Python which doesn't exist in Java and >> there was the creation of the SparseMatrix (and its subclasses: >> SparseBinaryMatrix, and SparseObjectMatrix) to handle array shaping and >> vector math operations. >> >> There are a few architectural differences in the Java version. One is the >> abstraction of objects represented in the Python version as arrays and >> array containers into formal Objects in the Java version. Another is that >> all methods in the Java version are "functional" in that the data they >> operate on is passed in, and no state is kept in either the TemporalMemory >> or the SpatialPooler classes. The "Connections" class (inspired by Chetan's >> Connections object) acts like an isolated memory - containing all state. >> This means that two distinct Connections objects (memories) could be passed >> to the TM or SP, manipulating two entirely different layers *concurrently* >> or in parallel. >> >> >> Roadmap: >> >> At this point the SpatialPooler can be connected to the TemporalMemory to >> produce output >> within a given Java project - since those two classes represent the major >> inference functionality of NuPIC. However, in order to exactly reproduce >> the convenience of the Online Prediction Framework, other structures would >> need to be implemented - and so those are next on the list to be >> implemented. The anticipated roadmap is as follows: >> >> 1.) Create the BaseEncoder and derivative encoders which are currently >> relevant (since one or two may have become obsolete). The culmination of >> which should be the GEOSpatialEncoder I assume. >> >> 2.) Classifiers will then be next on the list which will complete the >> current hierarchy of functionality. >> >> 3.) Following this, Layer and Regional constructs will be created to >> coordinate and manage data flow in this hierarchy. >> >> 4.) Then we'll loop around and take a look at what "Research" >> sensorymotor based new development can be formally pulled in and guide the >> reshaping of the Java version to a form that reflects the most current >> theory. >> >> 5.) Then we'll do an optimization/performance pass over the entire >> codebase to make it at least as fast as whatever C++ version is available. >> (*wink*) >> >> >> >> >> -- >> *We find it hard to hear what another is saying because of how loudly >> "who one is", speaks...* >> >> >> > > > -- > *We find it hard to hear what another is saying because of how loudly "who > one is", speaks...* > -- *We find it hard to hear what another is saying because of how loudly "who one is", speaks...*
