Hi Takenori, You might think this is weird (I know I do), but as I am basically just one person writing and supporting HTM.java (with some appreciated help from community members from time to time), I haven't really had the time to **use** NuPIC. Therefore the scope of the questions I can faithfully answer are specific to setting up and using the code, together with any Java related questions. NuPIC configurations that have to do with performance of the HTM (like DateEncoder parameters, the size of W and N; and actual parameter settings - any familiar person who has used NuPIC and struggled with that learning curve can answer you.
The default parameters used are those that were in the Python network examples and settings that I have been told are "decent" when asking for help myself. NuPIC parameters are not easy, and require knowledge of the "rules of thumb" (typical rules for usage). For instance, W should be an odd number for reasons having to do with finding the "center" of a series of bits. Also, if you read the class documentation for Encoder.java or base.py (The abstract base encoder for the Python version) files, you will see some discussion for N and W and how they relate to each other. In general, the difference between the ScalarEncoder and the RandomDistributedScalarEncoder is that the ScalarEncoder is a bit more efficient but requires prior knowledge of the min and max values in your expected dataset. The RDSE can be used without prior knowledge of the bounds and so is a nice alternative for unknown data. Most people just use the RDSE. Here's a video that discusses the RDSE: https://www.youtube.com/watch?v=_q5W2Ov6C9E The DateEncoder class Javadoc, and the class file itself (together with DateEncoderTest.java), have lots of documentation in them which illustrate their usage. Basically, a DateEncoder is a compound encoder that has ScalarEncoders inside it which handle different aspects of the date mechanism being used. The SpatialPooler is an integral part of the HTM - you usually want that. The only time when that has been "skipped" is when inserting an encoding scheme of your own and you want to preserve the input format. But that is an extreme corner case, I would advise to use one in your code. Don't worry about multiple regions and layers. The capacity to have multiple regions and layers exists for those who need extra flexibility. The ability to assemble Network hierarchies is mostly a "space saver" for when HTM Hierarchy code is released by Numenta in the future. The "modes" shown in the HotGym Demo are just there for demonstration purposes and really there is no internal concept of "Mode" inside the Network hierarchy. Again, the Mode in the demo is just a switch to instruct the demo to setup different hierarchy styles to show that the output is the same regardless of the number of hierarchical components used to funnel data through. I hope this helps. You can ask Numenta engineers for rules of thumb regarding the individual Parameter settings. Cheers, David On Wed, Nov 4, 2015 at 9:45 AM, Takenori Sato <[email protected]> wrote: > Hi NuPIC community and David, > > I have some questions about how to configure my network with htm.java. > > My use case is to let HTM detect an unexpected high load on a server > through PING response times. But so far, it produces 0.0 for almost any > inputs. Sometimes it returns some value, but which are not reasonable at > all. > > The biggest problem is that I am not sure at all about my configurations. > So I highly suspect my configurations are far from correct ones. > > For your reference, you can see my codes here: > > CloudSonar project <https://github.com/ggsato/CloudSonar> > HTMAnomalyDetector > <https://github.com/ggsato/CloudSonar/blob/master/src/com/cloudian/analytics/HTMAnomalyDetector.java> > > My network configurations are based on(or I say copy and paste) > NetworkDemoHarness. They are modified slightly where I believe I understand. > > Here're my questions. > > *1. Parameters#getAllDefaultParameters* > > private static Network createNetwork(Sensor<ObservableSensor<String>> > sensor) { > *Parameters p = buildParams();* > p = p.union(buildEncoderParams()); > return Network.create("CloudSonar", p) > .add(Network.createRegion("Region") > .add(Network.createLayer("Layer", p) > .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE) > .add(Anomaly.create()) > .add(new TemporalMemory()) > .add(sensor) > ) > ); > } > private static Parameters buildParams() { > return* Parameters.getAllDefaultParameters(); <== THIS ONE* > } > > NetworkDemoHarness#getParameters confused me with many parameters. So I > picked up only the default ones without overriding anything. Can I start > like this? > > Also, are there any resources to learn about those parameters? > > *2. Encoders* > > My inputs are [timestamps, duration_in_micro_sec]. > > private static String generateCSVInput(PollingJob job) { > StringBuffer sb = new StringBuffer(); > sb.append(FULL_DATE_FORMAT.format(new Date())); *<== TIMESTAMP* > sb.append(CSVUpdateHandler.DELIM); > sb.append(TimeUnit.MICROSECONDS.convert(job.pollingStatus.duration(), > TimeUnit.NANOSECONDS)); *<== DURATION* > return sb.toString(); > } > > I borrowed the config from NetworkDemoHarness#getHotGymFieldEncodingMap > and getNetworkDemoFieldEncodingMap(noticed mixed up). Then, modified the > red parts: > > public static Map<String, Map<String, Object>> > getNetworkFieldEncodingMap() { > Map<String, Map<String, Object>> fieldEncodings = setupMap( > null, > 0, // n > 0, // w > 0, 0, 0, 0, null, null, null, > "timestamp", "datetime", "DateEncoder"); > fieldEncodings = setupMap( > fieldEncodings, > 50, > 21, > 0, *10000000*, 0, 0.1, null, Boolean.TRUE, null, *<== 0 > ~ 10 sec* > CLASSFIER_FIELD, "int", "ScalarEncoder"); > > > fieldEncodings.get("timestamp").put(KEY.DATEFIELD_DOFW.getFieldName(), new > Tuple(1, 1.0)); // Day of week > > fieldEncodings.get("timestamp").put(KEY.DATEFIELD_TOFD.getFieldName(), new > Tuple(5, 4.0)); // Time of day > > fieldEncodings.get("timestamp").put(KEY.DATEFIELD_PATTERN.getFieldName(), > *FULL_DATE*); > > return fieldEncodings; > } > > Why are all the params of DateEncoder 0 or null? > > What is the difference between ScalarEncoder > and RandomDistributedScalarEncoder? > > I happened to use the larger n and w used > by getNetworkDemoFieldEncodingMap. Compared to HotGym demo, durations is > much larger than consumption. So a larger n makes sense, but I should have > set lower w like 6? > > I wasn't able to find information how to set those DATEFIELD parameters. > PATTERN was obvious, but the other two remained unclear. Especially, what > is the Tuple, and those numbers? > > *3. SpatialPooler* > > NetworkAPIDemo uses SpatialPooler in every network. But it should be > related to spatial inputs, correct? So I dropped it from my network > configuration. I have read the JavaDoc, but got no clue. What is it for? > > *4. Multiple Regions and Layers* > > I wasn't able to understand the difference between those 3 modes in > NetworkAPIDemo. I understand MULTILAYER uses multiple layers, and > MULTIREGION uses multiple regions. But when to use which mode in practice? > > > I gave all of these stupid questions, but in overall, I was impressed that > the design is easy to understand to integrate htm.java in my own > application!! > > Thanks, > Takenori > -- *With kind regards,* David Ray Java Solutions Architect *Cortical.io <http://cortical.io/>* Sponsor of: HTM.java <https://github.com/numenta/htm.java> [email protected] http://cortical.io
