Re: htm.java config questions

cogmission (David Ray) Wed, 04 Nov 2015 10:46:35 -0800

Hi Takenori,

You might think this is weird (I know I do), but as I am basically just one
person writing and supporting HTM.java (with some appreciated help from
community members from time to time), I haven't really had the time to
**use** NuPIC. Therefore the scope of the questions I can faithfully answer
are specific to setting up and using the code, together with any Java
related questions. NuPIC configurations that have to do with performance of
the HTM (like DateEncoder parameters, the size of W and N; and actual
parameter settings - any familiar person who has used NuPIC and struggled
with that learning curve can answer you.

The default parameters used are those that were in the Python network
examples and settings that I have been told are "decent" when asking for
help myself. NuPIC parameters are not easy, and require knowledge of the
"rules of thumb" (typical rules for usage). For instance, W should be an
odd number for reasons having to do with finding the "center" of a series
of bits. Also, if you read the class documentation for Encoder.java or
base.py (The abstract base encoder for the Python version) files, you will
see some discussion for N and W and how they relate to each other.

In general, the difference between the ScalarEncoder and the
RandomDistributedScalarEncoder is that the ScalarEncoder is a bit more
efficient but requires prior knowledge of the min and max values in your
expected dataset. The RDSE can be used without prior knowledge of the
bounds and so is a nice alternative for unknown data. Most people just use
the RDSE.

Here's a video that discusses the RDSE:
https://www.youtube.com/watch?v=_q5W2Ov6C9E

The DateEncoder class Javadoc, and the class file itself (together with
DateEncoderTest.java), have lots of documentation in them which illustrate
their usage. Basically, a DateEncoder is a compound encoder that has
ScalarEncoders inside it which handle different aspects of the date
mechanism being used.

The SpatialPooler is an integral part of the HTM - you usually want that.
The only time when that has been "skipped" is when inserting an encoding
scheme of your own and you want to preserve the input format. But that is
an extreme corner case, I would advise to use one in your code.

Don't worry about multiple regions and layers. The capacity to have
multiple regions and layers exists for those who need extra flexibility.
The ability to assemble Network hierarchies is mostly a "space saver" for
when HTM Hierarchy code is released by Numenta in the future. The "modes"
shown in the HotGym Demo are just there for demonstration purposes and
really there is no internal concept of "Mode" inside the Network hierarchy.
Again, the Mode in the demo is just a switch to instruct the demo to setup
different hierarchy styles to show that the output is the same regardless
of the number of hierarchical components used to funnel data through.

I hope this helps. You can ask Numenta engineers for rules of thumb
regarding the individual Parameter settings.

Cheers,
David

On Wed, Nov 4, 2015 at 9:45 AM, Takenori Sato <[email protected]> wrote:

> Hi NuPIC community and David,
>
> I have some questions about how to configure my network with htm.java.
>
> My use case is to let HTM detect an unexpected high load on a server
> through PING response times. But so far, it produces 0.0 for almost any
> inputs. Sometimes it returns some value, but which are not reasonable at
> all.
>
> The biggest problem is that I am not sure at all about my configurations.
> So I highly suspect my configurations are far from correct ones.
>
> For your reference, you can see my codes here:
>
> CloudSonar project <https://github.com/ggsato/CloudSonar>
> HTMAnomalyDetector
> <https://github.com/ggsato/CloudSonar/blob/master/src/com/cloudian/analytics/HTMAnomalyDetector.java>
>
> My network configurations are based on(or I say copy and paste)
> NetworkDemoHarness. They are modified slightly where I believe I understand.
>
> Here're my questions.
>
> *1. Parameters#getAllDefaultParameters*
>
> private static Network createNetwork(Sensor<ObservableSensor<String>>
> sensor) {
> *Parameters p = buildParams();*
> p = p.union(buildEncoderParams());
> return Network.create("CloudSonar", p)
>            .add(Network.createRegion("Region")
>                .add(Network.createLayer("Layer", p)
>                    .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE)
>                    .add(Anomaly.create())
>                    .add(new TemporalMemory())
>                    .add(sensor)
>                    )
>                );
> }
> private static Parameters buildParams() {
> return* Parameters.getAllDefaultParameters(); <== THIS ONE*
> }
>
> NetworkDemoHarness#getParameters confused me with many parameters. So I
> picked up only the default ones without overriding anything. Can I start
> like this?
>
> Also, are there any resources to learn about those parameters?
>
> *2. Encoders*
>
> My inputs are [timestamps, duration_in_micro_sec].
>
> private static String generateCSVInput(PollingJob job) {
> StringBuffer sb = new StringBuffer();
> sb.append(FULL_DATE_FORMAT.format(new Date())); *<== TIMESTAMP*
> sb.append(CSVUpdateHandler.DELIM);
> sb.append(TimeUnit.MICROSECONDS.convert(job.pollingStatus.duration(),
> TimeUnit.NANOSECONDS)); *<== DURATION*
> return sb.toString();
> }
>
> I borrowed the config from NetworkDemoHarness#getHotGymFieldEncodingMap
> and getNetworkDemoFieldEncodingMap(noticed mixed up). Then, modified the
> red parts:
>
>     public static Map<String, Map<String, Object>>
> getNetworkFieldEncodingMap() {
>         Map<String, Map<String, Object>> fieldEncodings = setupMap(
>                 null,
>                 0, // n
>                 0, // w
>                 0, 0, 0, 0, null, null, null,
>                 "timestamp", "datetime", "DateEncoder");
>         fieldEncodings = setupMap(
>                 fieldEncodings,
>                 50,
>                 21,
>                 0, *10000000*, 0, 0.1, null, Boolean.TRUE, null,  *<== 0
> ~ 10 sec*
>                 CLASSFIER_FIELD, "int", "ScalarEncoder");
>
>
> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_DOFW.getFieldName(), new
> Tuple(1, 1.0)); // Day of week
>
> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_TOFD.getFieldName(), new
> Tuple(5, 4.0)); // Time of day
>
> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_PATTERN.getFieldName(),
> *FULL_DATE*);
>
>         return fieldEncodings;
>     }
>
> Why are all the params of DateEncoder 0 or null?
>
> What is the difference between ScalarEncoder
> and RandomDistributedScalarEncoder?
>
> I happened to use the larger n and w used
> by getNetworkDemoFieldEncodingMap. Compared to HotGym demo, durations is
> much larger than consumption. So a larger n makes sense, but I should have
> set lower w like 6?
>
> I wasn't able to find information how to set those DATEFIELD parameters.
> PATTERN was obvious, but the other two remained unclear. Especially, what
> is the Tuple, and those numbers?
>
> *3. SpatialPooler*
>
> NetworkAPIDemo uses SpatialPooler in every network. But it should be
> related to spatial inputs, correct? So I dropped it from my network
> configuration. I have read the JavaDoc, but got no clue. What is it for?
>
> *4. Multiple Regions and Layers*
>
> I wasn't able to understand the difference between those 3 modes in
> NetworkAPIDemo. I understand MULTILAYER uses multiple layers, and
> MULTIREGION uses multiple regions. But when to use which mode in practice?
>
>
> I gave all of these stupid questions, but in overall, I was impressed that
> the design is easy to understand to integrate htm.java in my own
> application!!
>
> Thanks,
> Takenori
>

-- 
*With kind regards,*

David Ray
Java Solutions Architect

*Cortical.io <http://cortical.io/>*
Sponsor of:  HTM.java <https://github.com/numenta/htm.java>

[email protected]
http://cortical.io

Re: htm.java config questions

Reply via email to