Hi David,
I finally managed to get a decent configuration by running various swarming.
Based on the best result,
resources/sample/swarming/model_params_all-max-30.py,
I have configured as follows.
Configurations with "?" seem like missing.
Can you suggest how to fix if you see anything important?
private static Parameters buildEncoderParams() {
Map<String, Map<String, Object>> fieldEncodings =
getNetworkFieldEncodingMap();
Parameters p = Parameters.getEncoderDefaultParameters();
// CLAClassifier
// alpha, hard coded
// Spatial Pooler
// ? 'columnCount': 2048,
p.setParameterByKey(KEY.GLOBAL_INHIBITIONS, true); //
'globalInhibition': 1,
// ? 'inputWidth': 0,
p.setParameterByKey(KEY.MAX_BOOST, 2.0); // 'maxBoost': 2.0,
p.setParameterByKey(KEY.NUM_ACTIVE_COLUMNS_PER_INH_AREA, 40.0); //
'numActiveColumnsPerInhArea': 40,
p.setParameterByKey(KEY.POTENTIAL_PCT, 0.8); // 'potentialPct': 0.8,
// ? 'seed': 1956,
// 'spVerbosity': 0,
// ? 'spatialImp': 'cpp',
p.setParameterByKey(KEY.SYN_PERM_ACTIVE_INC, 0.05); //
'synPermActiveInc': 0.05,
// 'synPermConnected': 0.1,
p.setParameterByKey(KEY.SYN_PERM_INACTIVE_DEC,
0.04216241137734589);// 'synPermInactiveDec': 0.04216241137734589
// Temporal Memory Pooler
p.setParameterByKey(KEY.ACTIVATION_THRESHOLD, 14); //
'activationThreshold': 14,
// 'cellsPerColumn': 32,
// 'columnCount': 2048,
// ? 'globalDecay': 0.0,
// 'initialPerm': 0.21,
// ? 'inputWidth': 2048,
// ? 'maxAge': 0,
// ? 'maxSegmentsPerCell': 128,
// ? 'maxSynapsesPerSegment': 32,
p.setParameterByKey(KEY.MIN_THRESHOLD, 11); // 'minThreshold': 11,
// 'newSynapseCount': 20,
// ? 'outputType': 'normal',
// ? 'pamLength': 3,
// 'permanenceDec': 0.1,
// 'permanenceInc': 0.1,
// ? 'seed': 1960,
// ? 'temporalImp': 'cpp',
p.setParameterByKey(KEY.FIELD_ENCODING_MAP, fieldEncodings);
return p;
}
Also, I found AdaptiveScalarEncoder is not available, though it was
suggested by swarmings.
Is it deliberately removed from MultiEncoder#getEncoder?
Thanks,
Takenori
On Thu, Nov 5, 2015 at 3:47 PM, Takenori Sato <[email protected]> wrote:
> Thanks, David! I added you as a collaborator.
>
> - Takenori
>
> On Thu, Nov 5, 2015 at 3:25 PM, cogmission (David Ray) <
> [email protected]> wrote:
>
>> Hi Takenori,
>>
>> Running a swarm is always an option. Can you give me push rights to your
>> repo and check in some (small) example of data so I can have something to
>> run, and I'll take a look? I'll see if I can get it up and running and push
>> it back to your repo...
>>
>> Cheers,
>> David
>>
>> On Wed, Nov 4, 2015 at 10:30 PM, Takenori Sato <[email protected]>
>> wrote:
>>
>>> Hi David, thanks for your answers!
>>>
>>> I tried some, like adding SpatialPooler, changing n/w, but no luck.
>>>
>>> Perhaps I should run swarming in python against my data,
>>> and study the configuration produced.
>>>
>>> - Takenori
>>>
>>> On Thu, Nov 5, 2015 at 3:44 AM, cogmission (David Ray) <
>>> [email protected]> wrote:
>>>
>>>> Hi Takenori,
>>>>
>>>> You might think this is weird (I know I do), but as I am basically just
>>>> one person writing and supporting HTM.java (with some appreciated help from
>>>> community members from time to time), I haven't really had the time to
>>>> **use** NuPIC. Therefore the scope of the questions I can faithfully answer
>>>> are specific to setting up and using the code, together with any Java
>>>> related questions. NuPIC configurations that have to do with performance of
>>>> the HTM (like DateEncoder parameters, the size of W and N; and actual
>>>> parameter settings - any familiar person who has used NuPIC and struggled
>>>> with that learning curve can answer you.
>>>>
>>>> The default parameters used are those that were in the Python network
>>>> examples and settings that I have been told are "decent" when asking for
>>>> help myself. NuPIC parameters are not easy, and require knowledge of the
>>>> "rules of thumb" (typical rules for usage). For instance, W should be an
>>>> odd number for reasons having to do with finding the "center" of a series
>>>> of bits. Also, if you read the class documentation for Encoder.java or
>>>> base.py (The abstract base encoder for the Python version) files, you will
>>>> see some discussion for N and W and how they relate to each other.
>>>>
>>>> In general, the difference between the ScalarEncoder and the
>>>> RandomDistributedScalarEncoder is that the ScalarEncoder is a bit more
>>>> efficient but requires prior knowledge of the min and max values in your
>>>> expected dataset. The RDSE can be used without prior knowledge of the
>>>> bounds and so is a nice alternative for unknown data. Most people just use
>>>> the RDSE.
>>>>
>>>> Here's a video that discusses the RDSE:
>>>> https://www.youtube.com/watch?v=_q5W2Ov6C9E
>>>>
>>>> The DateEncoder class Javadoc, and the class file itself (together with
>>>> DateEncoderTest.java), have lots of documentation in them which illustrate
>>>> their usage. Basically, a DateEncoder is a compound encoder that has
>>>> ScalarEncoders inside it which handle different aspects of the date
>>>> mechanism being used.
>>>>
>>>> The SpatialPooler is an integral part of the HTM - you usually want
>>>> that. The only time when that has been "skipped" is when inserting an
>>>> encoding scheme of your own and you want to preserve the input format. But
>>>> that is an extreme corner case, I would advise to use one in your code.
>>>>
>>>> Don't worry about multiple regions and layers. The capacity to have
>>>> multiple regions and layers exists for those who need extra flexibility.
>>>> The ability to assemble Network hierarchies is mostly a "space saver" for
>>>> when HTM Hierarchy code is released by Numenta in the future. The "modes"
>>>> shown in the HotGym Demo are just there for demonstration purposes and
>>>> really there is no internal concept of "Mode" inside the Network hierarchy.
>>>> Again, the Mode in the demo is just a switch to instruct the demo to setup
>>>> different hierarchy styles to show that the output is the same regardless
>>>> of the number of hierarchical components used to funnel data through.
>>>>
>>>> I hope this helps. You can ask Numenta engineers for rules of thumb
>>>> regarding the individual Parameter settings.
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>> On Wed, Nov 4, 2015 at 9:45 AM, Takenori Sato <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi NuPIC community and David,
>>>>>
>>>>> I have some questions about how to configure my network with htm.java.
>>>>>
>>>>> My use case is to let HTM detect an unexpected high load on a server
>>>>> through PING response times. But so far, it produces 0.0 for almost any
>>>>> inputs. Sometimes it returns some value, but which are not reasonable at
>>>>> all.
>>>>>
>>>>> The biggest problem is that I am not sure at all about my
>>>>> configurations. So I highly suspect my configurations are far from correct
>>>>> ones.
>>>>>
>>>>> For your reference, you can see my codes here:
>>>>>
>>>>> CloudSonar project <https://github.com/ggsato/CloudSonar>
>>>>> HTMAnomalyDetector
>>>>> <https://github.com/ggsato/CloudSonar/blob/master/src/com/cloudian/analytics/HTMAnomalyDetector.java>
>>>>>
>>>>> My network configurations are based on(or I say copy and paste)
>>>>> NetworkDemoHarness. They are modified slightly where I believe I
>>>>> understand.
>>>>>
>>>>> Here're my questions.
>>>>>
>>>>> *1. Parameters#getAllDefaultParameters*
>>>>>
>>>>> private static Network createNetwork(Sensor<ObservableSensor<String>>
>>>>> sensor) {
>>>>> *Parameters p = buildParams();*
>>>>> p = p.union(buildEncoderParams());
>>>>> return Network.create("CloudSonar", p)
>>>>> .add(Network.createRegion("Region")
>>>>> .add(Network.createLayer("Layer", p)
>>>>> .alterParameter(KEY.AUTO_CLASSIFY, Boolean.TRUE)
>>>>> .add(Anomaly.create())
>>>>> .add(new TemporalMemory())
>>>>> .add(sensor)
>>>>> )
>>>>> );
>>>>> }
>>>>> private static Parameters buildParams() {
>>>>> return* Parameters.getAllDefaultParameters(); <== THIS ONE*
>>>>> }
>>>>>
>>>>> NetworkDemoHarness#getParameters confused me with many parameters. So
>>>>> I picked up only the default ones without overriding anything. Can I start
>>>>> like this?
>>>>>
>>>>> Also, are there any resources to learn about those parameters?
>>>>>
>>>>> *2. Encoders*
>>>>>
>>>>> My inputs are [timestamps, duration_in_micro_sec].
>>>>>
>>>>> private static String generateCSVInput(PollingJob job) {
>>>>> StringBuffer sb = new StringBuffer();
>>>>> sb.append(FULL_DATE_FORMAT.format(new Date())); *<== TIMESTAMP*
>>>>> sb.append(CSVUpdateHandler.DELIM);
>>>>> sb.append(TimeUnit.MICROSECONDS.convert(job.pollingStatus.duration(),
>>>>> TimeUnit.NANOSECONDS)); *<== DURATION*
>>>>> return sb.toString();
>>>>> }
>>>>>
>>>>> I borrowed the config from
>>>>> NetworkDemoHarness#getHotGymFieldEncodingMap and
>>>>> getNetworkDemoFieldEncodingMap(noticed mixed up). Then, modified the red
>>>>> parts:
>>>>>
>>>>> public static Map<String, Map<String, Object>>
>>>>> getNetworkFieldEncodingMap() {
>>>>> Map<String, Map<String, Object>> fieldEncodings = setupMap(
>>>>> null,
>>>>> 0, // n
>>>>> 0, // w
>>>>> 0, 0, 0, 0, null, null, null,
>>>>> "timestamp", "datetime", "DateEncoder");
>>>>> fieldEncodings = setupMap(
>>>>> fieldEncodings,
>>>>> 50,
>>>>> 21,
>>>>> 0, *10000000*, 0, 0.1, null, Boolean.TRUE, null, *<==
>>>>> 0 ~ 10 sec*
>>>>> CLASSFIER_FIELD, "int", "ScalarEncoder");
>>>>>
>>>>>
>>>>> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_DOFW.getFieldName(), new
>>>>> Tuple(1, 1.0)); // Day of week
>>>>>
>>>>> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_TOFD.getFieldName(), new
>>>>> Tuple(5, 4.0)); // Time of day
>>>>>
>>>>> fieldEncodings.get("timestamp").put(KEY.DATEFIELD_PATTERN.getFieldName(),
>>>>> *FULL_DATE*);
>>>>>
>>>>> return fieldEncodings;
>>>>> }
>>>>>
>>>>> Why are all the params of DateEncoder 0 or null?
>>>>>
>>>>> What is the difference between ScalarEncoder
>>>>> and RandomDistributedScalarEncoder?
>>>>>
>>>>> I happened to use the larger n and w used
>>>>> by getNetworkDemoFieldEncodingMap. Compared to HotGym demo, durations is
>>>>> much larger than consumption. So a larger n makes sense, but I should have
>>>>> set lower w like 6?
>>>>>
>>>>> I wasn't able to find information how to set those DATEFIELD
>>>>> parameters. PATTERN was obvious, but the other two remained unclear.
>>>>> Especially, what is the Tuple, and those numbers?
>>>>>
>>>>> *3. SpatialPooler*
>>>>>
>>>>> NetworkAPIDemo uses SpatialPooler in every network. But it should be
>>>>> related to spatial inputs, correct? So I dropped it from my network
>>>>> configuration. I have read the JavaDoc, but got no clue. What is it for?
>>>>>
>>>>> *4. Multiple Regions and Layers*
>>>>>
>>>>> I wasn't able to understand the difference between those 3 modes in
>>>>> NetworkAPIDemo. I understand MULTILAYER uses multiple layers, and
>>>>> MULTIREGION uses multiple regions. But when to use which mode in practice?
>>>>>
>>>>>
>>>>> I gave all of these stupid questions, but in overall, I was impressed
>>>>> that the design is easy to understand to integrate htm.java in my own
>>>>> application!!
>>>>>
>>>>> Thanks,
>>>>> Takenori
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> *With kind regards,*
>>>>
>>>> David Ray
>>>> Java Solutions Architect
>>>>
>>>> *Cortical.io <http://cortical.io/>*
>>>> Sponsor of: HTM.java <https://github.com/numenta/htm.java>
>>>>
>>>> [email protected]
>>>> http://cortical.io
>>>>
>>>
>>>
>>
>>
>> --
>> *With kind regards,*
>>
>> David Ray
>> Java Solutions Architect
>>
>> *Cortical.io <http://cortical.io/>*
>> Sponsor of: HTM.java <https://github.com/numenta/htm.java>
>>
>> [email protected]
>> http://cortical.io
>>
>
>