Re: Question regarding patterns and noise

Matthew Taylor Wed, 27 Apr 2016 08:19:40 -0700

Swarming tries to find the best encoder parameters as well, but it
depends on a lot of things. What size swarm did you use? How much data
was used? What were the min and max scalar values in that data? We
recommend that you provide 3000 rows of data to a swarm to get the
best model params. If you used a "small" swarm, the params you get
back are always bad. (By the way, I think the term "debug" swarm is
much better [1]).


[1] https://github.com/numenta/nupic/issues/913
---------
Matt Taylor
OS Community Flag-Bearer
Numenta


On Wed, Apr 27, 2016 at 1:54 AM, Alexandre Vivmond <[email protected]> wrote:
> Thank you, I'll try it out with the RandomDistributedScalarEncoder. One more
> question, those n and w parameters were generated by running a swarm. Does a
> swarm only try to optimize the SP and TP parameter values or does it also
> tries to find the best encoder and encoder parameters? I'm just wondering,
> because I do understand the issue with the n and w, but I just kind of
> blindly trusted the swarm so far to give me the best parameters for
> everything. If the swarm doesn't really deal with encoders, then I'll be
> more careful with encoders and their parameters.
>
> On Tue, Apr 26, 2016 at 7:34 PM, cogmission (David Ray)
> <[email protected]> wrote:
>>
>> I see. Well you know better! I just thought of my limited experience using
>> NuPIC and what I've run into. The encoding "problem" does make sense... I
>> didn't notice the w/n issue... :-)
>>
>>
>>
>> On Tue, Apr 26, 2016 at 11:59 AM, Matthew Taylor <[email protected]> wrote:
>>>
>>> Calling reset might help, but he should not have to do it. I think the
>>> values just need to be encoded in a way that presents more semantic
>>> difference between "5" and "6".
>>> ---------
>>> Matt Taylor
>>> OS Community Flag-Bearer
>>> Numenta
>>>
>>>
>>> On Tue, Apr 26, 2016 at 9:17 AM, cogmission (David Ray)
>>> <[email protected]> wrote:
>>> > i.e. The "reset" marks the beginning and end of a pattern.
>>> >
>>> > On Tue, Apr 26, 2016 at 11:17 AM, cogmission (David Ray)
>>> > <[email protected]> wrote:
>>> >>
>>> >> Alexandre, are you calling "reset()" after the 20,000 5's then one 6?
>>> >> The
>>> >> "reset()" lets the HTM know that the pattern has concluded and may
>>> >> help
>>> >> yield better results?
>>> >>
>>> >> Cheers,
>>> >> David
>>> >>
>>> >> On Tue, Apr 26, 2016 at 10:03 AM, Alexandre Vivmond
>>> >> <[email protected]>
>>> >> wrote:
>>> >>>
>>> >>> Here are parameters that I'm using for running a swarm
>>> >>>
>>> >>> SWARM_CONFIG = {
>>> >>>   "includedFields": [
>>> >>>     {
>>> >>>       "fieldName": "value",
>>> >>>       "fieldType": "float",
>>> >>>       "maxValue": 6.0,
>>> >>>       "minValue": 5.0
>>> >>>     }
>>> >>>   ],
>>> >>>   "streamDef": {
>>> >>>     "info": "value",
>>> >>>     "version": 1,
>>> >>>     "streams": [
>>> >>>       {
>>> >>>         "info": "Values",
>>> >>>         "source": "file://values.csv",
>>> >>>         "columns": [
>>> >>>           "*"
>>> >>>         ]
>>> >>>       }
>>> >>>     ]
>>> >>>   },
>>> >>>
>>> >>>   "inferenceType": "TemporalAnomaly",
>>> >>>   "inferenceArgs": {
>>> >>>     "predictionSteps": [
>>> >>>       1
>>> >>>     ],
>>> >>>     "predictedField": "value"
>>> >>>   },
>>> >>>   "iterationCount": -1,
>>> >>>   "swarmSize": "medium"
>>> >>> }
>>> >>>
>>> >>>
>>> >>> And here is the generated model_params.py file output
>>> >>>
>>> >>> MODEL_PARAMS = {'aggregationInfo': {'days': 0,
>>> >>>                      'fields': [],
>>> >>>                      'hours': 0,
>>> >>>                      'microseconds': 0,
>>> >>>                      'milliseconds': 0,
>>> >>>                      'minutes': 0,
>>> >>>                      'months': 0,
>>> >>>                      'seconds': 0,
>>> >>>                      'weeks': 0,
>>> >>>                      'years': 0},
>>> >>>  'model': 'CLA',
>>> >>>  'modelParams': {'anomalyParams': {u'anomalyCacheRecords': None,
>>> >>>                                    u'autoDetectThreshold': None,
>>> >>>                                    u'autoDetectWaitRecords': None},
>>> >>>                  'clParams': {'alpha': 0.00634375,
>>> >>>                               'clVerbosity': 0,
>>> >>>                               'regionName': 'CLAClassifierRegion',
>>> >>>                               'steps': '1'},
>>> >>>                  'inferenceType': 'TemporalAnomaly',
>>> >>>                  'sensorParams': {'encoders': {u'value':
>>> >>> {'clipInput':
>>> >>> True,
>>> >>>
>>> >>> 'fieldname':
>>> >>> 'value',
>>> >>>                                                           'maxval':
>>> >>> 6.0,
>>> >>>                                                           'minval':
>>> >>> 5.0,
>>> >>>                                                           'n': 22,
>>> >>>                                                           'name':
>>> >>> 'value',
>>> >>>                                                           'type':
>>> >>> 'ScalarEncoder',
>>> >>>                                                           'w': 21}},
>>> >>>                                   'sensorAutoReset': None,
>>> >>>                                   'verbosity': 0},
>>> >>>                  'spEnable': True,
>>> >>>                  'spParams': {'columnCount': 2048,
>>> >>>                               'globalInhibition': 1,
>>> >>>                               'inputWidth': 0,
>>> >>>                               'maxBoost': 2.0,
>>> >>>                               'numActiveColumnsPerInhArea': 40,
>>> >>>                               'potentialPct': 0.8,
>>> >>>                               'seed': 1956,
>>> >>>                               'spVerbosity': 0,
>>> >>>                               'spatialImp': 'cpp',
>>> >>>                               'synPermActiveInc': 0.05,
>>> >>>                               'synPermConnected': 0.1,
>>> >>>                               'synPermInactiveDec': 0.09376875},
>>> >>>                  'tpEnable': True,
>>> >>>                  'tpParams': {'activationThreshold': 12,
>>> >>>                               'cellsPerColumn': 32,
>>> >>>                               'columnCount': 2048,
>>> >>>                               'globalDecay': 0.0,
>>> >>>                               'initialPerm': 0.21,
>>> >>>                               'inputWidth': 2048,
>>> >>>                               'maxAge': 0,
>>> >>>                               'maxSegmentsPerCell': 128,
>>> >>>                               'maxSynapsesPerSegment': 32,
>>> >>>                               'minThreshold': 9,
>>> >>>                               'newSynapseCount': 20,
>>> >>>                               'outputType': 'normal',
>>> >>>                               'pamLength': 1,
>>> >>>                               'permanenceDec': 0.1,
>>> >>>                               'permanenceInc': 0.1,
>>> >>>                               'seed': 1960,
>>> >>>                               'temporalImp': 'cpp',
>>> >>>                               'verbosity': 0},
>>> >>>                  'trainSPNetOnlyIfRequested': False},
>>> >>>  'predictAheadTime': None,
>>> >>>  'version': 1}
>>> >>>
>>> >>> On Tue, Apr 26, 2016 at 4:33 PM, Matthew Taylor <[email protected]>
>>> >>> wrote:
>>> >>>>
>>> >>>> What are the encoder parameters you're using to encode these
>>> >>>> numbers?
>>> >>>> 5 and 6 might be close enough that they get encoded as the same bit
>>> >>>> array. What are your min/max values for the scalar encoder? Or are
>>> >>>> yo
>>> >>>> using another encoder?
>>> >>>> ---------
>>> >>>> Matt Taylor
>>> >>>> OS Community Flag-Bearer
>>> >>>> Numenta
>>> >>>>
>>> >>>>
>>> >>>> On Tue, Apr 26, 2016 at 3:32 AM, Alexandre Vivmond
>>> >>>> <[email protected]>
>>> >>>> wrote:
>>> >>>> > I've got a question regarding patterns and noise. I've
>>> >>>> > experimented a
>>> >>>> > bit
>>> >>>> > with HTM now, and I can get it to learn a wide variety of varying
>>> >>>> > patterns
>>> >>>> > such as for example: 1, 2, 3, 1, 2, 3, 1,... or 5, 6, 5, 6, 5, 6,
>>> >>>> > ...
>>> >>>> > but
>>> >>>> > patterns such as 5, 5, 6, 5, 5, 6, ... or 5, 5, 5, 5, 5, 5, 5, 5,
>>> >>>> > 5,
>>> >>>> > 6, 5,
>>> >>>> > 5, 5, 5, 5, 5, 5, 5, 5, 6, ... are things that HTM struggles with,
>>> >>>> > which is
>>> >>>> > understandable considering HTM is really good at creating "links"
>>> >>>> > between
>>> >>>> > values with respect to time and context. But the previously
>>> >>>> > mentioned
>>> >>>> > example makes it really hard to create "links" between
>>> >>>> > self-repeating
>>> >>>> > values, even though HTM can manage to differ between contexts. So
>>> >>>> > what
>>> >>>> > exactly is the "line" between a pattern and noise? I fed HTM 20000
>>> >>>> > values of
>>> >>>> > 10 fives followed by one 6 (5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 5,
>>> >>>> > ...)
>>> >>>> > and it
>>> >>>> > still didn't manage to learn that pattern. Any ideas?
>>> >>>>
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> With kind regards,
>>> >>
>>> >> David Ray
>>> >> Java Solutions Architect
>>> >>
>>> >> Cortical.io
>>> >> Sponsor of:  HTM.java
>>> >>
>>> >> [email protected]
>>> >> http://cortical.io
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > With kind regards,
>>> >
>>> > David Ray
>>> > Java Solutions Architect
>>> >
>>> > Cortical.io
>>> > Sponsor of:  HTM.java
>>> >
>>> > [email protected]
>>> > http://cortical.io
>>>
>>
>>
>>
>> --
>> With kind regards,
>>
>> David Ray
>> Java Solutions Architect
>>
>> Cortical.io
>> Sponsor of:  HTM.java
>>
>> [email protected]
>> http://cortical.io
>
>

Re: Question regarding patterns and noise

Reply via email to