Re: Troubleshooting poor HTM performance in gesture recognition

Francisco Webber Mon, 29 Dec 2014 08:24:12 -0800

Hello Nick,
What you are trying to do sounds very interesting. My guess is that the poor 
generalization is due to the fact that there is not sufficient semantics 
captured during the encoding step. As you might know we are working in the 
domain of language processing where semantic depth of the SDRs is key. 
In your case the semantics of the system is defined by the way a (human) body 
looks like and its degrees of freedom to move.
What you should try to achieve is to capture some this semantic context in your 
encoding process. The SDRs representing the body positions (or movements) 
should be formed in a way that similar positions (gestures) have similar SDRs 
(many overlapping points). The better you are able to realize this encoding, 
the better the HTM will be able to generalize.
In language processing, we were able to create classifiers that needed only 4 
example sentences like:


"Erwin Schrödinger is a physicist.”
“Marie Curie is a physicist"
“Niels Bohr is a physicist”
“James Maxwell is a physicist”

to give the following response: “Albert Einstein is a” PHYSICIST

In my experience, measurable similarity among SDRs, encoded to represent 
similar data, seems to be key for an HTM network to unfold its full power.

Francisco

On 29.12.2014, at 16:25, Nicholas Mitri <[email protected]> wrote:

> Hey Matt, everyone, 
> 
> I debugged the code and managed to get some sensible results. HTM is doing a 
> great job of learning sequences but performing very poorly at generalization. 
> So while it can recognize a sequence it had learned with high accuracy, when 
> it’s fed a test sequence that it’s never seen, its classification accuracy 
> plummets. To be clear, classification here is performed by assigning an HTM 
> region to each class and observing which region outputs the least anomaly 
> score averaged along a test sequence. 
> 
> I’ve tried tweaking the encoder parameters to quantize the input with a lower 
> resolution in the hope that similar inputs will be better pooled. That didn’t 
> pan out. Also, changing encoder output length or number of columns is causing 
> the HTM to output no predictions at times even with a non-empty active column 
> list. I have little idea why that keeps happening. 
> 
> Any hints as to how to get HTM to better perform here? I’ve included HMM 
> results for comparison. SVM results are all 95+%.
> 
> Thank you,
> Nick
> 
> 
> HTM Results:
> 
> Data = sequence of directions (8 discrete direction)
> Note on accuracy: M1/M2 is shown here to represent 2 performance metrics. M1 
> is average anomaly, M2 is the sum of average anomaly normalized and 
> prediction error normalized.
> 
> Base training accuracy: 100 % at 2 training passes
> User Dependent: 56.25%/56.25%
> User Independent: N/A
> Mixed: 65.00 %/ 71.25%
> 
> HMM (22-states) Results:
> 
> Data = sequence of directions (16 discrete direction)
> 
> Base training accuracy: 97.5%
> User Dependent: 76.25 %
> User Independent:  88.75 %
> Mixed: 88.75 %
> 
> 
>> On Dec 11, 2014, at 7:16 PM, Matthew Taylor <[email protected]> wrote:
>> 
>> Nicholas, can you paste a sample of the input data file?
>> 
>> ---------
>> Matt Taylor
>> OS Community Flag-Bearer
>> Numenta
>> 
>> On Thu, Dec 11, 2014 at 7:50 AM, Nicholas Mitri <[email protected]> wrote:
>> Hey all, 
>> 
>> I’m running into some trouble with using HTM for a gesture recognition 
>> application and would appreciate some help. 
>> First, the data is collected from 17 users performing 5 gestures of each of 
>> 16 different gesture classes using motion sensors. The feature vector for 
>> each sample is a sequence of discretized directions calculated using bezier 
>> control points after curve fitting the gesture trace. 
>> 
>> For a baseline, I fed the data to 16 10-state HMMs for training and again 
>> for testing. The classification accuracy achieved is 95.7%. 
>> 
>> For HTM, I created 16 CLA models using parameters from a medium swarm. I ran 
>> the data through the models for training where each model is trained on only 
>> 1 gesture class. For testing, I fed the same data again with learning turned 
>> off and recorded the anomaly score (averaged across each sequence) for each 
>> model. Classification was done by seeking the model with the minimum anomaly 
>> score. Accuracy turned out to be a puzzling 0.0%!!
>> 
>> Below is the relevant section of the code. I would appreciate any hints. 
>> Thanks,
>> Nick
>> 
>> def run_experiment():
>>     print "Running experiment..."
>> 
>>     model = [0]*16
>>     for i in range(0, 16):
>>         model[i] = ModelFactory.create(model_params, logLevel=0)
>>         model[i].enableInference({"predictedField": FIELD_NAME})
>> 
>>     with open(FILE_PATH, "rb") as f:
>>         csv_reader = csv.reader(f)
>>         data = []
>>         labels = []
>>         for row in csv_reader:
>>             r = [int(item) for item in row[:-1]]
>>             data.append(r)
>>             labels.append(int(row[-1]))
>> 
>>         # data_train, data_test, labels_train, labels_test = 
>> cross_validation.train_test_split(data, labels, test_size=0.4, 
>> random_state=0)
>>         data_train = data
>>         data_test = data
>>         labels_train = labels
>>         labels_test = labels
>> 
>>     for passes in range(0, TRAINING_PASSES):
>>         sample = 0
>>         for (ind, row) in enumerate(data_train):
>>             for r in row:
>>                 value = int(r)
>>                 result = model[labels_train[ind]].run({FIELD_NAME: value, 
>> '_learning': True})
>>                 prediction = result.inferences["multiStepBestPredictions"][1]
>>                 anomalyScore = result.inferences["anomalyScore"]
>>             model[labels[ind]].resetSequenceStates()
>>             sample += 1
>>             print "Processing training sample %i" % sample
>>             if sample == 100:
>>                 break
>> 
>>     sample = 0
>>     labels_predicted = []
>>     for row in data_test:
>>         anomaly = [0]*16
>>         for i in range(0, 16):
>>             model[i].resetSequenceStates()
>>             for r in row:
>>                 value = int(r)
>>                 result = model[i].run({FIELD_NAME: value, '_learning': 
>> False})
>>                 prediction = result.inferences["multiStepBestPredictions"][1]
>>                 anomalyScore = result.inferences["anomalyScore"]
>>                 # print value, prediction, anomalyScore
>>                 if value == int(prediction) and anomalyScore == 0:
>>                     # print "No prediction made"
>>                     anomalyScore = 1
>>                 anomaly[i] += anomalyScore
>>             anomaly[i] /= len(row)
>>         sample += 1
>>         print "Processing testing sample %i" % sample
>>         labels_predicted.append(np.min(np.array(anomaly)))
>>         print anomaly, row[-1]
>>         if sample == 100:
>>             break
>> 
>>     accuracy = np.sum(np.array(labels_predicted) == 
>> np.array(labels_test))*100.0/len(labels_test)
>>     print "Testing accuracy is %0.2f" % accuracy
>> 
>> 
>

Re: Troubleshooting poor HTM performance in gesture recognition

Reply via email to