Re: Troubleshooting poor HTM performance in gesture recognition

Nicholas Mitri Mon, 29 Dec 2014 08:39:13 -0800

Thanks for your comments Francisco, 

I should’ve explained better. Gestures here refer to the shapes drawn by the 
user’s hand as he/she moves a smartphone. The result is a flattened 3D trace 
whose trajectory is estimated using motion sensors. The feature vector of every 
gesture is subsequently a sequence of directions from one control vertex of the 
trace to the next. Think of it as a piece wise linear trace that’s represented 
by discretized direction e.g. the trace of ‘5’ is 2->3->0->3->2  if we’re using 
4 directions and start at the top.


That’s the data I’m working with so there’s very little semantic depth to 
consider. Encoders here are needed more for their quantization/pooling 
functionality than anything else. 

best,
Nick

> On Dec 29, 2014, at 6:23 PM, Francisco Webber <[email protected]> wrote:
> 
> Hello Nick,
> What you are trying to do sounds very interesting. My guess is that the poor 
> generalization is due to the fact that there is not sufficient semantics 
> captured during the encoding step. As you might know we are working in the 
> domain of language processing where semantic depth of the SDRs is key. 
> In your case the semantics of the system is defined by the way a (human) body 
> looks like and its degrees of freedom to move.
> What you should try to achieve is to capture some this semantic context in 
> your encoding process. The SDRs representing the body positions (or 
> movements) should be formed in a way that similar positions (gestures) have 
> similar SDRs (many overlapping points). The better you are able to realize 
> this encoding, the better the HTM will be able to generalize.
> In language processing, we were able to create classifiers that needed only 4 
> example sentences like:
> 
> "Erwin Schrödinger is a physicist.”
> “Marie Curie is a physicist"
> “Niels Bohr is a physicist”
> “James Maxwell is a physicist”
> 
> to give the following response: “Albert Einstein is a” PHYSICIST
> 
> In my experience, measurable similarity among SDRs, encoded to represent 
> similar data, seems to be key for an HTM network to unfold its full power.
> 
> Francisco
> 
> On 29.12.2014, at 16:25, Nicholas Mitri <[email protected] 
> <mailto:[email protected]>> wrote:
> 
>> Hey Matt, everyone, 
>> 
>> I debugged the code and managed to get some sensible results. HTM is doing a 
>> great job of learning sequences but performing very poorly at 
>> generalization. So while it can recognize a sequence it had learned with 
>> high accuracy, when it’s fed a test sequence that it’s never seen, its 
>> classification accuracy plummets. To be clear, classification here is 
>> performed by assigning an HTM region to each class and observing which 
>> region outputs the least anomaly score averaged along a test sequence. 
>> 
>> I’ve tried tweaking the encoder parameters to quantize the input with a 
>> lower resolution in the hope that similar inputs will be better pooled. That 
>> didn’t pan out. Also, changing encoder output length or number of columns is 
>> causing the HTM to output no predictions at times even with a non-empty 
>> active column list. I have little idea why that keeps happening. 
>> 
>> Any hints as to how to get HTM to better perform here? I’ve included HMM 
>> results for comparison. SVM results are all 95+%.
>> 
>> Thank you,
>> Nick
>> 
>> 
>> HTM Results:
>> 
>> Data = sequence of directions (8 discrete direction)
>> Note on accuracy: M1/M2 is shown here to represent 2 performance metrics. M1 
>> is average anomaly, M2 is the sum of average anomaly normalized and 
>> prediction error normalized.
>> 
>> Base training accuracy: 100 % at 2 training passes
>> User Dependent: 56.25%/56.25%
>> User Independent: N/A
>> Mixed: 65.00 %/ 71.25%
>> 
>> HMM (22-states) Results:
>> 
>> Data = sequence of directions (16 discrete direction)
>> 
>> Base training accuracy: 97.5%
>> User Dependent: 76.25 %
>> User Independent:  88.75 %
>> Mixed: 88.75 %
>> 
>> 
>>> On Dec 11, 2014, at 7:16 PM, Matthew Taylor <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Nicholas, can you paste a sample of the input data file?
>>> 
>>> ---------
>>> Matt Taylor
>>> OS Community Flag-Bearer
>>> Numenta
>>> 
>>> On Thu, Dec 11, 2014 at 7:50 AM, Nicholas Mitri <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Hey all, 
>>> 
>>> I’m running into some trouble with using HTM for a gesture recognition 
>>> application and would appreciate some help. 
>>> First, the data is collected from 17 users performing 5 gestures of each of 
>>> 16 different gesture classes using motion sensors. The feature vector for 
>>> each sample is a sequence of discretized directions calculated using bezier 
>>> control points after curve fitting the gesture trace. 
>>> 
>>> For a baseline, I fed the data to 16 10-state HMMs for training and again 
>>> for testing. The classification accuracy achieved is 95.7%. 
>>> 
>>> For HTM, I created 16 CLA models using parameters from a medium swarm. I 
>>> ran the data through the models for training where each model is trained on 
>>> only 1 gesture class. For testing, I fed the same data again with learning 
>>> turned off and recorded the anomaly score (averaged across each sequence) 
>>> for each model. Classification was done by seeking the model with the 
>>> minimum anomaly score. Accuracy turned out to be a puzzling 0.0%!!
>>> 
>>> Below is the relevant section of the code. I would appreciate any hints. 
>>> Thanks,
>>> Nick
>>> 
>>> def run_experiment():
>>>     print "Running experiment..."
>>> 
>>>     model = [0]*16
>>>     for i in range(0, 16):
>>>         model[i] = ModelFactory.create(model_params, logLevel=0)
>>>         model[i].enableInference({"predictedField": FIELD_NAME})
>>> 
>>>     with open(FILE_PATH, "rb") as f:
>>>         csv_reader = csv.reader(f)
>>>         data = []
>>>         labels = []
>>>         for row in csv_reader:
>>>             r = [int(item) for item in row[:-1]]
>>>             data.append(r)
>>>             labels.append(int(row[-1]))
>>> 
>>>         # data_train, data_test, labels_train, labels_test = 
>>> cross_validation.train_test_split(data, labels, test_size=0.4, 
>>> random_state=0)
>>>         data_train = data
>>>         data_test = data
>>>         labels_train = labels
>>>         labels_test = labels
>>> 
>>>     for passes in range(0, TRAINING_PASSES):
>>>         sample = 0
>>>         for (ind, row) in enumerate(data_train):
>>>             for r in row:
>>>                 value = int(r)
>>>                 result = model[labels_train[ind]].run({FIELD_NAME: value, 
>>> '_learning': True})
>>>                 prediction = 
>>> result.inferences["multiStepBestPredictions"][1]
>>>                 anomalyScore = result.inferences["anomalyScore"]
>>>             model[labels[ind]].resetSequenceStates()
>>>             sample += 1
>>>             print "Processing training sample %i" % sample
>>>             if sample == 100:
>>>                 break
>>> 
>>>     sample = 0
>>>     labels_predicted = []
>>>     for row in data_test:
>>>         anomaly = [0]*16
>>>         for i in range(0, 16):
>>>             model[i].resetSequenceStates()
>>>             for r in row:
>>>                 value = int(r)
>>>                 result = model[i].run({FIELD_NAME: value, '_learning': 
>>> False})
>>>                 prediction = 
>>> result.inferences["multiStepBestPredictions"][1]
>>>                 anomalyScore = result.inferences["anomalyScore"]
>>>                 # print value, prediction, anomalyScore
>>>                 if value == int(prediction) and anomalyScore == 0:
>>>                     # print "No prediction made"
>>>                     anomalyScore = 1
>>>                 anomaly[i] += anomalyScore
>>>             anomaly[i] /= len(row)
>>>         sample += 1
>>>         print "Processing testing sample %i" % sample
>>>         labels_predicted.append(np.min(np.array(anomaly)))
>>>         print anomaly, row[-1]
>>>         if sample == 100:
>>>             break
>>> 
>>>     accuracy = np.sum(np.array(labels_predicted) == 
>>> np.array(labels_test))*100.0/len(labels_test)
>>>     print "Testing accuracy is %0.2f" % accuracy
>>> 
>>> 
>> 
>

Re: Troubleshooting poor HTM performance in gesture recognition

Reply via email to