Re: Troubleshooting poor HTM performance in gesture recognition

Francisco Webber Mon, 29 Dec 2014 09:10:14 -0800

Ok, I now better understand how you encode the gestures. But still I think that 
my argument is valid, that if you want to do generalization you need semantics 
to generalize on.
Maybe I didn’t understand well what kind of generalization you want to achieve.


Francisco

On 29.12.2014, at 17:38, Nicholas Mitri <[email protected]> wrote:

> Thanks for your comments Francisco, 
> 
> I should’ve explained better. Gestures here refer to the shapes drawn by the 
> user’s hand as he/she moves a smartphone. The result is a flattened 3D trace 
> whose trajectory is estimated using motion sensors. The feature vector of 
> every gesture is subsequently a sequence of directions from one control 
> vertex of the trace to the next. Think of it as a piece wise linear trace 
> that’s represented by discretized direction e.g. the trace of ‘5’ is 
> 2->3->0->3->2  if we’re using 4 directions and start at the top. 
> 
> That’s the data I’m working with so there’s very little semantic depth to 
> consider. Encoders here are needed more for their quantization/pooling 
> functionality than anything else. 
> 
> best,
> Nick
> 
>> On Dec 29, 2014, at 6:23 PM, Francisco Webber <[email protected]> wrote:
>> 
>> Hello Nick,
>> What you are trying to do sounds very interesting. My guess is that the poor 
>> generalization is due to the fact that there is not sufficient semantics 
>> captured during the encoding step. As you might know we are working in the 
>> domain of language processing where semantic depth of the SDRs is key. 
>> In your case the semantics of the system is defined by the way a (human) 
>> body looks like and its degrees of freedom to move.
>> What you should try to achieve is to capture some this semantic context in 
>> your encoding process. The SDRs representing the body positions (or 
>> movements) should be formed in a way that similar positions (gestures) have 
>> similar SDRs (many overlapping points). The better you are able to realize 
>> this encoding, the better the HTM will be able to generalize.
>> In language processing, we were able to create classifiers that needed only 
>> 4 example sentences like:
>> 
>> "Erwin Schrödinger is a physicist.”
>> “Marie Curie is a physicist"
>> “Niels Bohr is a physicist”
>> “James Maxwell is a physicist”
>> 
>> to give the following response: “Albert Einstein is a” PHYSICIST
>> 
>> In my experience, measurable similarity among SDRs, encoded to represent 
>> similar data, seems to be key for an HTM network to unfold its full power.
>> 
>> Francisco
>> 
>> On 29.12.2014, at 16:25, Nicholas Mitri <[email protected]> wrote:
>> 
>>> Hey Matt, everyone, 
>>> 
>>> I debugged the code and managed to get some sensible results. HTM is doing 
>>> a great job of learning sequences but performing very poorly at 
>>> generalization. So while it can recognize a sequence it had learned with 
>>> high accuracy, when it’s fed a test sequence that it’s never seen, its 
>>> classification accuracy plummets. To be clear, classification here is 
>>> performed by assigning an HTM region to each class and observing which 
>>> region outputs the least anomaly score averaged along a test sequence. 
>>> 
>>> I’ve tried tweaking the encoder parameters to quantize the input with a 
>>> lower resolution in the hope that similar inputs will be better pooled. 
>>> That didn’t pan out. Also, changing encoder output length or number of 
>>> columns is causing the HTM to output no predictions at times even with a 
>>> non-empty active column list. I have little idea why that keeps happening. 
>>> 
>>> Any hints as to how to get HTM to better perform here? I’ve included HMM 
>>> results for comparison. SVM results are all 95+%.
>>> 
>>> Thank you,
>>> Nick
>>> 
>>> 
>>> HTM Results:
>>> 
>>> Data = sequence of directions (8 discrete direction)
>>> Note on accuracy: M1/M2 is shown here to represent 2 performance metrics. 
>>> M1 is average anomaly, M2 is the sum of average anomaly normalized and 
>>> prediction error normalized.
>>> 
>>> Base training accuracy: 100 % at 2 training passes
>>> User Dependent: 56.25%/56.25%
>>> User Independent: N/A
>>> Mixed: 65.00 %/ 71.25%
>>> 
>>> HMM (22-states) Results:
>>> 
>>> Data = sequence of directions (16 discrete direction)
>>> 
>>> Base training accuracy: 97.5%
>>> User Dependent: 76.25 %
>>> User Independent:  88.75 %
>>> Mixed: 88.75 %
>>> 
>>> 
>>>> On Dec 11, 2014, at 7:16 PM, Matthew Taylor <[email protected]> wrote:
>>>> 
>>>> Nicholas, can you paste a sample of the input data file?
>>>> 
>>>> ---------
>>>> Matt Taylor
>>>> OS Community Flag-Bearer
>>>> Numenta
>>>> 
>>>> On Thu, Dec 11, 2014 at 7:50 AM, Nicholas Mitri <[email protected]> 
>>>> wrote:
>>>> Hey all, 
>>>> 
>>>> I’m running into some trouble with using HTM for a gesture recognition 
>>>> application and would appreciate some help. 
>>>> First, the data is collected from 17 users performing 5 gestures of each 
>>>> of 16 different gesture classes using motion sensors. The feature vector 
>>>> for each sample is a sequence of discretized directions calculated using 
>>>> bezier control points after curve fitting the gesture trace. 
>>>> 
>>>> For a baseline, I fed the data to 16 10-state HMMs for training and again 
>>>> for testing. The classification accuracy achieved is 95.7%. 
>>>> 
>>>> For HTM, I created 16 CLA models using parameters from a medium swarm. I 
>>>> ran the data through the models for training where each model is trained 
>>>> on only 1 gesture class. For testing, I fed the same data again with 
>>>> learning turned off and recorded the anomaly score (averaged across each 
>>>> sequence) for each model. Classification was done by seeking the model 
>>>> with the minimum anomaly score. Accuracy turned out to be a puzzling 0.0%!!
>>>> 
>>>> Below is the relevant section of the code. I would appreciate any hints. 
>>>> Thanks,
>>>> Nick
>>>> 
>>>> def run_experiment():
>>>>     print "Running experiment..."
>>>> 
>>>>     model = [0]*16
>>>>     for i in range(0, 16):
>>>>         model[i] = ModelFactory.create(model_params, logLevel=0)
>>>>         model[i].enableInference({"predictedField": FIELD_NAME})
>>>> 
>>>>     with open(FILE_PATH, "rb") as f:
>>>>         csv_reader = csv.reader(f)
>>>>         data = []
>>>>         labels = []
>>>>         for row in csv_reader:
>>>>             r = [int(item) for item in row[:-1]]
>>>>             data.append(r)
>>>>             labels.append(int(row[-1]))
>>>> 
>>>>         # data_train, data_test, labels_train, labels_test = 
>>>> cross_validation.train_test_split(data, labels, test_size=0.4, 
>>>> random_state=0)
>>>>         data_train = data
>>>>         data_test = data
>>>>         labels_train = labels
>>>>         labels_test = labels
>>>> 
>>>>     for passes in range(0, TRAINING_PASSES):
>>>>         sample = 0
>>>>         for (ind, row) in enumerate(data_train):
>>>>             for r in row:
>>>>                 value = int(r)
>>>>                 result = model[labels_train[ind]].run({FIELD_NAME: value, 
>>>> '_learning': True})
>>>>                 prediction = 
>>>> result.inferences["multiStepBestPredictions"][1]
>>>>                 anomalyScore = result.inferences["anomalyScore"]
>>>>             model[labels[ind]].resetSequenceStates()
>>>>             sample += 1
>>>>             print "Processing training sample %i" % sample
>>>>             if sample == 100:
>>>>                 break
>>>> 
>>>>     sample = 0
>>>>     labels_predicted = []
>>>>     for row in data_test:
>>>>         anomaly = [0]*16
>>>>         for i in range(0, 16):
>>>>             model[i].resetSequenceStates()
>>>>             for r in row:
>>>>                 value = int(r)
>>>>                 result = model[i].run({FIELD_NAME: value, '_learning': 
>>>> False})
>>>>                 prediction = 
>>>> result.inferences["multiStepBestPredictions"][1]
>>>>                 anomalyScore = result.inferences["anomalyScore"]
>>>>                 # print value, prediction, anomalyScore
>>>>                 if value == int(prediction) and anomalyScore == 0:
>>>>                     # print "No prediction made"
>>>>                     anomalyScore = 1
>>>>                 anomaly[i] += anomalyScore
>>>>             anomaly[i] /= len(row)
>>>>         sample += 1
>>>>         print "Processing testing sample %i" % sample
>>>>         labels_predicted.append(np.min(np.array(anomaly)))
>>>>         print anomaly, row[-1]
>>>>         if sample == 100:
>>>>             break
>>>> 
>>>>     accuracy = np.sum(np.array(labels_predicted) == 
>>>> np.array(labels_test))*100.0/len(labels_test)
>>>>     print "Testing accuracy is %0.2f" % accuracy
>>>> 
>>>> 
>>> 
>> 
>

Re: Troubleshooting poor HTM performance in gesture recognition

Reply via email to