Ok, I now better understand how you encode the gestures. But still I think that my argument is valid, that if you want to do generalization you need semantics to generalize on. Maybe I didn’t understand well what kind of generalization you want to achieve.
Francisco On 29.12.2014, at 17:38, Nicholas Mitri <[email protected]> wrote: > Thanks for your comments Francisco, > > I should’ve explained better. Gestures here refer to the shapes drawn by the > user’s hand as he/she moves a smartphone. The result is a flattened 3D trace > whose trajectory is estimated using motion sensors. The feature vector of > every gesture is subsequently a sequence of directions from one control > vertex of the trace to the next. Think of it as a piece wise linear trace > that’s represented by discretized direction e.g. the trace of ‘5’ is > 2->3->0->3->2 if we’re using 4 directions and start at the top. > > That’s the data I’m working with so there’s very little semantic depth to > consider. Encoders here are needed more for their quantization/pooling > functionality than anything else. > > best, > Nick > >> On Dec 29, 2014, at 6:23 PM, Francisco Webber <[email protected]> wrote: >> >> Hello Nick, >> What you are trying to do sounds very interesting. My guess is that the poor >> generalization is due to the fact that there is not sufficient semantics >> captured during the encoding step. As you might know we are working in the >> domain of language processing where semantic depth of the SDRs is key. >> In your case the semantics of the system is defined by the way a (human) >> body looks like and its degrees of freedom to move. >> What you should try to achieve is to capture some this semantic context in >> your encoding process. The SDRs representing the body positions (or >> movements) should be formed in a way that similar positions (gestures) have >> similar SDRs (many overlapping points). The better you are able to realize >> this encoding, the better the HTM will be able to generalize. >> In language processing, we were able to create classifiers that needed only >> 4 example sentences like: >> >> "Erwin Schrödinger is a physicist.” >> “Marie Curie is a physicist" >> “Niels Bohr is a physicist” >> “James Maxwell is a physicist” >> >> to give the following response: “Albert Einstein is a” PHYSICIST >> >> In my experience, measurable similarity among SDRs, encoded to represent >> similar data, seems to be key for an HTM network to unfold its full power. >> >> Francisco >> >> On 29.12.2014, at 16:25, Nicholas Mitri <[email protected]> wrote: >> >>> Hey Matt, everyone, >>> >>> I debugged the code and managed to get some sensible results. HTM is doing >>> a great job of learning sequences but performing very poorly at >>> generalization. So while it can recognize a sequence it had learned with >>> high accuracy, when it’s fed a test sequence that it’s never seen, its >>> classification accuracy plummets. To be clear, classification here is >>> performed by assigning an HTM region to each class and observing which >>> region outputs the least anomaly score averaged along a test sequence. >>> >>> I’ve tried tweaking the encoder parameters to quantize the input with a >>> lower resolution in the hope that similar inputs will be better pooled. >>> That didn’t pan out. Also, changing encoder output length or number of >>> columns is causing the HTM to output no predictions at times even with a >>> non-empty active column list. I have little idea why that keeps happening. >>> >>> Any hints as to how to get HTM to better perform here? I’ve included HMM >>> results for comparison. SVM results are all 95+%. >>> >>> Thank you, >>> Nick >>> >>> >>> HTM Results: >>> >>> Data = sequence of directions (8 discrete direction) >>> Note on accuracy: M1/M2 is shown here to represent 2 performance metrics. >>> M1 is average anomaly, M2 is the sum of average anomaly normalized and >>> prediction error normalized. >>> >>> Base training accuracy: 100 % at 2 training passes >>> User Dependent: 56.25%/56.25% >>> User Independent: N/A >>> Mixed: 65.00 %/ 71.25% >>> >>> HMM (22-states) Results: >>> >>> Data = sequence of directions (16 discrete direction) >>> >>> Base training accuracy: 97.5% >>> User Dependent: 76.25 % >>> User Independent: 88.75 % >>> Mixed: 88.75 % >>> >>> >>>> On Dec 11, 2014, at 7:16 PM, Matthew Taylor <[email protected]> wrote: >>>> >>>> Nicholas, can you paste a sample of the input data file? >>>> >>>> --------- >>>> Matt Taylor >>>> OS Community Flag-Bearer >>>> Numenta >>>> >>>> On Thu, Dec 11, 2014 at 7:50 AM, Nicholas Mitri <[email protected]> >>>> wrote: >>>> Hey all, >>>> >>>> I’m running into some trouble with using HTM for a gesture recognition >>>> application and would appreciate some help. >>>> First, the data is collected from 17 users performing 5 gestures of each >>>> of 16 different gesture classes using motion sensors. The feature vector >>>> for each sample is a sequence of discretized directions calculated using >>>> bezier control points after curve fitting the gesture trace. >>>> >>>> For a baseline, I fed the data to 16 10-state HMMs for training and again >>>> for testing. The classification accuracy achieved is 95.7%. >>>> >>>> For HTM, I created 16 CLA models using parameters from a medium swarm. I >>>> ran the data through the models for training where each model is trained >>>> on only 1 gesture class. For testing, I fed the same data again with >>>> learning turned off and recorded the anomaly score (averaged across each >>>> sequence) for each model. Classification was done by seeking the model >>>> with the minimum anomaly score. Accuracy turned out to be a puzzling 0.0%!! >>>> >>>> Below is the relevant section of the code. I would appreciate any hints. >>>> Thanks, >>>> Nick >>>> >>>> def run_experiment(): >>>> print "Running experiment..." >>>> >>>> model = [0]*16 >>>> for i in range(0, 16): >>>> model[i] = ModelFactory.create(model_params, logLevel=0) >>>> model[i].enableInference({"predictedField": FIELD_NAME}) >>>> >>>> with open(FILE_PATH, "rb") as f: >>>> csv_reader = csv.reader(f) >>>> data = [] >>>> labels = [] >>>> for row in csv_reader: >>>> r = [int(item) for item in row[:-1]] >>>> data.append(r) >>>> labels.append(int(row[-1])) >>>> >>>> # data_train, data_test, labels_train, labels_test = >>>> cross_validation.train_test_split(data, labels, test_size=0.4, >>>> random_state=0) >>>> data_train = data >>>> data_test = data >>>> labels_train = labels >>>> labels_test = labels >>>> >>>> for passes in range(0, TRAINING_PASSES): >>>> sample = 0 >>>> for (ind, row) in enumerate(data_train): >>>> for r in row: >>>> value = int(r) >>>> result = model[labels_train[ind]].run({FIELD_NAME: value, >>>> '_learning': True}) >>>> prediction = >>>> result.inferences["multiStepBestPredictions"][1] >>>> anomalyScore = result.inferences["anomalyScore"] >>>> model[labels[ind]].resetSequenceStates() >>>> sample += 1 >>>> print "Processing training sample %i" % sample >>>> if sample == 100: >>>> break >>>> >>>> sample = 0 >>>> labels_predicted = [] >>>> for row in data_test: >>>> anomaly = [0]*16 >>>> for i in range(0, 16): >>>> model[i].resetSequenceStates() >>>> for r in row: >>>> value = int(r) >>>> result = model[i].run({FIELD_NAME: value, '_learning': >>>> False}) >>>> prediction = >>>> result.inferences["multiStepBestPredictions"][1] >>>> anomalyScore = result.inferences["anomalyScore"] >>>> # print value, prediction, anomalyScore >>>> if value == int(prediction) and anomalyScore == 0: >>>> # print "No prediction made" >>>> anomalyScore = 1 >>>> anomaly[i] += anomalyScore >>>> anomaly[i] /= len(row) >>>> sample += 1 >>>> print "Processing testing sample %i" % sample >>>> labels_predicted.append(np.min(np.array(anomaly))) >>>> print anomaly, row[-1] >>>> if sample == 100: >>>> break >>>> >>>> accuracy = np.sum(np.array(labels_predicted) == >>>> np.array(labels_test))*100.0/len(labels_test) >>>> print "Testing accuracy is %0.2f" % accuracy >>>> >>>> >>> >> >
