Hi An, I think you are right. If you don't get good SDRs out of the SP, then the rest of the system won't work well. Evaluating the quality of the SDRs is usually one of my first steps in any difficult HTM debugging process. The way I test it is:
a) to see that the distribution of winning columns is relatively even. You don't want a small number columns winning all the time. b) to see that small changes in the input cause small changes in the output. c) you'd like to get unique outputs for very different patterns. Testing with the KNN on the training set is a decent way to check this. You should get close to 100% on the training set. This is similar in spirit to a) above. To answer your other question, there's been a fair amount of discussion on the list already on the differences between HTM and other networks. We used to have hierarchy before but the current version of NuPIC does not support hierarchies well so that is a big difference. As such I do not recommend using NuPIC today for image recognition problems. (This is a temporary issue as we want to rethink the way we implement hierarchies.) Some advantages of the current implementation of HTM's are in their ability to handle streaming data, complex temporal sequences and continuous learning. The biggest advantage though is that it is a very good starting point for understanding computational principles of the cortex. We believe that HTMs are on the path to understanding all the other components of intelligence such as sensorimotor inference, attention, goal directed behavior, etc. If you believe that understanding the cortex deeply is the way to build intelligent systems, then HTMs are the way to go! --Subutai On Mon, Jan 19, 2015 at 11:14 PM, <[email protected]> wrote: > Hello, Subutai: > Thanks for your reply. It is really helpful. Actually, I tried to use it > before. But it seemed there was something wrong with the code. after I > fixed the error, and ran it. I found the classifier just learned 1 pattern. > Now it is working. Great. > > The only thing not that good is that, it is really slow because the > program read the images from 60000+10000 files. i think it will be more > efficient to read images from the datasets directly. Currently i find the > SDR i got is not very distributed, because i set a high localAreaDensity > value in order to get a high accuracy of recognition. Even so, the best > result i got is about 89%. :( > > I have some questions about the SDR and TP. Currently, i am trying to test > SP+KNN is because i think the SDR is really important for TP. If the SDR > from the SP is not good enough (KNN classifier didn't recognize it as a > right one), it would lead TP to another way when train with frame > sequences(same pattern). So, how good should the SDR be, for TP? And if we > got a bad SDR (KNN recognize it as another pattern), how much will it > effect TP and CLA classifier? The last question is about HTM model. > Recently, some deep learning models is on fire among academies. Like CNN > (Convolutional neural network ), RNN-LSTM (recurrent neural network - long > short term memory). If we compare HTM with them, what kind of advantage and > disadvantage does HTM have? > > Thank you. > > An Qi > > > On Mon, 19 Jan 2015 17:54:16 -0800 > Subutai Ahmad <[email protected]> wrote > > Hi An, >> >> Please see [1]. It gets 95.5% accuracy. However, please note this is a >> very >> simplistic system (just SP+KNN). It does not incorporate hierarchy, >> temporal pooling, or any sort of learning of invariances. (BTW, anything >> less than 99% is not considered very good for MNIST. MNIST is all about >> getting those last few corner cases! :-) >> >> --Subutai >> >> [1] https://github.com/numenta/nupic.research/tree/master/image_test >> >> >> On Sat, Jan 17, 2015 at 11:00 PM, <[email protected]> wrote: >> >> Hello. >>> >>> Sorry for the last email. Thx to the rich formatting :( ... I have to >>> type >>> again. >>> >>> Recently, I got the result of the test. I followed the source code and >>> built the Spatial Pooler + KNN classifier. Then I extracted images from >>> MNIST dataset(Train/test : 60000/10000) and parsed them to the model. I >>> tried to test with different parameters (using small dataset: Train/Test >>> - >>> 6000/1000 ), the best recognition result is about 87.6%. After that, i >>> tried the full size MNIST dataset, the result is 89.6%. Currently, this >>> is >>> the best result I got. >>> >>> Here is the statistics. It shows the error counts for each digits. the >>> Row >>> presents the input digit. the column presents the recognition result. >>> Most >>> of the "7" are recognized as "9". It seems the SDR from SP is still not >>> good enough for the classifier. >>> >>> I found some interesting things. When I let the "inputDimensions" and >>> "columnDimensions" be "784" and "1024", the result will be around 68%. >>> If i >>> use "(28,28)","(32,32)" and keep others the same, the result will be >>> around >>> 82%. That 's a lot of difference. It seems the array shape will effect >>> SP a >>> lot. >>> >>> Did any one get a better result? Does any one have some suggestion about >>> the parameters or others? >>> >>> Thank you. >>> An Qi >>> Tokyo University of Agriculture and Technology - Nakagawa Laboratory >>> 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588 >>> [email protected] >>> >>> > >
