I think you are right. If you don't get good SDRs out of the SP,
then
the
rest of the system won't work well. Evaluating the quality of the
SDRs
is
usually one of my first steps in any difficult HTM debugging
process.
The
way I test it is:
a) to see that the distribution of winning columns is relatively
even.
You
don't want a small number columns winning all the time.
b) to see that small changes in the input cause small changes in the
output.
c) you'd like to get unique outputs for very different patterns.
Testing
with the KNN on the training set is a decent way to check this. You
should
get close to 100% on the training set. This is similar in spirit to
a)
above.
To answer your other question, there's been a fair amount of
discussion
on
the list already on the differences between HTM and other networks.
We
used
to have hierarchy before but the current version of NuPIC does not
support
hierarchies well so that is a big difference. As such I do not
recommend
using NuPIC today for image recognition problems. (This is a
temporary
issue as we want to rethink the way we implement hierarchies.) Some
advantages of the current implementation of HTM's are in their
ability
to
handle streaming data, complex temporal sequences and continuous
learning.
The biggest advantage though is that it is a very good starting
point
for
understanding computational principles of the cortex. We believe
that
HTMs
are on the path to understanding all the other components of
intelligence
such as sensorimotor inference, attention, goal directed behavior,
etc.
If
you believe that understanding the cortex deeply is the way to build
intelligent systems, then HTMs are the way to go!
--Subutai
On Mon, Jan 19, 2015 at 11:14 PM, <[email protected]> wrote:
Hello, Subutai:
Thanks for your reply. It is really helpful. Actually, I tried to
use
it
before. But it seemed there was something wrong with the code. after
I
fixed the error, and ran it. I found the classifier just learned 1
pattern.
Now it is working. Great.
The only thing not that good is that, it is really slow because the
program read the images from 60000+10000 files. i think it will be
more
efficient to read images from the datasets directly. Currently i
find
the
SDR i got is not very distributed, because i set a high
localAreaDensity
value in order to get a high accuracy of recognition. Even so, the
best
result i got is about 89%. :(
I have some questions about the SDR and TP. Currently, i am trying
to
test
SP+KNN is because i think the SDR is really important for TP. If the
SDR
from the SP is not good enough (KNN classifier didn't recognize it
as a
right one), it would lead TP to another way when train with frame
sequences(same pattern). So, how good should the SDR be, for TP? And
if
we
got a bad SDR (KNN recognize it as another pattern), how much will
it
effect TP and CLA classifier? The last question is about HTM model.
Recently, some deep learning models is on fire among academies. Like
CNN
(Convolutional neural network ), RNN-LSTM (recurrent neural network
-
long
short term memory). If we compare HTM with them, what kind of
advantage
and
disadvantage does HTM have?
Thank you.
An Qi
On Mon, 19 Jan 2015 17:54:16 -0800
Subutai Ahmad <[email protected]> wrote
Hi An,
Please see [1]. It gets 95.5% accuracy. However, please note this is
a
very
simplistic system (just SP+KNN). It does not incorporate hierarchy,
temporal pooling, or any sort of learning of invariances. (BTW,
anything
less than 99% is not considered very good for MNIST. MNIST is all
about
getting those last few corner cases! :-)
--Subutai
[1] https://github.com/numenta/nupic.research/tree/master/image_test
On Sat, Jan 17, 2015 at 11:00 PM, <[email protected]> wrote:
Hello.
Sorry for the last email. Thx to the rich formatting :( ... I have
to
type
again.
Recently, I got the result of the test. I followed the source code
and
built the Spatial Pooler + KNN classifier. Then I extracted images
from
MNIST dataset(Train/test : 60000/10000) and parsed them to the
model. I
tried to test with different parameters (using small dataset:
Train/Test
-
6000/1000 ), the best recognition result is about 87.6%. After that,
i
tried the full size MNIST dataset, the result is 89.6%. Currently,
this
is
the best result I got.
Here is the statistics. It shows the error counts for each digits.
the
Row
presents the input digit. the column presents the recognition
result.
Most
of the "7" are recognized as "9". It seems the SDR from SP is still
not
good enough for the classifier.
I found some interesting things. When I let the "inputDimensions"
and
"columnDimensions" be "784" and "1024", the result will be around
68%.
If i
use "(28,28)","(32,32)" and keep others the same, the result will be
around
82%. That 's a lot of difference. It seems the array shape will
effect
SP a
lot.
Did any one get a better result? Does any one have some suggestion
about
the parameters or others?
Thank you.
An Qi
Tokyo University of Agriculture and Technology - Nakagawa Laboratory
2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588
[email protected]