Hello!

In my quest to make a HTM based reinforcement learner, I need a value
function approximator.
I could just use a multilayer perceptron (MLP), but there is a problem with
this: MLPs forget old information readily in order to assimilate new
information (this is called "catastrophic forgetting"). A way around this
is by storing input/output pairs in a "experience" buffer, and then doing
stochastic sampling on that. But, this approach is inelegant, slow, and
requires a lot of memory.

So, I have devised a new algorithm based on HTM's spatial pooler. I call it
SDRRBFNetwork (sparse distributed representation radial basis function
network).

Essentially, it performs unsupervised learning using the continuous spatial
pooling algorithm I developed, and then uses a standard linear combination
of the SDR to get output.

The main advantage of this is that there is almost no catastrophic
forgetting. Since only few cells receive attention at a time, most weights
are barely touched, keeping old information intact.

I compared it to a standard MLP on a sine curve learning task. It needs to
produce the output of the sine curve for every input, but the inputs it is
given to train on are in order (high temporal coherence). SDRRBFNetwork got
270 times less error than the MLP, in less training time.

If that doesn't make a case for SDRs, I don't know what can!

Reply via email to