Hi Michael, > Hi Dennis, > > I think the CLA does connect very well with what's known about biological > vision, but an important piece of the puzzle hasn't yet been implemented in > Nupic.
That is temporal pooling, right? > > As you probably know, visual cortex contains "simple cells" that each respond > to a particular visual feature at a particular place in the visual field, and > "complex cells" that each respond to a particular visual feature but at a > range of nearby positions in the visual field. The HMAX model uses > alternating layers of simple and complex cells, arranged hierarchically. Each > layer of simple cells produces representations of increasingly complex > features (and at higher levels, objects) and each layer of complex cells > introduces a greater degree of position invariance. So at the topmost level > of complex cells, cells respond to complex objects, presented anywhere within > the visual field. Yes. Isn't HMAX based on some known biology. I think HMAX was probably wrong, as it is an old model and has evolved since it's first development. Tomaso Poggio has published many changes and additions to the HMAX since then. > > Competitive Hebbian learning successfully explains how the simple cells learn > to represent individual visual features. Which can be replaced by k-means clustering. I remember that I've read a paper on the fact that k-means gives similar results on clustering features and is much more efficient. > But how does a complex cell learn to respond to the same visual feature at a > variety of spatial locations? HMAX implementations typically hand code the > response properties of complex cells, so that they are made to respond to the > same visual feature at different spatial locations. For the sake of saving memory, I think it is solved with convolutional networks that share weights. Those complex cells could as well be individual cells with pooling mechanism across many simple cells. How they learn to pull from the same kind of simple cells (representing similar patterns) is another question, but could probably be explain with connection to the cells that make those simple cells learn the patterns. I would assume that complex cells may utilize similar learning mechanism as simple cells but include pulling from many similar simple cells. > But, there is increasing evidence that the principle of "temporal slowness" > can be used to explain how the complex cells automatically learn to respond > to the same feature in multiple locations. This takes advantage of the fact > that visual features typically move around in the visual field, and so the > presence of a particular visual feature at one location is predictive of the > same visual feature at nearby locations. Can you explain a bit more? I think that by the time complex cells receive signals from simple cells they're not "aware" about the underlying patterns, as simple cells are responsible for those. > > Prediction is, of course, what CLA does best. CLA's spatial pooler performs a > form of competitive Hebbian learning, and teaches columns to represent > individual visual features. This is equivalent to what visual "simple cells" > learn. The CLA's temporal pooler can then learn predictions about what will > be active next based on what's currently active. The current version of Nupic > learns only nth-order predictions, so as to learn complex temporal sequences. > But an alternative that isn't currently implemented in Nupic but is discussed > in the white paper and may be taking place in some cortical layers, is to > learn 1st-order predictions, predictions based on current activity regardless > of prior activity. When a particular visual feature is active, 1st-order > predictions would predict the activity of that same visual feature at a range > of nearby positions in the visual field. That's exactly the kind of response > shown by complex cells. I don't think I understand about 1st order prediction. How does it work? How can you predict something you don't know exists? > > Multiple hierarchically organized CLA layers that work in this way could then > work a lot like a multiple level HMAX model, learning more complex > features/objects and greater invariance at each successive level. But unlike > HMAX, it would do all of its learning in an unsupervised way, using only the > statistics of the input visual stream. You'd still need a classifier at the top layer, right? Unless there is another hierarchy that can generate and read text for purposes of telling a user that the model actually understands what is on the image? > > Pages 31-33 of the white paper go into some more detail about this. Right now > this can't be done with the current version of Nupic, but it's been near the > top of the recent poll of where the focus should be for Nupic development in > 2014. I'm actually am very interested in creating visual system using CLA. _______________________________________________ nupic mailing list nupic@lists.numenta.org http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org