SOM doesn't have to be constrained to two dimensions.

That said, there are bunches of non-linear embedding methods that are more
current than SOM's.  SOM's were part of the neural plausibility movement of
the late 80's which more recently can be seen as an approach toward modern
formulations of stochastic gradient descent.

For one example, Hector Yee was just recommending that Affinity Based
Emedding [1] would be a useful think to look at.  I would find it hard to
say what would be a useful project in that regard.

More central to Mahout's general areas of excellence would be an
implementation of Latent factor Log Linear models [2].  These would provide
a very interesting complement to the alternating least squares methods that
have been developed lately in Mahout.

Either of these would strike me as more useful in the Mahout context than
SOM's.

[1] http://arxiv.org/abs/1301.4171

[2] http://arxiv.org/abs/1006.2156


On Sat, Mar 30, 2013 at 12:21 PM, Sean Owen <sro...@gmail.com> wrote:

> Are SOMs actually good at dimension reduction? I had understood it to
> just be a visualization technique. You end up with a mapping with the
> property that things that are near are similar, but no guarantee that
> things that are similar are near.
>
> On Sat, Mar 30, 2013 at 12:06 PM, Dan Filimon
> <dangeorge.fili...@gmail.com> wrote:
> > Hi,
> >
> > I have a larger assignment to work on for my Machine Learning course this
> > semester and I can pick one of 4 problems to solve.
> >
> > One of them, is implementing self organizing maps and using them to
> cluster
> > the  Localization Data for Person Activity Data Set [1] and evaluate the
> > clustering with the Dunn Index and F-measure.
> >
> > I vaguely recall talking to Ted about self organizing maps as a way of
> > achieving dimensionality reduction, so that's where it could be useful.
> >
> > I need to pick a problem anyway and was wondering if there's any sort of
> > interest in this one.
> > If yes, I could work on an implementation for Mahout (likely non
> MapReduce,
> > at least for the purposes of this assignment).
> >
> > Thoughts?
> >
> > [1]
> >
> http://archive.ics.uci.edu/ml/datasets/Localization+Data+for+Person+Activity
>

Reply via email to