Here's the ICML pre-print J. Weston, A. Makadia, H. Yee. Label Partitioning for Sublinear Ranking<http://www.thespermwhale.com/jaseweston/papers/label_partitioner.pdf> , *ICML 2013 *.
On Sat, Mar 30, 2013 at 10:56 AM, Hector Yee <[email protected]> wrote: > If you're going the embedding route, please consider trying wsabie first. > AWE is built on top of wsabie. > > http://www.thespermwhale.com/jaseweston/papers/wsabie-ijcai.pdf > > And so is the following ICML paper (preprint not online yet). Btw, anyone > going? > > http://icml.cc/2013/?page_id=43 > *Label Partitioning For Sublinear Ranking * > Jason Weston, Ameesh Makadia, Hector Yee > > I was going to modify https://issues.apache.org/jira/browse/MAHOUT-703 to > do this when I was in as startup, as essentially > wsabie is very similar to a 2 layer NN without the sigmoid and with the > WARP update rule (in the wsabie paper) which > optimizes for precision rather than AUC. People may prefer high precision > at the top > of the ranking order when ranking millions of items for recommendation > algorithms. > > An implementation of wsabie is in http://torch5.sourceforge.net somewhere > I think. > > Hope that helps. > > > On Sat, Mar 30, 2013 at 7:15 AM, Ted Dunning <[email protected]>wrote: > >> SOM doesn't have to be constrained to two dimensions. >> >> That said, there are bunches of non-linear embedding methods that are more >> current than SOM's. SOM's were part of the neural plausibility movement >> of >> the late 80's which more recently can be seen as an approach toward modern >> formulations of stochastic gradient descent. >> >> For one example, Hector Yee was just recommending that Affinity Based >> Emedding [1] would be a useful think to look at. I would find it hard to >> say what would be a useful project in that regard. >> >> More central to Mahout's general areas of excellence would be an >> implementation of Latent factor Log Linear models [2]. These would >> provide >> a very interesting complement to the alternating least squares methods >> that >> have been developed lately in Mahout. >> >> Either of these would strike me as more useful in the Mahout context than >> SOM's. >> >> [1] http://arxiv.org/abs/1301.4171 >> >> [2] http://arxiv.org/abs/1006.2156 >> >> >> On Sat, Mar 30, 2013 at 12:21 PM, Sean Owen <[email protected]> wrote: >> >> > Are SOMs actually good at dimension reduction? I had understood it to >> > just be a visualization technique. You end up with a mapping with the >> > property that things that are near are similar, but no guarantee that >> > things that are similar are near. >> > >> > On Sat, Mar 30, 2013 at 12:06 PM, Dan Filimon >> > <[email protected]> wrote: >> > > Hi, >> > > >> > > I have a larger assignment to work on for my Machine Learning course >> this >> > > semester and I can pick one of 4 problems to solve. >> > > >> > > One of them, is implementing self organizing maps and using them to >> > cluster >> > > the Localization Data for Person Activity Data Set [1] and evaluate >> the >> > > clustering with the Dunn Index and F-measure. >> > > >> > > I vaguely recall talking to Ted about self organizing maps as a way of >> > > achieving dimensionality reduction, so that's where it could be >> useful. >> > > >> > > I need to pick a problem anyway and was wondering if there's any sort >> of >> > > interest in this one. >> > > If yes, I could work on an implementation for Mahout (likely non >> > MapReduce, >> > > at least for the purposes of this assignment). >> > > >> > > Thoughts? >> > > >> > > [1] >> > > >> > >> http://archive.ics.uci.edu/ml/datasets/Localization+Data+for+Person+Activity >> > >> > > > > -- > Yee Yang Li Hector <https://plus.google.com/106746796711269457249> > Professional Profile <http://www.linkedin.com/in/yeehector> > http://hectorgon.blogspot.com/ > -- Yee Yang Li Hector <https://plus.google.com/106746796711269457249> Professional Profile <http://www.linkedin.com/in/yeehector> http://hectorgon.blogspot.com/
