We already discussed the paper before. In fact, i had exactly same idea for partitioning the factorization task (something the authors call "stratified" sgd ) with stochastic learners before i ever saw this paper.
I personally lost interest in this approach even before i read the paper because the way i understood it at that time, it would have required at least as many MR restarts with data exchange as there's a degree of parallelism and consequently just as many data passes. In framework of Mahout it is also difficult because Mahout doesn't support blocking out of the box for its DRM format so an additional job may be required to pre-block the data the way they want to process it --or-- we have to run over 100% of it during each restart, instead of a fraction if it. All in all, my speculation was there were little chances that this approach would provide a win over ALS techniques with restarts that we currently already have with a mid to high degree of parallelization (say 50 way parallelization and on). But honestly i would be happy to be wrong because I did not understand some of the work or did not see some of the optimizations suggested. I would be especially happy if it could beat our current ALS WR with a meaningful margin on bigger data. -d On Sat, Jan 14, 2012 at 9:45 AM, Zeno Gantner <[email protected]> wrote: > Hi list, > > I was talking to Isabel Drost in December, and we talked about a nice > paper from last year's KDD conference that suggests a neat trick that > allows doing SGD for matrix factorization in parallel. > > She said this would be interesting for some of you here. > > Here is the paper: > http://www.mpi-inf.mpg.de/~rgemulla/publications/gemulla11dsgd.pdf > > Note that the authors themselves implemented it already in Hadoop. > > Maybe someone would like to pick this up. > > I am still trying to find my way around the Mahout/Taste source code, > so do not expect anything from me too soon ;-) > > Best regards, > Zeno
