We already discussed the paper before. In fact, i had exactly same
idea for partitioning the factorization task (something the authors
call "stratified" sgd ) with stochastic learners before i ever saw
this paper.

I personally lost interest in this approach even before i read the
paper because the way i understood it at that time, it would have
required at least as many MR restarts with data exchange as there's a
degree of parallelism and consequently just as many data passes. In
framework of Mahout it is also difficult because Mahout doesn't
support blocking out of the box for its DRM format so an additional
job may be required to pre-block the data the way they want to process
it --or-- we have to run over 100% of it during each restart, instead
of a fraction if it.

All in all, my speculation was there were little chances that this
approach would provide a win over ALS techniques with restarts that we
currently already have with a mid to high degree of parallelization
(say 50 way parallelization and on).

But honestly i would be happy to be wrong because I did not understand
some of the work or did not see some of the optimizations suggested. I
would be especially happy if it could beat our current ALS WR with a
meaningful margin on bigger data.

-d

On Sat, Jan 14, 2012 at 9:45 AM, Zeno Gantner <[email protected]> wrote:
> Hi list,
>
> I was talking to Isabel Drost in December, and we talked about a nice
> paper from last year's KDD conference that suggests a neat trick that
> allows doing SGD for matrix factorization in parallel.
>
> She said this would be interesting for some of you here.
>
> Here is the paper:
> http://www.mpi-inf.mpg.de/~rgemulla/publications/gemulla11dsgd.pdf
>
> Note that the authors themselves implemented it already in Hadoop.
>
> Maybe someone would like to pick this up.
>
> I am still trying to find my way around the Mahout/Taste source code,
> so do not expect anything from me too soon ;-)
>
> Best regards,
>  Zeno

Reply via email to