IIRC the algorithm behind ParallelSGDFactorizer needs shared memory, which is not given in a shared-nothing environment.
On 07.09.2013 19:08, Tevfik Aytekin wrote: > Hi, > There seems to be no Hadoop implementation of ParallelSGDFactorizer. > ALSWRFactorizer has a Hadoop implementation. > > ParallelSGDFactorizer (since it is based on stochastic gradient > descent) is much faster than ALSWRFactorizer. > > I don't know Hadoop much. But it seems to me that a Hadoop > implementation of ParallelSGDFactorizer will also be much faster than > the Hadoop implementaion of ALSWRFactorizer. > > Is there a specific reason for why there is no Hadoop implementation > of ParallelSGDFactorizer? Is it because since Hadoop operations are > already slow the slowness of ALSWRFactorizer does not matter much. Or > is it simply because nobody has implemented it yet? > > Thanks > Tevfik >