Need for a distributed SVDRecommender

Sanjib Kumar Das Fri, 19 Nov 2010 13:34:32 -0800

Hi All,

I wanted to run a distributed RecommenderJob with the SVDRecommender
implementation.
So i ran the pseudo.RecommenderJob with an
SVDRecommender(numFeatures=30,trainingSteps=50) on the 1M Movielens
data(6040 users). So this generated 10 recommendations for each of the 6040
users but took 14 hours to do so! My hadoop cluster had 12 m/cs. So i guess
it just ran multiple instances of the non-distributed SVD implementation and
each of these instances did the same thing again and again. So unless the
implementation of the recommender is distributed, we dont get any special
benefit with the pseudo.RecommenderJob.


But the item.RecommenderJob does the same 10 recommendations each for the
6040 users in 38 minutes. This is because it has an underlying distributed
implementation.

So my doubt is do we have a distributed SVDRecommender implementation? If
not, how should i go about writing one? Can I use the new LanczosSolver to
achieve this?

Thanks,
Sanjib

Need for a distributed SVDRecommender

Reply via email to