Hi All, I wanted to run a distributed RecommenderJob with the SVDRecommender implementation. So i ran the pseudo.RecommenderJob with an SVDRecommender(numFeatures=30,trainingSteps=50) on the 1M Movielens data(6040 users). So this generated 10 recommendations for each of the 6040 users but took 14 hours to do so! My hadoop cluster had 12 m/cs. So i guess it just ran multiple instances of the non-distributed SVD implementation and each of these instances did the same thing again and again. So unless the implementation of the recommender is distributed, we dont get any special benefit with the pseudo.RecommenderJob.
But the item.RecommenderJob does the same 10 recommendations each for the 6040 users in 38 minutes. This is because it has an underlying distributed implementation. So my doubt is do we have a distributed SVDRecommender implementation? If not, how should i go about writing one? Can I use the new LanczosSolver to achieve this? Thanks, Sanjib
