Re: Need for a distributed SVDRecommender

Sean Owen Fri, 19 Nov 2010 14:04:31 -0800

That result sounds confusing. It should take about the same number of
wall-clock hours either way. I don't see why it would take 14 hours -- that
sounds wrong. If anything it should take 38 / N minutes where N is the
number of recommenders
you ran.


SVDRecommender is not distributed at all, no.

On Fri, Nov 19, 2010 at 9:34 PM, Sanjib Kumar Das <[email protected]>wrote:

> Hi All,
>
> I wanted to run a distributed RecommenderJob with the SVDRecommender
> implementation.
> So i ran the pseudo.RecommenderJob with an
> SVDRecommender(numFeatures=30,trainingSteps=50) on the 1M Movielens
> data(6040 users). So this generated 10 recommendations for each of the 6040
> users but took 14 hours to do so! My hadoop cluster had 12 m/cs. So i guess
> it just ran multiple instances of the non-distributed SVD implementation
> and
> each of these instances did the same thing again and again. So unless the
> implementation of the recommender is distributed, we dont get any special
> benefit with the pseudo.RecommenderJob.
>
> But the item.RecommenderJob does the same 10 recommendations each for the
> 6040 users in 38 minutes. This is because it has an underlying distributed
> implementation.
>
> So my doubt is do we have a distributed SVDRecommender implementation? If
> not, how should i go about writing one? Can I use the new LanczosSolver to
> achieve this?
>
> Thanks,
> Sanjib
>

Re: Need for a distributed SVDRecommender

Reply via email to