You can use rdd.cartesian then find top-k by key to distribute the
work to executors. There is a trick to boost the performance: you need
to blockify user/product features and then use native matrix-matrix
multiplication. There is a relevant PR from Deb: . -Xiangrui

On Mon, Feb 23, 2015 at 4:53 AM, Erlend Hamnaberg <> wrote:
> Hi.
> We are using the ALS model, and would like to get all users and items
> scored.
> currently we have these methods.
> We want to be able to distribute the calculations to the slaves so we dont
> have to do this on the master.
> Is there an efficient and distributed way of doing this?
> I suppose we could collect all items in the product features and send that
> into a broadcast, but that needs all items on the master, and we want to
> avoid that.
> Regards
> Erlend

To unsubscribe, e-mail:
For additional commands, e-mail:

Reply via email to