Many thanks for your response. I already figured out the details with some help from another forum.
1. I was trying to predict ratings for all users and all products. This is inefficient and now I am trying to reduce the number of required predictions. 2. There is a nice example buried in Spark source code which points out the usage of ML side ALS. Regards. Sahib Aulakh. On Wed, Jun 7, 2017 at 8:17 PM, Ryan <ryan.hd....@gmail.com> wrote: > 1. could you give job, stage & task status from Spark UI? I found it > extremely useful for performance tuning. > > 2. use modele.transform for predictions. Usually we have a pipeline for > preparing training data, and use the same pipeline to transform data you > want to predict could give us the prediction column. > > On Thu, Jun 1, 2017 at 7:48 AM, Sahib Aulakh [Search] < > sahibaul...@coupang.com> wrote: > >> Hello: >> >> I am training the ALS model for recommendations. I have about 200m >> ratings from about 10m users and 3m products. I have a small cluster with >> 48 cores and 120gb cluster-wide memory. >> >> My code is very similar to the example code >> >> spark/examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala >> code. >> >> I have a couple of questions: >> >> >> 1. All steps up to model training runs reasonably fast. Model >> training is under 10 minutes for rank 20. However, the >> model.recommendProductsForUsers step is either slow or just does not >> work as the code just seems to hang at this point. I have tried user and >> product blocks sizes of -1 and 20, 40, etc, played with executor memory >> size, etc. Can someone shed some light here as to what could be wrong? >> 2. Also, is there any example code for the ml.recommendation.ALS >> algorithm? I can figure out how to train the model but I don't understand >> (from the documentation) how to perform predictions? >> >> Thanks for any information you can provide. >> Sahib Aulakh. >> >> >> -- >> Sahib Aulakh >> Sr. Principal Engineer >> > > -- Sahib Aulakh Sr. Principal Engineer