1. could you give job, stage & task status from Spark UI? I found it extremely useful for performance tuning.
2. use modele.transform for predictions. Usually we have a pipeline for preparing training data, and use the same pipeline to transform data you want to predict could give us the prediction column. On Thu, Jun 1, 2017 at 7:48 AM, Sahib Aulakh [Search] < sahibaul...@coupang.com> wrote: > Hello: > > I am training the ALS model for recommendations. I have about 200m ratings > from about 10m users and 3m products. I have a small cluster with 48 cores > and 120gb cluster-wide memory. > > My code is very similar to the example code > > spark/examples/src/main/scala/org/apache/spark/examples/mllib/MovieLensALS.scala > code. > > I have a couple of questions: > > > 1. All steps up to model training runs reasonably fast. Model training > is under 10 minutes for rank 20. However, the > model.recommendProductsForUsers > step is either slow or just does not work as the code just seems to hang at > this point. I have tried user and product blocks sizes of -1 and 20, 40, > etc, played with executor memory size, etc. Can someone shed some light > here as to what could be wrong? > 2. Also, is there any example code for the ml.recommendation.ALS > algorithm? I can figure out how to train the model but I don't understand > (from the documentation) how to perform predictions? > > Thanks for any information you can provide. > Sahib Aulakh. > > > -- > Sahib Aulakh > Sr. Principal Engineer >