Great thanks both of you. I was struggling with this issue as well. -Rohit
On Mon, Jul 25, 2016 at 4:12 AM, Krishna Sankar <ksanka...@gmail.com> wrote: > Thanks Nick. I also ran into this issue. > VG, One workaround is to drop the NaN from predictions (df.na.drop()) and > then use the dataset for the evaluator. In real life, probably detect the > NaN and recommend most popular on some window. > HTH. > Cheers > <k/> > > On Sun, Jul 24, 2016 at 12:49 PM, Nick Pentreath <nick.pentre...@gmail.com > > wrote: > >> It seems likely that you're running into >> https://issues.apache.org/jira/browse/SPARK-14489 - this occurs when the >> test dataset in the train/test split contains users or items that were not >> in the training set. Hence the model doesn't have computed factors for >> those ids, and ALS 'transform' currently returns NaN for those ids. This in >> turn results in NaN for the evaluator result. >> >> I have a PR open on that issue that will hopefully address this soon. >> >> >> On Sun, 24 Jul 2016 at 17:49 VG <vlin...@gmail.com> wrote: >> >>> ping. Anyone has some suggestions/advice for me . >>> It will be really helpful. >>> >>> VG >>> On Sun, Jul 24, 2016 at 12:19 AM, VG <vlin...@gmail.com> wrote: >>> >>>> Sean, >>>> >>>> I did this just to test the model. When I do a split of my data as >>>> training to 80% and test to be 20% >>>> >>>> I get a Root-mean-square error = NaN >>>> >>>> So I am wondering where I might be going wrong >>>> >>>> Regards, >>>> VG >>>> >>>> On Sun, Jul 24, 2016 at 12:12 AM, Sean Owen <so...@cloudera.com> wrote: >>>> >>>>> No, that's certainly not to be expected. ALS works by computing a much >>>>> lower-rank representation of the input. It would not reproduce the >>>>> input exactly, and you don't want it to -- this would be seriously >>>>> overfit. This is why in general you don't evaluate a model on the >>>>> training set. >>>>> >>>>> On Sat, Jul 23, 2016 at 7:37 PM, VG <vlin...@gmail.com> wrote: >>>>> > I am trying to run ml.ALS to compute some recommendations. >>>>> > >>>>> > Just to test I am using the same dataset for training using ALSModel >>>>> and for >>>>> > predicting the results based on the model . >>>>> > >>>>> > When I evaluate the result using RegressionEvaluator I get a >>>>> > Root-mean-square error = 1.5544064263236066 >>>>> > >>>>> > I thin this should be 0. Any suggestions what might be going wrong. >>>>> > >>>>> > Regards, >>>>> > Vipul >>>>> >>>> >>>> >