Running with lambda=0 fails the ALS code since the matrices no longer stays
positive def and cholesky fails...

Run with a very low lambda (I tested with 1e-4) and you should see the
decrease in RMSE as you expect...

On Thu, Nov 27, 2014 at 3:04 AM, Kostas Kloudas <kklou...@gmail.com> wrote:

> Thanks a lot for your time guys and your quick replies!
>
> > On Nov 26, 2014, at 7:53 PM, Xiangrui Meng <men...@gmail.com> wrote:
> >
> > The training RMSE may increase due to regularization. Squared loss
> > only represents part of the global loss. If you watch the sum of the
> > squared loss and the regularization, it should be non-increasing.
> > -Xiangrui
> >
> > On Wed, Nov 26, 2014 at 9:53 AM, Sean Owen <so...@cloudera.com> wrote:
> >> I also modified the example to try 1, 5, 9, ... iterations as you did,
> >> and also ran with the same default parameters. I used the
> >> sample_movielens_data.txt file. Is that what you're using?
> >>
> >> My result is:
> >>
> >> Iteration 1 Test RMSE = 1.426079653593016 Train RMSE =
> 1.5013155094216357
> >> Iteration 5 Test RMSE = 1.405598012724468 Train RMSE =
> 1.4847078708333596
> >> Iteration 9 Test RMSE = 1.4055990901261632 Train RMSE =
> 1.484713206769993
> >> Iteration 13 Test RMSE = 1.4055990999738366 Train RMSE =
> 1.4847132332994588
> >> Iteration 17 Test RMSE = 1.40559910003368 Train RMSE = 1.48471323345531
> >> Iteration 21 Test RMSE = 1.4055991000342158 Train RMSE =
> 1.4847132334567061
> >> Iteration 25 Test RMSE = 1.4055991000342174 Train RMSE =
> 1.4847132334567108
> >>
> >> Train error is higher than test error, consistently, which could be
> >> underfitting. A higher rank=50 gets a reasonable result:
> >>
> >> Iteration 1 Test RMSE = 1.5981883186995312 Train RMSE =
> 1.4841671360432005
> >> Iteration 5 Test RMSE = 1.5745145659678204 Train RMSE =
> 1.4672341345080382
> >> Iteration 9 Test RMSE = 1.5745147110505406 Train RMSE =
> 1.4672385714907996
> >> Iteration 13 Test RMSE = 1.5745147108258577 Train RMSE =
> 1.4672385929631868
> >> Iteration 17 Test RMSE = 1.5745147108246424 Train RMSE =
> 1.4672385930428344
> >> Iteration 21 Test RMSE = 1.5745147108246367 Train RMSE =
> 1.4672385930431973
> >> Iteration 25 Test RMSE = 1.5745147108246367 Train RMSE =
> 1.467238593043199
> >>
> >> I'm not sure what the difference is. I looked at your modifications
> >> and they seem very similar. Is it the data you're using?
> >>
> >>
> >> On Wed, Nov 26, 2014 at 3:34 PM, Kostas Kloudas <kklou...@gmail.com>
> wrote:
> >>> For the training I am using the code in the MovieLensALS example with
> trainImplicit set to false
> >>> and for the training RMSE I use the
> >>>
> >>> val rmseTr = computeRmse(model, training, params.implicitPrefs).
> >>>
> >>> The computeRmse() method is provided in the MovieLensALS class.
> >>>
> >>>
> >>> Thanks a lot,
> >>> Kostas
> >>>
> >>>
> >>>> On Nov 26, 2014, at 2:41 PM, Sean Owen <so...@cloudera.com> wrote:
> >>>>
> >>>> How are you computing RMSE?
> >>>> and how are you training the model -- not with trainImplicit right?
> >>>> I wonder if you are somehow optimizing something besides RMSE.
> >>>>
> >>>> On Wed, Nov 26, 2014 at 2:36 PM, Kostas Kloudas <kklou...@gmail.com>
> wrote:
> >>>>> Once again, the error even with the training dataset increases. The
> results
> >>>>> are:
> >>>>>
> >>>>> Running 1 iterations
> >>>>> For 1 iter.: Test RMSE  = 1.2447121194304893  Training RMSE =
> >>>>> 1.2394166987104076 (34.751317636 s).
> >>>>> Running 5 iterations
> >>>>> For 5 iter.: Test RMSE  = 1.3253957117600659  Training RMSE =
> >>>>> 1.3206317416138509 (37.693118023000004 s).
> >>>>> Running 9 iterations
> >>>>> For 9 iter.: Test RMSE  = 1.3255293380139364  Training RMSE =
> >>>>> 1.3207661218210436 (41.046175661 s).
> >>>>> Running 13 iterations
> >>>>> For 13 iter.: Test RMSE  = 1.3255295352665748  Training RMSE =
> >>>>> 1.3207663201865092 (47.763619515 s).
> >>>>> Running 17 iterations
> >>>>> For 17 iter.: Test RMSE  = 1.32552953555787  Training RMSE =
> >>>>> 1.3207663204794406 (59.682361103000005 s).
> >>>>> Running 21 iterations
> >>>>> For 21 iter.: Test RMSE  = 1.3255295355583026  Training RMSE =
> >>>>> 1.3207663204798756 (57.210578232 s).
> >>>>> Running 25 iterations
> >>>>> For 25 iter.: Test RMSE  = 1.325529535558303  Training RMSE =
> >>>>> 1.3207663204798765 (65.785485882 s).
> >>>>>
> >>>>> Thanks a lot,
> >>>>> Kostas
> >>>>>
> >>>>> On Nov 26, 2014, at 12:04 PM, Nick Pentreath <
> nick.pentre...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> copying user group - I keep replying directly vs reply all :)
> >>>>>
> >>>>> On Wed, Nov 26, 2014 at 2:03 PM, Nick Pentreath <
> nick.pentre...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>> ALS will be guaranteed to decrease the squared error (therefore
> RMSE) in
> >>>>>> each iteration, on the training set.
> >>>>>>
> >>>>>> This does not hold for the test set / cross validation. You would
> expect
> >>>>>> the test set RMSE to stabilise as iterations increase, since the
> algorithm
> >>>>>> converges - but not necessarily to decrease.
> >>>>>>
> >>>>>> On Wed, Nov 26, 2014 at 1:57 PM, Kostas Kloudas <kklou...@gmail.com
> >
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> I am getting familiarized with Mllib and a thing I noticed is that
> >>>>>>> running the MovieLensALS
> >>>>>>> example on the movieLens dataset for increasing number of
> iterations does
> >>>>>>> not decrease the
> >>>>>>> rmse.
> >>>>>>>
> >>>>>>> The results for 0.6% training set and 0.4% test are below. For
> training
> >>>>>>> set to 0.8%, the results
> >>>>>>> are almost identical. Shouldn’t it be normal to see a decreasing
> error?
> >>>>>>> Especially going from 1 to 5 iterations.
> >>>>>>>
> >>>>>>> Running 1 iterations
> >>>>>>> Test RMSE for 1 iter. = 1.2452964343277886 (52.757125927000004 s).
> >>>>>>> Running 5 iterations
> >>>>>>> Test RMSE for 5 iter. = 1.3258973764470259 (61.183927666 s).
> >>>>>>> Running 9 iterations
> >>>>>>> Test RMSE for 9 iter. = 1.3260308117704385 (61.84948875800001 s).
> >>>>>>> Running 13 iterations
> >>>>>>> Test RMSE for 13 iter. = 1.3260310099809915 (73.799510125 s).
> >>>>>>> Running 17 iterations
> >>>>>>> Test RMSE for 17 iter. = 1.3260310102735398 (77.56512185300001 s).
> >>>>>>> Running 21 iterations
> >>>>>>> Test RMSE for 21 iter. = 1.3260310102739703 (79.607495074 s).
> >>>>>>> Running 25 iterations
> >>>>>>> Test RMSE for 25 iter. = 1.326031010273971 (88.631776301 s).
> >>>>>>> Running 29 iterations
> >>>>>>> Test RMSE for 29 iter. = 1.3260310102739712 (101.178383079 s).
> >>>>>>>
> >>>>>>> Thanks  a lot,
> >>>>>>> Kostas
> >>>>>>>
> ---------------------------------------------------------------------
> >>>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> >>>>>>> For additional commands, e-mail: user-h...@spark.apache.org
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> >> For additional commands, e-mail: user-h...@spark.apache.org
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to