You may try LBFGS to have more stable convergence. In spark 1.1, we will be able to use LBFGS instead of GD in training process. On Jul 4, 2014 1:23 PM, "Thomas Robert" <tho...@creativedata.fr> wrote:
> Hi all, > > I too am having some issues with *RegressionWithSGD algorithms. > > Concerning your issue Eustache, this could be due to the fact that these > regression algorithms uses a fixed step (that is divided by > sqrt(iteration)). During my tests, quite often, the algorithm diverged an > infinity cost, I guessed because the step was too big. I reduce it and > managed to get good results on a very simple generated dataset. > > But I was wondering if anyone here had some advises concerning the use of > these regression algorithms, for example how to choose a good step and > number of iterations? I wonder if I'm using those right... > > Thanks, > > -- > > *Thomas ROBERT* > www.creativedata.fr > > > 2014-07-03 16:16 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>: > >> Printing the model show the intercept is always 0 :( >> >> Should I open a bug for that ? >> >> >> 2014-07-02 16:11 GMT+02:00 Eustache DIEMERT <eusta...@diemert.fr>: >> >>> Hi list, >>> >>> I'm benchmarking MLlib for a regression task [1] and get strange >>> results. >>> >>> Namely, using RidgeRegressionWithSGD it seems the predicted points miss >>> the intercept: >>> >>> {code} >>> val trainedModel = RidgeRegressionWithSGD.train(trainingData, 1000) >>> ... >>> valuesAndPreds.take(10).map(t => println(t)) >>> {code} >>> >>> output: >>> >>> (2007.0,-3.784588726958493E75) >>> (2003.0,-1.9562390324037716E75) >>> (2005.0,-4.147413202985629E75) >>> (2003.0,-1.524938024096847E75) >>> ... >>> >>> If I change the parameters (step size, regularization and iterations) I >>> get NaNs more often than not: >>> (2007.0,NaN) >>> (2003.0,NaN) >>> (2005.0,NaN) >>> ... >>> >>> On the other hand DecisionTree model give sensible results. >>> >>> I see there is a `setIntercept()` method in abstract class >>> GeneralizedLinearAlgorithm that seems to trigger the use of the intercept >>> but I'm unable to use it from the public interface :( >>> >>> Any help appreciated :) >>> >>> Eustache >>> >>> [1] https://archive.ics.uci.edu/ml/datasets/YearPredictionMSD >>> >> >