Hi list, I'm benchmarking MLlib for a regression task [1] and get strange results.
Namely, using RidgeRegressionWithSGD it seems the predicted points miss the intercept: {code} val trainedModel = RidgeRegressionWithSGD.train(trainingData, 1000) ... valuesAndPreds.take(10).map(t => println(t)) {code} output: (2007.0,-3.784588726958493E75) (2003.0,-1.9562390324037716E75) (2005.0,-4.147413202985629E75) (2003.0,-1.524938024096847E75) ... If I change the parameters (step size, regularization and iterations) I get NaNs more often than not: (2007.0,NaN) (2003.0,NaN) (2005.0,NaN) ... On the other hand DecisionTree model give sensible results. I see there is a `setIntercept()` method in abstract class GeneralizedLinearAlgorithm that seems to trigger the use of the intercept but I'm unable to use it from the public interface :( Any help appreciated :) Eustache [1] https://archive.ics.uci.edu/ml/datasets/YearPredictionMSD