Hi
There are many different variants of gradient descent mostly dealing with what
the step size is and how it might be adjusted as the algorithm proceeds. Also
if it uses a stochastic variant (as opposed to batch descent) then there are
variations there too. I don’t know off-hand what MLlib’s
I don’t get those results. I get:
spark 0.14
scikit-learn0.85
The scikit-learn mse is due to the very low eta0 setting. Tweak that to 0.1 and
push iterations to 400 and you get a mse ~= 0. Of course the coefficients are
both ~1 and the intercept ~0. Similarly if you change the
Ah I see, thanks!
I was just confused because given the same configuration, I would have
thought that Spark and Scikit would give more similar results, but I guess
this is simply not the case (as in your example, in order to get spark to
give an mse sufficiently close to scikit's you have to give