Thanks DB Tsai, it is very helpful.
Cheers,
Wei
2015-06-23 16:00 GMT-07:00 DB Tsai :
> Please see the current version of code for better documentation.
>
> https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
>
> Sincerely,
>
> DB
Please see the current version of code for better documentation.
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
Sincerely,
DB Tsai
--
Blog: https://www.dbtsai.com
PGP Ke
The regularization is handled after the objective function of data is
computed. See
https://github.com/apache/spark/blob/6a827d5d1ec520f129e42c3818fe7d0d870dcbef/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
line 348 for L2.
For L1, it's handled by OWLQN, so you don'
Hi DB Tsai,
Thanks for your reply. I went through the source code of
LinearRegression.scala. The algorithm minimizes square error L = 1/2n ||A
weights - y||^2^. I cannot match this with the elasticNet loss function
found here http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html, which
is the s
Hi Wei,
I don't think ML is meant for single node computation, and the
algorithms in ML are designed for pipeline framework.
In short, the lasso regression in ML is new algorithm implemented from
scratch, and it's faster, and converged to the same solution as R's
glmnet but with scalability. Here
Hi Spark experts,
I see lasso regression/ elastic net implementation under both MLLib and ML,
does anyone know what is the difference between the two implementation?
In spark summit, one of the keynote speakers mentioned that ML is meant for
single node computation, could anyone elaborate this?