[ 
https://issues.apache.org/jira/browse/SPARK-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14734170#comment-14734170
 ] 

Debasish Das edited comment on SPARK-9834 at 9/8/15 3:18 PM:
-------------------------------------------------------------

[~mengxr] If you are open to use breeze.proximal.QuadraticMinimizer we can 
support elastic net in this variant as well...The flow will be very similar to 
QuadraticMinimizer integration to ALS...I have done runtime benchmarks compared 
to OWLQN and if we can afford to do dense cholesky QuadraticMinimizer converges 
faster than OWLQN.

There are two new QuadraticMinimizer features I am working on which will 
further improve the solver:
1. sparse ldl through tim davis lgpl code and using breeze sparse matrix for 
sparse gram and conic formulations. Plan is to add it in breeze-native under 
lgpl similar to netlib-java integration.
2. admm acceleration using nesterov method. ADMM can be run in the same 
complexity as FISTA (implemented in TFOCS). Reference: 
http://www.optimization-online.org/DB_FILE/2009/12/2502.pdf

Although in practice I found even the ADMM implemented right now in 
QuadraticMinimizer converges faster than OWLQN. Tom in his paper demonstrated 
faster ADMM convergence compared to FISTA for quadratic problems: 
ftp://ftp.math.ucla.edu/pub/camreport/cam12-35.pdf. 

Due to the X^TX availability in these problems (ALS and linear regression) I 
also compute the min and max eigen values using power iteration 
(breeze.optimize.linear.PowerMethod) in the code which gives the Lipschitz 
estimator L and there is no line search overhead. This trick did not work for 
the nonlinear variant as the hessian estimates are not close to gram matrix !

QuadraticMinimizer is optimized to run at par with blas dposv when there are no 
constraints while BFGS/OWLQN both still have lot of overhead from iterators 
etc. That might also be the reason that I see QuadraticMinimizer is faster than 
BFGS/OWLQN.

It might be the right time to do the micro-benchmark as well that you asked for 
QuadraticMinimizer. Let me know what you think. I can finish up the 
micro-benchmark, bring the runtime of QuadraticMinimizer to ALS 
NormalEquationSolver and then start the L1 experiments.


was (Author: debasish83):
If you are open to use breeze.proximal.QuadraticMinimizer we can support 
elastic net in this variant as well...I can add it on top of your PR...it will 
be very similar to quadraticminimizer integration to ALS...I have done runtime 
benchmarks compared to OWLQN and if we can afford to do dense cholesky 
QuadraticMinimizer converges faster than OWLQN...there are two new features I 
am working on...sparse ldl through tim davis lgpl code and using breeze sparse 
matrix for sparse gram and conic formulations and admm acceleration using 
nesterov method...admm can also be run in the same complexity as FISTA...david 
goldferb proved it.

> Normal equation solver for ordinary least squares
> -------------------------------------------------
>
>                 Key: SPARK-9834
>                 URL: https://issues.apache.org/jira/browse/SPARK-9834
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>
> Add normal equation solver for ordinary least squares with not many features. 
> The approach requires one pass to collect AtA and Atb, then solve the problem 
> on driver. It works well when the problem is not very ill-conditioned and not 
> having many columns. It also provides R-like summary statistics.
> We can hide this implementation under LinearRegression. It is triggered when 
> there are no more than, e.g., 4096 features.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to