[jira] [Commented] (SPARK-9834) Normal equation solver for ordinary least squares
[ https://issues.apache.org/jira/browse/SPARK-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315935#comment-15315935 ] Debasish Das commented on SPARK-9834: - Do you have runtime comparisons that when features <= 4096, OLS using Normal Equations is faster than BFGS ? > Normal equation solver for ordinary least squares > - > > Key: SPARK-9834 > URL: https://issues.apache.org/jira/browse/SPARK-9834 > Project: Spark > Issue Type: New Feature > Components: ML >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > Fix For: 1.6.0 > > > Add normal equation solver for ordinary least squares with not many features. > The approach requires one pass to collect AtA and Atb, then solve the problem > on driver. It works well when the problem is not very ill-conditioned and not > having many columns. It also provides R-like summary statistics. > We can hide this implementation under LinearRegression. It is triggered when > there are no more than, e.g., 4096 features. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9834) Normal equation solver for ordinary least squares
[ https://issues.apache.org/jira/browse/SPARK-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14734170#comment-14734170 ] Debasish Das commented on SPARK-9834: - If you are open to use breeze.proximal.QuadraticMinimizer we can support elastic net in this variant as well...I can add it on top of your PR...it will be very similar to quadraticminimizer integration to ALS...I have done runtime benchmarks compared to OWLQN and if we can afford to do dense cholesky QuadraticMinimizer converges faster than OWLQN...there are two new features I am working on...sparse ldl through tim davis lgpl code and using breeze sparse matrix for sparse gram and conic formulations and admm acceleration using nesterov method...admm can also be run in the same complexity as FISTA...david goldferb proved it. > Normal equation solver for ordinary least squares > - > > Key: SPARK-9834 > URL: https://issues.apache.org/jira/browse/SPARK-9834 > Project: Spark > Issue Type: New Feature > Components: ML >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > Add normal equation solver for ordinary least squares with not many features. > The approach requires one pass to collect AtA and Atb, then solve the problem > on driver. It works well when the problem is not very ill-conditioned and not > having many columns. It also provides R-like summary statistics. > We can hide this implementation under LinearRegression. It is triggered when > there are no more than, e.g., 4096 features. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9834) Normal equation solver for ordinary least squares
[ https://issues.apache.org/jira/browse/SPARK-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14732440#comment-14732440 ] DB Tsai commented on SPARK-9834: In fact, for linear regression, if the # of features is small, X^TX is the only required stats for implementing one pass elastic net. Once X^TX is computed, we can use local solver to optimize the objective function without going through the data. Here is the algorithm Kun implemented when he was an intern at Alpine Data Labs. http://arxiv.org/pdf/1307.0048v1.pdf Maybe what we can implement is the following. For the # of features < 4096, we compute the X^TX first. Then if no L1, we compute the model using exact normal equation, if there is L1, we compute the model using LBFGS with X^TX without going through the data again. > Normal equation solver for ordinary least squares > - > > Key: SPARK-9834 > URL: https://issues.apache.org/jira/browse/SPARK-9834 > Project: Spark > Issue Type: New Feature > Components: ML >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > Add normal equation solver for ordinary least squares with not many features. > The approach requires one pass to collect AtA and Atb, then solve the problem > on driver. It works well when the problem is not very ill-conditioned and not > having many columns. It also provides R-like summary statistics. > We can hide this implementation under LinearRegression. It is triggered when > there are no more than, e.g., 4096 features. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9834) Normal equation solver for ordinary least squares
[ https://issues.apache.org/jira/browse/SPARK-9834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14729388#comment-14729388 ] Apache Spark commented on SPARK-9834: - User 'mengxr' has created a pull request for this issue: https://github.com/apache/spark/pull/8588 > Normal equation solver for ordinary least squares > - > > Key: SPARK-9834 > URL: https://issues.apache.org/jira/browse/SPARK-9834 > Project: Spark > Issue Type: New Feature > Components: ML >Reporter: Xiangrui Meng >Assignee: Xiangrui Meng > > Add normal equation solver for ordinary least squares with not many features. > The approach requires one pass to collect AtA and Atb, then solve the problem > on driver. It works well when the problem is not very ill-conditioned and not > having many columns. It also provides R-like summary statistics. > We can hide this implementation under LinearRegression. It is triggered when > there are no more than, e.g., 4096 features. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org