[jira] [Updated] (SPARK-20047) Constrained Logistic Regression

DB Tsai (JIRA) Tue, 21 Mar 2017 16:25:54 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-20047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


DB Tsai updated SPARK-20047:
----------------------------
    Description: 
For certain applications, such as stacked regressions, it is important to put 
non-negative constraints on the regression coefficients. Also, if the ranges of 
coefficients are known, it makes sense to constrain the coefficient search 
space.

Fitting generalized constrained regression models object to Cβ ≤ b, where C ∈ 
R^\{m×p\} and b ∈ R^\{m\} are predefined matrices and vectors which places a 
set of m linear constraints on the coefficients is very challenging as 
discussed in many literatures. 

However, for box constraints on the coefficients, the optimization is well 
solved. For gradient descent, people can projected gradient descent in the 
primal by zeroing the negative weights at each step. For LBFGS, an extended 
version of it, LBFGS-B can handle large scale box optimization efficiently. 
Unfortunately, for OWLQN, there is no good efficient way to do optimization 
with box constrains.

As a result, in this work, we only implement constrained LR with box constrains 
without L1 regularization. 

Note that since we standardize the data in training phase, so the coefficients 
seen in the optimization subroutine are in the scaled space; as a result, we 
need to convert the box constrains into scaled space.

Users will be able to set the lower / upper bounds of each coefficients and 
intercepts.


 


One solution could be to modify these implementations and do a Projected 
Gradient Descent in the primal by zeroing the negative weights at each step. 
But this process is inconvenient because the nice convergence properties are 
then lost.





  was:
For certain applications, such as stacked regressions, it is important to put 
non-negative constraints on the regression coefficients. Also, if the ranges of 
coefficients are known, it makes sense to constrain the coefficient search 
space.

Fitting generalized constrained regression models object to Cβ ≤ b, where C ∈ 
R^\{m×p\} and b ∈ R^{m} are predefined matrices and vectors which places a
set of m linear constraints on the coefficients is very challenging as 
discussed in many literatures. 

However, for box constraints on the coefficients, the optimization is well 
solved. For gradient descent, people can projected gradient descent in the 
primal by zeroing the negative weights at each step. For LBFGS, an extended 
version of it, LBFGS-B can handle large scale box optimization efficiently. 
Unfortunately, for OWLQN, there is no good efficient way to do optimization 
with box constrains.

As a result, in this work, we only implement constrained LR with box constrains 
without L1 regularization. 

Note that since we standardize the data in training phase, so the coefficients 
seen in the optimization subroutine are in the scaled space; as a result, we 
need to convert the box constrains into scaled space.

Users will be able to set the lower / upper bounds of each coefficients and 
intercepts.


 


One solution could be to modify these implementations and do a Projected 
Gradient Descent in the primal by zeroing the negative weights at each step. 
But this process is inconvenient because the nice convergence properties are 
then lost.






> Constrained Logistic Regression
> -------------------------------
>
>                 Key: SPARK-20047
>                 URL: https://issues.apache.org/jira/browse/SPARK-20047
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>    Affects Versions: 2.1.0
>            Reporter: DB Tsai
>            Assignee: Yanbo Liang
>
> For certain applications, such as stacked regressions, it is important to put 
> non-negative constraints on the regression coefficients. Also, if the ranges 
> of coefficients are known, it makes sense to constrain the coefficient search 
> space.
> Fitting generalized constrained regression models object to Cβ ≤ b, where C ∈ 
> R^\{m×p\} and b ∈ R^\{m\} are predefined matrices and vectors which places a 
> set of m linear constraints on the coefficients is very challenging as 
> discussed in many literatures. 
> However, for box constraints on the coefficients, the optimization is well 
> solved. For gradient descent, people can projected gradient descent in the 
> primal by zeroing the negative weights at each step. For LBFGS, an extended 
> version of it, LBFGS-B can handle large scale box optimization efficiently. 
> Unfortunately, for OWLQN, there is no good efficient way to do optimization 
> with box constrains.
> As a result, in this work, we only implement constrained LR with box 
> constrains without L1 regularization. 
> Note that since we standardize the data in training phase, so the 
> coefficients seen in the optimization subroutine are in the scaled space; as 
> a result, we need to convert the box constrains into scaled space.
> Users will be able to set the lower / upper bounds of each coefficients and 
> intercepts.
>  
> One solution could be to modify these implementations and do a Projected 
> Gradient Descent in the primal by zeroing the negative weights at each step. 
> But this process is inconvenient because the nice convergence properties are 
> then lost.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-20047) Constrained Logistic Regression

Reply via email to