[ 
https://issues.apache.org/jira/browse/SPARK-17163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15435497#comment-15435497
 ] 

Seth Hendrickson commented on SPARK-17163:
------------------------------------------

I completely agree about the fact that there would not be much need for LOR if 
they are kept separate. It seems we will have only a single interface, whether 
we force it explicitly by merging them, or let it happen as one falls out of 
use. If that's the case, I prefer to have a single interface called just 
{{LogisticRegression}} which handles all cases. 

As a clarification on the pivoting subject - we elected explicitly NOT to 
implement pivoting for the multinomial case as it matches glmnet and it doesn't 
subjectively select one class as a pivot. Implementing pivoting could be as 
simple as locking one set of coefficients at zero during training (i.e. we 
don't have to hard code a modified gradient update formula like in mllib). What 
do you think about exposing a merged API which has a {{family}} parameter. When 
the family is set to {{"binomial"}} we produce normal logistic regression with 
pivoting and when it is set to {{"multinomial"}} (default) it produces logistic 
regression with pivoting. This is basically how glmnet exposes it. The concern 
would be that users get different results between 2.0 and 2.1, but we could fix 
that by making the default "binomial" for binary classification.

I vote to merge them into a single interface {{LogisticRegression}}. What a fun 
PR that would be :)

> Decide on unified multinomial and binary logistic regression interfaces
> -----------------------------------------------------------------------
>
>                 Key: SPARK-17163
>                 URL: https://issues.apache.org/jira/browse/SPARK-17163
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, MLlib
>            Reporter: Seth Hendrickson
>
> Before the 2.1 release, we should finalize the API for logistic regression. 
> After SPARK-7159, we have both LogisticRegression and 
> MultinomialLogisticRegression models. This may be confusing to users and, is 
> a bit superfluous since MLOR can do basically all of what BLOR does. We 
> should decide if it needs to be changed and implement those changes before 2.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to