[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110628#comment-15110628 ] Daniel Darabos commented on SPARK-2309: --- https://github.com/apache/spark/blob/v1.6.0/docs/ml-classification-regression.md#logistic-regression still says: > The current implementation of logistic regression in spark.ml only supports > binary classes. Support for multiclass regression will be added in the future. That can be removed now, right? > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952869#comment-14952869 ] christian sommeregger commented on SPARK-2309: -- Sorry everyone! I got confused by the different terminologies out there. (the model on slideshare is of course implemented correctly) I was talking about a conditional multinomial logit: So instead of U_is=X_s*w_i ( U_is denotes the utility of item i in choice situation s, features X_s are constant across alternatives, weights w_i are item specific) we would use: U_is=X_si*w (weights are the same across alternatives, but features can be distinct for each item, and can be different for each s) (check 6.3.3. in http://data.princeton.edu/wws509/notes/c6s3.html) I would be happy to contribute some code. Do you think this would be an interesting extension. Should I create a new ticket for this ? > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953024#comment-14953024 ] Sean Owen commented on SPARK-2309: -- Hm, I'm not sure I've seen a formulation like that before. Typically you have a set of input features, and K output classes, and you learn K one-vs-all classification boundaries. But that means the same features in each case, but different coefficients for each class. That's what your reference says too, but in the "Multinomial logit" section, which is what we're talking about here no? I'm actually a little confused by http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 on reviewing; does the notation change on the third line or am I missing a key step? x shouldn't be specific to the output class k; w should be. > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953136#comment-14953136 ] christian sommeregger commented on SPARK-2309: -- yes, I think the k should be dropped in x_k on slide 25 in line 2. Regarding my source: the current multinomial model in spark corresponds to: 6.3.2 However, I think that something like 6.3.3 or 6.3.4 would be extremely useful. > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953530#comment-14953530 ] DB Tsai commented on SPARK-2309: I think the priority will be porting MLOR into Spark ML framework first, and then we can think about the extension. > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953520#comment-14953520 ] DB Tsai commented on SPARK-2309: That was a typo. In line 3, x should be x_k. Several people already sent me email and asked the same question. I'll update the slide. Thanks. > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951744#comment-14951744 ] Sean Owen commented on SPARK-2309: -- Yeah, I also don't get it. In multinomial LR you still have the same features for every output class. The slide you show just shows a loss function computed over the loss for each of the N classes, not just 1. But the features are the same. Implicitly, if an example is in class k then it's not in the other classes. > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14951438#comment-14951438 ] DB Tsai commented on SPARK-2309: I don't quite get you, can you elaborate? But I'm pretty sure that the implementation in Spark MLlib is the same as slide and that's standard multinomial LoR. You can check the test code which shows that the result matches R. > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950208#comment-14950208 ] christian sommeregger commented on SPARK-2309: -- Hey everybody! After inspecting the code on github I believe that we have not really implemented the standard multinomial problem from http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 but a model that covers a set of binary choices with item specific weights, which is a slightly different thing. For a true multinomial setup each row in the training data needs to containt all items (K = number of choices) that were available in a specific choice situation, The current labelled point object however has just a choice flag + the respective features of one item in each row: e.g.: Labelled point (K=1) 0 | () 1 | () 3 | () 3 | () 0 | () For the model on http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 we would rather need the following structure. e.g.: Always three Items in the choice set (K=3) Choice Indicator | Item1Features | Item2Features | Item3Features 1 | () | () | () 3 | () | () | ()* 3 | () | () | ()* e.g.: Flexible number of Items in the choice set (K varies) 8 | () | () | () | () | () | () | () | ()* | () 2 | () | ()* | () 3 | () | () | ()* | () > Generalize the binary logistic regression into multinomial logistic regression > -- > > Key: SPARK-2309 > URL: https://issues.apache.org/jira/browse/SPARK-2309 > Project: Spark > Issue Type: New Feature > Components: MLlib >Reporter: DB Tsai >Assignee: DB Tsai >Priority: Critical > Fix For: 1.3.0 > > > Currently, there is no multi-class classifier in mllib. Logistic regression > can be extended to multinomial one straightforwardly. > The following formula will be implemented. > http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2309) Generalize the binary logistic regression into multinomial logistic regression
[ https://issues.apache.org/jira/browse/SPARK-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063163#comment-14063163 ] Xiangrui Meng commented on SPARK-2309: -- PR: https://github.com/apache/spark/pull/1379 Generalize the binary logistic regression into multinomial logistic regression -- Key: SPARK-2309 URL: https://issues.apache.org/jira/browse/SPARK-2309 Project: Spark Issue Type: New Feature Components: MLlib Reporter: DB Tsai Assignee: DB Tsai Currently, there is no multi-class classifier in mllib. Logistic regression can be extended to multinomial one straightforwardly. The following formula will be implemented. http://www.slideshare.net/dbtsai/2014-0620-mlor-36132297/25 -- This message was sent by Atlassian JIRA (v6.2#6252)