[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-07 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15856612#comment-15856612
 ] 

Joseph K. Bradley edited comment on SPARK-17139 at 2/7/17 7:25 PM:
---

I'll offer a few thoughts first:
* A "ClassificationSummary" could be the same as a 
"MulticlassClassificationSummary" because binary is a special type of 
multiclass.
* Following the structure of abstractions for Prediction is reasonable.
* Separating binary and multiclass is reasonable; the separation is more 
significant for evaluation than for the Prediction abstractions.
* Abstract classes have been a pain in the case of Prediction abstractions, so 
I'd prefer we use traits.

The 2 alternatives I see are:
1. BinaryClassificationSummary inherits from ClassificationSummary.  No 
separate MulticlassClassificationSummary.
2. BinaryClassificationSummary and MulticlassClassificationSummary inherit from 
ClassificationSummary.

Both alternatives are semantically reasonable.  However, since 
ClassificationSummary = MulticlassClassificationSummary in terms of 
functionality, and since the Prediction abstractions combine binary and 
multiclass, I prefer option 1.

If we go with option 1, then we need 4 concrete classes:
* LogisticRegressionSummary
* LogisticRegressionTrainingSummary
* BinaryLogisticRegressionSummary
* BinaryLogisticRegressionTrainingSummary

We would definitely want binary summaries to inherit from their multiclass 
counterparts, and for training summaries to inherit from their general 
counterparts:
* LogisticRegressionSummary
* LogisticRegressionTrainingSummary: LogisticRegressionSummary
* BinaryLogisticRegressionSummary: LogisticRegressionSummary
* BinaryLogisticRegressionTrainingSummary: LogisticRegressionTrainingSummary, 
BinaryLogisticRegressionSummary

Of course, this is a problem.  But we could solve it by having all of these be 
traits, with concrete classes inheriting.  I.e., 
{{LogisticRegressionModel.summary}} could return {{trait 
LogisticRegressionTrainingSummary}}, which could be of concrete type 
{{LogisticRegressionTrainingSummaryImpl}} (multiclass) or 
{{BinaryLogisticRegressionTrainingSummaryImpl}} (binary).

I suspect MiMa will complain about this, but IIRC it's safe since all of these 
summaries have private constructors and can't be extended outside of Spark.

Btw, we could introduce a set of abstractions matching the Prediction ones, but 
that should probably happen under a separate JIRA.

What do you think?


was (Author: josephkb):
I'll offer a few thoughts first:
* A "ClassificationSummary" could be the same as a 
"MulticlassClassificationSummary" because binary is a special type of 
multiclass.
* Following the structure of abstractions for Prediction is reasonable.
* Separating binary and multiclass is reasonable; the separation is more 
significant for evaluation than for the Prediction abstractions.
* Abstract classes have been a pain in the case of Prediction abstractions, so 
I'd prefer we use traits.

The 2 alternatives I see are:
1. BinaryClassificationSummary inherits from ClassificationSummary.  No 
separate MulticlassClassificationSummary.
2. BinaryClassificationSummary and MulticlassClassificationSummary inherit from 
ClassificationSummary.

Both alternatives are semantically reasonable.  However, since 
ClassificationSummary = MulticlassClassificationSummary in terms of 
functionality, and since the Prediction abstractions combine binary and 
multiclass, I prefer option 1.

If we go with option 1, then we need 4 concrete classes:
* LogisticRegressionSummary
* LogisticRegressionTrainingSummary
* BinaryLogisticRegressionSummary
* BinaryLogisticRegressionTrainingSummary

We would definitely want binary summaries to inherit from their multiclass 
counterparts, and for training summaries to inherit from their general 
counterparts:
* LogisticRegressionSummary
* LogisticRegressionTrainingSummary: LogisticRegressionSummary
* BinaryLogisticRegressionSummary: LogisticRegressionSummary
* BinaryLogisticRegressionTrainingSummary: LogisticRegressionTrainingSummary, 
BinaryLogisticRegressionSummary

Of course, this is a problem.  But we could solve it by having all of these be 
traits, with concrete classes inheriting.  I.e., 
{{LogisticRegressionModel.summary}} could return {{trait 
LogisticRegressionTrainingSummary}}, which could be of concrete type 
{{LogisticRegressionTrainingSummaryImpl}} (multiclass) or 
{{BinaryLogisticRegressionTrainingSummaryImpl}} (binary).

I suspect MiMa will complain about this, but IIRC it's safe since all of these 
summaries have private constructors and can't be extended outside of Spark.

What do you think?

> Add model summary for MultinomialLogisticRegression
> ---
>
> Key: SPARK-17139
> URL: https://issues.apache.org/jira/browse/SPARK-17139
>

[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-10-10 Thread Weichen Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15564097#comment-15564097
 ] 

Weichen Xu edited comment on SPARK-17139 at 10/11/16 1:25 AM:
--

I'm working on it hard and will create PR this week, thanks!


was (Author: weichenxu123):
I'm working on it hardly and will create PR this week, thanks!

> Add model summary for MultinomialLogisticRegression
> ---
>
> Key: SPARK-17139
> URL: https://issues.apache.org/jira/browse/SPARK-17139
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: Seth Hendrickson
>
> Add model summary to multinomial logistic regression using same interface as 
> in other ML models.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2016-08-18 Thread Weichen Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427519#comment-15427519
 ] 

Weichen Xu edited comment on SPARK-17139 at 8/19/16 3:05 AM:
-

I will work on it and create a PR when the dependent algorithm merged, thanks.


was (Author: weichenxu123):
I will work on it and create PR soon, thanks.

> Add model summary for MultinomialLogisticRegression
> ---
>
> Key: SPARK-17139
> URL: https://issues.apache.org/jira/browse/SPARK-17139
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Reporter: Seth Hendrickson
>
> Add model summary to multinomial logistic regression using same interface as 
> in other ML models.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org