[ https://issues.apache.org/jira/browse/SPARK-17906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15594570#comment-15594570 ]
zhengruifeng commented on SPARK-17906: -------------------------------------- Yes. I think it useful to expose metrics computing one label vs others. > MulticlassClassificationEvaluator support target label > ------------------------------------------------------ > > Key: SPARK-17906 > URL: https://issues.apache.org/jira/browse/SPARK-17906 > Project: Spark > Issue Type: Brainstorming > Components: ML > Reporter: zhengruifeng > Priority: Minor > > In practice, I sometime only focus on metric of one special label. > For example, in CTR prediction, I usually only mind F1 of positive class. > In sklearn, this is supported: > {code} > >>> from sklearn.metrics import classification_report > >>> y_true = [0, 1, 2, 2, 2] > >>> y_pred = [0, 0, 2, 2, 1] > >>> target_names = ['class 0', 'class 1', 'class 2'] > >>> print(classification_report(y_true, y_pred, target_names=target_names)) > precision recall f1-score support > class 0 0.50 1.00 0.67 1 > class 1 0.00 0.00 0.00 1 > class 2 1.00 0.67 0.80 3 > avg / total 0.70 0.60 0.61 5 > {code} > Now, ml only support `weightedXXX`. So I think there may be a point to > improve. > The API may be designed like this: > {code} > val dataset = ... > val evaluator = new MulticlassClassificationEvaluator > evaluator.setMetricName("f1") > evaluator.evaluate(dataset) // weightedF1 of all classes > evaluator.setTarget(0.0).setMetricName("f1") > evaluator.evaluate(dataset) // F1 of class "0" > {code} > what's your opinion? [~yanboliang][~josephkb][~sethah][~srowen] > If this is useful and acceptable, I'm happy to work on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org