[ https://issues.apache.org/jira/browse/SPARK-18844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15746293#comment-15746293 ]
Zak Patterson commented on SPARK-18844: --------------------------------------- I'm not familiar with the python API much, but it seems to me that the two methods available for scala (precision and recall) are not available in python? https://github.com/apache/spark/blob/v2.1.0-rc2/python/pyspark/mllib/evaluation.py#L29 > Add more binary classification metrics to BinaryClassificationMetrics > --------------------------------------------------------------------- > > Key: SPARK-18844 > URL: https://issues.apache.org/jira/browse/SPARK-18844 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 2.0.2 > Reporter: Zak Patterson > Priority: Minor > Labels: evaluation > Fix For: 2.0.2 > > Original Estimate: 5h > Remaining Estimate: 5h > > BinaryClassificationMetrics only implements Precision (positive predictive > value) and recall (true positive rate). It should implement more > comprehensive metrics. > Moreover, the instance variables storing computed counts are marked private, > and there are no accessors for them. So if one desired to add this > functionality, one would have to duplicate this calculation, which is not > trivial: > https://github.com/apache/spark/blob/v2.0.2/mllib/src/main/scala/org/apache/spark/mllib/evaluation/BinaryClassificationMetrics.scala#L144 > Currently Implemented Metrics > --- > * Precision (PPV): `precisionByThreshold` > * Recall (Sensitivity, true positive rate): `recallByThreshold` > Desired additional metrics > --- > * False omission rate: `forByThreshold` > * False discovery rate: `fdrByThreshold` > * Negative predictive value: `npvByThreshold` > * False negative rate: `fnrByThreshold` > * True negative rate (Specificity): `specificityByThreshold` > * False positive rate: `fprByThreshold` > Alternatives > --- > The `createCurve` method is marked private. If it were marked public, and the > trait BinaryClassificationMetricComputer were also marked public, then it > would be easy to define new computers to get whatever the user wanted. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org