[ https://issues.apache.org/jira/browse/SPARK-32472 ]
Gideon P deleted comment on SPARK-32472: ---------------------------------- was (Author: JIRAUSER304403): [~kmoore] can I raise a PR for this issue? > Expose confusion matrix elements by threshold in BinaryClassificationMetrics > ---------------------------------------------------------------------------- > > Key: SPARK-32472 > URL: https://issues.apache.org/jira/browse/SPARK-32472 > Project: Spark > Issue Type: Improvement > Components: MLlib > Affects Versions: 3.0.0 > Reporter: Kevin Moore > Priority: Minor > > Currently, the only thresholded metrics available from > BinaryClassificationMetrics are precision, recall, f-measure, and (indirectly > through roc()) the false positive rate. > Unfortunately, you can't always compute the individual thresholded confusion > matrix elements (TP, FP, TN, FN) from these quantities. You can make a system > of equations out of the existing thresholded metrics and the total count, but > they become underdetermined when there are no true positives. > Fortunately, the individual confusion matrix elements by threshold are > already computed and sitting in the confusions variable. It would be helpful > to expose these elements directly. The easiest way would probably be by > adding methods like > {code:java} > def truePositivesByThreshold(): RDD[(Double, Double)] = confusions.map{ case > (t, c) => (t, c.weightedTruePositives) }{code} > An alternative could be to expose the entire RDD[(Double, > BinaryConfusionMatrix)] in one method, but BinaryConfusionMatrix is also > currently package private. > The closest issue to this I found was this one for adding new calculations to > BinaryClassificationMetrics > https://issues.apache.org/jira/browse/SPARK-18844, which was closed without > any changes being merged. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org