[ https://issues.apache.org/jira/browse/SPARK-32472 ]


    Gideon P deleted comment on SPARK-32472:
    ----------------------------------

was (Author: JIRAUSER304403):
[~kmoore] can I raise a PR for this issue? 

> Expose confusion matrix elements by threshold in BinaryClassificationMetrics
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-32472
>                 URL: https://issues.apache.org/jira/browse/SPARK-32472
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 3.0.0
>            Reporter: Kevin Moore
>            Priority: Minor
>
> Currently, the only thresholded metrics available from 
> BinaryClassificationMetrics are precision, recall, f-measure, and (indirectly 
> through roc()) the false positive rate.
> Unfortunately, you can't always compute the individual thresholded confusion 
> matrix elements (TP, FP, TN, FN) from these quantities. You can make a system 
> of equations out of the existing thresholded metrics and the total count, but 
> they become underdetermined when there are no true positives.
> Fortunately, the individual confusion matrix elements by threshold are 
> already computed and sitting in the confusions variable. It would be helpful 
> to expose these elements directly. The easiest way would probably be by 
> adding methods like 
> {code:java}
> def truePositivesByThreshold(): RDD[(Double, Double)] = confusions.map{ case 
> (t, c) => (t, c.weightedTruePositives) }{code}
> An alternative could be to expose the entire RDD[(Double, 
> BinaryConfusionMatrix)] in one method, but BinaryConfusionMatrix is also 
> currently package private.
> The closest issue to this I found was this one for adding new calculations to 
> BinaryClassificationMetrics 
> https://issues.apache.org/jira/browse/SPARK-18844, which was closed without 
> any changes being merged.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to