Michael Kubovy wrote: > Hi Tim and José, > > >>>Date: Fri, 31 Mar 2006 11:58:14 +0200 >>>From: "Anadon Herrera, Jose Daniel" <[EMAIL PROTECTED]> >>>Subject: [R] ROC optimal threshold >>> >>>I am using the ROC package to evaluate predictive models >>>I have successfully plot the ROC curve, however >>> >>>?is there anyway to obtain the value of operating point=optimal >>>threshold >>>value (i.e. the nearest point of the curve to the top-left corner >>>of the >>>axes)? > > > On Mar 31, 2006, at 8:01 AM, Tim Howard wrote: > > >>I've struggled a bit with the same question, said another way: "how >>do you find the value in a ROC curve that minimizes false positives >>while maximizing true positives"? >> >>Here's something I've come up with. I'd be curious to hear from the >>list whether anyone thinks this code might get stuck in local >>minima, or if it does find the global minimum each time. (I think >>it's ok). >> >> >>>From your ROC object you need to grab the sensitivity (=true >>>positive rate) and specificity (= 1- false positive rate) and the >>>cutoff levels. Then find the value that minimizes abs(sensitivity- >>>specificity), or sqrt((1-sens)^2)+(1-spec)^2)) as follows: >> >>absMin <- extract[which.min(abs(extract$sens-extract$spec)),]; >>sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract >>$spec)^2)),]; >> >>In this example, 'extract' is a dataframe containing three columns: >>extract$sens = sensitivity values, extract$spec = specificity >>values, extract$votes = cutoff values. The command subsets the >>dataframe to a single row containing the desired cutoff and the >>sens and spec values that are associated with it. >> >>Most of the time these two answers (abs or sqrt) are the same, >>sometimes they differ quite a bit. >> >>I do not see this application of ROC curves very often. A question >>for those much more knowledgeable than I.... is there a problem >>with using ROC curves in this manner? >> >>Tim Howard > > > @BOOK{MacmillanCreelman2005, > title = {Detection theory: {A} user's guide}, > publisher = {Lawrence Erlbaum Associates}, > year = {2005}, > address = {Mahwah, NJ, USA}, > edition = {2nd}, > author = {Macmillan, Neil A and Creelman, C Douglas}, > } > on p. 43 shows that the ideal value of the cutoff depends on the > reward function R that specifies the payoff for each outcome: > \[ > LR(x) = \beta = \frac{R(true negative) - R{false positive)}{R(true > positive) - R(false negative)} \frac{p(noise)}{p(signal)} > \] > > I believe that your attempt to minimize false positives while > maximizing true positives amounts to maximizing the proportion of > correct answers. For that you just set $\beta = 0$. Otherwise it > might be best to explicitly state your costs and benefits by > specifying the reward function R. > _____________________________ > Professor Michael Kubovy
Choosing cutoffs is frought with difficulties, arbitrariness, inefficiency, and the necessity to use a complex adjustment for multiple comparisons in later analysis steps unless the dataset used to generate the cutoff was so large as could be considered infinite. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html