If you define a cost function for a given threshold k as cost(k) = FP(k) + lambda * FN(k)
then choose k that minimises cost. FP and FN are false positives and false negatives at threshold k. You change lambda to a value greater than 1 if you want to penalise FN more than FP. There are many situations where this is desirable. For example when you have highly unbalanced class sizes. For example consider a problem where you want to predict rare events and you will be penalised much more heavily if you miss an event than a non-event. I believe the ROC was designed to compare two methods over a range of thresholds and not for choosing the threshold itself. Regards, Adai On Fri, 2006-03-31 at 08:01 -0500, Tim Howard wrote: > Jose - > > I've struggled a bit with the same question, said another way: "how do you > find the value in a ROC curve that minimizes false positives while maximizing > true positives"? > > Here's something I've come up with. I'd be curious to hear from the list > whether anyone thinks this code might get stuck in local minima, or if it > does find the global minimum each time. (I think it's ok). > > >From your ROC object you need to grab the sensitivity (=true positive rate) > >and specificity (= 1- false positive rate) and the cutoff levels. Then find > >the value that minimizes abs(sensitivity-specificity), or > >sqrt((1-sens)^2)+(1-spec)^2)) as follows: > > absMin <- extract[which.min(abs(extract$sens-extract$spec)),]; > sqrtMin <- extract[which.min(sqrt((1-extract$sens)^2+(1-extract$spec)^2)),]; > > In this example, 'extract' is a dataframe containing three columns: > extract$sens = sensitivity values, extract$spec = specificity values, > extract$votes = cutoff values. The command subsets the dataframe to a single > row containing the desired cutoff and the sens and spec values that are > associated with it. > > Most of the time these two answers (abs or sqrt) are the same, sometimes they > differ quite a bit. > > I do not see this application of ROC curves very often. A question for those > much more knowledgeable than I.... is there a problem with using ROC curves > in this manner? > > Tim Howard > > > > > Date: Fri, 31 Mar 2006 11:58:14 +0200 > From: "Anadon Herrera, Jose Daniel" <[EMAIL PROTECTED]> > Subject: [R] ROC optimal threshold > To: "'r-help@stat.math.ethz.ch'" <r-help@stat.math.ethz.ch> > Message-ID: > <[EMAIL PROTECTED]> > Content-Type: text/plain; charset=iso-8859-1 > > hello, > > I am using the ROC package to evaluate predictive models > I have successfully plot the ROC curve, however > > ?is there anyway to obtain the value of operating point=optimal threshold > value (i.e. the nearest point of the curve to the top-left corner of the > axes)? > > thank you very much, > > > jose daniel anadon > area de ecologia > universidad miguel hernandez > > espa?a > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html