Re: [R] Proper / Improper scoring Rules

Frank E Harrell Jr Fri, 07 Aug 2009 09:51:03 -0700

Donald Catanzaro, PhD wrote:

Hi All,
I am working on some ordinal logistic regresssions using LRM in theDesign package. My response variable has three categories (1,2,3) andafter using the creating my model and using a call to predict somevalues and I wanted to use a simple .5 cut-off to classify myprobabilities into the categories.
I had two questions:
a) first, I am having trouble directly accessing the probabilitieswhich may have more to do with my lack of experience with R
For instance, my calls
>ologit.three.NoPerFor <- lrm(Threshold.Three ~ TECI , data=CLD,na.action=na.pass)>CLD$Threshold.Predict.Three.NoPerFor<- predict(ologit.three.NoPerFor,newdata=CLD, type="fitted.ind")>CLD$Threshold.Predict.Three.NoPerFor.Cats[CLD$Threshold.Predict.Three.NoPerFor.Threshold.Three=1> .5] <- 1Error: unexpected '=' in"CLD$Threshold.Predict.Three.NoPerFor.Cats[CLD$Threshold.Predict.Three.NoPerFor.Threshold.Three="
 >
 >
produce an error message and it seems as R does not like the equal signat all. So how does one access the probabilities so I can classify theminto the categories of 1,2,3 so I can look at performance of my model ?


use == to check equality

b) which leads me to my next question. I thought that simplycalculating the percent correct off of my predictions would besufficient to look at performance but since my question is very much inline with this threadhttp://tolstoy.newcastle.edu.au/R/e4/help/08/04/8987.html I am not sosure anymore. I am afraid I did not understand Frank Harrell's lastsuggestion regarding improper scoring rule - can someone point me tosome internet resources that I might be able to review to see why myapproach would not be valid ?

Percent correct will give you misleading answers and is game-able. Itis also ultra-high-variance. Though not a truly proper scoring rule,Somers' Dxy rank correlation (generalization of ROC area) is helpful.Better still: use the log-likelihood and related quantities (deviance,adequacy index as described in my book).


Frank



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Proper / Improper scoring Rules

Reply via email to