Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

Frank E Harrell Jr Mon, 13 Oct 2008 20:30:58 -0700

John Sorkin wrote:

Of course Prof Baer is correct the positive predictive value (PPV) and the 
negative predictive values (NPV) serve the function of providing conditional 
post-test probabilities
PPV: Post-test probability of disease given a positive test
NPV: Post-test probability of no disease given a negative test.


Further, PPV is a function of sensitivity (for a given specificity in a 
population with a given disease prevalence), the higher the sensitivity almost 
always the greater the PPV (it can by unchanged, but I don't believe it can be 
lower) and as
              NPV is a function of specificity (for a given sensitivity in a 
population with a given disease prevelance), the higher the specificity almost 
always the greater the NPV (it can by unchanged, but I don't believe it can be 
lower) .

Thus using Prof Harrell's suggestion to use the test that move a pre-test probability a great deal in one or both directions, the test to choose is the one with largest sensitivity and or specificity, and thussensitivity and specificity are, I believe is a good summary measures of the "quality" of a clinical test.

I don't see how that follows. At any rate, the use of prevalence,sens., and spec. is indirect. It is easier and faster to just directlymodel the probability of interest.

Finally I think Prof Harrell's observation that sensitivity and specificity change quite a bit, and mathematically must change if the disease is not all-or-nothing while true is a degenerate case of little practical importance.

Absolutely not. This happens in everyday practice. See the Hlatkypaper below. One of the explanations is that if the disease has variouslevels of severity and is not all or nothing, patients with severedisease are easier to detect. And there are risk factors for severedisease. These risk factors relate to sensitivity.


 author = {Hlatky, M. A. and Pryor, D. B. and Harrell, F. E. and Califf, R.
           M. and Mark, D. B. and Rosati, R. A.},
  year = 1984,

title = {Factors affecting the sensitivity and specificity of theexercise

          electrocardiography. {M}ultivariable analysis},
  journal = Am J Med,
  volume = 77,
  pages = {64-71},
  annote = {diagnosis;testing;non-constancy of sensitivity and specificity}
}

@Article{gug00inv,

author = {{Guggenmoos-Holzmann}, Irene and {vanHouwelingen},

  Hans C.},
  title =                {The (in)validity of sensitivity and specificity},
  journal =      Stat in Med,
  year =                 2000,
  volume =               19,
  pages =                {1783-1792},
  annote =               {severe problems with sensitivity and specificity;
  diagnosis; testing; teaching MDs;death of sensitivity and specificity}
}
 author =               {Moons, Karel G. M. and Harrell, Frank E.},

title = {Sensitivity and specificity should bede-emphasized

in diagnostic accuracy studies},
  journal =      {Academic Radiology},
  year =                 2003,
  volume =               10,
  pages =                {670-672},
  note =                 {Editorial},

annote = {diagnosis;accuracy;reasons for avoidingsensitivity

and specificity}
}

Frank

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
"Robert W. Baer, Ph.D." <[EMAIL PROTECTED]> 10/13/2008 4:41 PM >>>
----- Original Message -----From: "Frank E Harrell Jr" <[EMAIL PROTECTED]>
To: "John Sorkin" <[EMAIL PROTECTED]>
Cc: <r-help@r-project.org>; <[EMAIL PROTECTED]>;<[EMAIL PROTECTED]>
Sent: Monday, October 13, 2008 2:09 PM
Subject: Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)
John Sorkin wrote:
Frank,
Perhaps I was not clear in my previous Email message. Sensitivity andspecificity do tell us about the quality of a test in that given twotests the one with higher sensitivity will be better at identifyingsubjects who have a disease in a pool who have a disease, and the moresensitive test will be better at identifying subjects who do not have adisease in a pool of people who do not have a disease. It is true thatpositive predictive and negative predictive values are of greater utilityto a clinician, but as you know these two measures are functions ofsensitivity, specificity and disease prevalence. All other things beingequal, given two tests one would select the one with greater sensitivityand specificity so in a sense they do measure the "quality" of a clinicaltest - but not, as I tried to explain the quality of a statistical model.
That is not very relevant John. It is a function of all those thingsbecause those quantities are all deficient.
I would select the test that can move the pre-test probability a greatdeal in one or both directions.
Of course, this quantity is known as a likelihood ratio and is a function ofsensitivity and specificity. For 2 x 2 data one often speaks of postivelikelihood ratio and negative likelihood ratio, but for multi-rowcontingency table one can define likelihood ratios for a series of cut-offpoints. This has become a popular approach in evidence-based medicine whendiagnostic tests have continuous rather than binary outputs.
You are of course correct that sensitivity and specificity are not truly"inherent" characteristics of a test as their values may change frompopulation-to-population, but paretically speaking, they don't change allthat much, certainly not as much as positive and negative predictivevalues.
They change quite a bit, and mathematically must change if the disease isnot all-or-nothing.
I guess we will disagree about the utility of sensitivity and specificityas simplifying concepts.
Thank you as always for your clear thoughts and stimulating comments.
And thanks for yours John.
Frank
John
among those subjects with a disease and the one with greater specificitywill be better at indentifying John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Frank E Harrell Jr <[EMAIL PROTECTED]> 10/13/2008 2:35 PM >>>
John Sorkin wrote:
Jumping into a thread can be like jumping into a den of lions but heregoes . . .Sensitivity and specificity are not designed to determine the quality ofa fit (i.e. if your model is good), but rather are characteristics of atest. A test that has high sensitivity will properly identify a largeportion of people with a disease (or a characteristic) of interest. Atest with high specificity will properly identify large proportion ofpeople without a disease (or characteristic) of interest. Sensitivityand specificity inform the end user about the "quality" of a test. Othermetrics have been designed to determine the quality of the fit, nonethat I know of are completely satisfactory. The pseudo R squared is onesuch measure.For a given diagnostic test (or classification scheme), differentcut-off points for identifying subject who have disease can be examinedto see how they influence sensitivity and 1-specificity using ROCcurves.
I await the flames that will surely come my way

John
John this has been much debated but I fail to see how backwardsprobabilities are that helpful in judging the usefulness of a test. Whynot condition on what we know (the test result and other baselinevariables) and quit conditioning on what we are trying to find out(disease status)? The data collected in most studies (other thancase-control) allow one to use logistic modeling with the correct timeorder.
Furthermore, sensitivity and specificity are not constants but vary withsubjects' characteristics. So they are not even useful as simplifyingconcepts.
Frank
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
Frank E Harrell Jr <[EMAIL PROTECTED]> 10/13/2008 12:27 PM >>>
Maithili Shiva wrote:
Dear Mr Peter Dalgaard and Mr Dieter Menne,
I sincerely thank you for helping me out with my problem. The thing istaht I already have calculated SENS = Gg / (Gg + Bg) = 89.97%
and SPEC = Bb / (Bb + Gb) = 74.38%.
Now I have values of SENS and SPEC, which are absolute in nature. Myquestion was how do I interpret these absolue values. How does thesevalues help me to find out wheher my model is good.
With regards

Ms Maithili Shiva
I can't understand why you are interested in probabilities that are inbackwards time order.
Frank
________________________________________________________________________
Subject: [R] Logistic regresion - Interpreting (SENS) and (SPEC)
To: r-help@r-project.org Date: Friday, October 10, 2008, 5:54 AM
Hi

Hi I am working on credit scoring model using logistic
regression. I havd main sample of 42500 clentes and based on
their status as regards to defaulted / non - defaulted, I
have genereted the probability of default.

I have a hold out sample of 5000 clients. I have calculated
(1) No of correctly classified goods Gg, (2) No of correcly
classified Bads Bg and also (3) number of wrongly classified
bads (Gb) and (4) number of wrongly classified goods (Bg).

My prolem is how to interpret these results? What I have
arrived at are the absolute figures.
--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-helpPLEASE do read the posting guidehttp://www.R-project.org/posting-guide.htmland provide commented, minimal, self-contained, reproducible code.
Confidentiality Statement:
This email message, including any attachments, is for ...{{dropped:10}}


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)

Reply via email to