[R] Logistic Regression

Endy BlackEndy Fri, 19 Apr 2013 12:16:37 -0700

Dear colleagues I have a couple of problems related with binary logistic
regression.


  The first problem is how to compute Pearson and Likelihood chi-squeared
tests for grouped data.

 For the same form of data set how to compute sensitivity, specificity and
related measures.

 When I speak about grouped data I mean data of the following form



                                  Alcohol.Consum   Malformed     NoMalformed

                                            0
                   48                    17066

                                          <1
38                    14464

                                           1-2
                    5                        788

                                           3-5
                    1                        126

                                          >=6
                    1                          37



(This data set has been taken from the book Categorical data analysis by
Agresti)



   The second question is the following:is it correct to upload a grouped
data set to an ungrouped one?

The upload is achieved with the aid of the following routine



Createdataframe <- function(d){

 f1 = f2 = numeric(0)

 v = d[,1]

 for(j in 2:3){

  for(i in 1:dim(d)[1]){

   f1 = c( f1, rep( levels(v)[v[i]], d[i,j] ) )

  }

  f2 = c( f2, rep( 3-j, sum(d[,j]) ) )

 }

 df=data.frame(f1,f2)

 return(df)

}



    My finally question is why SPSS computes the Hosmer-Lemeshow test while
the routine listed below gives NaN. (One has to put g=8 in order to get a
numeric value)



The data set is (it is also has been taken from the same book mentioned
above)



FlightNo           Temp    ThermalDisast

1          66        0

2          70        1

3          69        0

4          68        0

5          67        0

6          72        0

7          73        0

8          70        0

9          57        1

10        63        1

11        70        1

12        78        0

13        67        0

14        53        1

15        67        0

16        75        0

17        70        0

18        81        0

19        76        0

20        79        0

21        75        1

22        76        0

23        58        1



and the routine used is



hosmerlemeshow = function(obj, g=10) {

# first, check to see if we fed in the right kind of object

   stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")

   y = obj$model[[1]]

   # the double bracket (above) gets the index of items within an object

   if (is.factor(y))

      y = as.numeric(y)==2

   yhat = obj$fitted.values

   cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)



   obs = xtabs(cbind(1 - y, y) ~ cutyhat)

   expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)

   if (any(expect < 5))

   #   warning("Some expected counts are less than 5. Use smaller number of
groups")

   chisq = sum((obs - expect)^2/expect)

   P = 1 - pchisq(chisq, g - 2)

   cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")

   # by returning an object of class "htest", the function will perform
like the

   # built-in hypothesis tests

   return(structure(list(

      method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
"bins", sep=" ")),

      data.name = deparse(substitute(obj)),

      statistic = c(X2=chisq),

      parameter = c(df=g-2),

      p.value = P

   ), class='htest'))

   return(list(chisq=chisq,p.value=P))

}



Thank you for your cooperation in advance. Any suggestion and/or solution
will be greatly appreciated.



With regards

Endy

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Logistic Regression

Reply via email to