[R] Logistic regression

Endy BlackEndy Fri, 19 Apr 2013 10:34:26 -0700

Dear colleagues I have a couple of problems related with binary logistic
regression.


The first problem is how to compute Pearson and Likelihood chi-squeared
tests for grouped data.

For the same form of data set how to compute sensitivity, specificity and
related measures.

When I speak about grouped data I mean data of the following form


Alcohol.Consum Malformed NoMalformed

           0                  48            17066

         <1                  38            14464

          1-2                  5               788

          3-5                  1               126

          >=6                 1                 37

(This data set has been taken from the book Categorical data analysis by
Agresti)

The second question is the following:is it correct to upload a grouped data
set to an ungrouped one?

The upload is achieved with the aid of the following routine

Createdataframe <- function(d){

  f1 = f2 = numeric(0)

  v = d[,1]

  for(j in 2:3){

     for(i in 1:dim(d)[1]){

        f1 = c( f1, rep( levels(v)[v[i]], d[i,j] ) )

     }

     f2 = c( f2, rep( 3-j, sum(d[,j]) ) )

  }

  df=data.frame(f1,f2)

  return(df)

}

My finally question is why SPSS computes the Hosmer-Lemeshow test while the
routine listed below gives NaN. (One has to put g=8 in order to get a
numeric value)

The data set is (it is also has been taken from the same book mentioned
above)


FlightNo Temp ThermalDisast

      1       66           0

      2       70           1

      3       69           0

      4       68           0

      5       67           0

      6       72           0

      7       73           0

      8       70           0

       9      57           1

     10      63           1

     11      70           1

     12      78           0

     13      67           0

     14      53           1

     15      67           0

     16     75            0

     17     70            0

     18     81            0

     19     76            0

     20     79            0

     21     75            1

     22     76            0

     23     58             1

and the routine used is

hosmerlemeshow = function(obj, g=10) {

  # first, check to see if we fed in the right kind of object

   stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")

   y = obj$model[[1]]

   # the double bracket (above) gets the index of items within an object

   if (is.factor(y))

      y = as.numeric(y)==2

      yhat = obj$fitted.values

      cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)

      obs = xtabs(cbind(1 - y, y) ~ cutyhat)

      expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)

      if (any(expect < 5))

        # warning("Some expected counts are less than 5. Use smaller number
of groups")

        chisq = sum((obs - expect)^2/expect)

        P = 1 - pchisq(chisq, g - 2)

        cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")

        # by returning an object of class "htest", the function will
perform like the

       # built-in hypothesis tests

       return(structure(list(

      method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
"bins", sep=" ")),

      data.name = deparse(substitute(obj)),

      statistic = c(X2=chisq),

       parameter = c(df=g-2),

       p.value = P

      ), class='htest'))

      return(list(chisq=chisq,p.value=P))

}

I found this routine in the internet.

Thank you for your cooperation in advance. Any suggestion and/or solution
will be greatly appreciated.

With regards
Endy

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Logistic regression

Reply via email to