Dear colleagues I have a couple of problems related with binary logistic regression.
The first problem is how to compute Pearson and Likelihood chi-squeared tests for grouped data. For the same form of data set how to compute sensitivity, specificity and related measures. When I speak about grouped data I mean data of the following form Alcohol.Consum Malformed NoMalformed 0 48 17066 <1 38 14464 1-2 5 788 3-5 1 126 >=6 1 37 (This data set has been taken from the book Categorical data analysis by Agresti) The second question is the following:is it correct to upload a grouped data set to an ungrouped one? The upload is achieved with the aid of the following routine Createdataframe <- function(d){ f1 = f2 = numeric(0) v = d[,1] for(j in 2:3){ for(i in 1:dim(d)[1]){ f1 = c( f1, rep( levels(v)[v[i]], d[i,j] ) ) } f2 = c( f2, rep( 3-j, sum(d[,j]) ) ) } df=data.frame(f1,f2) return(df) } My finally question is why SPSS computes the Hosmer-Lemeshow test while the routine listed below gives NaN. (One has to put g=8 in order to get a numeric value) The data set is (it is also has been taken from the same book mentioned above) FlightNo Temp ThermalDisast 1 66 0 2 70 1 3 69 0 4 68 0 5 67 0 6 72 0 7 73 0 8 70 0 9 57 1 10 63 1 11 70 1 12 78 0 13 67 0 14 53 1 15 67 0 16 75 0 17 70 0 18 81 0 19 76 0 20 79 0 21 75 1 22 76 0 23 58 1 and the routine used is hosmerlemeshow = function(obj, g=10) { # first, check to see if we fed in the right kind of object stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit") y = obj$model[[1]] # the double bracket (above) gets the index of items within an object if (is.factor(y)) y = as.numeric(y)==2 yhat = obj$fitted.values cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE) obs = xtabs(cbind(1 - y, y) ~ cutyhat) expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat) if (any(expect < 5)) # warning("Some expected counts are less than 5. Use smaller number of groups") chisq = sum((obs - expect)^2/expect) P = 1 - pchisq(chisq, g - 2) cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n") # by returning an object of class "htest", the function will perform like the # built-in hypothesis tests return(structure(list( method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g, "bins", sep=" ")), data.name = deparse(substitute(obj)), statistic = c(X2=chisq), parameter = c(df=g-2), p.value = P ), class='htest')) return(list(chisq=chisq,p.value=P)) } Thank you for your cooperation in advance. Any suggestion and/or solution will be greatly appreciated. With regards Endy [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.