Hi all, Could somebody be so kind to explain to me what is the saturated model on which deviance and degrees of freedom are calculated when fitting a binomial glm?
Everything makes sense if I fit the model using as response a vector of proportions or a two-column matrix. But when the response is a factor and counts are specified via the "weights" argument, I am kind of lost as far as what is the implied saturated model. Here is a simple example, based on the UCBAdmissions data. > UCBAd <- as.data.frame(UCBAdmissions) > UCBAd <- glm(Admit ~ Gender + Dept, family = binomial, + weights = Freq, data = UCBAd) > UCBAd$deviance [1] 5187.488 > UCBAd$df.residual [1] 17 I can see that the data frame UCBAd has 24 rows and using 1+1+5 parameters to fit the model leaves me with 17 degrees of freedom. What is not clear to me is what is the saturated model? Is it the model that fits a probability zero to each row corresponding to failures and a probability one to each row corresponding to successes? If this is so, it seems to me that looking at the deviance as a goodness-of-fit statistic does not make much sense in this case. Am I missing something? Thank you in advance, Giovanni -- Giovanni Petris <gpet...@uark.edu> Associate Professor Department of Mathematical Sciences University of Arkansas - Fayetteville, AR 72701 Ph: (479) 575-6324, 575-8630 (fax) http://definetti.uark.edu/~gpetris/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.