On Dec 21, 2010, at 14:22 , S Ellison wrote: > A possible caveat here. > > Traditionally, logistic regression was performed on the > logit-transformed proportions, with the standard errors based on the > residuals for the resulting linear fit. This accommodates overdispersion > naturally, but without telling you that you have any. > > glm with a binomial family does not allow for overdispoersion unless > you use the quasibinomial family. If you have overdispersion, standard > errors from glm will be unrealistically small. Make sure your model fits > in glm before you believe the standard errors, or use the quasibionomial > family.
...and before you believe in overdispersion, make sure you have a credible explanation for it. All too often, what you really have is a model that doesn't fit your data properly. > > Steve Ellison > LGC > > > >>>> Ben Bolker <bbol...@gmail.com> 21/12/2010 13:08:34 >>> > array chip <arrayprofile <at> yahoo.com> writes: > > [snip] > >> I can think of analyzing this data using glm() with the attached > dataset: >> >> test<-read.table('test.txt',sep='\t') >> > fit<-glm(cbind(positive,total-positive)~treatment,test,family=binomial) >> summary(fit) >> anova(fit, test='Chisq') > >> First, is this still called logistic regression or something else? I > thought >> with logistic regression, the response variable is a binary factor? > > Sometimes I've seen it called "binomial regression", or just > "a binomial generalized linear model" > >> Second, then summary(fit) and anova(fit, test='Chisq') gave me > different p >> values, why is that? which one should I use? > > summary(fit) gives you p-values from a Wald test. > anova() gives you tests based on the Likelihood Ratio Test. > In general the LRT is more accurate. > >> Third, is there an equivalent model where I can use variable > "percentage" >> instead of "positive" & "total"? > > glm(percentage~treatment,weights=total,data=tests,family=binomial) > > is equivalent to the model you fitted above. >> >> Finally, what is the best way to analyze this kind of dataset >> where it's almost the same as ANOVA except that the response > variable >> is a proportion (or success and failure)? > > Don't quite know what you mean here. How is the situation "almost > the same as ANOVA" different from the situation you described above? > Do you mean when there are multiple factors? or ??? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ******************************************************************* > This email and any attachments are confidential. Any use...{{dropped:8}} > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.