[R] Mixed-effects model for overdispersed count data?
Hi, I have to analyse the number of provisioning trips to nestlings according to a number of biological and environmental factors. I was thinking of building a mixed-effects model with species and nestid as random effects, using a Poisson distribution, but the data are overdispersed (variance/mean = 5). I then thought of using a mixed-effects model with negative binomial distribution, but I have 2 problems: 1- The only package building mixed models with neg. bin. distribution I found is the package glmmADMB but I have a hard time understanding the output. Anyone knows of a R package with an output that gives p values? 2- Two people I asked advice to told me that I should use either a mixed-effect model with a Poisson distribution (the random effects will take care of the overdispersion) OR a glm using neg. bin. distribution but not both at the same time, which would be unnecessary. Any advice is welcome! Thank you Marie-Helene Hachey M.Sc. student Universite Laval, Quebec __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed-effects model for overdispersed count data?
Marie-Hélène Hachey marie_helene48 at hotmail.com writes: Hi, I have to analyse the number of provisioning trips to nestlings according to a number of biological and environmental factors. I was thinking of building a mixed-effects model with species and nestid as random effects, using a Poisson distribution, but the data are overdispersed (variance/mean = 5). I then thought of using a mixed-effects model with negative binomial distribution, but I have 2 problems: 1- The only package building mixed models with neg. bin. distribution I found is the package glmmADMB but I have a hard time understanding the output. Anyone knows of a R package with an output that gives p values? 2- Two people I asked advice to told me that I should use either a mixed-effect model with a Poisson distribution (the random effects will take care of the overdispersion) OR a glm using neg. bin. distribution but not both at the same time, which would be unnecessary. Several pieces of advice: * this question is probably most appropriate for r-sig-mixed-models (or perhaps r-sig-ecology) * glmmADMB is admittedly a bit scratchy at the moment, but you may not find a package that gives much easier-to-understand output -- almost all packages will give output in terms of fixed effect coefficients, standard errors, and variances/covariances/standard deviations of random effects. * you might want to consider Poisson-lognormal models instead, which allow for overdispersion and are a bit easier to fit in the context of mixed models, by defining an individual-level random effect: see e.g. Elston, D. A., R. Moss, T. Boulinier, C. Arrowsmith, and X. Lambin. 2001. Analysis of Aggregation, a Worked Example: Numbers of Ticks on Red Grouse Chicks. Parasitology 122, no. 05: 563-569. doi:10.1017/S0031182001007740. http://journals.cambridge.org/action/displayAbstract?fromPage=onlineaid=82701. Such models can be fitted in (at least) MCMCglmm and recent versions of glmer. * p values will be tricky indeed. sorry about that. * as to the advice about using either mixed models or NB models but not both -- that's an empirical question. It may indeed be the case that one or the other takes care of the overdispersion, but you won't know until you try. It is certainly possible to have overdispersion even within a species/nestid combination. I would suggest http://glmm.wikidot.com/faq as a starting point for further reading ... good luck __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed-effects model for overdispersed count data?
According to the documentation for glmmADMB if you fit your model with a statment like fit =glmm.admb(y~Base*trt+Age+Visit, ... data=epil2,family=nbinom) and that the parameter estimates are in fit$b while their estimated standard deviations are in fit$stdbeta so presumably p values can be constructed from the quotient fit$b/fit$stdbeta by assuming a t distribution with (somehow) the correct degrees of freedom. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed-effects model for overdispersed count data?
dave fournier otter at otter-rsch.com writes: According to the documentation for glmmADMB if you fit your model with a statment like fit =glmm.admb(y~Base*trt+Age+Visit, ... data=epil2,family=nbinom) and that the parameter estimates are in fit$b while their estimated standard deviations are in fit$stdbeta so presumably p values can be constructed from the quotient fit$b/fit$stdbeta by assuming a t distribution with (somehow) the correct degrees of freedom. As I commented elsewhere (for the record in this group), you would do that in R via 2*pnorm(-abs(fit$b/fit$stdbeta)) for a 2-tailed test, but these values should be taken as order-of-magnitude estimates of the 'true' (???) p-value at best, because they are Wald tests (not score or likelihood, both of which are more reliable) and because they assume infinite 'denominator degrees of freedom' (i.e. Z/chi-squared test rather than t/F test equivalent). Probably reliable only for a large, well-behaved data set (e.g., 40 random-effects levels (species or nests)) ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.