On 10-10-26 05:27 AM, Karen Kotschy wrote: > Dear list > > This seems like something I really should know by now, but I'm getting so > confused, I'd really appreciate a little help! > > I am trying to model the relationship between relative abundance (%) and > relative cover (%) data for plant species. I want to know to > what extent the 2 measures correlate, and to compare the extent of this > correlation at different sites. Obviously, both sets of data are > zero-inflated and highly skewed. > > The "traditional" thing to do would be to log-transform both of them and > use lm(). However, a recent paper (O'Hara & Kotze, 2010) argues that a > much better approach is to use glm() and to specify Poisson or negative > binomial models, rather than using transformations. This does make a lot > of sense, I think! > > I have tried using "quasipoisson" and "quasibinomial" families in glm(), > but I am left with a number of questions: > > 1) Should relative abundance and relative cover be treated as "count" > data, given that the values are not actually integers but rather > percentages? > > 2) Which parts of the output of glm(...family=quasipoisson(link=log)) do I > use to evaluate the fit? Just residual deviance and the p value? > > 3) How do I plot the data so as to graphically represent the model? If I > am using a log link should I use log axes for x and y? > > Thanks so much for any help! > Karen
Interesting paper by O'Hara and Kotze, but it does not refer to cover (compositional) data, but rather to count data. Cover data is actually a considerably harder problem to handle in the generalized linear model case (alas), *unless* the data come from a point count of some sort (i.e., where you know the 'denominator', or the total number of counts that would correspond to 100% cover, in which case you can use a binomial GLM: see e.g. [Seavy, N. E, S. Quader, John D. Alexander, and C. John Ralph. 2002. Generalized linear models and point count data: statistical considerations for the design and analysis of monitoring studies. In Bird Conservation Implementation and Integration in the Americas: Proceedings of the Third International Partners in Flight Conference, ed. C. John Ralph and Terrell D. Rich, 2:744-753. Asilomar, CA: U.S. Dept. of Agriculture, Forest Service, Pacific Southwest Research Station, March 20. http://www.fs.fed.us/psw/publications/documents/psw_gtr191/psw_gtr191_0744-0753_seavy.pdf.] The natural (to a statistician) way to deal with this would be via beta regression [Smithson, Michael, and Jay Verkuilen. 2006. A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychological Methods 11, no. 1 (March): 54-71. doi:2006-03820-004.] Beta distributions are a natural description of cover -- they are distributions defined on [0,1] with a simple mathematical description, that can be fitted similarly to GLMs [see the 'betareg' package on CRAN]. I think I heard a talk at ESA a few years ago that used beta regression (or maybe I just thought it should have used beta regression). There's one big problem, though -- zeros do not naturally fit into the statistical framework, so you have to do some kind of ad hoc fix for this (this is discussed, briefly, in Smithson and Verkuilen 2006). I looked for ecology papers that used beta regression or cited SV2006, but didn't find very many (see below). If you have a point count statistic, I would analyze your data in terms of 'number of points occupied out of total census points', with a binomial or quasibinomial model. If they are assessed in some other way where there is no natural denominator, I would either (sigh) use transformations or look into beta regression. I'd be interested to hear other opinions. good luck, Ben Bolker Boughton, Elizabeth H., Pedro F. Quintana-Ascencio, and Patrick J. Bohlen. 2010. Refuge effects of Juncus effusus in grazed, subtropical wetland plant communities. Plant Ecology (9). doi:10.1007/s11258-010-9836-4. http://www.springerlink.com/content/u18v4526k10uw2p1/. Irvine, Kathryn M., and Thomas J. Rodhouse. 2010. Power analysis for trend in ordinal cover classes: implications for long-term vegetation monitoring. Journal of Vegetation Science (8): no-no. doi:10.1111/j.1654-1103.2010.01214.x. http://onlinelibrary.wiley.com/doi/10.1111/j.1654-1103.2010.01214.x/full. Royo, Alejandro A., Ramona Bates, and Elizabeth P. Lacey. 2008. Demographic constraints in three populations of Lobelia boykinii: a rare wetland endemic. The Journal of the Torrey Botanical Society 135, no. 2 (4): 189-199. doi:10.3159/07-RA-039.1. http://www.bioone.org/doi/abs/10.3159/07-RA-039.1?cookieSet=1&prevSearch=. _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology