Not sure if its relevant, but this paper and associated function *beta.div.comp* {adespatial} may be of interest to disentangle between richness and turnover in composition data https://doi.org/10.1111/geb.12207
Tania Bird MSc *"There is a sufficiency in the world for man's need but not for man's greed" ~ Mahatma Gandhi* https://www.linkedin.com/in/taniabird On Thu, 4 Apr 2019 at 10:28, Torsten Hauffe <torsten.hau...@gmail.com> wrote: > Great point David! > > Since Tim was referring to microbial communities, the gjam package is > similar to mvabund, boral etc. and the microbial example discussed in the > following paper might be of interest. > > https://esajournals.onlinelibrary.wiley.com/doi/full/10.1002/ecm.1241 > > With that being about R itself, I may go a bit off topic: > In all those multivariate GLM approaches, is there a way to disentangle > richness differences (or nestedness) and turnover like we can do with > pairwise distances? > (See the inspiring discussion between Carvalho et al. and Baselga et al.; > summarized in > http://onlinelibrary.wiley.com/doi/10.1111/geb.12207/abstract > ) > Since different biological processes may cause these patterns, separating > richness differences and species turnover is of interest. Maybe the the row > effect in those multivariate GLMs could be estimated as response to > environmental predictors? > > Cheers, > Torsten > > > > On Thu, 4 Apr 2019 at 01:19, David Warton <david.war...@unsw.edu.au> > wrote: > > > Hi Tim, > > Yes you are right this is an issue, BC (and other distance metrics) are > > sensitive to sampling intensity, which is often an artefact of the > sampling > > technique. Transformation is not a great solution to the problem - it > > works imperfectly and will have different effects depending on the > > properties of your data. There are lots of different types of datasets > out > > there, each with different properties, and different behaviours under > > different transformation/standardisation strategies, so there is no > > one-transformation-suits-all solution. An illustration of this (in the > > case of row standardisation) is in the below paper: > > > > https://besjournals.onlinelibrary.wiley.com/doi/10.1111/2041-210X.12843 > > > > The strategy I would advise here is to go a very different route and > build > > a statistical model for the data. You can then include row effects in > the > > model to handle variation in sampling intensity across rows of data > (along > > the lines of equation 2 of the above paper). Or if the magnitude of the > > variation in sampling intensity is known (e.g. it is due to changes in > > sizes of quadrats used for sampling, and quadrat size has been recorded), > > then the standard approach to handle this is to add an offset to the > > model. There is plenty of software out there that can fit suitable > > statistical models with row effects (and offsets) for this sort of data, > > including the mvabund, HMSC, boral, and gllvm packages on R. > Importantly, > > these packages come with diagnostic tools to check that the analysis > > approach adequately captures key properties of your data - an essential > > step in any analysis. > > > > All the best > > David > > > > > > Professor David Warton > > School of Mathematics and Statistics, Evolution & Ecology Research > Centre, > > Centre for Ecosystem Science > > UNSW Sydney > > NSW 2052 AUSTRALIA > > phone +61(2) 9385 7031 > > fax +61(2) 9385 7123 > > > > http://www.eco-stats.unsw.edu.au > > > > > > > > ---------------------------------------------------------------------- > > > > Date: Tue, 2 Apr 2019 17:15:45 +0200 > > From: Tim Richter-Heitmann <trich...@uni-bremen.de> > > To: r-sig-ecology@r-project.org > > Subject: [R-sig-eco] interpreting ecological distance approaches (Bray > > Curtis after various data transformation) > > Message-ID: <3834fea1-040a-12b5-c3a3-633e68dc6...@uni-bremen.de> > > Content-Type: text/plain; charset="utf-8"; Format="flowed" > > > > Dear list, > > > > i am not an ecologist by training, so please bear with me. > > > > It is my understanding that Bray Curtis distances seem to be sensitive to > > different community sizes. Thus, they seem to deliver inadequate results > > when the different community sizes are the result of technical artifacts > > rather than biology (see e.g. Weiss et al, 2017 on microbiome data). > > > > Therefore, i often see BC distances made on relative data (which seems to > > be equivalent to the Manhattan distance) or on data which has been > > subsampled to even sizes (e.g. rarefying). Sometimes i also see Bray > Curtis > > distances calculated on Hellinger-transformed data, > > > > which is the square root of relative data. This again makes sample sizes > > unequal (but only to a small degree), so i wondered if this is a valid > > approach, especially considering that the "natural" distance choice for > > Hellinger transformed data is Euclidean (to obtain, well, the Hellinger > > distance). > > > > Another question is what different sizes (i.e. the sums) of Hellinger > > transformed communities represent? I tested some datasets, and couldnt > > find a correlation between original sample sizes and their hellinger > > transformed counterparts. > > > > Any advice is very much welcome. Thank you. > > > > -- > > Dr. Tim Richter-Heitmann > > > > University of Bremen > > Microbial Ecophysiology Group (AG Friedrich) > > FB02 - Biologie/Chemie > > Leobener Straße (NW2 A2130) > > D-28359 Bremen > > Tel.: 0049(0)421 218-63062 > > Fax: 0049(0)421 218-63069 > > > > > > > > _______________________________________________ > > R-sig-ecology mailing list > > R-sig-ecology@r-project.org > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > > > [[alternative HTML version deleted]] > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology