On Tue, 30 Sep 2014, separ...@yahoo.com wrote:
See Filzmoser et al. (2009) http://www.statistik.tuwien.ac.at/forschung/SM/SM-2009-2.pdf
Serge-Étienne, I'll carefully read this; have not found it in my literature surches.
Filzmoser et al. (2009) wrote that "Some measures like the standard deviation (or the variance) make no statistical sense with closed data [...]".
It is also difficult to make ecological sense from these measures, particularly when communicating with non-technical decision-makers.
They also wrote that "If Euclidean geometry is not valid, the arithmetic mean is quite likely to be a poor estimate of the data center." As Euclidean geometry is not valid for compositions, you have to compute the mean in the ilr or clr space (both are euclidean, alr is not). The mean.acomp function computes the mean in euclidean space, then back-transform the result in the compositional space.
My developing understanding of log ratio transformations is that they only apply to data such as chemical concentrations. The transformation serves to close the data so the row totals are 1.0 or 100. With count data, the proportions represent the column count as a proportion of the row total so converstion to the compositional space and closure is not necessary as they already sum to 1.0. Perhaps the Filzmoser et al. paper will clear up and correct my understanding of how count data should be handled. What I need to do is calculate a valid measure of the variability of these count data in each data set, compare them (perhaps by clustering), identify explanatory geomorphic, hydrologic, and/or chemical data via regression analysis, and examine temporal changes within each data set. I'm certainly open to suggestions and advice as CoDA is new to me. Thanks again, Rich _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology