Thank you Juan. This does indeed look promising. Cheers, Marc On Thu, Jul 18, 2019 at 4:08 PM Juan Antonio Balbuena <j.a.balbu...@uv.es> wrote:
> You may have a look at the RRPP package (Collyer & Adams 2018, Methods > Ecol Evol doi: 10.1111/2041-210X.13029) > > Best > > Juan > > El 18/07/2019 a las 14:48, Marc Taylor escribió: > > Dear all, > > > > I am working with rather large community dataset (~30 spp and ~20,000 > > samples) which I am trying to relate to several environmental variables. > > One of the environmental variables that I am specifically interested in > is > > a factor. > > > > I have conducted a CCA (vegan::cca) relating env variables to the > > community, and have found significant effects. I am now interested in > > understanding how to characterize my community data with respect to the > > categorical env factor mentioned earlier - i.e. is a given spp more > > associated with a given level of the env variable. > > > > In the past, I have used vegan::simper for such explorations, but am > having > > difficulty applying it here due to the size of the data (i.e. a distance > > matrix for the biological community would be 20,000 X 20,000). I'm > > wondering if there are any approaches that I might apply that are less > > computationally intensive. > > > > One approach that I explored was to look for the closest env level for > each > > species based on CCA coordinates (from significant axes) using a nearest > > neighbor search (FNN::get.knnx). The results look promising (see example > > below), but I haven't come across the approach anywhere else. It is also > > unfortunately not a statistical test, so I lose some quantitative measure > > of how likely these associations are > > > > Does anyone have any other suggestions how I might do such an analysis > with > > this large dataset? > > > > Many thanks in advance, > > Marc > > > > > > #### EXAMPLE #### > > > > set. seed(1) > > > > ### required packages and data > > library(FNN) > > library(vegan) > > > > data(dune) > > data(dune.env) > > str(dune.env) > > > > ### fit model > > mod <- cca(dune ~ A1 + Moisture + Management, data=dune.env) > > # visualize > > plot(mod, display = c("sp", "bp", "cn")) > > > > ### Permutation Test for CCA > > # terms sig. test (tested sequentially - i.e. order matters) > > (At <- anova(mod, by = "terms", permutations = 499)) > > # cca axes sig. test > > (Ax <- anova(mod, by = "axis", permutations = 499, cutoff = 0.1)) > > > > ### Determine nearest neighbor of Moisture level for each species > > # number of significant CCA axes > > n <- 2 > > # retrieve un-scaled CCA coordinates > > res <- summary(mod, scaling = 0, axes = n, display = c("sp", "wa", "lc", > > "bp", "cn")) > > # get CCA indices for Moisture levels > > mat <- match(paste0("Moisture",levels(dune.env$Moisture)), > > rownames(res$centroids)) > > > > # nearest neighbor > > pred <- get.knnx(data = res$centroids[mat,], query = res$species, > > k = length(levels(dune.env$Moisture)))$nn.index[,1] > > # return results > > tmp <- data.frame(spp = rownames(res$species), > > nearest_level = paste0("Moisture",levels(dune.env$Moisture))[pred]) > > tmp > > > > > > plot(mod, display = c("cn")) > > text(mod, display = "sp", > > col = > > > rainbow(length(levels(dune.env$Moisture)))[as.numeric(tmp$nearest_level)]) > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > R-sig-ecology mailing list > > R-sig-ecology@r-project.org > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > > > -- > > Dr. Juan A. Balbuena > Cavanilles Institute of Biodiversity and Evolutionary Biology > Symbiont Ecology and Evolution Lab > University of Valencia http://www.uv.es/~balbuena > <http://www.uv.es/%7Ebalbuena> > P.O. Box 22085 http://www.uv.es/cophylpaco > 46071 Valencia, Spain > e-mail: j.a.balbu...@uv.es <mailto:j.a.balbu...@uv.es>tel. +34 963 543 > 658 fax +34 963 543 733 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > *NOTE!*For shipments by EXPRESS COURIER use the following street address: > C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain. > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > _______________________________________________ > R-sig-ecology mailing list > R-sig-ecology@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology