Hello, A few days ago I posted the following question, and got the answers below:
*Dear friends,* *I would like to ask for some advice.* *I am embarking in the analysis of 3,000 plant species occurrence data across biogeographic scales in South America. I am willing to try to jump from more traditional distance-based multivariate analysis (e.g., RDA on hellinger-transformed abundance data) to multivariate GLM as proposed by you (mvabund package) and also by Yee (VGAM package).* *However, distance-based methods have grown to incorporate spatial dependency through the development of MEM and AEM techniques, which model symmetric and asymmetric spatial relationships and can be included in the explanatory side of the analysis.* *Reading the multivariate GLM papers, however, I have not find exactly how to control or include spatial autocorrelation. I am thinking of including MEM and perhaps AEM variables simply as co-variables added to the explanatory environmental variables in the multivariate GLM.* *Is this a step I will regret later on? Is this ok?* *A second quick wondering: common GLM analyzes are carried out as a series of nested models in which we exclude variables from an initial full model based on anovas/AIC. I suppose this is also true for multivariate GLM. Is it? Can I compare successive models using the same approach used in common GLM?* *Thanks in advance for any thoughts,* *All the best,* *Alexandre* Replies ************************** *David Warton* *Hi Alex,* *Thanks for the e-mail, sounds like interesting stuff!* *Yes you could as you say use the MEM and AEM techniques with manyglm, while this is not the best of approaches for handling spatial data, it is the simplest and currently the best one given the current lack of code for an alternative.* *And yes you could use an AIC approach for model selection.* ***** *Hi,* *the only thing i am aware of is the spatial autocorrection function available in the nlme package:* *for example:* *null.model <- lme(fixed = A~B, data = data, random = ~ 1 | dummy, method="ML")* *cor.model <- update(null.model, correlation = corExp(form = ~ x + y), method = "ML")* *argument "correlation" accepts several forms of spatial models based on variogram (here exponential based on xy coordinates). One can extract model goodness with extract.aic() or just summary().* *However, this is univariate glm (but can be extended to interaction) and as far as i was told these procedures only exist for gaussian distributions, not for poisson/NB, which are better for species data most of the time.* *I was looking for the same, but in the end i went back to RDA with dbMEMs and used the aforementioned procedure* *only for highly correlated univariate pairs in the dataset.* * Please let me know, if you are more successful.* ***** *Hi Alexandre,* *Not sure what the best solution is, but a few hacker ideas come to mind. First, you could create a spatially lagged variable from scratch. This would be created by deciding on a neighborhood size, say first order neighbors, and then creating a variable that was the average response (Y) value for the first order neighbors. Neighborhood size could be guestimated by looking at residual maps. This is similar to what happens in simultaneous autoregressive (SAR) lagged models. Then this lagged variable could be a fixed covariate in your model. You could test residuals from the lagged model to see if this removed your spatial autocorrelation.* *Since you mentioned a GAM approach, you could also do a spatial GAM, where Lat and Long variables are specified as smooth covariates with lots of knots to account for short range spatial structure. Again, you could test your residuals to see if this removed your spatial autocorrelation.* *If you are comfortable with Bayesian modeling, Banerjee et al. (2015, ‘Hierarchical modeling and analysis for spatial data’) have a chapter on multivariate spatial modeling, with a brief mention of generalized linear models.* *Some food for thought.* ***** *Alexander,* *Any chance you might include spatial dependency (however you may choose to do it) as a random effect in a mixed-model structure? This way you can either run the model with the spatial dependency to test this explicitly or remove this effect from the model structure.* *And yes, you can use AIC to rank multivariate models. * *Just a quick note.* ***** *Furthermore I received the suggestion to read the following papers:* *Spatial factor analysis: a new tool for estimating joint **species distributions and correlations in species range* *James T. Thorson1*, Mark D. Scheuerell2, Andrew O. Shelton3, Kevin E. See4, Hans J. Skaug5* *and Kasper Kristensen. Methods in Ecology and Evolution 2015* *Geostatistical delta-generalized linear mixed models improve **precision for estimated abundance indices for West Coast* *groundfishes. **James T. Thorson1*, Andrew O. Shelton2, Eric J. Ward2, and Hans J. Skaug. **ICES Journal of Marine Science; doi:10.1093/icesjms/fsu243* ** *The importance of spatial models for estimating the strength of **density dependence.**JAMES T. THORSON,1,6 HANS J. SKAUG,2 KASPER KRISTENSEN,3 ANDREW O., HELTON,4 ERIC J.WARD,4 JOHN H. HARMS,1 **AND JAMES A. BENANTE. **Ecology, 96(5), 2015, pp. 1202–1212. * -- Dr. Alexandre F. Souza Professor Adjunto III Universidade Federal do Rio Grande do Norte CB, Departamento de Ecologia Campus Universitário - Lagoa Nova 59072-970 - Natal, RN - Brasil lattes: lattes.cnpq.br/7844758818522706 http://www.docente.ufrn.br/alexsouza [[alternative HTML version deleted]] _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology