Re: [R] NA in cca (vegan)
Gavin Simpson ucl.ac.uk> writes: > > On Fri, 2009-09-04 at 17:15 +0200, Kim Vanselow wrote: > > Dear all, > > I would like to calculate a cca (package vegan) with species and > > environmental data. One of these environmental variables is > > cos(EXPOSURE). > > The problem: for flat releves there is no exposure. The value is > > missing and I can't call it 0 as 0 stands for east and west. > > The cca does not run with missing values. What can I do to make vegan > > cca ignoring these missing values? > > Thanks a lot, > > Kim > > > This is timely as Jari Oksanen (lead developer on vegan) has been > looking into making this happen automatically in vegan ordination > functions. The solution for something like cca is very simple but it > gets more complicated when you might like to allow features like > na.exclude etc and have all the functions that operate on objects of > class "cca" work nicely. > > For the moment, you should just process your data before it goes into > cca. Here I assume that you have two data frames; i) Y is the species > data, and ii) X the environmental data. Further I assume that only one > variable in X has missings, lets call this Exposure: > Kim, A test version of NA handling in cca is now in the development version of vegan at http://vegan.r-forge.r-project.org/. You may get current source code or a bit stale packages from that address (when writing this, the packages are two to three days behind the current devel version). Instruction of downloading the working version of vegan can be found in the same web site. Basically the development version does exactly the same thing as Gavin showed you in his response. It does a "listwise" elimination of missing values. Indeed, it may be better to do that manually and knowingly than to use perhaps surprising automation of handling missing values within the function. Your missing values are somewhat wierd as they are not missing values (= unknown and unobserved), but you just decided to use a coding system that does not cope with your well known and measured values. I would prefer to find a coding that puts flat ground together with exposure giving similar conditions. In no case should they be regarded as NA since they are available and known, and censoring them from your data may distort your analysis. Perhaps having a new variable (hasExposure, TRUE/FALSE) and coding them as east/west (=0) in Exposure could make more sense. Indeed, model term hasExposure*Exposure would make sense as this would separate flat ground from slopes of different Exposures. The interaction term and aliasing would take care of having flat ground with known values but separate from exposed slopes. Cheers, Jari Oksanen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA in cca (vegan)
On Fri, 2009-09-04 at 17:15 +0200, Kim Vanselow wrote: > Dear all, > I would like to calculate a cca (package vegan) with species and > environmental data. One of these environmental variables is > cos(EXPOSURE). > The problem: for flat releves there is no exposure. The value is > missing and I can't call it 0 as 0 stands for east and west. > The cca does not run with missing values. What can I do to make vegan > cca ignoring these missing values? > Thanks a lot, > Kim Hi Kim, This is timely as Jari Oksanen (lead developer on vegan) has been looking into making this happen automatically in vegan ordination functions. The solution for something like cca is very simple but it gets more complicated when you might like to allow features like na.exclude etc and have all the functions that operate on objects of class "cca" work nicely. For the moment, you should just process your data before it goes into cca. Here I assume that you have two data frames; i) Y is the species data, and ii) X the environmental data. Further I assume that only one variable in X has missings, lets call this Exposure: ## dummy data set.seed(1234) ## 20 samples of 10 species Y <- data.frame(matrix(rpois(20*10, 2), ncol = 10)) ## 20 samples and 5 env variables X <- data.frame(matrix(rnorm(20*5), ncol = 5)) names(X) <- c(paste("Var", 1:4, sep = ""), "Exposure") ## simulate some NAs in Exposure X$Exposure[sample(1:20, 3)] <- NA ## show X X ## Now create a new variable indicating which are missing miss <- with(X, is.na(Exposure)) ## now create new X and Y omitting these rows Y2 <- Y[!miss, ] X2 <- X[!miss, ] ## Now submit to CCA mod <- cca(Y2 ~ ., data = X2) mod ## plot it plot(mod, display = c("sites","bp"), scaling = 3) ## It'd be nice to get predictions for the 3 samples we missed out pred <- predict(mod, newdata = Y[miss, ], type = "wa", scaling = 3) ## add these points to the ordination: points(pred[, 1:2], col = "red", cex = 1.5) HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.