Re: [R-sig-eco] clustering communities by an environmental variable

2020-04-25 Thread Pierre Thiriet
Hi
You might have a look to multivariate regression tree, formerly mvpart packagé
Cheers

Le 24 avr. 2020 à 18:30, à 18:30, Salvador SANCHEZ COLON 
 a écrit:
>Hi Irene,
>
>
>The approach that you are using -i.e., calculating the between-sample
>dissimilarity with the Bray-Curtis distance, followed by the UPGMA
>clustering technique- will group your samples based on the values of
>all the variables contained in your data matrix (which, in ecological
>studies, usually are species abundance or presence/absence data). 
>
>
>I hope this helps but feel free to ask further details. 
>
>
>Salvador SÁNCHEZ-COLÓN
>Independent consultant
>
>
>En Vie, 24 Abril, 2020 en 10:49, Irene Adamo 
>escribió:
> 
>
>Para: r-sig-ecology@r-project.org
>Dear all,
>I am trying to do a dendrogram by trying to see if my communities by
>altitude but I am not sure on how to do it.
>lets say I have this example
>
># calculate Bray-Curtis distance among samples comm.bc.dist <-
>vegdist(comm,
>method = "bray") # cluster communities using average-linkage algorithm
>comm.bc.clust <- hclust(comm.bc.dist, method = "average") # plot
>cluster
>diagram plot(comm.bc.clust, ylab = "Bray-Curtis dissimilarity")
>
>Now, I would like to cluster my communities by Altitude always using
>the
>distance matrix and the dendrogram.
>Is it possible?
>
>Thanks a lot for any help!
>
>[[alternative HTML version deleted]]
>
>___
>R-sig-ecology mailing list
>R-sig-ecology@r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>.
>
>   [[alternative HTML version deleted]]
>
>___
>R-sig-ecology mailing list
>R-sig-ecology@r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Pierre THIRIET

Thanks a lot dear Philip. I will try this.

all the best

Pierre


Le 22/03/2019 à 17:07, Dixon, Philip M [STAT] a écrit :

Pierre,

I don't know a function that does this, but it is extremely easy to code.

Dist objects are vectors containing the 1st stage pairwise dissimilarities.   Call 
those dist1, dist2, dist3, ...  So alldist <- cbind(dist1=dist1, dist2=dist2, ...) 
will assemble the matrix of dissimilarities with useful column names.  stage2 <- 
as.dist(1-cor(alldist)) will compute the matrix of correlations, convert from 
similarity (the correlation) to distance (1-correlation) and convert to a distance 
object.  Then just run your favorite MDS on stage2.

Note: sometimes folks prefer sqrt(1-cor) as the "correlation distance", instead 
of 1-cor.  I don't know which Clarke prefers.

Best,
Philip Dixon

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Pierre THIRIET
Dear useRs,

I want to perform 2nd stage nMDS, as described in Clarke, K.R., et al 
(2006). Exploring interactions by second-stage community analyses. 
Journal of Experimental Marine Biology and Ecology 338, 179-192. See 
Abstract below

Do you know a package in R for that ? Or would you have home-made 
scripts, at least a function for computing the distance matrix of 
pair-wise correlations among dissimilarity matrices ?

Thank you,

Pierre


Abstract of Clarke et al 2006 :

Many biological data sets, from field observations and manipulative 
experiments, involve crossed factor designs, analysed in a univariate 
context by higher-way analyses of variance which partition out ‘main’ 
and ‘interaction’ effects. Indeed, tests for significance of 
interactions among factors, such as differing Before–After responses at 
Control and Impact sites, are the basis of the widely used BACI strategy 
for detecting impacts in the environment. There are difficulties, 
however, in generalising simple univariate definitions of interaction, 
from classic linear models, to the robust, non-parametric multivariate 
methods that are commonly required in handling assemblage data. The size 
of an interaction term, and even its existence at all, depends crucially 
on the measurement scale, so it is fundamentally a parametric construct. 
Despite this, certain forms of interaction can be examined using 
non-parametric methods, namely those evidenced by changing assemblage 
patterns over many time periods, for replicate sites from different 
experimental conditions (types of ‘Beyond BACI’ design) – or changing 
multivariate structure over space, at many observed times. *Second-stage 
MDS, which can be thought of as an MDS plot of the pairwise similarities 
between MDS plots (e.g. of assemblage time trajectories), can be used to 
illustrate such interactions, and they can be formally tested by 
second-stage ANOSIM permutation tests. Similarities between 
(first-stage) multivariate patterns are assessed by rank-based matrix 
correlations, preserving the fully non-parametric approach common in 
marine community studies. *The method is exemplified using time-series 
data on corals from Thailand, macrobenthos from Tees Bay, UK, and 
macroalgae from a complex recolonisation experiment carried out in the 
Ligurian Sea, Italy. The latter data set is also used to demonstrate how 
the analysis copes straightforwardly with certain repeated-measures designs.


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Adonis for significance of clusteredness from hclust (vegan package)

2016-10-05 Thread Pierre THIRIET
Dear Ansley,

I agree with Zoltan.

I suggest SIMPROF test in package "clustsig". A complimentary package 
for use with hclust; simprof tests to see which (if any) clusters are 
statistically different. The null hypothesis is that there is no a 
priori group structure.
See Clarke, K.R., Somerfield, P.J., and Gorley R.N. 2008. Testing of 
null hypothesis in exploratory community analyses: similarity profiles 
and biota-environment linkage. J. Exp. Mar. Biol. Ecol. 366, 56-69

HTH
Pierre

Le 04/10/2016 à 14:14, Zoltan Botta-Dukat a écrit :
> Dear Ansley,
>
> I cannot answer your question, I hope someone else will answer. I'd
> rather point out a problem in your approach. Statistical tests were
> developed for testing difference between a priori groups, thus estimated
> Type I error rate is valid only for this situation. When you calculates
> Type I error rate for comparison of groups created by cluster analysis
> of the SAME data, the calculated error rate will be lower than the valid
> error rate. So you cannot use the term "significant" in this situation.
>
> Sorry for making you sadden by this information.
>
> Zoltan
>
> 2016.10.03. 21:52 keltez�ssel, Ansley Silva �rta:
>> Hello:
>>
>> I have created a dendrograms using hierarchical cluster analysis with the
>> vegan package (function: hclust).
>>
>> By visually observing the dendrogram, I have determined that there are 3
>> main clusters if I "cut" the tree at the height 0.25  (please see the
>> dendrogram from the code).
>> I then created a new dataset, which is essentially the same as the
>> original, but I have added the categorical variable Group to represent
>> these 3 main clusters.
>> ST0 is group a, AP0 and AP100 is group b, and AP200 AP300 ST100 ST200 ST
>> 300 is group c.
>> I want to now if they are significantly different from each other.  I
>> understand, from the output pasted below, that I can accept that there is a
>> significant effect of Group.  Is this the only thing I can say from
>> Permanova?  What would be the code for a follow up test to look at
>> pair-wise significant differences?
>> Thanks very much.
>>
>> Call:
>> adonis(formula = species ~ Group, data = environ, permutations = 999)
>>
>> Permutation: free
>> Number of permutations: 999
>>
>> Terms added sequentially (first to last)
>>
>> Df SumsOfSqs  MeanSqs F.Model  R2 Pr(>F)
>> Group  2   0.40244 0.201219   4.969 0.66528  0.007 **
>> Residuals  5   0.20248 0.040495 0.33472
>> Total  7   0.60492  1.0
>> ---
>> Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1
>>
>>
>>
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

[R-sig-eco] vegdist: "binomial", is it the scale-invariant version?

2014-03-26 Thread Pierre THIRIET

Dear all,

Please, could someone confirm that vegdist(X, meth="binomial") is the 
binomial deviance *scaled* (i.e. the scale-invarient version of 
"binomial deviance")


I want to use the dissimilarity "binomial deviance *scaled*", described 
by Anderson and Millar (2004) as

where

"scaled" since it is the scale-invarient version of the "binomial 
deviance" (by dividing by n/k/), also described by Anderson and Millar 
(2004)



vegdist implements the "binomial index" as
d[jk] = sum*(*x[ij]*log(x[ij]/n[i]) + x[ik]*log(x[ik]/n[i]) - 
n[i]*log(1/2)*)*/n[i], where n[i] = x[ij] + x[ik]


It seems therefore that is the "binomial deviance scaled", whright ?

I thank you in advance,

cheers,
Pierre



Anderson, M.J. and Millar, R.B. (2004). Spatial variation and effects of 
habitat on temperate reef fish assemblages in northeastern New Zealand. 
/Journal of Experimental Marine Biology and Ecology/ 305, 191--221.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Extract residuals from adonis function in vegan package

2014-03-18 Thread Pierre THIRIET
Dear Alicia,
the mvabund package enables to fit Multivariate Generalized Linear Model 
for presence/abscence data. I guess you could extract residuals.
HTH,
Pierre


*Pierre THIRIET*

Doctorant en écologie marine - Marine Ecology PhD student

EA 4228 - ECOMERS - /Ecosystèmes Côtiers Marins et Réponses aux Stress/
Université de Nice - Sophia Antipolis, Faculté des Sciences, Parc 
Valrose, 06108 Nice Cedex 2, France
More about my work and my lab. 
<http://www.unice.fr/ecomers/index.php?option=com_comprofiler&task=userProfile&user=101&Itemid=111&lang=fr>

Cell: +336 79 44 91 90
Office: +334 92 07 68 33
Skype: pierre.d.thiriet
Mail: pierre.thir...@unice.fr
Mail(bis): pierre.d.thir...@gmail.com


Le 18/03/2014 15:43, Cade, Brian a écrit :
> You ought to be very careful about using residuals from one analysis as the
> response variable in another analysis as the inferences about the second
> analysis will almost certainly be flawed.  Best to try and do this another
> way if at all possible.
>
> Brian
>
> Brian S. Cade, PhD
>
> U. S. Geological Survey
> Fort Collins Science Center
> 2150 Centre Ave., Bldg. C
> Fort Collins, CO  80526-8818
>
> email:ca...@usgs.gov  
> tel:  970 226-9326
>
>
>
> On Tue, Mar 18, 2014 at 5:53 AM, Alicia Valdés
> wrote:
>
>> Hi,
>>
>> I am using the adonis function in the vegan package to determine effects of
>> different environmental factors in forest plant community composition in
>> different regions. I would like to first use adonis to remove the region
>> effect, this is, to fit a model like
>>
>> adonis_region<- adonis(community ~ region,permutations=999,method="bray")
>>
>>   Where community is a presence/absence data matrix of species in forest
>> patches belonging to different regions.
>>
>> Then I would like to use the residuals of this model as the response
>> variable for some other analyses. I know I could use strata to model region
>> as a block variable, but this does not interest me, as afterwards I want to
>> perform another kind of analysis where I would like the region effect to be
>> removed.
>>
>> My problem is that I cannot figure out how to get residual values from the
>> adonis model.
>>
>> Any hint of some other kind of multivariate analysis that could solve this
>> problem is very welcome.
>>
>> Thanks in advance,
>>
>> --
>> Alicia Valdés
>> Université de Picardie Jules Verne
>> Unité EDYSAN (Ecologie et Dynamique des Systèmes Anthropisés)
>> 1, rue des Louvels
>> F-80037 Amiens Cedex
>> France
>> tel:+33 322825775
>> alicia.val...@u-picardie.fr
>> http://www.u-picardie.fr/edysan/
>>
>>  [[alternative HTML version deleted]]
>>
>>
>> ___
>> R-sig-ecology mailing list
>> R-sig-ecology@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>
>>
>   [[alternative HTML version deleted]]
>
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] summing up two matrix of distances among centroids

2014-01-21 Thread Pierre THIRIET
Dear all,

I sampled within 24 sites, by using 2 distinct methods, fish assemblages 
(6 replicates) and macro-invertebrates assemblages (3 replicates).
I want to compute the pairwise distances among sites considering both 
assemblages(using braycurtis dissimilarity), let's say the overall 
distances among sites
I see 2 options:
option 1, averaging at the site scale fishes and macro-invertebrates 
abundances , cbind the two mean abundances matrix and calculating 
bray-curtis
option 2, for each assemblage, computing bray-curtis and distances among 
sites centroids, and then summing up the 2 distance matrix.

Please, could tell me if the option 2 is correct (and more appropriate 
than option 1) and if my code is correct

thank you in advance.

library(plyr)
library(vegan)

#assemblage 1: 15 fish species, 6 replicates per site
a1.env=data.frame(
   Habitat=paste("H",gl(2,12*6),sep=""),
   Site=paste("S",gl(24,6),sep=""),
   Replicate=rep(paste("R",1:6,sep=""),24))
summary(a1.env)

a1.bio=as.data.frame(replicate(15,rpois(144,sample(1:10,1
names(a1.bio)=paste("F",1:15,sep="")
a1.bio[1:72,]=2*a1.bio[1:72,]

#assemblage 2: 10 taxa of macro-invertebrates, 3 replicates per site
a2.env=a1.env[a1.env$Replicate%in%c("R1","R2","R3"),]
summary(a2.env)
a2.bio=as.data.frame(replicate(10,rpois(72,sample(10:100,1
names(a2.bio)=paste("I",1:10,sep="")
a2.bio[1:36,]=0.5*a2.bio[1:36,]


# "environmental" data at the sit scale
env=unique(a1.env[,c("Habitat","Site")])
env=env[order(env$Site),]

 OPTION 1, averaging abundances and cbind

a1.bio.mean=ddply(cbind(a1.bio,a1.env),.(Habitat,Site),numcolwise(mean))
a1.bio.mean=a1.bio.mean[order(a1.bio.mean$Site),]

a2.bio.mean=ddply(cbind(a2.bio,a2.env),.(Habitat,Site),numcolwise(mean))
a2.bio.mean=a2.bio.mean[order(a2.bio.mean$Site),]

bio.mean=cbind(a1.bio.mean[,-c(1:2)],a2.bio.mean[,-c(1:2)])

dist.mean=vegdist(sqrt(bio.mean),"bray")


###OPTION 2, computing for each assemblage distance among centroids 
and summing the 2 distances matrix
a1.dist=vegdist(sqrt(a1.bio),"bray")
a1.coord.centroid=betadisper(a1.dist,a1.env$Site)$centroids
a1.dist.centroid=vegdist(a1.coord.centroid,"eucl")

a2.dist=vegdist(sqrt(a2.bio),"bray")
a2.coord.centroid=betadisper(a2.dist,a2.env$Site)$centroids
a2.dist.centroid=vegdist(a2.coord.centroid,"eucl")

dist.centroid=a1.dist.centroid+a2.dist.centroid
coord.centroid=cmdscale(dist.centroid,k=23) # for further distance-based 
analysis (is it correct ?)


### COMPARING OPTION 1 AND 2
pco.mean=cmdscale(vegdist(sqrt(bio.mean),"bray"))
pco.centroid=cmdscale(dist.centroid)

comparison=procrustes(pco.centroid,pco.mean)
protest(pco.centroid,pco.mean)


-- 
*Pierre THIRIET*

Doctorant en écologie marine - Marine Ecology PhD student

EA 4228 - ECOMERS - /Ecosystèmes Côtiers Marins et Réponses aux Stress/
Université de Nice - Sophia Antipolis, Faculté des Sciences, Parc 
Valrose, 06108 Nice Cedex 2, France
More about my work and my lab. 
<http://www.unice.fr/ecomers/index.php?option=com_comprofiler&task=userProfile&user=101&Itemid=111&lang=fr>

Cell: +336 79 44 91 90
Office: +334 92 07 68 33
Skype: pierre.d.thiriet
Mail: pierre.thir...@unice.fr
Mail(bis): pierre.d.thir...@gmail.com

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fitting logistic function to data that crosses the origin

2013-06-19 Thread Pierre THIRIET

Dear Phil,

I did not understand why you prefer working on logged data, if 
untransformed data are already nicely S-shaped. Is it for fitting the 
classical methods applied in all the previous studies?


Anyway, I thik the nls self-start SSfpl() might be usefull. It fits a 
four parameter sigmoid : inflection point, "slope", upper asymptote, and 
lower asymptote. Thanks to this fourth parameter, you could deal with 
negative y values without any rescaling.


Maybe SSasympOff() might be also of interest, it is an Asymptotic 
Regression Model with an Offset. And also all other self-start nlm:
?selfStart()


HTH
Pierre

Le 18/06/2013 23:28, Novack-Gottshall, Philip M. a écrit :

Greetings,

I am looking for advice on fitting a logistic function to logged data that 
includes negative values.

Background: The data is body-size data, which is conventionally analyzed in 
logarithmic form. Because the dependent variables cross the value of 1, the 
logged data includes both positive and negative values. I'm comparing the 
standard power function and the logistic function because I have reason to 
believe there is allometry, which approximately follows a logistic curve during 
ontogeny. And I want to use AICc model comparison in my model selection.

Logistic functions, by definition, asymptotically approach (and never reach) 
zero, making it impossible to fit data that crosses the origin. But the data 
(see sample below), does have a sigmoidal appearance (both when plotted in 
linear and logged axes.) So I would like to fit a logistic model to it.

I have no problem fitting a logistic model to the linear form. And I can shift 
the logged data upward by an arbitrary value so that all response values are 
positive. But because I want to use the log-likelihoods for the model fits to 
conduct model selection, I don't like this solution. This is because it's 
arbitrary, because the value of logLik() can change depending on how much I 
shift the dependent variables, and because I'd need to shift different samples 
by different amounts.

Can anyone point me in a good direction on how to better fit logistic models to 
this data? (Or to calculate the correct logLik and residual sum-of-squares.) 
Should I just compare logistic in linear space to the power function in log-log 
space? Or how about rescaling the data between 0 and 1, which is arbitrary, but 
at least consistent among samples?

Thanks,
Phil


# Basic example using sample data in linear-space
y <- c(0.01, 0.10, 0.17, 0.69, 1.35, 3.45, 4.07, 4.12)
x <- c(0.28, 0.64, 0.76, 1.43, 1.98, 2.99, 3.97, 4.43)
data <- data.frame(x=x, y=y)
plot(data)
# Estimate starting parameters
K <- max(data$y)  # Carrying capacity
x.0 <- min(data$y)# Starting size
r <- 2  # Rate
nlsfit <- nls(y ~ K / (1 + ((K - x.0)/x.0) * exp(-r * x)), data=data, 
start=list(K=K, r=r, x.0=x.0))
nlsfit
logLik(nlsfit)
xfit <- seq(min(data$x), max(data$x), (max(data$x)- min(data$x))/100)
yfit <- coef(nlsfit)[1] / (1 + ((coef(nlsfit)[1] - coef(nlsfit)[3]) / 
coef(nlsfit)[3]) * exp(-coef(nlsfit)[2] * xfit))
lines(spline(xfit, yfit), col="red")
# K = 4.185, r = 2.117, x.0 = 0.0325
# residual sum-of-squares: 0.0228
# logLik = 12.094

# Now re-do, taking the log of both variables:
data <- data.frame(x=log(x), y=log(y))
plot(data)
K <- max(data$y) ; x.0 <- min(data$y); r <- .7
nlsfit <- nls(y ~ K / (1 + ((K - x.0)/x.0) * exp(-r * x)), data=data, 
start=list(K=K, r=r, x.0=x.0))
nlsfit
logLik(nlsfit)
xfit <- seq(min(data$x), max(data$x), (max(data$x)- min(data$x))/100)
yfit <- coef(nlsfit)[1] / (1 + ((coef(nlsfit)[1] - coef(nlsfit)[3]) / 
coef(nlsfit)[3]) * exp(-coef(nlsfit)[2] * xfit))
lines(spline(xfit, yfit), col="red")
# Line reaches asymptote as y approaches zero. (This is also how the logistic 
function works.)
# K = -4.738, r = -4.066, x.0 = -0.694
# residual sum-of-squares: 5.741
# logLik = -10.024

# Now re-do, taking the log of both variables, then vertically shift by adding 
value of 5 to y variable to ensure all positive values:
data <- data.frame(x=log(x), y=log(y) + 5)
plot(data)
K <- max(data$y) ; x.0 <- min(data$y) ; r <- 2
nlsfit <- nls(y ~ K / (1 + ((K - x.0)/x.0) * exp(-r * x)), data=data, 
start=list(K=K, r=r, x.0=x.0))
nlsfit
logLik(nlsfit)
xfit <- seq(min(data$x), max(data$x), (max(data$x)- min(data$x))/100)
yfit <- coef(nlsfit)[1] / (1 + ((coef(nlsfit)[1] - coef(nlsfit)[3]) / 
coef(nlsfit)[3]) * exp(-coef(nlsfit)[2] * xfit))
lines(spline(xfit, yfit), col="red")
# K = 6.727, r = 1.810, x.0 = 3.843
# residual sum-of-squares: 0.354
# logLik =  1.120 # But this and residual sum-of-squares varies depending on 
what value data were shifted by.

# Now take logged data and use vertically shifted model and down-shift it:
data <- data.frame(x=log(x), y=log(y))
plot(data)
xfit <- seq(min(data$x), max(data$x), (max(data$x)- min(data$x))/100)
yfit <- coef(nlsfit)[1] / (1 + ((coef(nlsfit)[1] - coef(nlsfit)[3]) / 
coef(nlsfit)[3]) * exp(-coef(nlsfit)[2] * xfit)) - 5
lin

Re: [R-sig-eco] Pariwise tests in ANOSIM (and/or ADONIS)

2013-06-04 Thread Pierre THIRIET

Dear Jari,

from this perspective, is it correct to run multiple pairwise tests (1 
factor 2 levels) using adonis, and then adjust p-values. If yes, which 
adjustement method would you suggest, Bonferroni correction ?

Thanks in advance,
All the best,
Pierre

Le 05/06/2013 07:00, Jari Oksanen a écrit :

On 05/06/2013, at 03:25 AM, Simon de Lestang wrote:


I have what I am sure is a very simple question relating to the package vegan, 
although I have not been able to solve it via Google.
Using the function anosim I am only able to produce a global result in the 
output.
Is it possible to produce all pairwise comparisons (corrected) in a single 
output or are these to be conducted individually? (individual pairwise tests).
I guess my question would also relate to the function adonis.


Dear Simon,

Pairwise tests are not possible in vegan. My understanding is that the non-R 
software with such tests makes separate pairwise tests using subsets of data 
with only two levels of a factor in one tests. We don't provide that in vegan 
and have no plans to provide this in the future.

Cheers, Jari Oksanen


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Functional response type II question, curve fitting and

2013-05-29 Thread Pierre THIRIET
Dear Fernando,

I would suggest the function nls() that fit non-linear model.
You can specify by hand your equation and starting values of your 
parameters (your guessed value)
R include some Self-Starting functions for the model the most used, this 
could be easier for you to work with it. check ?selfStart()
According to wikipedia, Type II functional response is similar to the 
Michaelis--Menten equation, which is implemented in R as self starting 
function:http://stat.ethz.ch/R-manual/R-patched/library/stats/html/SSmicmen.html
 

By adpting examples, you could easily fit your model, if the choosen 
equation is suitable.

About statistical comparisons of estimated parameters among curves, I 
have no idea but I found, among other this post:
http://stats.stackexchange.com/questions/26611/how-to-test-the-effect-of-a-grouping-variable-with-a-non-linear-model

Good luck,
Pierre

Le 30/05/2013 07:49, Luis Fernando García Hernández a écrit :
> Dear all,
>
> I am new on the list and on the more complex applications of R, so I ask
> you to excuse me if my mail is too long.
>
> This time I have to do several tasks related tu functional reponse. I want
> to evaluate the relationship between prey density (x) vs prey consumed(y)
> the idea is to fit the data set to the functional reponse equation:
>   Na/TP=a/(1+aThN). I have idea about the value of most of the equation
> parameters, but I am completely lost about how to fit the data points to
> this equation.
>
> On another hand I have to graph and compare if there are significative
> differences on estimated parameters of the curves for lepidopterans and
> ants, the final data should look like the attacher picture.If any of you
> have some idea about how to do this, would be really appreciated.
>
> Below you  can find the data and the pictures.
>
> Thanks in advance
>
>  Prey Density Eaten  Lepidopteran 1 2  Lepidopteran 1 3  Lepidopteran 1 3
> Lepidopteran 3 4  Lepidopteran 3 5  Lepidopteran 3 4  Lepidopteran 5 7
> Lepidopteran 5 5  Lepidopteran 5 6  Lepidopteran 10 8  Lepidopteran 10 10
> Lepidopteran 10 7  Ant 1 3  Ant 1 1  Ant 3 4  Ant 3 2  Ant 5 4  Ant 5 6  Ant
> 10 5  Ant 10 6
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] CCA vs NMDS and ordisurf

2013-04-19 Thread Pierre THIRIET
Dear Francois and Jari, thank you for correcting me.
Dear Aurélie, sorry for all these wrong stuffs.
All the best,
Pierre


Le 19/04/2013 08:46, François Gillet a écrit :
>
> 2013/4/18 Pierre THIRIET  <mailto:pierre.d.thir...@gmail.com>>
>
> Bray-curtis is usually the most appropriate, on raw
> abundance/biomass/cover data, or square root/log transformed. So
> why do you Hellinger transform before? This transformation is
> dedicated to be used with euclidean distance, and resulted
> ordinations (PCA or RDA) have a distinct meaning than PCoA or
> CAP/db-RDA (with bray-curtis) because joint abscence are included
> in first cases and excluded in the latter.
>
>
> Dear Pierre and Aurélie,
>
> Just another point: after Hellinger standardization (site profiles), 
> joint absences (double zeros) are ignored in PCA and RDA since 
> Hellinger distance is an asymmetric measure of resemblance (see 
> Legendre and Gallagher 2001, Oecologia).
>
> All the best,
>
> François
>
> ---
>
> Prof. *François Gillet*
> Université de Franche-Comté - CNRS
> UMR 6249 Chrono-environnement
> UFR Sciences et Techniques
> 16, Route de Gray
> F-25030 Besançon cedex
> France
> http://chrono-environnement.univ-fcomte.fr/
> http://chrono-environnement.univ-fcomte.fr/spip.php?article530
> Phone: +33 (0)3 81 66 62 81
> iPhone: +33 (0)7 88 37 07 76
> Location: La Bouloie, Bât. Propédeutique, *-114L*
> ---
> Editor of*/Plant Ecology and Evolution/*
> http://www.plecevo.eu
> ---*
> ***


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] CCA vs NMDS and ordisurf

2013-04-18 Thread Pierre THIRIET

Dear Aurélie,

About the dissimilarity measures and data you used:
Bray-curtis is usually the most appropriate, on raw 
abundance/biomass/cover data, or square root/log transformed. So why do 
you Hellinger transform before? This transformation is dedicated to be 
used with euclidean distance, and resulted ordinations (PCA or RDA) have 
a distinct meaning than PCoA or CAP/db-RDA (with bray-curtis) because 
joint abscence are included in first cases and excluded in the latter. 
See picture below from Anderson et al 2011 Navigating the multiple 
meanings of b diversity: a roadmap for the practicing ecologist




So, if you want do constrained ordinations (constrained by "drought 
disturbance gradient", I guess), I would suggest dbRDA (vegan::capscale) 
with bray curtis, or RDA on Hellinger transformed data, depending on 
what you want to emphasis.

For unconstrained ordinations, this will be respectively PCoA and PCA.

Pay attention in using NMDS. As you said,  it is rank-based, this is why 
fitting environmental vectors to NMDS biplot is not so appropriate, 
despite widely done. I don't see the problem about ordisurf and PCoA or 
CAP: Ordisurf enables you to fit environnemental variables that have 
non-linear relationships with PC of distance based ordinations.


If you use bray-curtis, I would suggest to use distance among group 
centroids instead of computing averages over groups followed by bray-curtis


About hypotheses testing (in capscale or adonis for instance), pay 
attention to the longitudinal nature of your data. Some questions about 
repeated measure and adonis are already in R-SIG-ECO archives, have a alook.


I guess you are interested in identifying the species which are the most 
responsible of community change over drought disturbance gradien?!
If yes, I think an appropriate way could be: a dbRDA (capscale) with 
bray curtis on square root transformed cover data (or not, depends if 
you have few predominant species that might mask the others) , and 
"drought disturbance gradient" as a continuous constraint. Then, you 
could overlay vectors of correlations between species cover and CAP1 axe 
(i.e. in vegan: scores(your.capscale, dis="sp", scaling=-2, const = 
sqrt(nrow(your.cover.data.matrix)-1),choices=1).


I hope my english is at least understandable, and that my answer helped you.

Cheers,
Pierre



Le 18/04/2013 13:31, Aurélie Boissezon a écrit :

Hi everybody,

I have some questions about ordination analysis and interpretation of 
ordisurf() output. So huge thanks to people who will help me to clean up my 
confused brain.
So I am working on cover data of aquatic plants (%). I made 7 quadrat sampling between 
2009 and 2012 in a semi permanent shallow pond (n=1200 approximately without empty 
quadrat). Due to fluctuating water regime and small topographic variations, my sampling 
units are distributed along a gradient of inundation conditions from permanently wet to 
frequently dry. Clearly the vegetation responded to water level condition occurring the 
previous year. Community following several years of high levels was very different from 
the one occuring the year after a severe drought of the waterbody (a lot of charophytes, 
pionneer species). I quantified this "drought disturbance gradient" by 
calculating when (which season?), and for how many days each quadrat dried before each 
field sampling.
My purpose is to explore the relationship between the composition of the community and 
those "drought indexes". And in particular to highlight the succession of 
species along the gradients.
My first reflex was to implement a CCA but someone tell me to explore 
unconstrained approach and in particular NMDS.
The CCA ordination shows a strong arch effect but is highly significant and 
perfectly ecologically interpretable and congruent with my field observations. 
To summarize submerged species are separated from helophytes species by 
duration of drought during growing season (submerged species need water from 
winter to summer). And submerged species succeeded each other along a gradient 
of duration of drought at the end of the growth season, in autumn.
But to see if I had similar results when looking at the whole variation of the 
community data set and when using a more suitable distance measure, I run a 
NMDS on Hellinger-transformed data based on Bray-Curtis distances.
With NMDS I didn't reach a "convergent solution" even after setting stricter 
criteria maxit and sratmax. Nevertheless the stress is acceptable (8 with k=3 ) and the 
species are ordinated similarly to the CCA. I implement the same analysis on a simplified 
version of my data set by averaging the cover of species by date, by depth clusters (10 
centiles) and by area of the lake leading to 131 observations instead of 1200 quadrats 
initially (which is very large). Here the nmds reached quickly a convergent solution 
(after 20 or 50 runs) and gave always a similar ordination of species.
So is it important not to reach a 

Re: [R-sig-eco] How to remove space among columns

2013-03-27 Thread Pierre THIRIET
Hi Manuel,

try this combination of paste(), which collapse your variables into a 
single one, and cbind() for binding this new variable with the others of 
your initial data frame

mat=as.data.frame(matrix(1:18,3,6))
mat2=cbind(newV=with(mat,paste(V1,V2,V3,V4,sep="")),mat[,5:6])

is it what you wanted?

cheers,
Pierre

PS: this kind of questions should go into general R help list

Le 27/03/2013 14:15, Manuel Spínola a écrit :
> Thank you very much to all that answered my question, some one of you asked
> me to be more specific, here is my question again:
>
> I hava a data frame:
>
> col1  col2 col3 col4 col5 col6
> 01  1 0 ac
> 10  0 0 ad
> 01  1 1 bd
>
> I want to end with a data frame like this (the first 4 columns without
> space):
>
> 0110 ac
> 1000 ad
> 0111 bd
>
>
> Best,
>
> Manuel
>
>
>
> 2013/3/26 Roman Luštrik 
>
>> How do you want to remove the space, so that when you print a data frame
>> that the columns are closer together? Pr perhaps you're trying to merge
>> column names into a single string? Perhaps something third? Can you expand
>> your answer?
>>
>> Cheers,
>> Roman
>>
>>
>>
>> 2013/3/27 Manuel Spínola 
>>
>>> Dera list members,
>>>
>>> How to remove spaces among columns in a data frame.
>>>
>>> 1 2 3 4  to become 1234
>>>
>>> Best,
>>>
>>> Manuel
>>>
>>> --
>>> *Manuel Spínola, Ph.D.*
>>>
>>> Instituto Internacional en Conservación y Manejo de Vida Silvestre
>>> Universidad Nacional
>>> Apartado 1350-3000
>>> Heredia
>>> COSTA RICA
>>> mspin...@una.ac.cr
>>> mspinol...@gmail.com
>>> Teléfono: (506) 2277-3598
>>> Fax: (506) 2237-7036
>>> Personal website: Lobito de río <
>>> https://sites.google.com/site/lobitoderio/>
>>> Institutional website: ICOMVIS 
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>>
>>> ___
>>> R-sig-ecology mailing list
>>> R-sig-ecology@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>>>
>>>
>>
>> --
>> In God we trust, all others bring data.
>>
>
>
>
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Mean survival time (and possibly its confidence interval) for interval-censored data (using "interval" R package)

2013-03-13 Thread Pierre THIRIET

Dear all,

I would like to get mean survival time (and possibly its confidence 
interval) for interval-censored data.
Using "interval package", I am fitting the non-parametric maximum 
likelihood estimate for the distribution (i.e. generalization of the 
Kaplan-Meier estimate). But then, I don't know how to compute the mean 
survival time from the icfit object.
For getting a raw estimate of the mean survival times, I plot the NPLME 
and look at the time interval where survival porbability equals 0.5. Is 
there a better way for getting mean survival time, and is there a way 
for getting its confidence interval?


library(interval)
data(bcos)
icout<-icfit(Surv(left,right,type="interval2")~treatment, data=bcos)
plot(icout)
abline(h=0.5)

I thank you in advance,
Pierre

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] question for the R community : Plot RDA biplot without axis ?

2013-02-26 Thread Pierre Thiriet
Hi Sarah,

Here is Gavin's solution.

You have to modify the text.cca function. (at least) two ways:

# First
#without paranthèses after function name, you get the source codes
vegan:::text.cca
#then copy/paste this source codes, modify, save, and assign name space

#Second
# you can modify directly the function using  fixInNamespace()
fixInNamespace("text.cca",ns="vegan")  #this function opens a script editor
(in RStudio, I don't know in other config), you have to remove/modify lines
32 to 35.

#then you just have to run your codes, but just precise in your calls
"vegan:::text.cca" instead" of "text".

### data
data(dune)
data(dune.env)

### Constrained ordination
dune.hel<-decostand(dune, "hellinger")
dune.cca <- cca(dune ~ A1 + Manure, data=dune.env)

### Plot with 4 panels
par(mfrow=c(2,2))
par(mar=c(0.3,0.3,0.3,0.3))


### plot 1
plot(dune.cca, type = "n", scaling = 2, col.axis="white")
with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
  col=2))
### When I add the next line, it adds env. variables as arrows but also
vegan:::text.cca(dune.cca, display="bp", col=1, cex=1.1)

###plot 2
plot(dune.cca, type = "n", scaling = 2, col.axis="white", col="grey")
with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
  col=2))
vegan:::text.cca(dune.cca, display="bp", col=1, cex=1.1)

###plot 3
plot(dune.cca, type = "n", scaling = 2, col.axis="white")
with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
  col=2))
vegan:::text.cca(dune.cca, display="bp", col=1, cex=1.1)

###plot 4
plot(dune.cca, type = "n", scaling = 2, col.axis="white")
with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
  col=2))
vegan:::text.cca(dune.cca, display="bp", col=1, cex=1.1)



I hope it helps
Pierre

2013/2/26 Sarah Loboda 

> Hi,
> Here's the reproducible example that I made with dune data. When you do the
> 4 graphs, you can see that because of the text () function, there is an
> axis on the right and values appear in the plots on the right side. I
> understand that it is because of my text () function, but is there a way to
> delete that axis in the text funtion? if not, is there another way to plot
> my data on 4 panels without axis?
>
> I don't know what you mean by "body of vegan text.cca". You mean in the
> vegan tutorial ?
> I used col.axis because ann=FALSE as an argument in plot function does not
> work and col.axis seems fine...
>
> Thank you very much for your time. I really appreciate your help :D
>
> library(vegan)
> library(MASS)
>
> ### data
> data(dune)
> data(dune.env)
>
> ### Constrained ordination
> dune.hel<-decostand(dune, "hellinger")
> dune.cca <- cca(dune ~ A1 + Manure, data=dune.env)
>
> ### Plot with 4 panels
> par(mfrow=c(2,2))
> par(mar=c(0.3,0.3,0.3,0.3))
>
>
> ### plot 1
> plot(dune.cca, type = "n", scaling = 2, col.axis="white")
> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
> col=2))
> ### When I add the next line, it adds env. variables as arrows but also
> adds an axis on the right
> text(dune.cca, display="bp", col=1, cex=1.1)
>
> ###plot 2
> plot(dune.cca, type = "n", scaling = 2, col.axis="white", col="grey")
> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
> col=2))
> text(dune.cca, display="bp", col=1, cex=1.1)
>
> ###plot 3
> plot(dune.cca, type = "n", scaling = 2, col.axis="white")
> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
> col=2))
> text(dune.cca, display="bp", col=1, cex=1.1)
>
> ###plot 4
> plot(dune.cca, type = "n", scaling = 2, col.axis="white")
> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
> col=2))
> text(dune.cca, display="bp", col=1, cex=1.1)
>
> 2013/2/25 Gavin Simpson 
>
> > On Mon, 2013-02-25 at 13:18 -0500, Sarah Loboda wrote:
> > > Hi,
> > > I have trouble to obtain the ordination graph I want. I want to have 4
> > RDA
> > > biplot on the same page and I don't want to have (or I want to modify)
> > the
> > > axis numbers. I want the marks on the axis without numbers to maximize
> > the
> > > space for each RDA plot.
> >
> > A problem is the call to text() ( which calls text.cca() ). It doesn't
> > pass on arguments to the underlying axis() calls and hence you can't do
> > what you are trying to do with that function directly.
> >
> > Not sure why you want the axis to be white - that draws an axis so it
> > will obscure anything drawn before it with white paint.
> >
> > The only solution at the moment will be to modify the vegan:::text.cca()
> > function to change the two calls to axis() at the end of the function
> > definition. I suspect you could just copy the body of vegan:::text.cca
> > and put it into your own function, but I haven't tried it. If that fails
> > due to namespace issues, then use assignInNamespace() to assign your
> > function to the text.cca function in vegans namespace.
> >
> > See th

Re: [R-sig-eco] adonis and temporal changes

2013-02-18 Thread Pierre THIRIET

Dear Valérie,

If I remember well, your design includes:
Isolation categories: 3 levels
Sites: nested within Isolation categories (10 levels, a total of 30 sites)
How many replicates per site and time?
Time:? how many years you have? Only one sampling per year? Within sites 
and years, samples were random or it is always exactly the same area you 
sample (e.g. permanent quadrats)?


for adonis, consider that strata is for constraining permutations, which 
is different than terms in the formulae.


Chears,
Pierre

Le 18/02/2013 11:17, v_coudr...@voila.fr a écrit :

Dear all,
I would like to test changes in species dissimilarity matrices over time, 
taking into account that the measurement are repeated in each site over years. 
I used the
adonis function: adonis(diss.matrix~year, strata=site). However if I do the 
same model entering site as an additional fixed effect (this was applied this 
way in a
paperI read): adonis(diss.matrix~site+year, strata=site), I get exactly the 
same estimate for year, but the variance explained is much higher. I am a bit 
lost regarding
how much of the variance in dissimilarity is really explained by temporal 
changes.
Thank you very much
___
C'est l'année du Serpent ! Connaissez-vous votre signe astral chinois ? 
Découvrez-le ici 
http://astrocenter.voila.fr/voila/Presentation.aspx?product=StEdCH2K2&Af=-3000

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] binomial regression with non integers

2013-02-15 Thread Pierre THIRIET

Hi Valérie,

If I well understood, you want to test differences in the ratio 
(multiple-site turnover component/multiple-site total dissimilarity) 
between groups of sites. I think that beta.sample() function could help 
you in comparing between groups of sites their multiple-site total 
dissimilarity and the respective contributions of the 2 components.  
With this function, you can get for each group of sites the boostraped 
confidence intervals of the 3 values(2 components and total 
dissimilarity). You could also plot the distributions of resampled 
dissimilarities .This is what Baselga, A. and C. D. L. Orme (2012) did 
in their exemple (see pasted code below).
I imagine that if you are interested in the ratio (turnover/total 
dissimilarity), this is because total multiple-site dissimilarities are 
really different between groups of sites. You could compute for each 
group (and so compare among groups) the bootstraped confidence interval 
of this ratio by combining the use of boot() and beta.multi(). However, 
the ecological meaning of this comparison of ratio is confused for me. 
What do you want to test?
If you are interested in caracterizing differences in assemblage 
composition between group of sites, I am wondering if another approach 
like dbRDA (with group as constraint) on each of the 3 dissimilarity 
matrices returned by beta.pair(feeded by the whole site x species matrix 
including every groups of sites) might give you better insights about 
how are differences between groups (turnover and/or nestedness). 
However, I read in Baselga and Orme 2012 that  multiple-site measures 
are different than averaged pairwise dissimilarities, this is why I am 
not sure on the suitability of this variation partionning approach. Does 
any one know if it is allowed and make sense?


Cheers,
Pierre


Baselga, A. and C. D. L. Orme (2012). "betapart: an R package for the 
study of beta diversity." Methods in Ecology and Evolution 3(5): 808-812.


library(betapart)
data(ceram.s)
data(ceram.n)
# get betapart objects
ceram.s.core <- betapart.core(ceram.s)
ceram.n.core <- betapart.core(ceram.n)
# multiple site measures
ceram.s.multi <- beta.multi(ceram.s.core)
ceram.n.multi <- beta.multi(ceram.n.core)
# sampling across equal sites
ceram.s.samp <- beta.sample(ceram.s.core,
sites=10, samples=100)
ceram.n.samp <- beta.sample(ceram.n.core,
sites=10, samples=100)
# plotting the distributions of components
dist.s <- ceram.s.samp$sampled.values
dist.n <- ceram.n.samp$sampled.values
plot(density(dist.s$beta.SOR),
xlim=c(0,0Æ8), ylim=c(0, 19), xlab='Beta
diversity', main='', lwd=3)
lines(density(dist.s$beta.SNE), lty=1, lwd=2)
lines(density(dist.s$beta.SIM), lty=2, lwd=2)
lines(density(dist.n$beta.SOR), col='grey60',
lwd=3)
lines(density(dist.n$beta.SNE), col='grey60',
lty=1, lwd=2)
lines(density(dist.n$beta.SIM), col='grey60',
lty=2, lwd=2)



Le 15/02/2013 13:37, v_coudr...@voila.fr a écrit :

Dear all,

I am investigating diversity in different sites. I partitioned my measure of 
diversity into to additive components (Baselga 2012) and get for each site a 
value of overal
diversity change (between 0 and 1) and a value for each additive component, 
such that for each site beta1+beta2=beta_total. I would like to make a 
regression
model to test if the proportion of diversity due to beta1 (beta1/beta_total) is 
signifcantly different according to an explanatory factor. If beta1 had been an 
integer
value, I would have used a binomial model. However, since beta1 is not an 
integer I don't think that I am allowed to use the formel
glm(cbind(beta1,beta2)~x,family=binomial)? What alternative method could I use?
I hope that my question is not too confuse.
Thank you very much.
Valérie
___
C'est l'année du Serpent ! Connaissez-vous votre signe astral chinois ? 
Découvrez-le ici 
http://astrocenter.voila.fr/voila/Presentation.aspx?product=StEdCH2K2&Af=-3000

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Species Scaling methods in PCoA or CAP using vegan::capscale

2013-02-07 Thread Pierre THIRIET

Dear R user,

I want to study relations between sites categories and species 
abundances through PCoA or CAP using vegan::capscale. For doing so, I 
overlay species scores on my ordination. However, I am getting confused 
with the different scaling options and their interpretation. From what I 
understood:
With scaling=1, arrow shows the direction from the origin for which 
sites have larger abundances for this species.
With scaling=2, we want to analyze the correlations among species. 
Species that have a small angle between their arrows are expected to be 
strongly positively correlated.
With scaling=-2, const = sqrt(nrow(dune)-1), we get correlations between 
species and axes. This come from Jari Oksanen at 
https://stat.ethz.ch/pipermail/r-sig-ecology/2010-August/001448.html


I did compare the 3 different options (see codes below), differences 
seem to be only a matter of arrow length. Hence, by considering that the 
most important are the relative length of arrows (relative to each 
other), am I allowed to use scaling=-2 (species axes correlations) for 
both analyzing site-species associations and species-species correlations?
One more questions, is there a well admitted threshold in the value of 
species-axes correlations, below which we consider that species are not 
correlated (I mean species don't differ in abundance across the sites, 
excluding the cases of non linear relationships).
If yes, do this threshold change depending on the scaling method used 
and on either the ordination is constrained or not.
Morover, if I want to overlay a vector for a quantitative 
environnemental variable, may I use also scaling=-2, from which 
correlation threshold?


I thank you in advance,
Pierre

library(vegan)
library(ggplot2)
library(grid)
data(dune)
data(dune.env)

dune=sqrt(dune)
mcap=capscale(dune~1,dist="bray") #PCoA

#sites scores
dims=c(1,2)
site=scores(mcap,display="wa",choices=dims)
site.env=cbind(site,dune.env)

#spider for management levels
dev.new();plot(mcap);coord_spider=with(dune.env,ordispider(mcap,Management,col="blue",label=F,choices=dims));dev.off()
coord_spider=as.data.frame(cbind(coord_spider[,],site.env))
names(coord_spider)[1:4]=c("MDS1","MDS2","MDS1end","MDS2end")

#species scores
#scaling1
cor.min=0.6 #below this threshold, arrows will be not plotted because 
correlation is considered too much week

cor_sp=as.data.frame(scores(mcap, dis="sp", scaling=1,choices=dims))
cor_sp$cor=with(cor_sp,sqrt(MDS1^2+MDS2^2))
cor_sp$sup=FALSE;cor_sp$sup[cor_sp$cor>=cor.min]<-TRUE
cor_sp$lab=row.names(cor_sp)
cor_sp=cor_sp[cor_sp$sup==TRUE,]
cor_sp_s1=cor_sp

#scaling2
cor.min=0.6 #below this threshold, arrows will be not plotted because 
correlation is considered too much week

cor_sp=as.data.frame(scores(mcap, dis="sp", scaling=2,choices=dims))
cor_sp$cor=with(cor_sp,sqrt(MDS1^2+MDS2^2))
cor_sp$sup=FALSE;cor_sp$sup[cor_sp$cor>=cor.min]<-TRUE
cor_sp$lab=row.names(cor_sp)
cor_sp=cor_sp[cor_sp$sup==TRUE,]
cor_sp_s2=cor_sp

#scaling -2, correlations between species and axes
#from Jari Oksanen at 
https://stat.ethz.ch/pipermail/r-sig-ecology/2010-August/001448.html
cor.min=0.6 #below this threshold, arrows will be not plotted because 
correlation is considered too much week
cor_sp=as.data.frame(scores(mcap, dis="sp", scaling=-2, const = 
sqrt(nrow(dune)-1),choices=dims))

cor_sp$cor=with(cor_sp,sqrt(MDS1^2+MDS2^2))
cor_sp$sup=FALSE;cor_sp$sup[cor_sp$cor>=cor.min]<-TRUE
cor_sp$lab=row.names(cor_sp)
cor_sp=cor_sp[cor_sp$sup==TRUE,]
cor_sp_s3=cor_sp

#plot
mon.plot1=ggplot(data=site.env)+theme_bw()+
  geom_point(aes(x=MDS1,y=MDS2,color=Management))# les sites

#add spider
mon.plot2=mon.plot1+
geom_segment(data=coord_spider,aes(x=MDS1,y=MDS2,xend=MDS1end,yend=MDS2end,colour=Management),lwd=0.5,alpha=1/3)+
geom_point(data=coord_spider,aes(x=MDS1,y=MDS2,colour=Management),cex=3,shape=19)

#add species scores as vector
#scaling1
mon.plot_s1=mon.plot2+ggtitle("Scaling 1")+
geom_point(aes(x=0,y=0),shape=21,fill="black",color="black",size=3)+#central 
point
  geom_segment(data=cor_sp_s1,aes(x=0,y=0,xend=MDS1,yend=MDS2),arrow = 
arrow(length = unit(0.3,"cm")))+#arrows
  geom_text(data = cor_sp_s1, aes(x = MDS1, y = MDS2, label = lab), 
size = 3)#labels


#scaling2
mon.plot_s2=mon.plot2+ggtitle("Scaling 2")+
geom_point(aes(x=0,y=0),shape=21,fill="black",color="black",size=3)+#central 
point
  geom_segment(data=cor_sp_s2,aes(x=0,y=0,xend=MDS1,yend=MDS2),arrow = 
arrow(length = unit(0.3,"cm")))+#arrows
  geom_text(data = cor_sp_s2, aes(x = MDS1, y = MDS2, label = lab), 
size = 3)#labels


#scaling -2, correlations between species and axes
mon.plot_s3=mon.plot2+ggtitle("Scaling -2")+
geom_point(aes(x=0,y=0),shape=21,fill="black",color="black",size=3)+#central 
point
  geom_segment(data=cor_sp_s3,aes(x=0,y=0,xend=MDS1,yend=MDS2),arrow = 
arrow(length = unit(0.3,"cm")))+#arrows
  geom_text(data = cor_sp_s3, aes(x = MDS1, y = MDS2, label = lab), 
size = 3)#labels


#plot everything
grid.newpage()
pushViewport(viewport(la