Re: [R-sig-eco] how are compuetd the species scores of pca from veganpackage ?

2014-05-27 Thread claire della vedova
Dear Jari,

You're right I missed the Design decision vignettesorry for that.
Thanks you so much for your reply.
All the best

Claire Della Vedova


-Message d'origine-
De : Jari Oksanen [mailto:jari.oksa...@oulu.fi] 
Envoyé : mardi 27 mai 2014 10:43
À : claire della vedova; r-sig-ecology@r-project.org
Objet : RE: [R-sig-eco] how are compuetd the species scores of pca from
veganpackage ?

Dear Claire Della Vedova,

It seems that you have searched in many places, except in vegan
documentation. Look at the vignette on Design decisions, section "Scaling in
redundancy analysis".

Cheers, Jari Oksanen

From: r-sig-ecology-boun...@r-project.org
 on behalf of claire della vedova

Sent: 27 May 2014 11:17
To: r-sig-ecology@r-project.org
Subject: [R-sig-eco] how are compuetd the species scores of pca from
veganpackage ?

Hi everybody,

I'm working on PCA approach, and comparing outputs from ade4 and vegan
packages.
I'm ok with the normalization of the variables coordinates coming from ade4
outputs.
(with $co : coordinates are scaled to eigen values ; with  $c1 : coordinates
are scaled to 1).

But I have difficulties to understand how are computed the Species scores in
vegan's outputs with scaling 1 or 2 options, and what means  the message
concerning scaling, especially about the 'General scaling constant of
scores'.
For example :

'Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  4.226177 '
'

I've search on the archives of R-sig-ecology , cross validate and
stackoverflow, and found nothing that helped me. If somebody has some
information about it, I would greatly appreciate some help.
All the best.

Claire Della Vedova

Here some parts of my code :


library(ade4)
library(vegan)

doubs.env <- read.csv
('http://www.sci.muni.cz/botany/zeleny/wiki/anadat-r/data
   -download/DoubsEnv.csv', row.names = 1)

## with ade4 ##
pca.ad<-dudi.pca(doubs.env, scale = TRUE, center = TRUE, scann = FALSE,nf=3)

# eigen value of the fisrt eigen vector
pca.ad$eig[1]
[1] 5.968749

#variables coordinates in first eigen vector pca.ad$co[,1] [1]  0.85280863
-0.81918008 -0.4528  0.75214647 -0.04996375  0.70722171  [7]  0.83048310
0.90260821  0.79011263 -0.76485397  0.76373149


#check the normalization of laodings
sum(pca.ad$co[,1]^2)
[1] 5.968749
#=> coordinatesscaled to eigen values

#variables normed scores in first eigen vector pca.ad$c1[,1] [1]  0.34906791
-0.33530322 -0.18535177  0.30786532 -0.02045094  0.28947691  [7]  0.33992972
0.36945166  0.32340546 -0.31306670  0.31260725


#check the normalization
sum(pca.ad$c1[,1]^2)
[1] 1
#=> coordinates scaled to 1

## with vegan ##
pca.veg<-rda(doubs.env, scale = TRUE)

# species scores for the fisrt eigen vector, with sacling 1 summary(pca.veg,
scaling=1)
  dasaltpendeb pHdurpho
 1.4752228 -1.4170507 -0.7833294  1.3010933 -0.0864293  1.2233806  1.4366031
   nitammoxydbo
 1.5613681  1.3667687 -1.3230752  1.3211335

'Scaling 1 for species and site scores
* Sites are scaled proportional to eigenvalues
* Species are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  4.226177 '



summary(pca.veg, scaling=2)[1][[1]][,1]
das alt pen deb  pH dur
1.08668311 -1.04383225 -0.57701847  0.95841533 -0.06366582  0.90117039
pho nit amm oxy dbo
1.05823501  1.15013974  1.00679334 -0.97460773  0.97317743 'Scaling 2 for
species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  4.226177 '




--
View this message in context:
http://r-sig-ecology.471788.n2.nabble.com/how-are-compuetd-the-species-score
s-of-pca-from-vegan-package-tp7578918.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] how are compuetd the species scores of pca from vegan package ?

2014-05-27 Thread claire della vedova
Hi everybody,

I'm working on PCA approach, and comparing outputs from ade4 and vegan
packages.
I'm ok with the normalization of the variables coordinates coming from ade4
outputs.
(with $co : coordinates are scaled to eigen values ; with  $c1 : coordinates
are scaled to 1).

But I have difficulties to understand how are computed the Species scores in
vegan's outputs with scaling 1 or 2 options, and what means  the message
concerning scaling, especially about the 'General scaling constant of
scores'.
For example :

'Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  4.226177 '
'

I've search on the archives of R-sig-ecology , cross validate and
stackoverflow, and found nothing that helped me. If somebody has some
information about it, I would greatly appreciate some help.
All the best.

Claire Della Vedova

Here some parts of my code :


library(ade4)
library(vegan)

doubs.env <- read.csv
('http://www.sci.muni.cz/botany/zeleny/wiki/anadat-r/data
   -download/DoubsEnv.csv', row.names = 1)

## with ade4 ##
pca.ad<-dudi.pca(doubs.env, scale = TRUE, center = TRUE, scann = FALSE,nf=3)

# eigen value of the fisrt eigen vector
pca.ad$eig[1]
[1] 5.968749

#variables coordinates in first eigen vector
pca.ad$co[,1]
[1]  0.85280863 -0.81918008 -0.4528  0.75214647 -0.04996375  0.70722171
 [7]  0.83048310  0.90260821  0.79011263 -0.76485397  0.76373149


#check the normalization of laodings
sum(pca.ad$co[,1]^2)
[1] 5.968749
#=> coordinatesscaled to eigen values

#variables normed scores in first eigen vector
pca.ad$c1[,1]
[1]  0.34906791 -0.33530322 -0.18535177  0.30786532 -0.02045094  0.28947691
 [7]  0.33992972  0.36945166  0.32340546 -0.31306670  0.31260725


#check the normalization
sum(pca.ad$c1[,1]^2)
[1] 1
#=> coordinates scaled to 1

## with vegan ##
pca.veg<-rda(doubs.env, scale = TRUE)

# species scores for the fisrt eigen vector, with sacling 1
summary(pca.veg, scaling=1)
  dasaltpendeb pHdurpho 
 1.4752228 -1.4170507 -0.7833294  1.3010933 -0.0864293  1.2233806  1.4366031 
   nitammoxydbo 
 1.5613681  1.3667687 -1.3230752  1.3211335 

'Scaling 1 for species and site scores
* Sites are scaled proportional to eigenvalues
* Species are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  4.226177 
'



summary(pca.veg, scaling=2)[1][[1]][,1]
das alt pen deb  pH dur 
1.08668311 -1.04383225 -0.57701847  0.95841533 -0.06366582  0.90117039 
pho nit amm oxy dbo 
1.05823501  1.15013974  1.00679334 -0.97460773  0.97317743 
'Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions
* General scaling constant of scores:  4.226177 '




--
View this message in context: 
http://r-sig-ecology.471788.n2.nabble.com/how-are-compuetd-the-species-scores-of-pca-from-vegan-package-tp7578918.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] troubles with global test of rda from vegan

2014-03-22 Thread claire della vedova
Hi everybody,

I’m in troubles with results I obtained using rda function of vegan package
and I would greatly appreciate some help.
I did a rda to assess if my  matrix of species abundances (18 sites and 34
species)  can be explained by my  environmental matrix (18 sites and 5
variables). Abundances were transformed according hellinger equation
First I did a rda with all my environmental variables, and then did the
overall test. It was no significant.

myrda1<-rda(decostand(abund, "hellinger")~.,VarEnv)
anova(myrda1)
Permutation test for rda under reduced model

Model: rda(formula = decostand(abund, "hellinger") ~ VAR1 + VAR2 + Var3 +
Var4 + VAR5, data = VarEnv)
 Df  Var F N.Perm Pr(>F)
Model 5 0.062863 1.025 99   0.43
Residual 12 0.147195 

 I also did the test by margin (all pvalues were no significant), and by
axis (first axis significant)
anova(myrda1, by="axis")

Model: rda(formula = decostand(abund, "hellinger") ~ VAR1 + VAR2 + Var3 + 
Var4 + VAR5, data = VarEnv)
 Df  Var  F N.Perm Pr(>F)   
RDA1  1 0.030016 2.4470199   0.01 **
RDA2  1 0.013816 1.1263 99   0.29   
RDA3  1 0.009770 0.7965 99   0.68   
RDA4  1 0.006273 0.5114 99   0.84   
RDA5  1 0.002989 0.2437 99   1.00   
Residual 12 0.147195

On the plot, first axis is explained by Var1 and Var4


Since I was surprised by the results of the global test I tried a forward
selection. Only the Var4 was kept is the final model, and the test was now
significant. I also did backward selection ;  it was the Var1 which was kept
is the final model, and the test was significant too.

So my question is, why the global test of the rda with all the environmental
variables is not significant while the test by “axis” is significant for the
first one (explain by variables Var1 and Var4) and while model selection
lead to significant test for Var1 or Var4 ?

I analyzed the VIF of the full model, and all were lower than 3
vif.cca(myrda1)
 VAR1   VAR2 Var3   Var4  VAR5 
2.573506 2.949139 2.209569 2.023914 1.854133

Thanks in advance for your help.

All the best.
Claire Della Vedova




--
View this message in context: 
http://r-sig-ecology.471788.n2.nabble.com/troubles-with-global-test-of-rda-from-vegan-tp7578754.html
Sent from the r-sig-ecology mailing list archive at Nabble.com.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] pca or nmds (with which normalization and distance ) for abundance data ?

2012-12-13 Thread claire della vedova
Dear all,

I’m a biostatistician working for a French institute involved in
environmental risk assessment, and I would need help to understand the
results I obtained from several ordination analyses. 

I have a dataset of 25 sites. For these 25 sites I have abundance data of 38
species and also the measurement of 5 environmental variables.

Here an extract of my abundance data for the 5 first sites: 

Anguinidae.ditylenchus Aphelenchidae Aphelenchoididae Aporcelaimidae

1218  184  0

 014  154  0

45 0  101  6

20 0  148  0

 0 0  118  0

 

Here the environmental data for the 5 first sites:

   ExtPond  moist   Corg   pH   DV50

 0.946  9.086  4.269 5.24 171.33

 0.682 27.139 23.813 3.82  75.45

 2.480 14.322  7.191 4.48 230.90

 3.069 18.380 11.404 3.58 211.19

 2.615 16.693  7.128 4.12 224.45

 

My aim was to study how the distribution of species is linked with
environmental data.

Firstly, I did a PCA (with vegan library), using a Hellinger transformation,
with commands like this :

acp1<-rda(decostand(myDataSpec[,c(25:62)], "hellinger"))

 

The first axe represent 19.5% the second one 16.3%. A colleague of me said
it is not so bad with abundance data, but it seems to me quite poor. What do
you think about ?

 

Then, I fitted environmental vectors with the envfit function (of vegan
library), with commands like this :

physCInd.fit3<-envfit(acp1,MyDataEnv[,c(13,18,20,21,23)], permut=4999,
na.rm=T)

It appeared that pH variable is significantly linked with the ordination,
and the pval of ExtPond is 0.1.

Next I did a RDA which is not significant.

To finish I did two NMDS. For the first one I used the Hellinger
normalization and the Bray-Curtis distance. The stress obtained value is
0.22, Non metric fit R² is 0.952 and Linear fit R2 =0.777. When I fitted the
environmental vectors , ExtPond was correlated with the ordination (pval
=0.02) and p-val of pH = 0.23

But then I read in “numerical ecology” page 449 that it’s better to
standardize the data by dividing each value by maximum abundance for species
and then use Kulcynski distance. The stress value was 0.23 , Non metric fit
R² was 0.948 and Linear fit R2 =0.69. These values are a little less good
than those of the first NMDS, but the stressplot seems to me more
homogenous.

Nevertheless,   the results I obtained are very different... When I fitted
the environmental data it appeared that ExtPond was not correlated with this
ordination (p-val=0.82) and p-val of pH=0.06. And obviously ExtPond is the
most important variable  for us ;-)

With all these results, I’m quite confused, and I don’t know what to think.
So, if someone can help me, I would appreciate it very much.  Be sure that
all comments will be welcome.

To summarize my questions are :

a)  Which ordination method would be better for my data : PCA knowing
that the represented inertia is 35.62% or NMDS  with a stress value about
0.22? 

b)  If NMDS is more adapted which one is the better? with Hellinger
normalization and Bray-Curtis distance, or with the normalization
recommended by Legendre and Legendre  and Kulcynski distance ?

c)   Is there other method to apply? I’m going to try co-inertia with
ade4 package

 

Thanks in advance.

Cheers.

Claire Della Vedova

 


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology