Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :)
I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... Isac Newton became famous not only for his science but also for a famous phrase that, if I don't remember it bad, act like this :" If I have seen so much far away is because I stand on shoulders of Giants"... I think that it needs any comment, and express itself the concept... So, I am so sorry, I also don't like the "me to" attitude, but you don't know how is my reality here, and I can assure you that also If I am still a "student", I am alone in my research, and If have a tutor and boss for italian rules I don't have a boss for statistics, couse none could help me on that... So what could I do if I don't take models in already published literature? Anyway, I don't want to seem like the victim, I have a brain that works and I am doing my best to understand and improve my knowledge and at least lean and grow, for sure, step by step, and with a big humility, in science and in this case in statistics... Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: > NMS.1 Call: metaMDS(comm = sqrtABCD, distance = "bray", k = 2, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 2 Stress: 24.54342 Two convergent solutions found after 18 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD > NMS.ABCD.2ef ***FACTORS: Centroids: NMDS1 NMDS2 CommunityA -0.3271 0.1984 CommunityB -0.1956 0.1768 CommunityC 0.2520 -0.2847 CommunityD 0.2706 -0.0905 HostCorylus -0.2613 0.1876 HostOstrya 0.2613 -0.1876 Goodness of fit: r2 Pr(>r) Community 0.1897 0.017982 * Host 0.1778 0.001998 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. > NMS.1.3 Call: metaMDS(comm = sqrtABCD, distance = "bray", k = 3, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 3 Stress: 16.29226 Two convergent solutions found after 6 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD > NMS.ABCD.3ef ***FACTORS: Centroids: NMDS1 NMDS2 NMDS3 CommunityA 0.3881 -0.2702 0.1536 CommunityB 0.1407 -0.2344 0.0197 CommunityC -0.2053 0.3566 -0.0219 CommunityD -0.3235 0.1480 -0.1514 HostCorylus 0.2644 -0.2523 0.0866 HostOstrya -0.2644 0.2523 -0.0866 Goodness of fit: r2 Pr(>r) Community 0.1798 0.005994 ** Host 0.1581 0.000999 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. > I got 10 sample units for each community data (40 in total) You said at the end : "*I do wonder if you are not hitting the curse of dimensionality here?"* Can you explain me what do you mean for "hitting the curse of dimensionality" if I am not so demanding... ... and then: *"it would be nice to look at the ordination but how you do that I don't know".* I would be glad if you see the graphs of my ordinations, Can I send them to you? That would be great... let me know about that. I used to plot in this way: > plot(NMS.1, type="n", dis= "sp") > ordisymbol(NMS.1, env.table, "Host", legend=T) Anyway I have to admit that with 2 and at least 3 dimensions the points into the ordinantion plot are better separated in reasons to the data matrix, so what to do? better fittind of points ant bigger stress or the contrary? I think is enough, thank you so much for your help, I'll appreciate any comments! :) Gian > And thank you all for the kind responses... > > > > I do not want to torture myself for sure... :) I red (lot of) > publications > > about fungal community ecology studies (soil fungi), my research field > > indeed, and all uses NMDS or DCA as ordination techniques... So, I am > only > > trying to do my best useing R for calculating them... > > Would you walk in front of a bus if you saw lots of other people doing > it? I doubt it. This kind of "me to" attitude to science is quite > demoralising when reviewing manuscripts and reading the literature. > > DCA was invented to solve a specific problem with CA - namely the arch > artefact. I forget whether this is in Jari's public lecture notes, vegan > vignettes/tutorials or in one of his lectures on a course we taught > together, but DCA replaces the arch artefact with other artefacts that > make the points look like a trumpet or a diamond in ordination space. > > Why DCA is used as a default instead of a special case escapes me. You > really shouldn't use DCA at all if you can get away with it as it is > doing some nasty things to your data. > > Alternatives; i) NMDS ii) PCA after application of a transformation > (Legendre & Gallagher 2001, Oecologia). And there are probably others... > > > > > What I need now is a good environmental interpretation of my work... > > > > Then I found the fantastic Jari's pdf about "Multivariate Analysis of > > Ecological Communities in R: vegan tutorial" and I went to the passage > about > > factors and vectors fitting... > > That's my R code: > > > > > NMS.ABCDsqrt > > > > Call: > > metaMDS(comm = sqrtABCD, distance = "bray", k = 4, trymax = 100, > > autotransform = F) > > > > Nonmetric Multidimensional Scaling using isoMDS (MASS package) > > > > Data: sqrtABCD > > Distance: bray shortest > > > > Dimensions: 4 > > Stress: 11.68632 > > What was the stress with k = 2 and k = 3. As Jari has already mentioned, > how are you going to interpret and visualise this 4D configuration of > points (you can't plot NMDS1 vs NMDS2, NMDS1 vs NMDS3 etc. for reasons > explained to you earlier in this thread). > > > Two convergent solutions found after 2 tries > > Scaling: centring, PC rotation, halfchange scaling > > Species: expanded scores based on sqrtABCD > > > > > envfit(NMS.ABCDsqrt, env.table, permu=1000) ->NMS.ABCDsqrtef > > > NMS.ABCDsqrtef > > > > ***FACTORS: > > > > Centroids: > > NMDS1 NMDS2 NMDS3 NMDS4 > > CommunityA -0.3821 0.3822 -0.1173 -0.1232 > > CommunityB -0.1849 0.2748 0.0076 -0.0720 > > CommunityC 0.2206 -0.4261 -0.0505 0.1197 > > CommunityD 0.3465 -0.2310 0.1603 0.0756 > > HostCorylus -0.2835 0.3285 -0.0549 -0.0976 > > HostOstrya 0.2835 -0.3285 0.0549 0.0976 > > > > Goodness of fit: > > r2 Pr(>r) > > Community 0.2009 0.001998 ** > > Host 0.1818 0.000999 *** > > --- > > Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 > > P values based on 1000 permutations. > > > > > names(NMS.ABCDsqrtef) > > [1] "vectors" "factors" > > > NMS.ABCDsqrtef$vectors > > NULL > > > > > > > I have an enviromental matrix (env.table) that contains only area > (A,B,C,D) > > and tree species (Corylus sp. and Ostrya sp. ) differentiation, so I have > > the area and the tree species of each samples is related to... > > I could plot factors on the graph but not vectors because there aren't > > vectors in reason to the absence of numerical data in the env. matrix... > > isn't it right? Aren't the R2 values too low? > > Did you read ?envfit ? It states: > .... > (r^2). For factors this is defined as r^2 = 1 - ss_w/ss_t, where > ss_w and ss_t are within-group and total sums of squares. > > So this statistic here is looking at how constrained within the 4D space > the levels of each factor are in relation to the overall spread of the > points. This looks to me like some evidence for grouping of your sites > on basis of Community and stronger evidence for Host. The "effect" is > small but significant. I do wonder if you are not hitting the curse of > dimensionality here? > > The interpretation will depend on the number of samples. It would be > nice to look at the ordination but how you do that I don't know. > > G > > > > > Many many thank you for answering... > > > > Gian > > > > > > > > > > > > > > > > > > > > > > > Jari, > > > > > > > > I am here again ... :) > > > > So, to try having a comparison of the real goodness of my metaMDS > data I > > > > tried to perform a DCA (with same input table) > > > > Then please forgive me if I do somethign wrong with it... That's my R > > > code: > > > > > > Why DCA? What lead you to torture your data so? > > > > > > > >decorana(sqrtABCD, iweigh=0, ira=0) -> DCA.1 > > > > > DCA.1 > > > > > > > > Call: > > > > decorana(veg = sqrtABCD, iweigh = 0, ira = 0) > > > > > > > > Detrended correspondence analysis with 26 segments. > > > > Rescaling of axes with 4 iterations. > > > > > > > > DCA1 DCA2 DCA3 DCA4 > > > > Eigenvalues 0.6688 0.5387 0.4822 0.3752 > > > > Decorana values 0.7912 0.5795 0.4145 0.2931 > > > > Axis lengths 5.9974 3.7036 3.6121 3.3802 > > > > > > > > > > > > > > > > > In that situation the graph is still good but the differences between > the > > > > two clades are little more confused, maybe in the axe II (I mean the > > > > vertical one) in this case there is a better separation. > > > > What do the "Decorana values" really mean? > > > > > > ?decorana > > > > > > Basically, in the original DECORANA code the Eigenvalues reported were > > > computed at the wrong stage of the "detrending" processes. Jari > realised > > > this when interfacing the old DECORANA code with R. Jari altered the > > > code to compute the correct Eigenvalues, but chose to also report the > > > values you'd get from DECORANA or Canoco to stop people complaining > that > > > vegan was doing DCA incorrectly. > > > > > > > And how about the segments? > > > > > > What about them? Do you know how DCA works? The standard detrending > > > breaks the first (D)CA axis into 26 sequential chunks or segements. the > > > 26 is the default, but it can be changed. Within each chunk, the mean > > > trial site score for axis 2 for sites in that chunk is subtracted from > > > the trial axis 2 site scores of the sites in the chunk. This detrending > > > is what gets rid of the "arch" found in some CA plots and is the reason > > > DCA was invented. > > > > > > > > > > > How can I do something better? > > > > > > Are you trying to separate the two clades? Do you know a priori which > > > samples belong to which clade? If so, one of the many classification > > > methods in R would be more useful as they look to separate the a priori > > > defined groups best. The methods you have been using thus far aim to > > > represent the dissimilarities between samples best in a low dimensional > > > space. > > > > > > HTH > > > > > > G > > > > > > > > > > > Many thank you in advance, > > > > > > > > G. > > > > > > > [[alternative HTML version deleted]] > > > > _______________________________________________ > > R-sig-ecology mailing list > > R-sig-ecology@r-project.org > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology > -- > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > Gower Street, London [w] > http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/> > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > -- Dr. Gian Maria Niccolò Benucci Department of Applied Biology - University of Perugia Borgo XX Giugno, 74 I-06121 - Perugia, ITALY Email: gian.benu...@gmail.com Tel: +39.075.5856433 [[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology