Re: [R-sig-eco] pollination experiment with missing value
Dear Humberto, I suppose you are interessed in the significance of the treatment factor. You can test that effect by comparing models with and without the term. You can get the multiple comparisons with the multcomp package. Here is an example using the Pastes dataset. library(lme4) library(multcomp) data(Pastes) Model - lmer(strength ~ cask + (1|batch), data = Pastes) Model2 - lmer(strength ~ (1|batch), data = Pastes) anova(Model, Model2) glht(Model, linfct = mcp(cask = Tukey)) confint(glht(Model, linfct = mcp(cask = Tukey))) PS You have a lot of treatments to test with few replicates. Testing less treatment with more replicates would be a better design. I'm affraid that the power of your current design will be rather low. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-sig-ecology-boun...@r-project.org [mailto:r-sig-ecology-boun...@r-project.org] Namens Humberto Dutra Verzonden: dinsdag 8 december 2009 0:25 Aan: r-sig-ecology@r-project.org Onderwerp: [R-sig-eco] pollination experiment with missing value Dear subscribers I performed a quite simple pollination experiment: 12 plants 12 pollination treatments per plant, plant 1 had treatments a, b, c, d... etc plant 2 had treatments a, b, c, d... The idea was to run a simple anova using plants as blocks to see the effects of treatments on fruit production, but I lost some a few treatments in some plants, thus I have to deal with missing values First I tried the obvious, arc transformed fruits fruset.aov-aov(arcfruit~treatment+Error(Plant), na.action=na.omit, fruset) but I get the wrong DF because of the missing values, and I cannot perform the multiple comparisons test. Is there a better way to deal with this unbalanced design? I did a little bit more research and I decided to run a GLM for mixed effects using a binomial distribution y-cbind(fruits,nofruits) model-glmer(y ~ treatment+ (1|Plant), binomial,data=fruset) but them I cannot get the anova table anova(model,test=F) Error in anova(model, test = F) : single argument anova for GLMMs not yet implemented Some people suggested using Anova function on the package car, but I don't see how can I get it to work with a mixed effects model like this. Any suggestions are appreciated. Are there other straight forward ways to analyze such data, given the missing values, and multiple comparisons follow up? Thank you Humberto == 'Discipline - Success doesn't just happen. You have to be intentional about it, and that takes discipline.' - John Maxwell [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Gian,, I applaud your continued struggle to understand the best route of analysis. If I were you, I would ignore the rude comments made by some and focus on the constructive replies from others. Good luck! Chris Habeck Ph.D. Candidate Department of Zoology University of Wisconsin, Madison http://habeckecology.wikispaces.com/ 2009/12/7 Gian Maria Niccolò Benucci gian.benu...@gmail.com Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... Isac Newton became famous not only for his science but also for a famous phrase that, if I don't remember it bad, act like this : If I have seen so much far away is because I stand on shoulders of Giants... I think that it needs any comment, and express itself the concept... So, I am so sorry, I also don't like the me to attitude, but you don't know how is my reality here, and I can assure you that also If I am still a student, I am alone in my research, and If have a tutor and boss for italian rules I don't have a boss for statistics, couse none could help me on that... So what could I do if I don't take models in already published literature? Anyway, I don't want to seem like the victim, I have a brain that works and I am doing my best to understand and improve my knowledge and at least lean and grow, for sure, step by step, and with a big humility, in science and in this case in statistics... Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 2 Stress: 24.54342 Two convergent solutions found after 18 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD NMS.ABCD.2ef ***FACTORS: Centroids: NMDS1 NMDS2 CommunityA -0.3271 0.1984 CommunityB -0.1956 0.1768 CommunityC 0.2520 -0.2847 CommunityD 0.2706 -0.0905 HostCorylus -0.2613 0.1876 HostOstrya 0.2613 -0.1876 Goodness of fit: r2 Pr(r) Community 0.1897 0.017982 * Host 0.1778 0.001998 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. NMS.1.3 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 3 Stress: 16.29226 Two convergent solutions found after 6 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on sqrtABCD NMS.ABCD.3ef ***FACTORS: Centroids: NMDS1 NMDS2 NMDS3 CommunityA 0.3881 -0.2702 0.1536 CommunityB 0.1407 -0.2344 0.0197 CommunityC -0.2053 0.3566 -0.0219 CommunityD -0.3235 0.1480 -0.1514 HostCorylus 0.2644 -0.2523 0.0866 HostOstrya -0.2644 0.2523 -0.0866 Goodness of fit: r2 Pr(r) Community 0.1798 0.005994 ** Host 0.1581 0.000999 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 P values based on 1000 permutations. I got 10 sample units for each community data (40 in total) You said at the end : *I do wonder if you are not hitting the curse of dimensionality here?* Can you explain me what do you mean for hitting the curse of dimensionality if I am not so demanding... ... and then: *it would be nice to look at the ordination but how you do that I don't know.* I would be glad if you see the graphs of my ordinations, Can I send them to you? That would be great... let me know about that. I used to plot in this way: plot(NMS.1, type=n, dis= sp) ordisymbol(NMS.1, env.table, Host, legend=T) Anyway I have to admit that with 2 and at least 3 dimensions the points into the ordinantion plot are better separated in reasons to the data matrix, so what to do? better fittind of points ant bigger stress or the contrary? I think is enough, thank you so much for your help, I'll appreciate any comments! :) Gian And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only
Re: [R-sig-eco] pollination experiment with missing value
Hi Thierry Thanks for the reply, you are right about lack of power. I appreciated the suggestion of doing lmer, however I was wondering if there is book out there explaining how to present the results of such analysis. I have Pinheiros and Bates and Crawle's R book, and althought they do a good job explaining how to perform these tests they don't explain how to present them on a scientific publication. I am more interested though in comparing some specific treatments, thus planned contrasts are also a possiblity. Does anyone know if reducing the number of post hoc comparisons by using planned constrasts will increase my power and if yes, how do I get to do these planed comparisons? Thanks Humberto ONKELINX, Thierry wrote: Dear Humberto, I suppose you are interessed in the significance of the treatment factor. You can test that effect by comparing models with and without the term. You can get the multiple comparisons with the multcomp package. Here is an example using the Pastes dataset. library(lme4) library(multcomp) data(Pastes) Model - lmer(strength ~ cask + (1|batch), data = Pastes) Model2 - lmer(strength ~ (1|batch), data = Pastes) anova(Model, Model2) glht(Model, linfct = mcp(cask = Tukey)) confint(glht(Model, linfct = mcp(cask = Tukey))) PS You have a lot of treatments to test with few replicates. Testing less treatment with more replicates would be a better design. I'm affraid that the power of your current design will be rather low. HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-sig-ecology-boun...@r-project.org [mailto:r-sig-ecology-boun...@r-project.org] Namens Humberto Dutra Verzonden: dinsdag 8 december 2009 0:25 Aan: r-sig-ecology@r-project.org Onderwerp: [R-sig-eco] pollination experiment with missing value Dear subscribers I performed a quite simple pollination experiment: 12 plants 12 pollination treatments per plant, plant 1 had treatments a, b, c, d... etc plant 2 had treatments a, b, c, d... The idea was to run a simple anova using plants as blocks to see the effects of treatments on fruit production, but I lost some a few treatments in some plants, thus I have to deal with missing values First I tried the obvious, arc transformed fruits fruset.aov-aov(arcfruit~treatment+Error(Plant), na.action=na.omit, fruset) but I get the wrong DF because of the missing values, and I cannot perform the multiple comparisons test. Is there a better way to deal with this unbalanced design? I did a little bit more research and I decided to run a GLM for mixed effects using a binomial distribution y-cbind(fruits,nofruits) model-glmer(y ~ treatment+ (1|Plant), binomial,data=fruset) but them I cannot get the anova table anova(model,test=F) Error in anova(model, test = F) : single argument anova for GLMMs not yet implemented Some people suggested using Anova function on the package car, but I don't see how can I get it to work with a mixed effects model like this. Any suggestions are appreciated. Are there other straight forward ways to analyze such data, given the missing values, and multiple comparisons follow up? Thank you Humberto == 'Discipline - Success doesn't just happen. You have to be intentional about it, and that takes discipline.' - John Maxwell [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
On Mon, 2009-12-07 at 20:10 +0100, Gian Maria Niccolò Benucci wrote: Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... That's fine, but temper that with a realisation that not everyone knows what they are doing numerically. So be critical about what you read, learn about the methods and what they do. snip / Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: Thanks for these: one way of trying to choose a dimensionality for the solution is to plot the stress as a function of k (k on the x-axis, stress on the y) - this is often called a screeplot as you are looking for a dramatic change in slope. I took your stresses and plotted them against k (crudely): plot(2:4, c(24.54342, 16.29226, 11.68632), type = b) and doesn't seem to be any noticeable change here, so not much help there. Looking at the goodness of fit stats, the story they tell doesn't really change much depending on whether you use 2,3, or 4 dimensions. So perhaps stick with 2 in that case. Also, try: stressplot(MOD) where mod is the object returned by metaMDS. The stressplot plots your original dissimilarities against dissimilarities derived from the nMDS configuration. It also shows the monotonic regression fit and a few goodness of fit criteria. You could evaluate the models with different k using these plots. NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) snip / I got 10 sample units for each community data (40 in total) You said at the end : *I do wonder if you are not hitting the curse of dimensionality here?* Can you explain me what do you mean for hitting the curse of dimensionality if I am not so demanding... Yep, sorry, that was a bit cryptic. Curse of dimensionality is a phrase coined by Belman (1961) and refers to the problem of defining localness in high dimensions; neighbourhoods with a fixed number of samples become less local as the number of dimensions in creases. basically, if you have a number of dimensions, the more dimensions you have the easier it is for a sample to lie a long way from the rest of the data along a single dimension and thus have large dissimilarity. This doesn't appear to be the case here though; 4 is low dimensionality (hence my wondering if this was or wasn't a problem), but when you'd only shown the k=4 data, I did wonder if the low r2 was due to you points being widely spread along one of the 4D; i.e. was the more complex solution leading to the low r2? By looks of things, the low r2 is probably more to do with the small, but significant, effects of your two covariates. ... and then: *it would be nice to look at the ordination but how you do that I don't know.* I would be glad if you see the graphs of my ordinations, Can I send them to you? That would be great... let me know about that. I used to plot in this way: plot(NMS.1, type=n, dis= sp) ordisymbol(NMS.1, env.table, Host, legend=T) Anyway I have to admit that with 2 and at least 3 dimensions the points into the ordinantion plot are better separated in reasons to the data matrix, so what to do? better fittind of points ant bigger stress or the contrary? If this were me, seeing as the interpretation/results don't change, I'd probably stick with k=2 so you can easily draw the ordination for presentation in your phd work or future papers. HTH G I think is enough, thank you so much for your help, I'll appreciate any comments! :) Gian And thank you all for the kind responses... I do not want to torture myself for sure... :) I red (lot of) publications about fungal community ecology studies (soil fungi), my research field indeed, and all uses NMDS or DCA as ordination techniques... So, I am only trying to do my best useing R for calculating them... Would you walk in front of a bus if you saw lots of other people doing it? I doubt it. This kind of me to attitude to science is quite demoralising when reviewing manuscripts and reading the literature. DCA was invented to solve a specific problem with CA - namely the arch artefact. I forget whether this is in Jari's public lecture notes, vegan vignettes/tutorials or in one of his lectures on a course we taught together, but DCA replaces the arch artefact with other artefacts that make the points look like a trumpet or a diamond in ordination space. Why DCA is used
Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?
Hi Gian and others, I think we better stop worrying about subjective interpretations of emotional backgrounds of what in other aspects are absolutely helpful discussion threads... I guess part of the challenge on this mailing list is to span the whole range of expertise with useful discussion/output/help for everyone, be it a student or an expert. I found this mailing list very helpful many times for my own questions, but also very informative when just following the threads on other questions... Gian, in my opinion, 2 dimensions are absolutely ok, especially if they do visualize an (obvious) effect in your study. In other words, if 2 dimensions show you an effect of Host but not of Area, the effect is obviously strong enough. Then I would not worry about stress too much. However, there may still be an effect of Area, maybe visible in more dimensions, but it´s obviously of minor importance. I personally like a combination of NMDS with the permutational MANOVA approach (by Marti Anderson) implemented in the function adonis() in vegan. You can use the same dissimilarity measure (Bray-Curtis) used for the NMDS and can test the Area vs. the Host effect on parasite (was it?) composition. I think that could be a very useful complement to an NMDS-derived ordination plot and then you may also regard high-stress representations (and that´s what all the low-dimensional ordination plots really ARE!) in a different light. Complementations like the permanova are in my opinion better than trying the full spectrum of ordination methods until finally some kind of pattern gets uncovered (comes quite close to the much too often encountered data-fishing expeditions). And though copying analysis strategies is probably not quite like throwing yourself in front of a bus, there is some benefit in using what people working in a specific field regard their standard methods (wait for the reviews to discover this). In any case, a responsible choice for a type of analysis is oriented along the study design and the data at hand. cheers, gabriel -- Dr. Gabriel Singer Department of Freshwater Ecology - University of Vienna and Wassercluster Lunz Biologische Station GmbH +43-(0)664-1266747 gabriel.sin...@univie.ac.at Gian Maria Niccolò Benucci wrote: Hi Gavin and Hi all, I will not go in front of a bus for sure, I not mad, at least I am not still mad... :) I would like to tell you that I am a Ph.D. student, and for what I know, Ph.D. student still have to understand things studing those from whom wrote before them... Isac Newton became famous not only for his science but also for a famous phrase that, if I don't remember it bad, act like this : If I have seen so much far away is because I stand on shoulders of Giants... I think that it needs any comment, and express itself the concept... So, I am so sorry, I also don't like the me to attitude, but you don't know how is my reality here, and I can assure you that also If I am still a student, I am alone in my research, and If have a tutor and boss for italian rules I don't have a boss for statistics, couse none could help me on that... So what could I do if I don't take models in already published literature? Anyway, I don't want to seem like the victim, I have a brain that works and I am doing my best to understand and improve my knowledge and at least lean and grow, for sure, step by step, and with a big humility, in science and in this case in statistics... Anyway... For continuing the brainstorm if I can...The Host effect is what I think is more interesting for the ecological point of view of my trials also becasue the 4 communities have two by two the same host, I mean A and B, Corylus, while B and C, Ostrya... If I plot the factors of the envifit into the graph and the evidence of separation seems clear... That's are my metaMDS with 2 and 3 dimensions: NMS.1 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 2 Stress: 24.54342 Two convergent solutions found after 18 tries Scaling: centring, PC rotation, halfchange scaling Species: expanded scores based on ‘sqrtABCD’ NMS.ABCD.2ef ***FACTORS: Centroids: NMDS1 NMDS2 CommunityA -0.3271 0.1984 CommunityB -0.1956 0.1768 CommunityC 0.2520 -0.2847 CommunityD 0.2706 -0.0905 HostCorylus -0.2613 0.1876 HostOstrya 0.2613 -0.1876 Goodness of fit: r2 Pr(r) Community 0.1897 0.017982 * Host 0.1778 0.001998 ** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 P values based on 1000 permutations. NMS.1.3 Call: metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100, autotransform = F) Nonmetric Multidimensional Scaling using isoMDS (MASS package) Data: sqrtABCD Distance: bray shortest Dimensions: 3 Stress: 16.29226 Two