Re: [R-sig-eco] pollination experiment with missing value

2009-12-08 Thread ONKELINX, Thierry
Dear Humberto,

I suppose you are interessed in the significance of the treatment
factor. You can test that effect by comparing models with and without
the term. You can get the multiple comparisons with the multcomp
package. Here is an example using the Pastes dataset.

library(lme4)
library(multcomp)
data(Pastes)
Model - lmer(strength ~ cask + (1|batch), data = Pastes)
Model2 - lmer(strength ~ (1|batch), data = Pastes)
anova(Model, Model2)
glht(Model, linfct = mcp(cask = Tukey))
confint(glht(Model, linfct = mcp(cask = Tukey)))

PS You have a lot of treatments to test with few replicates. Testing
less treatment with more replicates would be a better design. I'm
affraid that the power of your current design will be rather low.

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-sig-ecology-boun...@r-project.org
[mailto:r-sig-ecology-boun...@r-project.org] Namens Humberto Dutra
Verzonden: dinsdag 8 december 2009 0:25
Aan: r-sig-ecology@r-project.org
Onderwerp: [R-sig-eco] pollination experiment with missing value

Dear subscribers
I performed a  quite simple pollination experiment: 
12 plants
12 pollination treatments per plant,
plant 1 had treatments a, b, c,
d... etc plant 2 had treatments a, b, c, d... 
The idea was to run a simple anova using plants as blocks to see the
effects of treatments on fruit production, but I lost some a few
treatments in some plants, thus I have to deal with missing values First
I tried the obvious, arc transformed fruits 
 fruset.aov-aov(arcfruit~treatment+Error(Plant),
na.action=na.omit, fruset)

but I get the wrong DF because of the missing values, and I cannot
perform the multiple comparisons test. Is there a better way to deal
with this unbalanced design?

I did a little bit more research and  I decided to run a GLM for mixed
effects using a binomial distribution
y-cbind(fruits,nofruits)
 model-glmer(y ~ treatment+ (1|Plant), binomial,data=fruset)

but them I cannot get the anova table 
 anova(model,test=F)
Error in anova(model, test = F) : 
  single argument anova for GLMMs not yet implemented

Some people suggested using Anova function on the package car, but I
don't see how can I get it to work with a mixed effects model like this.


Any suggestions are appreciated. Are there other straight forward ways
to analyze such data, given the missing values, and multiple comparisons
follow up?
Thank you

Humberto





==
'Discipline - Success doesn't just happen. You have to be intentional
about it, and that takes discipline.' - John Maxwell



  
[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-08 Thread Chris Habeck
Gian,,

I applaud your continued struggle to understand the best route of analysis.
 If I were you, I would ignore the rude comments made by some and focus on
the constructive replies from others.  Good luck!

Chris Habeck
Ph.D. Candidate
Department of Zoology
University of Wisconsin, Madison
http://habeckecology.wikispaces.com/


2009/12/7 Gian Maria Niccolò Benucci gian.benu...@gmail.com

 Hi Gavin and Hi all,

 I will not go in front of a bus for sure, I not mad, at least I am not
 still
 mad... :)

 I would like to tell you that I am a Ph.D. student, and for what I know,
 Ph.D. student still have to understand things studing those from whom wrote
 before them...

 Isac Newton became famous not only for his science but also for a famous
 phrase that, if I don't remember it bad, act like this : If I have seen so
 much far away is because I stand on shoulders of Giants... I think that it
 needs any comment, and express itself the concept...

 So, I am so sorry, I also don't like the me to attitude, but you don't
 know how is my reality here, and I can assure you that also If I am still a
 student, I am alone in my research, and If have a tutor and boss for
 italian rules I don't have a boss for statistics, couse none could help me
 on that...
 So what could I do if I don't take models in already published literature?

 Anyway, I don't want to seem like the victim, I have a brain that works and
 I am doing my best to understand and improve my knowledge and at least lean
 and grow, for sure, step by step, and with a big humility, in science and
 in
 this case in statistics...

 Anyway... For continuing the brainstorm if I can...The Host effect is what
 I
 think is more interesting for the ecological point of view of my trials
 also
 becasue the 4 communities have two by two the same host, I mean A and B,
 Corylus, while B and C, Ostrya...
 If I plot the factors of the envifit into the graph and the evidence of
 separation seems clear...

 That's are my metaMDS with 2 and 3 dimensions:

  NMS.1

 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
 autotransform = F)

 Nonmetric Multidimensional Scaling using isoMDS (MASS package)

 Data: sqrtABCD
 Distance: bray shortest

 Dimensions: 2
 Stress: 24.54342
 Two convergent solutions found after 18 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on ‘sqrtABCD’

  NMS.ABCD.2ef

 ***FACTORS:

 Centroids:
  NMDS1   NMDS2
 CommunityA  -0.3271  0.1984
 CommunityB  -0.1956  0.1768
 CommunityC   0.2520 -0.2847
 CommunityD   0.2706 -0.0905
 HostCorylus -0.2613  0.1876
 HostOstrya   0.2613 -0.1876

 Goodness of fit:
  r2   Pr(r)
 Community 0.1897 0.017982 *
 Host  0.1778 0.001998 **
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 P values based on 1000 permutations.
  NMS.1.3

 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100,
 autotransform = F)

 Nonmetric Multidimensional Scaling using isoMDS (MASS package)

 Data: sqrtABCD
 Distance: bray shortest

 Dimensions: 3
 Stress: 16.29226
 Two convergent solutions found after 6 tries
 Scaling: centring, PC rotation, halfchange scaling
 Species: expanded scores based on ‘sqrtABCD’

  NMS.ABCD.3ef

 ***FACTORS:

 Centroids:
  NMDS1   NMDS2   NMDS3
 CommunityA   0.3881 -0.2702  0.1536
 CommunityB   0.1407 -0.2344  0.0197
 CommunityC  -0.2053  0.3566 -0.0219
 CommunityD  -0.3235  0.1480 -0.1514
 HostCorylus  0.2644 -0.2523  0.0866
 HostOstrya  -0.2644  0.2523 -0.0866

 Goodness of fit:
  r2   Pr(r)
 Community 0.1798 0.005994 **
 Host  0.1581 0.000999 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 P values based on 1000 permutations.
 

 I got 10 sample units for each community data (40 in total)

 You said at the end : *I do wonder if you are not hitting the curse of
 dimensionality here?*

 Can you explain me what do you mean for hitting the curse of
 dimensionality if I am not so demanding...

 ... and then: *it would be nice to look at the ordination but how you do
 that I don't know.*

 I would be glad if you see the graphs of my ordinations, Can I send them to
 you? That would be great... let me know about that. I used to plot in this
 way:

  plot(NMS.1, type=n, dis= sp)
  ordisymbol(NMS.1, env.table, Host, legend=T)

 Anyway I have to admit that with 2 and at least 3 dimensions the points
 into
 the ordinantion plot are better separated in reasons to the data matrix, so
 what to do? better fittind of points ant bigger stress or the contrary?

 I think is enough, thank you so much for your help, I'll appreciate any
 comments! :)

 Gian






  And thank you all for the kind responses...
  
   I do not want to torture myself for sure... :) I red (lot of)
  publications
   about fungal community ecology studies (soil fungi), my research field
   indeed, and all uses NMDS or DCA as ordination techniques... So, I am
  only
   

Re: [R-sig-eco] pollination experiment with missing value

2009-12-08 Thread hpdutra

Hi Thierry
Thanks for the reply, you are right about lack of power. I appreciated the
suggestion of doing lmer, however I was wondering if there is book out there
explaining how to present the results of such analysis. I have Pinheiros and
Bates and Crawle's R book, and althought they do a good job explaining how
to perform these tests they don't explain how to present them on a
scientific publication.
I am more interested though in comparing some specific treatments, thus
planned contrasts are also a possiblity. Does anyone know if  reducing the
number of post hoc comparisons by using planned constrasts will increase my
power and if yes, how do I get to do these planed comparisons?
Thanks
Humberto


ONKELINX, Thierry wrote:
 
 Dear Humberto,
 
 I suppose you are interessed in the significance of the treatment
 factor. You can test that effect by comparing models with and without
 the term. You can get the multiple comparisons with the multcomp
 package. Here is an example using the Pastes dataset.
 
 library(lme4)
 library(multcomp)
 data(Pastes)
 Model - lmer(strength ~ cask + (1|batch), data = Pastes)
 Model2 - lmer(strength ~ (1|batch), data = Pastes)
 anova(Model, Model2)
 glht(Model, linfct = mcp(cask = Tukey))
 confint(glht(Model, linfct = mcp(cask = Tukey)))
 
 PS You have a lot of treatments to test with few replicates. Testing
 less treatment with more replicates would be a better design. I'm
 affraid that the power of your current design will be rather low.
 
 HTH,
 
 Thierry
 
 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek
 team Biometrie  Kwaliteitszorg
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 
 Research Institute for Nature and Forest
 team Biometrics  Quality Assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium
 
 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be
 
 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher
 
 The plural of anecdote is not data.
 ~ Roger Brinner
 
 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey
 
 -Oorspronkelijk bericht-
 Van: r-sig-ecology-boun...@r-project.org
 [mailto:r-sig-ecology-boun...@r-project.org] Namens Humberto Dutra
 Verzonden: dinsdag 8 december 2009 0:25
 Aan: r-sig-ecology@r-project.org
 Onderwerp: [R-sig-eco] pollination experiment with missing value
 
 Dear subscribers
 I performed a  quite simple pollination experiment: 
 12 plants
 12 pollination treatments per plant,
 plant 1 had treatments a, b, c,
 d... etc plant 2 had treatments a, b, c, d... 
 The idea was to run a simple anova using plants as blocks to see the
 effects of treatments on fruit production, but I lost some a few
 treatments in some plants, thus I have to deal with missing values First
 I tried the obvious, arc transformed fruits 
 fruset.aov-aov(arcfruit~treatment+Error(Plant),
 na.action=na.omit, fruset)
 
 but I get the wrong DF because of the missing values, and I cannot
 perform the multiple comparisons test. Is there a better way to deal
 with this unbalanced design?
 
 I did a little bit more research and  I decided to run a GLM for mixed
 effects using a binomial distribution
y-cbind(fruits,nofruits)
 model-glmer(y ~ treatment+ (1|Plant), binomial,data=fruset)
 
 but them I cannot get the anova table 
 anova(model,test=F)
 Error in anova(model, test = F) : 
   single argument anova for GLMMs not yet implemented
 
 Some people suggested using Anova function on the package car, but I
 don't see how can I get it to work with a mixed effects model like this.
 
 
 Any suggestions are appreciated. Are there other straight forward ways
 to analyze such data, given the missing values, and multiple comparisons
 follow up?
 Thank you
 
 Humberto
 
 
 
 
 
 ==
 'Discipline - Success doesn't just happen. You have to be intentional
 about it, and that takes discipline.' - John Maxwell
 
 
 
   
   [[alternative HTML version deleted]]
 
 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
 
 Druk dit bericht a.u.b. niet onnodig af.
 Please do not print this message unnecessarily.
 
 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver
 weer 
 en binden het INBO onder geen enkel beding, zolang dit bericht niet
 bevestigd is
 door een geldig ondertekend document. The views expressed in  this message 
 and any annex are purely those of the writer and may not be regarded as
 stating 
 an official position of INBO, as long as the message is not confirmed by a
 duly 
 signed document.
 
 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-08 Thread Gavin Simpson
On Mon, 2009-12-07 at 20:10 +0100, Gian Maria Niccolò Benucci wrote:
 Hi Gavin and Hi all,
 
 I will not go in front of a bus for sure, I not mad, at least I am not still
 mad... :)
 
 I would like to tell you that I am a Ph.D. student, and for what I know,
 Ph.D. student still have to understand things studing those from whom wrote
 before them...

That's fine, but temper that with a realisation that not everyone knows
what they are doing numerically. So be critical about what you read,
learn about the methods and what they do.

snip /

 Anyway... For continuing the brainstorm if I can...The Host effect is what I
 think is more interesting for the ecological point of view of my trials also
 becasue the 4 communities have two by two the same host, I mean A and B,
 Corylus, while B and C, Ostrya...
 If I plot the factors of the envifit into the graph and the evidence of
 separation seems clear...
 
 That's are my metaMDS with 2 and 3 dimensions:

Thanks for these: one way of trying to choose a dimensionality for the
solution is to plot the stress as a function of k (k on the x-axis,
stress on the y) - this is often called a screeplot as you are looking
for a dramatic change in slope. I took your stresses and plotted them
against k (crudely):

plot(2:4, c(24.54342, 16.29226, 11.68632), type = b)

and doesn't seem to be any noticeable change here, so not much help
there.

Looking at the goodness of fit stats, the story they tell doesn't really
change much depending on whether you use 2,3, or 4 dimensions. So
perhaps stick with 2 in that case.

Also, try:

stressplot(MOD)

where mod is the object returned by metaMDS. The stressplot plots your
original dissimilarities against dissimilarities derived from the nMDS
configuration. It also shows the monotonic regression fit and a few
goodness of fit criteria. You could evaluate the models with different k
using these plots.

 
  NMS.1
 
 Call:
 metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
 autotransform = F)
snip /
 I got 10 sample units for each community data (40 in total)
 
 You said at the end : *I do wonder if you are not hitting the curse of
 dimensionality here?*
 
 Can you explain me what do you mean for hitting the curse of
 dimensionality if I am not so demanding...

Yep, sorry, that was a bit cryptic. Curse of dimensionality is a phrase
coined by Belman (1961) and refers to the problem of defining
localness in high dimensions; neighbourhoods with a fixed number of
samples become less local as the number of dimensions in creases.
basically, if you have a number of dimensions, the more dimensions you
have the easier it is for a sample to lie a long way from the rest of
the data along a single dimension and thus have large dissimilarity.

This doesn't appear to be the case here though; 4 is low dimensionality
(hence my wondering if this was or wasn't a problem), but when you'd
only shown the k=4 data, I did wonder if the low r2 was due to you
points being widely spread along one of the 4D; i.e. was the more
complex solution leading to the low r2?

By looks of things, the low r2 is probably more to do with the small,
but significant, effects of your two covariates.

 
 ... and then: *it would be nice to look at the ordination but how you do
 that I don't know.*
 
 I would be glad if you see the graphs of my ordinations, Can I send them to
 you? That would be great... let me know about that. I used to plot in this
 way:
 
  plot(NMS.1, type=n, dis= sp)
  ordisymbol(NMS.1, env.table, Host, legend=T)
 
 Anyway I have to admit that with 2 and at least 3 dimensions the points into
 the ordinantion plot are better separated in reasons to the data matrix, so
 what to do? better fittind of points ant bigger stress or the contrary?

If this were me, seeing as the interpretation/results don't change, I'd
probably stick with k=2 so you can easily draw the ordination for
presentation in your phd work or future papers.

HTH

G

 
 I think is enough, thank you so much for your help, I'll appreciate any
 comments! :)
 
 Gian
 
 
 
 
 
 
  And thank you all for the kind responses...
  
   I do not want to torture myself for sure... :) I red (lot of)
  publications
   about fungal community ecology studies (soil fungi), my research field
   indeed, and all uses NMDS or DCA as ordination techniques... So, I am
  only
   trying to do my best useing R for calculating them...
 
  Would you walk in front of a bus if you saw lots of other people doing
  it? I doubt it. This kind of me to attitude to science is quite
  demoralising when reviewing manuscripts and reading the literature.
 
  DCA was invented to solve a specific problem with CA - namely the arch
  artefact. I forget whether this is in Jari's public lecture notes, vegan
  vignettes/tutorials or in one of his lectures on a course we taught
  together, but DCA replaces the arch artefact with other artefacts that
  make the points look like a trumpet or a diamond in ordination space.
 
  Why DCA is used 

Re: [R-sig-eco] Fwd: how to calculate axis variance in metaMDS, pakage vegan?

2009-12-08 Thread gabriel singer

Hi Gian and others,

I think we better stop worrying about subjective interpretations of 
emotional backgrounds of what in other aspects are absolutely helpful 
discussion threads... I guess part of the challenge on this mailing list 
is to span the whole range of expertise with useful 
discussion/output/help for everyone, be it a student or an expert. I 
found this mailing list very helpful many times for my own questions, 
but also very informative when just following the threads on other 
questions...


Gian, in my opinion, 2 dimensions are absolutely ok, especially if they 
do visualize an (obvious) effect in your study. In other words, if 2 
dimensions show you an effect of Host but not of Area, the effect is 
obviously strong enough. Then I would not worry about stress too much. 
However, there may still be an effect of Area, maybe visible in more 
dimensions, but it´s obviously of minor importance.


I personally like a combination of NMDS with the permutational MANOVA 
approach (by Marti Anderson) implemented in the function adonis() in 
vegan. You can use the same dissimilarity measure (Bray-Curtis) used for 
the NMDS and can test the Area vs. the Host effect on parasite (was 
it?) composition. I think that could be a very useful complement to an 
NMDS-derived ordination plot and then you may also regard high-stress 
representations (and that´s what all the low-dimensional ordination 
plots really ARE!) in a different light.


Complementations like the permanova are in my opinion better than trying 
the full spectrum of ordination methods until finally some kind of 
pattern gets uncovered (comes quite close to the much too often 
encountered data-fishing expeditions). And though copying analysis 
strategies is probably not quite like throwing yourself in front of a 
bus, there is some benefit in using what people working in a specific 
field regard their standard methods (wait for the reviews to discover 
this). In any case, a responsible choice for a type of analysis is 
oriented along the study design and the data at hand.


cheers, gabriel

--
Dr. Gabriel Singer
Department of Freshwater Ecology - University of Vienna
and Wassercluster Lunz Biologische Station GmbH
+43-(0)664-1266747
gabriel.sin...@univie.ac.at



Gian Maria Niccolò Benucci wrote:

Hi Gavin and Hi all,

I will not go in front of a bus for sure, I not mad, at least I am not still
mad... :)

I would like to tell you that I am a Ph.D. student, and for what I know,
Ph.D. student still have to understand things studing those from whom wrote
before them...

Isac Newton became famous not only for his science but also for a famous
phrase that, if I don't remember it bad, act like this : If I have seen so
much far away is because I stand on shoulders of Giants... I think that it
needs any comment, and express itself the concept...

So, I am so sorry, I also don't like the me to attitude, but you don't
know how is my reality here, and I can assure you that also If I am still a
student, I am alone in my research, and If have a tutor and boss for
italian rules I don't have a boss for statistics, couse none could help me
on that...
So what could I do if I don't take models in already published literature?

Anyway, I don't want to seem like the victim, I have a brain that works and
I am doing my best to understand and improve my knowledge and at least lean
and grow, for sure, step by step, and with a big humility, in science and in
this case in statistics...

Anyway... For continuing the brainstorm if I can...The Host effect is what I
think is more interesting for the ecological point of view of my trials also
becasue the 4 communities have two by two the same host, I mean A and B,
Corylus, while B and C, Ostrya...
If I plot the factors of the envifit into the graph and the evidence of
separation seems clear...

That's are my metaMDS with 2 and 3 dimensions:

  

NMS.1



Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 2, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 2
Stress: 24.54342
Two convergent solutions found after 18 tries
Scaling: centring, PC rotation, halfchange scaling
Species: expanded scores based on ‘sqrtABCD’

  

NMS.ABCD.2ef



***FACTORS:

Centroids:
  NMDS1   NMDS2
CommunityA  -0.3271  0.1984
CommunityB  -0.1956  0.1768
CommunityC   0.2520 -0.2847
CommunityD   0.2706 -0.0905
HostCorylus -0.2613  0.1876
HostOstrya   0.2613 -0.1876

Goodness of fit:
  r2   Pr(r)
Community 0.1897 0.017982 *
Host  0.1778 0.001998 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P values based on 1000 permutations.
  

NMS.1.3



Call:
metaMDS(comm = sqrtABCD, distance = bray, k = 3, trymax = 100,
autotransform = F)

Nonmetric Multidimensional Scaling using isoMDS (MASS package)

Data: sqrtABCD
Distance: bray shortest

Dimensions: 3
Stress: 16.29226
Two