Re: [R-sig-phylo] PGLS vs lm

Theodore Garland Jr Fri, 26 Jul 2013 15:43:57 -0700

Hi Tom,

So far I have resisted jumping in here, but maybe this will help.
Come up with a model for how you think your traits of interest might evolve 
together in a correlated fashion along a phylogenetic tree.
Now implement it in a computer simulation along a phylogenetic tree.
Also implement the model with no correlation between the traits.
Analyze the data with whatever methods you choose.
Check the Type I error rate and then the power of each method.  Also check the 
bias and means squared error for the parameter you are trying to estimate.
See what method works best.
Use that method for your data if you have some confidence that the model you 
used to simulate trait evolution is reasonable, based on your understanding 
(and intuition) about the biology involved.


Lots of us have done this sort of thing, e.g., check this:

Martins, E. P., and T. Garland, Jr. 1991. Phylogenetic analyses of the 
correlated evolution of continuous characters: a simulation study. Evolution 
45:534-557.

Cheers,
Ted

Theodore Garland, Jr., Professor
Department of Biology
University of California, Riverside
Riverside, CA 92521
Office Phone:  (951) 827-3524
Wet Lab Phone:  (951) 827-5724
Dry Lab Phone:  (951) 827-4026
Home Phone:  (951) 328-0820
Skype:  theodoregarland
Facsimile:  (951) 827-4286 = Dept. office (not confidential)
Email:  tgarl...@ucr.edu
http://www.biology.ucr.edu/people/faculty/Garland.html
http://scholar.google.com/citations?hl=en&user=iSSbrhwAAAAJ

Inquiry-based Middle School Lesson Plan:
"Born to Run: Artificial Selection Lab"
http://www.indiana.edu/~ensiweb/lessons/BornToRun.html

________________________________
From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] on 
behalf of Tom Schoenemann [t...@indiana.edu]
Sent: Friday, July 26, 2013 3:21 PM
To: Tom Schoenemann
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] PGLS vs lm

OK, so I haven't gotten any responses that convince me that PGLS isn't 
biologically suspect. At the risk of thinking out loud to myself here, I wonder 
if my finding might have to do with the method detecting phylogenetic signal in 
the error (residuals?):

From:
Revell, L. J. (2010). Phylogenetic signal and linear regression on species 
data. Methods in Ecology and Evolution, 1(4), 319-329.

I note the following: "...the suitability of a phylogenetic regression should 
actually be diagnosed by estimating phylogenetic signal in the residual 
deviations of Y given our predictors (X1, X2, etc.)."

Let's say one variable, "A", has a strong evolutionary signal, but the other, 
variable "B", does not. Would we expect this to affect a PGLS differently if we 
use A to predict B, vs. using B to predict A?

If so, it would explain my findings. However, given the difference, I can have 
no confidence that there is, or is not, a significant covariance between A and 
B independent of phylogeny. Doesn't this finding call into question the method 
itself?

More directly, how is one to interpret such a finding? Is there, or is there 
not, a significant biological association?

-Tom


On Jul 21, 2013, at 11:47 PM, Tom Schoenemann <t...@indiana.edu> wrote:

> Thanks Liam,
>
> A couple of questions:
>
> How does one do a hypothesis test on a regression, controlling for phylogeny, 
> if not using PGLS as I am doing?  I realize one could use independent 
> contrasts, though I was led to believe that is equivalent to a PGLS with 
> lambda = 1.
>
> I take it from what you wrote that the PGLS in caper does a ML of lambda only 
> on y, when doing the regression? Isn't this patently wrong, biologically 
> speaking? Phylogenetic effects could have been operating on both x and y - we 
> can't assume that it would only be relevant to y. Shouldn't phylogenetic 
> methods account for both?
>
> You say you aren't sure it is a good idea to jointly optimize lambda for x & 
> y.  Can you expand on this?  What would be a better solution (if there is 
> one)?
>
> Am I wrong that it makes no evolutionary biological sense to use a method 
> that gives different estimates of the probability of a relationship based on 
> the direction in which one looks at the relationship? Doesn't the fact that 
> the method gives different answers in this way invalidate the method for 
> taking phylogeny into account when assessing relationships among biological 
> taxa?  How could it be biologically meaningful for phylogeny to have a 
> greater influence when x is predicting y, than when y is predicting x?  Maybe 
> I'm missing something here.
>
> -Tom
>
>
> On Jul 21, 2013, at 8:59 PM, Liam J. Revell <liam.rev...@umb.edu> wrote:
>
>> Hi Tom.
>>
>> Joe pointed out that if we assume that our variables are multivariate 
>> normal, then a hypothesis test on the regression is the same as a test that 
>> cov(x,y) is different from zero.
>>
>> If you insist on using lambda, one logical extension to this might be to 
>> jointly optimize lambda for x & y (following Freckleton et al. 2002) and 
>> then fix the value of lambda at its joint MLE during GLS. This would at 
>> least have the property of guaranteeing that the P-values for y~x and x~y 
>> are the same....
>>
>> I previously posted code for joint estimation of lambda on my blog here: 
>> http://blog.phytools.org/2012/09/joint-estimation-of-pagels-for-multiple.html.
>>
>> With this code to fit joint lambda, our analysis would then look something 
>> like this:
>>
>> require(phytools)
>> require(nlme)
>> lambda<-joint.lambda(tree,cbind(x,y))$lambda
>> fit1<-gls(y~x,data=data.frame(x,y),correlation=corPagel(lambda,tree,fixed=TRUE))
>> fit2<-gls(x~y,data=data.frame(x,y),correlation=corPagel(lambda,tree,fixed=TRUE))
>>
>> I'm not sure that this is a good idea - but it is possible....
>>
>> - Liam
>>
>> Liam J. Revell, Assistant Professor of Biology
>> University of Massachusetts Boston
>> web: http://faculty.umb.edu/liam.revell/
>> email: liam.rev...@umb.edu
>> blog: http://blog.phytools.org
>>
>> On 7/21/2013 6:15 PM, Tom Schoenemann wrote:
>>> Hi all,
>>>
>>> I'm still unsure of how I should interpret results given that using PGLS
>>> to predict group size from brain size gives different significance
>>> levels and lambda estimates than when I do the reverse (i.e., predict
>>> brain size from group size).  Biologically, I don't think this makes any
>>> sense.  If lambda is an estimate of the phylogenetic signal, what
>>> possible evolutionary and biological sense are we to make if the
>>> estimates of lambda are significantly different depending on which way
>>> the association is assessed? I understand the mathematics may allow
>>> this, but if I can't make sense of this biologically, then doesn't it
>>> call into question the use of this method for these kinds of questions
>>> in the first place?  What am I missing here?
>>>
>>> Here is some results from data I have that illustrate this (notice that
>>> the lambda values are significantly different from each other):
>>>
>>> Group size predicted by brain size:
>>>
>>>> model.group.by.brain<-pgls(log(GroupSize) ~ log(AvgBrainWt), data = 
>>>> primate_tom, lambda='ML')
>>>> summary(model.group.by.brain)
>>>
>>> Call:
>>> pgls(formula = log(GroupSize) ~ log(AvgBrainWt), data = primate_tom,
>>>    lambda = "ML")
>>>
>>> Residuals:
>>>     Min       1Q   Median       3Q      Max
>>> -0.27196 -0.07638  0.00399  0.10107  0.43852
>>>
>>> Branch length transformations:
>>>
>>> kappa  [Fix]  : 1.000
>>> lambda [ ML]  : 0.759
>>>   lower bound : 0.000, p = 4.6524e-08
>>>   upper bound : 1.000, p = 2.5566e-10
>>>   95.0% CI   : (0.485, 0.904)
>>> delta  [Fix]  : 1.000
>>>
>>> Coefficients:
>>>                 Estimate Std. Error t value Pr(>|t|)
>>> (Intercept)     -0.080099   0.610151 -0.1313 0.895825
>>> log(AvgBrainWt)  0.483366   0.136694  3.5361 0.000622 ***
>>> ---
>>> Signif. codes:  0 ï¿½***ï¿½ 0.001 ï¿½**ï¿½ 0.01 ï¿½*ï¿½ 0.05 ï¿½.ï¿½ 0.1 
>>> ï¿½ ï¿½ 1
>>>
>>> Residual standard error: 0.1433 on 98 degrees of freedom
>>> Multiple R-squared: 0.1132,     Adjusted R-squared: 0.1041
>>> F-statistic:  12.5 on 2 and 98 DF,  p-value: 1.457e-05
>>>
>>>
>>> Brain size predicted by group size:
>>>
>>>> model.brain.by.group<-pgls(log(AvgBrainWt) ~ log(GroupSize), data = 
>>>> primate_tom, lambda='ML')
>>>> summary(model.brain.by.group)
>>>
>>> Call:
>>> pgls(formula = log(AvgBrainWt) ~ log(GroupSize), data = primate_tom,
>>>    lambda = "ML")
>>>
>>> Residuals:
>>>     Min       1Q   Median       3Q      Max
>>> -0.38359 -0.08216  0.00902  0.05609  0.27443
>>>
>>> Branch length transformations:
>>>
>>> kappa  [Fix]  : 1.000
>>> lambda [ ML]  : 1.000
>>>   lower bound : 0.000, p = < 2.22e-16
>>>   upper bound : 1.000, p = 1
>>>   95.0% CI   : (0.992, NA)
>>> delta  [Fix]  : 1.000
>>>
>>> Coefficients:
>>>               Estimate Std. Error t value  Pr(>|t|)
>>> (Intercept)    2.740932   0.446943  6.1326 1.824e-08 ***
>>> log(GroupSize) 0.050780   0.043363  1.1710    0.2444
>>> ---
>>> Signif. codes:  0 ï¿½***ï¿½ 0.001 ï¿½**ï¿½ 0.01 ï¿½*ï¿½ 0.05 ï¿½.ï¿½ 0.1 
>>> ï¿½ ï¿½ 1
>>>
>>> Residual standard error: 0.122 on 98 degrees of freedom
>>> Multiple R-squared: 0.0138,     Adjusted R-squared: 0.003737
>>> F-statistic: 1.371 on 2 and 98 DF,  p-value: 0.2586
>>>
>>>
>>> On Jul 14, 2013, at 6:18 AM, Emmanuel Paradis <emmanuel.para...@ird.fr>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I would like to react a bit on this issue.
>>>>
>>>> Probably one problem is that the distinction "correlation vs. regression" 
>>>> is not the same for independent data and for phylogenetic data.
>>>>
>>>> Consider the case of independent observations first. Suppose we are 
>>>> interested in the relationship y = b x + a, where x is an environmental 
>>>> variable, say latitude. We can get estimates of b and a by moving to 10 
>>>> well-chosen locations, sampling 10 observations  of y (they are 
>>>> independent) and analyse the 100 data points with OLS.
>>> Here we cannot say anything about the correlation between x and y
>>> because we controlled the distribution of x. In practice, even if x is
>>> not controlled, this approach is still valid as long as the observations
>>> are independent.
>>>>
>>>> With phylogenetic data, x is not controlled if it is measured "on the 
>>>> species" -- in other words it's an evolving trait (or intrinsic variable). 
>>>> x may be controlled if it is measured "outside the species" (extrinsic 
>>>> variable) such as latitude. So the case  of using regression or 
>>>> correlation is not the same than above.
>>> Combining intrinsic and extinsic variables has generated a lot of debate
>>> in the literature.
>>>>
>>>> I don't think it's a problem of using a method and not another, but rather 
>>>> to use a method keeping in mind what it does (and its assumptions). 
>>>> Apparently, Hansen and Bartoszek consider a range of models including 
>>>> regression models where, by contrast to GLS,  the evolution of the 
>>>> predictors is modelled explicitly.
>>>>
>>>> If we want to progress in our knowledge on how evolution works, I think we 
>>>> have to not limit ourselves to assess whether there is a relationship, but 
>>>> to test more complex models. The case presented by Tom is particularly 
>>>> relevant here (at least to me): testing  whether group size affects brain 
>>>> size or the opposite (or both) is an
>>> important question. There's been also a lot of debate whether
>>> comparative data can answer this question. Maybe what we need here is an
>>> approach based on simultaneous equations (aka structural equation
>>> models), but I'm not aware whether this exists in a phylogenetic
>>> framework. The approach by Hansen and Bartoszek could be a step in this
>>> direction.
>>>>
>>>> Best,
>>>>
>>>> Emmanuel
>>>>
>>>> Le 13/07/2013 02:59, Joe Felsenstein a ï¿½crit :
>>>>>
>>>>> Tom Schoenemann asked me:
>>>>>
>>>>>> With respect to your crankiness, is this the paper by Hansen that you 
>>>>>> are referring to?:
>>>>>>
>>>>>> Bartoszek, K., Pienaar, J., Mostad, P., Andersson, S., & Hansen, T. F. 
>>>>>> (2012). A phylogenetic comparative method for studying multivariate 
>>>>>> adaptation. Journal of Theoretical Biology, 314(0), 204-215.
>>>>>>
>>>>>> I wrote Bartoszek to see if I could get his R code to try the method 
>>>>>> mentioned in there. If I can figure out how to apply it to my data, that 
>>>>>> will be great. I agree that it is clearly a mistake to assume one 
>>>>>> variable is responding evolutionarily only to  the current value of the 
>>>>>> other (predictor variables).
>>>>>
>>>>> I'm glad to hear that *somebody* here thinks it is a mistake (because it 
>>>>> really is).  I keep mentioning it here, and Hansen has published 
>>>>> extensively on it, but everyone keeps saying "Well, my friend used it, 
>>>>> and he got tenure, so it must be OK".
>>>>>
>>>>> The paper I saw was this one:
>>>>>
>>>>> Hansen, Thomas F & Bartoszek, Krzysztof (2012). Interpreting the 
>>>>> evolutionary regression: The interplay between observational and 
>>>>> biological errors in phylogenetic comparative studies. Systematic Biology 
>>>>>  61 (3): 413-425.  ISSN 1063-5157.
>>>>>
>>>>> J.F.
>>>>> ----
>>>>> Joe Felsenstein         j...@gs.washington.edu
>>>>> Department of Genome Sciences and Department of Biology,
>>>>> University of Washington, Box 355065, Seattle, WA 98195-5065 USA
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-phylo mailing list - R-sig-phylo@r-project.org
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>>>> Searchable archive 
>>>>> athttp://www.mail-archive.com/r-sig-phylo@r-project.org/
>>>>>
>>>
>>> _________________________________________________
>>> P. Thomas Schoenemann
>>>
>>> Associate Professor
>>> Department of Anthropology
>>> Cognitive Science Program
>>> Indiana University
>>> Bloomington, IN  47405
>>> Phone: 812-855-8800
>>> E-mail: t...@indiana.edu
>>>
>>> Open Research Scan Archive (ORSA) Co-Director
>>> Consulting Scholar
>>> Museum of Archaeology and Anthropology
>>> University of Pennsylvania
>>>
>>> http://www.indiana.edu/~brainevo
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-phylo mailing list - R-sig-phylo@r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
>>> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
>>
>
> _________________________________________________
> P. Thomas Schoenemann
>
> Associate Professor
> Department of Anthropology
> Cognitive Science Program
> Indiana University
> Bloomington, IN  47405
> Phone: 812-855-8800
> E-mail: t...@indiana.edu
>
> Open Research Scan Archive (ORSA) Co-Director
> Consulting Scholar
> Museum of Archaeology and Anthropology
> University of Pennsylvania
>
> http://www.indiana.edu/~brainevo
>
>
>
>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

_________________________________________________
P. Thomas Schoenemann

Associate Professor
Department of Anthropology
Cognitive Science Program
Indiana University
Bloomington, IN  47405
Phone: 812-855-8800
E-mail: t...@indiana.edu

Open Research Scan Archive (ORSA) Co-Director
Consulting Scholar
Museum of Archaeology and Anthropology
University of Pennsylvania

http://www.indiana.edu/~brainevo











        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] PGLS vs lm

Reply via email to