Re: [R-sig-phylo] PGLS vs lm

2013-07-26 Thread Tom Schoenemann
Thanks for the suggestions. I'll see if I can implement them.

However, I'm curious if anyone can address my specific questions: Does it make 
biological sense for one variable "A" to predict another "B" significantly, but 
for "B" to predict "A"?

-Tom

On Jul 26, 2013, at 6:42 PM, Theodore Garland Jr  
wrote:

> Hi Tom,
> 
> So far I have resisted jumping in here, but maybe this will help.
> Come up with a model for how you think your traits of interest might evolve 
> together in a correlated fashion along a phylogenetic tree.
> Now implement it in a computer simulation along a phylogenetic tree.
> Also implement the model with no correlation between the traits.  
> Analyze the data with whatever methods you choose.
> Check the Type I error rate and then the power of each method.  Also check 
> the bias and means squared error for the parameter you are trying to estimate.
> See what method works best.
> Use that method for your data if you have some confidence that the model you 
> used to simulate trait evolution is reasonable, based on your understanding 
> (and intuition) about the biology involved.
> 
> Lots of us have done this sort of thing, e.g., check this:
> 
> Martins, E. P., and T. Garland, Jr. 1991. Phylogenetic analyses of the 
> correlated evolution of continuous characters: a simulation study. Evolution 
> 45:534-557.
> 
> 
> 
> Cheers,
> Ted
> 
> Theodore Garland, Jr., Professor
> Department of Biology
> University of California, Riverside
> Riverside, CA 92521
> Office Phone:  (951) 827-3524
> Wet Lab Phone:  (951) 827-5724
> Dry Lab Phone:  (951) 827-4026
> Home Phone:  (951) 328-0820
> Skype:  theodoregarland
> Facsimile:  (951) 827-4286 = Dept. office (not confidential)
> Email:  tgarl...@ucr.edu
> http://www.biology.ucr.edu/people/faculty/Garland.html
> http://scholar.google.com/citations?hl=en&user=iSSbrhwJ
> 
> Inquiry-based Middle School Lesson Plan:
> "Born to Run: Artificial Selection Lab"
> http://www.indiana.edu/~ensiweb/lessons/BornToRun.html
> 
> From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] 
> on behalf of Tom Schoenemann [t...@indiana.edu]
> Sent: Friday, July 26, 2013 3:21 PM
> To: Tom Schoenemann
> Cc: r-sig-phylo@r-project.org
> Subject: Re: [R-sig-phylo] PGLS vs lm
> 
> OK, so I haven't gotten any responses that convince me that PGLS isn't 
> biologically suspect. At the risk of thinking out loud to myself here, I 
> wonder if my finding might have to do with the method detecting phylogenetic 
> signal in the error (residuals?):
> 
> From:
> Revell, L. J. (2010). Phylogenetic signal and linear regression on species 
> data. Methods in Ecology and Evolution, 1(4), 319-329.
> 
> I note the following: "...the suitability of a phylogenetic regression should 
> actually be diagnosed by estimating phylogenetic signal in the residual 
> deviations of Y given our predictors (X1, X2, etc.)."
> 
> Let's say one variable, "A", has a strong evolutionary signal, but the other, 
> variable "B", does not. Would we expect this to affect a PGLS differently if 
> we use A to predict B, vs. using B to predict A?  
> 
> If so, it would explain my findings. However, given the difference, I can 
> have no confidence that there is, or is not, a significant covariance between 
> A and B independent of phylogeny. Doesn't this finding call into question the 
> method itself?
> 
> More directly, how is one to interpret such a finding? Is there, or is there 
> not, a significant biological association?
> 
> -Tom
> 
> 
> On Jul 21, 2013, at 11:47 PM, Tom Schoenemann  wrote:
> 
> > Thanks Liam,
> > 
> > A couple of questions: 
> > 
> > How does one do a hypothesis test on a regression, controlling for 
> > phylogeny, if not using PGLS as I am doing?  I realize one could use 
> > independent contrasts, though I was led to believe that is equivalent to a 
> > PGLS with lambda = 1.  
> > 
> > I take it from what you wrote that the PGLS in caper does a ML of lambda 
> > only on y, when doing the regression? Isn't this patently wrong, 
> > biologically speaking? Phylogenetic effects could have been operating on 
> > both x and y - we can't assume that it would only be relevant to y. 
> > Shouldn't phylogenetic methods account for both?
> > 
> > You say you aren't sure it is a good idea to jointly optimize lambda for x 
> > & y.  Can you expand on this?  What would be a better solution (if there is 
> > one)?
> > 
> > Am I wrong that it makes no evolutionary biological sense to use a method 
> > that gives different estimates of the probability of a relationship based 
> > on the direction in which one looks at the relationship? Doesn't the fact 
> > that the method gives different answers in this way invalidate the method 
> > for taking phylogeny into account when assessing relationships among 
> > biological taxa?  How could it be biologically meaningful for phylogeny to 
> > have a greater influence when x is predicting y, than when y is predicting 
> > x?  Maybe I'

Re: [R-sig-phylo] PGLS vs lm

2013-07-26 Thread Theodore Garland Jr
Hi Tom,

So far I have resisted jumping in here, but maybe this will help.
Come up with a model for how you think your traits of interest might evolve 
together in a correlated fashion along a phylogenetic tree.
Now implement it in a computer simulation along a phylogenetic tree.
Also implement the model with no correlation between the traits.
Analyze the data with whatever methods you choose.
Check the Type I error rate and then the power of each method.  Also check the 
bias and means squared error for the parameter you are trying to estimate.
See what method works best.
Use that method for your data if you have some confidence that the model you 
used to simulate trait evolution is reasonable, based on your understanding 
(and intuition) about the biology involved.

Lots of us have done this sort of thing, e.g., check this:

Martins, E. P., and T. Garland, Jr. 1991. Phylogenetic analyses of the 
correlated evolution of continuous characters: a simulation study. Evolution 
45:534-557.

Cheers,
Ted

Theodore Garland, Jr., Professor
Department of Biology
University of California, Riverside
Riverside, CA 92521
Office Phone:  (951) 827-3524
Wet Lab Phone:  (951) 827-5724
Dry Lab Phone:  (951) 827-4026
Home Phone:  (951) 328-0820
Skype:  theodoregarland
Facsimile:  (951) 827-4286 = Dept. office (not confidential)
Email:  tgarl...@ucr.edu
http://www.biology.ucr.edu/people/faculty/Garland.html
http://scholar.google.com/citations?hl=en&user=iSSbrhwJ

Inquiry-based Middle School Lesson Plan:
"Born to Run: Artificial Selection Lab"
http://www.indiana.edu/~ensiweb/lessons/BornToRun.html


From: r-sig-phylo-boun...@r-project.org [r-sig-phylo-boun...@r-project.org] on 
behalf of Tom Schoenemann [t...@indiana.edu]
Sent: Friday, July 26, 2013 3:21 PM
To: Tom Schoenemann
Cc: r-sig-phylo@r-project.org
Subject: Re: [R-sig-phylo] PGLS vs lm

OK, so I haven't gotten any responses that convince me that PGLS isn't 
biologically suspect. At the risk of thinking out loud to myself here, I wonder 
if my finding might have to do with the method detecting phylogenetic signal in 
the error (residuals?):

From:
Revell, L. J. (2010). Phylogenetic signal and linear regression on species 
data. Methods in Ecology and Evolution, 1(4), 319-329.

I note the following: "...the suitability of a phylogenetic regression should 
actually be diagnosed by estimating phylogenetic signal in the residual 
deviations of Y given our predictors (X1, X2, etc.)."

Let's say one variable, "A", has a strong evolutionary signal, but the other, 
variable "B", does not. Would we expect this to affect a PGLS differently if we 
use A to predict B, vs. using B to predict A?

If so, it would explain my findings. However, given the difference, I can have 
no confidence that there is, or is not, a significant covariance between A and 
B independent of phylogeny. Doesn't this finding call into question the method 
itself?

More directly, how is one to interpret such a finding? Is there, or is there 
not, a significant biological association?

-Tom


On Jul 21, 2013, at 11:47 PM, Tom Schoenemann  wrote:

> Thanks Liam,
>
> A couple of questions:
>
> How does one do a hypothesis test on a regression, controlling for phylogeny, 
> if not using PGLS as I am doing?  I realize one could use independent 
> contrasts, though I was led to believe that is equivalent to a PGLS with 
> lambda = 1.
>
> I take it from what you wrote that the PGLS in caper does a ML of lambda only 
> on y, when doing the regression? Isn't this patently wrong, biologically 
> speaking? Phylogenetic effects could have been operating on both x and y - we 
> can't assume that it would only be relevant to y. Shouldn't phylogenetic 
> methods account for both?
>
> You say you aren't sure it is a good idea to jointly optimize lambda for x & 
> y.  Can you expand on this?  What would be a better solution (if there is 
> one)?
>
> Am I wrong that it makes no evolutionary biological sense to use a method 
> that gives different estimates of the probability of a relationship based on 
> the direction in which one looks at the relationship? Doesn't the fact that 
> the method gives different answers in this way invalidate the method for 
> taking phylogeny into account when assessing relationships among biological 
> taxa?  How could it be biologically meaningful for phylogeny to have a 
> greater influence when x is predicting y, than when y is predicting x?  Maybe 
> I'm missing something here.
>
> -Tom
>
>
> On Jul 21, 2013, at 8:59 PM, Liam J. Revell  wrote:
>
>> Hi Tom.
>>
>> Joe pointed out that if we assume that our variables are multivariate 
>> normal, then a hypothesis test on the regression is the same as a test that 
>> cov(x,y) is different from zero.
>>
>> If you insist on using lambda, one logical extension to this might be to 
>> jointly optimize lambda for x & y (following Freckleton et al. 2002) and 
>> then fix the value of lambda at its joint MLE during

Re: [R-sig-phylo] PGLS vs lm

2013-07-26 Thread Tom Schoenemann
OK, so I haven't gotten any responses that convince me that PGLS isn't 
biologically suspect. At the risk of thinking out loud to myself here, I wonder 
if my finding might have to do with the method detecting phylogenetic signal in 
the error (residuals?):

From:
Revell, L. J. (2010). Phylogenetic signal and linear regression on species 
data. Methods in Ecology and Evolution, 1(4), 319-329.

I note the following: "...the suitability of a phylogenetic regression should 
actually be diagnosed by estimating phylogenetic signal in the residual 
deviations of Y given our predictors (X1, X2, etc.)."

Let's say one variable, "A", has a strong evolutionary signal, but the other, 
variable "B", does not. Would we expect this to affect a PGLS differently if we 
use A to predict B, vs. using B to predict A?  

If so, it would explain my findings. However, given the difference, I can have 
no confidence that there is, or is not, a significant covariance between A and 
B independent of phylogeny. Doesn't this finding call into question the method 
itself?

More directly, how is one to interpret such a finding? Is there, or is there 
not, a significant biological association?

-Tom


On Jul 21, 2013, at 11:47 PM, Tom Schoenemann  wrote:

> Thanks Liam,
> 
> A couple of questions: 
> 
> How does one do a hypothesis test on a regression, controlling for phylogeny, 
> if not using PGLS as I am doing?  I realize one could use independent 
> contrasts, though I was led to believe that is equivalent to a PGLS with 
> lambda = 1.  
> 
> I take it from what you wrote that the PGLS in caper does a ML of lambda only 
> on y, when doing the regression? Isn't this patently wrong, biologically 
> speaking? Phylogenetic effects could have been operating on both x and y - we 
> can't assume that it would only be relevant to y. Shouldn't phylogenetic 
> methods account for both?
> 
> You say you aren't sure it is a good idea to jointly optimize lambda for x & 
> y.  Can you expand on this?  What would be a better solution (if there is 
> one)?
> 
> Am I wrong that it makes no evolutionary biological sense to use a method 
> that gives different estimates of the probability of a relationship based on 
> the direction in which one looks at the relationship? Doesn't the fact that 
> the method gives different answers in this way invalidate the method for 
> taking phylogeny into account when assessing relationships among biological 
> taxa?  How could it be biologically meaningful for phylogeny to have a 
> greater influence when x is predicting y, than when y is predicting x?  Maybe 
> I'm missing something here.
> 
> -Tom 
> 
> 
> On Jul 21, 2013, at 8:59 PM, Liam J. Revell  wrote:
> 
>> Hi Tom.
>> 
>> Joe pointed out that if we assume that our variables are multivariate 
>> normal, then a hypothesis test on the regression is the same as a test that 
>> cov(x,y) is different from zero.
>> 
>> If you insist on using lambda, one logical extension to this might be to 
>> jointly optimize lambda for x & y (following Freckleton et al. 2002) and 
>> then fix the value of lambda at its joint MLE during GLS. This would at 
>> least have the property of guaranteeing that the P-values for y~x and x~y 
>> are the same
>> 
>> I previously posted code for joint estimation of lambda on my blog here: 
>> http://blog.phytools.org/2012/09/joint-estimation-of-pagels-for-multiple.html.
>> 
>> With this code to fit joint lambda, our analysis would then look something 
>> like this:
>> 
>> require(phytools)
>> require(nlme)
>> lambda<-joint.lambda(tree,cbind(x,y))$lambda
>> fit1<-gls(y~x,data=data.frame(x,y),correlation=corPagel(lambda,tree,fixed=TRUE))
>> fit2<-gls(x~y,data=data.frame(x,y),correlation=corPagel(lambda,tree,fixed=TRUE))
>> 
>> I'm not sure that this is a good idea - but it is possible
>> 
>> - Liam
>> 
>> Liam J. Revell, Assistant Professor of Biology
>> University of Massachusetts Boston
>> web: http://faculty.umb.edu/liam.revell/
>> email: liam.rev...@umb.edu
>> blog: http://blog.phytools.org
>> 
>> On 7/21/2013 6:15 PM, Tom Schoenemann wrote:
>>> Hi all,
>>> 
>>> I'm still unsure of how I should interpret results given that using PGLS
>>> to predict group size from brain size gives different significance
>>> levels and lambda estimates than when I do the reverse (i.e., predict
>>> brain size from group size).  Biologically, I don't think this makes any
>>> sense.  If lambda is an estimate of the phylogenetic signal, what
>>> possible evolutionary and biological sense are we to make if the
>>> estimates of lambda are significantly different depending on which way
>>> the association is assessed? I understand the mathematics may allow
>>> this, but if I can't make sense of this biologically, then doesn't it
>>> call into question the use of this method for these kinds of questions
>>> in the first place?  What am I missing here?
>>> 
>>> Here is some results from data I have that illustrate this (notice that
>>> the lambda values