Re: [R-sig-phylo] PGLS vs lm

Liam J. Revell Sun, 21 Jul 2013 18:01:27 -0700

Hi Tom.

Joe pointed out that if we assume that our variables are multivariatenormal, then a hypothesis test on the regression is the same as a testthat cov(x,y) is different from zero.

If you insist on using lambda, one logical extension to this might be tojointly optimize lambda for x & y (following Freckleton et al. 2002) andthen fix the value of lambda at its joint MLE during GLS. This would atleast have the property of guaranteeing that the P-values for y~x andx~y are the same....

I previously posted code for joint estimation of lambda on my blog here:http://blog.phytools.org/2012/09/joint-estimation-of-pagels-for-multiple.html.

With this code to fit joint lambda, our analysis would then looksomething like this:


require(phytools)
require(nlme)
lambda<-joint.lambda(tree,cbind(x,y))$lambda
fit1<-gls(y~x,data=data.frame(x,y),correlation=corPagel(lambda,tree,fixed=TRUE))
fit2<-gls(x~y,data=data.frame(x,y),correlation=corPagel(lambda,tree,fixed=TRUE))

I'm not sure that this is a good idea - but it is possible....

- Liam

Liam J. Revell, Assistant Professor of Biology
University of Massachusetts Boston
web: http://faculty.umb.edu/liam.revell/
email: liam.rev...@umb.edu
blog: http://blog.phytools.org

On 7/21/2013 6:15 PM, Tom Schoenemann wrote:

Hi all,

I'm still unsure of how I should interpret results given that using PGLS
to predict group size from brain size gives different significance
levels and lambda estimates than when I do the reverse (i.e., predict
brain size from group size).  Biologically, I don't think this makes any
sense.  If lambda is an estimate of the phylogenetic signal, what
possible evolutionary and biological sense are we to make if the
estimates of lambda are significantly different depending on which way
the association is assessed? I understand the mathematics may allow
this, but if I can't make sense of this biologically, then doesn't it
call into question the use of this method for these kinds of questions
in the first place?  What am I missing here?

Here is some results from data I have that illustrate this (notice that
the lambda values are significantly different from each other):

Group size predicted by brain size:

model.group.by.brain<-pgls(log(GroupSize) ~ log(AvgBrainWt), data = 
primate_tom, lambda='ML')
summary(model.group.by.brain)


Call:
pgls(formula = log(GroupSize) ~ log(AvgBrainWt), data = primate_tom,
     lambda = "ML")

Residuals:
      Min       1Q   Median       3Q      Max
-0.27196 -0.07638  0.00399  0.10107  0.43852

Branch length transformations:

kappa  [Fix]  : 1.000
lambda [ ML]  : 0.759
    lower bound : 0.000, p = 4.6524e-08
    upper bound : 1.000, p = 2.5566e-10
    95.0% CI   : (0.485, 0.904)
delta  [Fix]  : 1.000

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)
(Intercept)     -0.080099   0.610151 -0.1313 0.895825
log(AvgBrainWt)  0.483366   0.136694  3.5361 0.000622 ***
---
Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1

Residual standard error: 0.1433 on 98 degrees of freedom
Multiple R-squared: 0.1132,     Adjusted R-squared: 0.1041
F-statistic:  12.5 on 2 and 98 DF,  p-value: 1.457e-05


Brain size predicted by group size:

model.brain.by.group<-pgls(log(AvgBrainWt) ~ log(GroupSize), data = 
primate_tom, lambda='ML')
summary(model.brain.by.group)


Call:
pgls(formula = log(AvgBrainWt) ~ log(GroupSize), data = primate_tom,
     lambda = "ML")

Residuals:
      Min       1Q   Median       3Q      Max
-0.38359 -0.08216  0.00902  0.05609  0.27443

Branch length transformations:

kappa  [Fix]  : 1.000
lambda [ ML]  : 1.000
    lower bound : 0.000, p = < 2.22e-16
    upper bound : 1.000, p = 1
    95.0% CI   : (0.992, NA)
delta  [Fix]  : 1.000

Coefficients:
                Estimate Std. Error t value  Pr(>|t|)
(Intercept)    2.740932   0.446943  6.1326 1.824e-08 ***
log(GroupSize) 0.050780   0.043363  1.1710    0.2444
---
Signif. codes:  0 �***� 0.001 �**� 0.01 �*� 0.05 �.� 0.1 � � 1

Residual standard error: 0.122 on 98 degrees of freedom
Multiple R-squared: 0.0138,     Adjusted R-squared: 0.003737
F-statistic: 1.371 on 2 and 98 DF,  p-value: 0.2586


On Jul 14, 2013, at 6:18 AM, Emmanuel Paradis <emmanuel.para...@ird.fr>
wrote:

Hi all,

I would like to react a bit on this issue.

Probably one problem is that the distinction "correlation vs. regression" is 
not the same for independent data and for phylogenetic data.

Consider the case of independent observations first. Suppose we are interested 
in the relationship y = b x + a, where x is an environmental variable, say 
latitude. We can get estimates of b and a by moving to 10 well-chosen 
locations, sampling 10 observations  of y (they are independent) and analyse 
the 100 data points with OLS.

Here we cannot say anything about the correlation between x and y
because we controlled the distribution of x. In practice, even if x is
not controlled, this approach is still valid as long as the observations
are independent.


With phylogenetic data, x is not controlled if it is measured "on the species" -- in 
other words it's an evolving trait (or intrinsic variable). x may be controlled if it is measured 
"outside the species" (extrinsic variable) such as latitude. So the case  of using 
regression or correlation is not the same than above.

Combining intrinsic and extinsic variables has generated a lot of debate
in the literature.


I don't think it's a problem of using a method and not another, but rather to 
use a method keeping in mind what it does (and its assumptions). Apparently, 
Hansen and Bartoszek consider a range of models including regression models 
where, by contrast to GLS,  the evolution of the predictors is modelled 
explicitly.

If we want to progress in our knowledge on how evolution works, I think we have 
to not limit ourselves to assess whether there is a relationship, but to test 
more complex models. The case presented by Tom is particularly relevant here 
(at least to me): testing  whether group size affects brain size or the 
opposite (or both) is an

important question. There's been also a lot of debate whether
comparative data can answer this question. Maybe what we need here is an
approach based on simultaneous equations (aka structural equation
models), but I'm not aware whether this exists in a phylogenetic
framework. The approach by Hansen and Bartoszek could be a step in this
direction.


Best,

Emmanuel

Le 13/07/2013 02:59, Joe Felsenstein a �crit :


Tom Schoenemann asked me:

With respect to your crankiness, is this the paper by Hansen that you are 
referring to?:

Bartoszek, K., Pienaar, J., Mostad, P., Andersson, S., & Hansen, T. F. (2012). 
A phylogenetic comparative method for studying multivariate adaptation. Journal of 
Theoretical Biology, 314(0), 204-215.

I wrote Bartoszek to see if I could get his R code to try the method mentioned 
in there. If I can figure out how to apply it to my data, that will be great. I 
agree that it is clearly a mistake to assume one variable is responding 
evolutionarily only to  the current value of the other (predictor variables).


I'm glad to hear that *somebody* here thinks it is a mistake (because it really is).  I 
keep mentioning it here, and Hansen has published extensively on it, but everyone keeps 
saying "Well, my friend used it, and he got tenure, so it must be OK".

The paper I saw was this one:

Hansen, Thomas F & Bartoszek, Krzysztof (2012). Interpreting the evolutionary 
regression: The interplay between observational and biological errors in 
phylogenetic comparative studies. Systematic Biology  61 (3): 413-425.  ISSN 
1063-5157.

J.F.
----
Joe Felsenstein         j...@gs.washington.edu
 Department of Genome Sciences and Department of Biology,
 University of Washington, Box 355065, Seattle, WA 98195-5065 USA

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive athttp://www.mail-archive.com/r-sig-phylo@r-project.org/


_________________________________________________
P. Thomas Schoenemann

Associate Professor
Department of Anthropology
Cognitive Science Program
Indiana University
Bloomington, IN  47405
Phone: 812-855-8800
E-mail: t...@indiana.edu

Open Research Scan Archive (ORSA) Co-Director
Consulting Scholar
Museum of Archaeology and Anthropology
University of Pennsylvania

http://www.indiana.edu/~brainevo











         [[alternative HTML version deleted]]

_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


_______________________________________________
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] PGLS vs lm

Reply via email to