Hello All,
I am having a bit of trouble determining the significance of a term in
a phylogenetic GLS using ape and nlme. I specified a model with a
continuous normal dependent variable and a single categorical factor
with 3 unranked levels as the independent variable. The model is quite
simple but the tree in question is large (894 tips). I want to know if
the three different categories in question have significantly
different means. I have no branch length data and am using ultrametric
and branch length=1 trees for these analyses. I have tried all the
available phylogenetic correlation structures (rho, lambda, OU) to
find the best fitting branch length transformation. They generally
lead to similar results with respect to effect size, and the rho,
lambda, and OU models with the additional parameter have little
difference between them, but fit significantly better than the BM model.
If you use summary() to describe the gls output, level 3 is highly
significantly different from the intercept (level 1), whereas level 2
is not for all models tested. Using the command intervals() from the
nlme package also shows that the effect of level three is
significantly different from 0. This leads me to believe that category
3 of the factor is significantly different from the other two.
However, if you specify a model that does not include the independent
factor variable in question, and is instead of the format -- model<-
gls(Y~1, correlation=corStruct(...)) -- the AIC and BIC scores are
better (lower) than the model of the same correlation structure with
the factor included. Likelihood ratio tests also show that the factor
term does not significantly improve the fit of the model. My
understanding is that model<-gls(Y~1, correlation=corStruct(...)) will
simply fit an intercept based on the mean of the response variable,
the variance-covariance matrix of the tip data, and the evolution
model given. This model will also be nested within the model with the
factor. If the intercept model is shows an equivalent fit based on an
lrtest that means the factor in question is not significantly
improving the model, correct? And if so, why do the summary and
intervals commands show highly significant effects?
Thank you in advance for your time. For clarity some input/output data
for Pagel's lambda are shown below where summary() and anova() show a
significant effect, but the lrtest does not.
Cheers,
Patrick
> Factor.model<-gls(Y~Factor, correlation=corPagel(1, tree))
> summary(Factor.model)
AIC = -1599.295
BIC = -1575.334
logLik=804.6476
lambda=0.658666
Coefficients:
Intercept=0.2518, se=0.0396, t-value=6.3587, p-value=0.000
Level2=-0.00344, se=0.0107, t-value=-0.3222, p-value=0.7474
Level3=-0.03966, se=0.0096, t-value=-4.1334, p-value=0.0000
> anova(Factor.model)
(Intercept) numDF=1, F-value=38.61981, p-value<0.0001
Factor numDF=2, F-value=8.67266, p-value=0.0002
> Intercept.model<-gls(Y~1, correlation=corPagel(1, tree))
> summary(Intercept.model)
AIC=-1600.819
BIC=-1586.435
logLik=803.4093
lambda=0.6562463
> lrtest(Intercept.model, Factor.model)
Intercept.model = 3 Df, 803.41 LogLik
Factor.model= 5 Df, 804.65 LogLik
ChiSq=2.4767, Df=2, p-value=0.2899
Patrick Flight
Doctoral Candidate
Department of Ecology and Evolutionary Biology
Brown University
80 Waterman St. Box G-W
Providence, RI 02912
patrickflight.net
_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo