Hello All,

I am having a bit of trouble determining the significance of a term in a phylogenetic GLS using ape and nlme. I specified a model with a continuous normal dependent variable and a single categorical factor with 3 unranked levels as the independent variable. The model is quite simple but the tree in question is large (894 tips). I want to know if the three different categories in question have significantly different means. I have no branch length data and am using ultrametric and branch length=1 trees for these analyses. I have tried all the available phylogenetic correlation structures (rho, lambda, OU) to find the best fitting branch length transformation. They generally lead to similar results with respect to effect size, and the rho, lambda, and OU models with the additional parameter have little difference between them, but fit significantly better than the BM model.

If you use summary() to describe the gls output, level 3 is highly significantly different from the intercept (level 1), whereas level 2 is not for all models tested. Using the command intervals() from the nlme package also shows that the effect of level three is significantly different from 0. This leads me to believe that category 3 of the factor is significantly different from the other two. However, if you specify a model that does not include the independent factor variable in question, and is instead of the format -- model<- gls(Y~1, correlation=corStruct(...)) -- the AIC and BIC scores are better (lower) than the model of the same correlation structure with the factor included. Likelihood ratio tests also show that the factor term does not significantly improve the fit of the model. My understanding is that model<-gls(Y~1, correlation=corStruct(...)) will simply fit an intercept based on the mean of the response variable, the variance-covariance matrix of the tip data, and the evolution model given. This model will also be nested within the model with the factor. If the intercept model is shows an equivalent fit based on an lrtest that means the factor in question is not significantly improving the model, correct? And if so, why do the summary and intervals commands show highly significant effects?

Thank you in advance for your time. For clarity some input/output data for Pagel's lambda are shown below where summary() and anova() show a significant effect, but the lrtest does not.

Cheers,
Patrick

> Factor.model<-gls(Y~Factor, correlation=corPagel(1, tree))

> summary(Factor.model)
AIC = -1599.295
BIC = -1575.334
logLik=804.6476

lambda=0.658666

Coefficients:
Intercept=0.2518, se=0.0396, t-value=6.3587, p-value=0.000
Level2=-0.00344, se=0.0107, t-value=-0.3222, p-value=0.7474
Level3=-0.03966, se=0.0096, t-value=-4.1334, p-value=0.0000

> anova(Factor.model)

(Intercept) numDF=1, F-value=38.61981, p-value<0.0001
Factor numDF=2, F-value=8.67266, p-value=0.0002

> Intercept.model<-gls(Y~1, correlation=corPagel(1, tree))

> summary(Intercept.model)
AIC=-1600.819
BIC=-1586.435
logLik=803.4093

lambda=0.6562463

> lrtest(Intercept.model,  Factor.model)
Intercept.model = 3 Df, 803.41 LogLik
Factor.model= 5 Df, 804.65 LogLik
ChiSq=2.4767, Df=2, p-value=0.2899





Patrick Flight
Doctoral Candidate
Department of Ecology and Evolutionary Biology
Brown University
80 Waterman St. Box G-W
Providence, RI 02912
patrickflight.net

_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Reply via email to