Andrew Robinson wrote:
Dear R colleagues,
a friend and I are trying to develop a modest workflow for the problem
of decomposing tests of higher-order terms into interpretable sets of
tests of lower order terms with conditioning.
For example, if the interaction between A (3 levels) and C (2 levels)
is significant, it may be of interest to ask whether or not A is
significant at level 1 of C and level 2 of C.
The textbook response seems to be to subset the data and perform the
tests on the two subsets, but using the RSS and DF from the original
fit.
We're trying to answer the question using new explanatory variables.
This approach (seems to) work just fine in aov, but not lme.
For example,
##########################################################################
# Build the example dataset
set.seed(101)
Block <- gl(6, 6, 36, lab = paste("B", 1:6, sep = ""))
A <- gl(3, 2, 36, lab = paste("A", 1:3, sep = ""))
C <- gl(2, 1, 36, lab = paste("C", 1:2, sep = ""))
example <- data.frame(Block, A, C)
example$Y <- rnorm(36)*2 + as.numeric(example$A)*as.numeric(example$C)^2 +
3 * rep(rnorm(6), each=6)
with(example, interaction.plot(A, C, Y))
# lme
require(nlme)
anova(lme(Y~A*C, random = ~1|Block, data = example))
summary(aov(Y ~ Error(Block) + A*C, data = example))
# Significant 2-way interaction. Now we would like to test the effect
# of A at C1 and the effect of A at C2. Construct two new variables
# that decompose the interaction, so an LRT is possible.
example$AC <- example$AC1 <- example$AC2 <- interaction(example$A, example$C)
example$AC1[example$C == "C1"] <- "A1.C1" # A is constant at C1
example$AC2[example$C == "C2"] <- "A1.C2" # A is constant at C2
# lme fails (as does lmer)
lme(Y ~ AC1 + AC, random = ~ 1 | Block, data = example)
# but aov seems just fine.
summary(aov(Y ~ Error(Block) + AC1 + AC, data = example))
## AC was not significant, so A is not significant at C1
summary(aov(Y ~ Error(Block) + AC2 + AC, data = example))
## AC was significant, so A is significant at C2
## Some experimentation suggests that lme doesn't like the 'partial
## confounding' approach that we are using, rather than the variables
## that we have constructed.
lme(Y ~ AC, random = ~ 1 | Block, data = example)
lme(Y ~ A + AC, random = ~ 1 | Block, data = example)
##########################################################################
Are we doing anything obviously wrong? Is there another approach to
the goal that we are trying to achieve?
This looks like a generic issue with lme/lmer, in that they are not
happy with rank deficiencies in the design matrix.
Here's a "dirty" trick which you are not really supposed to know about
because it is hidden inside the "stats" namespace:
M <- model.matrix(~AC1+AC, data=example)
M2 <- stats:::Thin.col(M)
summary(lme(Y ~ M2 - 1, random = ~ 1 | Block, data = example)
(Thin.col(), Thin.row(), and Rank() are support functions for
anova.mlm() et al., but maybe we should document them and put them out
in the open.)
--
O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.