Well put Donald. The only additional points I wish to make are that in my
career I've
never seen balanced factorial data with normal errors. Only in the case where
the
study was done in a balanced way (i.e., experimental study, no missing data,
etc.) AND
where the model is a regression model with normal errors are effects orthogonal
[even with
perfect balance, nonlinear models such as logistic models do not yield
orthogonal
esetimates.]. Even then, I'm not
often interested in "main effect" tests, which are averages of stratified
estimates (stratified
by the other factor).
Even when I want to average the stratified effects, I do it by getting
differences in
predicted values, therefore coding is of no concern to me. In S-Plus I have a
contrast
function that uses this method, e.g. contrast(fit.result,
list(age=65,sex=c('male','female')),
list(age=21, sex=c('male','female')), type='average', weights=table(sex))
This does a Type II contrast where weights are the marginal frequencies of male
and female.
If I want a Type III contrast (seldom sensible) I would use weights='equal'.
The contrast
is for age 65 vs. age 21, no matter how nonlinear the age effect is in the
model.
-Frank
"Donald F. Burrill" wrote:
> In response to a comment of mine:
>
> > Incidentally, I'd strongly recommend constructing interaction variables
> > that are orthogonal at least to their own main effects (and lower-order
> > interactions, when there are any), and possibly orthogonal to some or all
> > of the apparently irrelevant other predictors. Else correlations between
> > the interaction variables and other variables can, sometimes, be horribly
> > confusing; especially with the "quantitative" (non-categorical)
> > variables, whose products with other such variables are likely to be
> > strongly (positively) correlated with the original variables merely
> > because the original variables tend to be always positive and sometimes
> > far from zero -- thus inducing what I've elsewhere called "spurious
> > multicollinearity".
>
> Frank E Harrell Jr wrote:
>
> > This I do not understand. I don't see the point in testing main
> > effects in the presence of interaction effects (unlike the pooled main
> > effect + interaction effect tests which are completely invariant to
> > coding). So I don't see why coding matters. -Frank Harrell
>
> Sorry if I have confused two issues. The remark quoted is not related to
> the coding of variables; it applies generally. As to "testing main
> effects in the presence of interactions", in a factorial analysis of
> variance one tests main effects and all possible interactions in the
> presence of each other; and it is standard advice not to attempt to
> interpret main effects (or for that matter lower-order interactions) in
> the presence of significant interaction(s), at least until one has made
> some sense out of the interaction(s) (or, better, out of the pattern of
> main effects & interactions).
> But in a balanced factorial ANOVA things are unambiguous in two
> ways: (1) the apparent significance of individual sources of variation
> does not depend on the order of their entry into the model; (2) the
> significance of any particular source does not depend on the presence or
> absence of other sources. Both of these are due to the orthogonality
> inherent in a balanced design. When the predictors are correlated, as is
> usual in regression and in unbalanced ANOVAs, neither of these is true.
> Constructing interactions to be orthogonal to their main effects and to
> lower-order interactions, as recommended above, means at least that one's
> ability to detect main effects is not bollixed up by including the
> interaction terms in the analysis. It also means that if any interaction
> term is significant, one can believe that one is indeed looking at an
> interaction effect, and not at an artifact arising from inadvertent
> correlation between the interaction variable and its main effects.
> I take it that one first looks for the patterns of main effects
> and interactions that must be taken into account in the eventual
> restricted model; then one attempts to interpret the model. At this
> point coding matters, because the meaning one can attribute to any
> particular coefficient will depend on the coding of the variable. It
> follows that one may choose to revise the coding, to facilitate or
> simplify the interpretation.
>
> There is one other sense in which "coding matters", although this
> may be a bit off-topic from the original thread. Consider an experiment
> in which the subjects are of two sexes, and the experimental treatments
> are mediated by experimenters, who also are of two sexes. Whatever else
> is going on, there is a 2x2 subdesign representing (sex of Subject) by
> (sex of Experimenter). One may code both variables, for example, so that
> 0 = male and 1 = female. Then if the data show a difference between
> cases where subject and experimenter are of the same sex and cases where
> they are of opposite sexes, that's an interaction effect. But if one had
> coded (0 = male and 1 = female) for Subjects, and (0 = same sex as
> Subject and 1 = opposite sex from Subject) for Experimenters, then the
> effect just described is a main effect of the Experimenter sex variable.
>
> ------------------------------------------------------------------------
> Donald F. Burrill [EMAIL PROTECTED]
> 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED]
> MSC #29, Plymouth, NH 03264 603-535-2597
> 184 Nashua Road, Bedford, NH 03110 603-471-7128
--
Frank E Harrell Jr
Professor of Biostatistics and Statistics
Division of Biostatistics and Epidemiology
Department of Health Evaluation Sciences
University of Virginia School of Medicine
http://hesweb1.med.virginia.edu/biostat