This seems to me a rather serious issue, but one that comes up more frequently 
than it should.  Let's assume the treatments were applied at random to the 
plots.  There are two options with regard to pre conditions.  One is to apply 
the treatments at random and simply remain blind to any existing variation 
among the plots.  This approach relies on the process of randomization to 
provide adequate dispersion across the existing variation and deconfound 
existing variation and applied treatment.   This is statistically valid, quite 
common, but does have its risks.  The risk is that things can go "wrong" in the 
randomization processes and confounding does occur - you simply don't know it 
and must interpret your results under the assumption that treatment was the 
only factor that varied systematically.   Treatments effects can be washed out 
or observed variation may be driven by the unmeasured variation in pre existing 
conditions.   However, you are statistically justified in accep!
 ting the results as they present themselves.   They may simply be wrong.  
Prior knowledge of variation among plots is important in the decision to go 
this route.

The second approach is the one that you took, which is to pre sample the plots 
to assess the variation in conditions. The value here is that you have a lot 
more information you can potentially bring to bear.  The point that is most 
often missed is that it also allows you, and I would suggest requires you,  to 
carefully tune your design to account for the existing variation.  The "risk" 
in taking pre samples is that if you don't fine tune your design, you are still 
"stuck" with the information from the pre sample. It can't be ignored or simply 
made to go away.  Thus, once you make the decision to pre sample plots, it is 
CRITICAL to use that variation in the assignment of treatments to assure 
adequate dispersion of treatments across the existing variation.  I think the 
proper approach here would have been to block the plots by seedling health and 
then randomly apply the treatments within blocks.   Alternatively one could use 
a stratified random assignment of plots, though this!
  limits the ability to extract information.  The covariate approach ONLY works 
if there is no confounding of pre existing variation and treatment (and no 
interaction).  I hate to be dour, but I'm not sure I see a way out of this 
situation.  Can you really hope to determine whether it is treatment or initial 
seedling health that is driving the results?  One would have to know more of 
the details, but either way the robustness of the results that typically derive 
from an experiment are seriously compromised.  Had you blocked by seedling 
condition you could look at the effect of seedling health, treatment and their 
interaction.

I think the most frustrating thing in such situations is that one ends up 
thwarted by one's own best intentions.


On 11/11/10 3:04 PM, "Jing Luo" <luoj...@gmail.com> wrote:

Dear All,

I have a question about including covariates in the ANOVA analysis.

We grew corn seedlings in about 32 field plots and then applied 4 different
treatments to study their responses (plot is the experiment unit). However,
we noticed quite big variation of seedling healthiness from plot to plot
BEFORE the treatments were applied. So we scored the healthiness from 1 to 5
(least healthy to most healthy) and planned to include this as a
covariate in the model.

During data analysis, I noticed that the healthiness was confounded with
treatments, with some treatments applied to most of the healthy plots, and
other treatment applied to most of the not healthy plots (we could not
control that because treatment to each plots was pre-determined). As a
result, the analysis on some of the variables show some strange patterns,
especially when the healthiness covariate was significant in the model. For
one variable, for example, the least-square mean estimates of the four
treatments were A=B=C<D if covariate was NOT included, but became A=B=D<C if
covariates was included in the model.

I acknowledge that covariates serve their important role in controlling
factors that were not imposed by the treatment. However, I am just wondering
when the covariate is confounded with treatment, and had significant affect
on the results, can we argue that the covariate could be excluded from
the model? Have you ever have to deal with this similar situation before?

Any thoughts will be appreciated. Thanks.

Jing Luo

Reply via email to