I sent the message below to sci.stat.edu on February 3.  How-
ever, presumably due to the recent EdStat technical problems, 
the message wasn't received by at least some EdStat subscrib-
ers.  So I've sent the message again.  My apologies if you've 
received it twice.


"Joe" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]

> Don,
>
> Thanks for replying.
>
> I have a few further questions ....
>
> Let's say I am going by the HTO method and I wish to calculate
> the SS for factor A.  I understand that I should not use the
> "whole" model.

A more standard name for the "whole model" is the "saturated
model".

The following discussion assumes that we have no "empty" cells in
the analysis.  Empty cells make the analysis more complicated.
Most real experiments don't have empty cells because the re-
searcher takes steps to prevent their occurrence.  The idea of an
empty cell is discussed further in the paper referenced at the
end of this post.


> My question is then, what is the "other" model in this case.

The two models whose residual sums of squares must be "differ-
enced" to compute the SAS Type 2 main effect sum of squares for
factor A in a three-way analysis of variance are given in lines
1226 and 1227 in the computer program at

              http://www.matstat.com/ss/pr0165.htm

The two models for the HTO sums of squares for the same effect
are given in lines 1371 and 1372 of the same program.


> And, also is this "other" model unique and will it lead to
> unique value of SS.

Interestingly, the parameter estimates of the models under dis-
cussion aren't unique.  (The so-called sigma-restrictions are of-
ten used to generate unique solutions for the parameter esti-
mates.)  Perhaps somewhat surprisingly, despite the fact that the
parameter estimates aren't unique, all the models give unique re-
sidual sums of squares.  Thus, when we subtract one residual sum
of squares from another we get a unique sum of squares for the
associated effect.

However, as can be seen from the computer program, we have many
options for specifying the two models in three-way and higher
cases and these pairs of models will all tend to give (in the un-
balanced case) different values for "the" sum of squares for an
effect.  Thus, for example, in the unbalanced case the SAS type 2
sum of squares for the A effect will be "unique", but it will
generally be different from the HTO sum of squares for the same
effect, which will also be "unique".

>
> How about when we calculate the SS for factor B

The two models for the SAS Type II B effect are given in lines
1233 and 1234 of the above program and the two models for the HTO
B effect are given in lines 1378 and 1379.

>
> Another issue that is troubling me is that of term "ABC".  Why
> is it essential to include it in case of HTI?

The HTI (i.e., SAS Type 3) sums of squares have (by definition)
Higher-level Terms Included (i.e., HTI) in both models.  In the
case in question the sum of squares for the A effect is being
computed.  In this case the ABC term is a higher-level term so it
must be included (along with the three other higher-level terms)
in both models.

The two models whose residual sums of squares must be differenced
to compute the HTI (SAS Type 3) main effect for factor A in a
three-way analysis of variance are given in lines 1283 and 1284
of the above program.


> What if I increase my number of experiments and try to fit a
> quadratic model with second order terms?

I suspect that you mean to say that you will increase the number
of VALUES of one of the predictor variables (i.e., A, B, or C)
from two values to three or more values.  Then you can (if that
predictor variable is continuous and if a curved line is needed)
model the relationship between the response variable (i.e., y)
and that predictor variable with a curved line -- i.e., with a
quadratic or higher-order model.

I'm sorry, but I can't answer the question, partly because I'm
not sure what you're asking, but mainly because I haven't studied
computing sums of squares for curvilinear effects.  You might be
able to answer the question by studying the two computer programs
on the web site and by experimenting with modified versions of
the programs.  (They're designed to be easy to modify.)  By a
process of trial and error and by comparing the output from the
programs with output from standard analysis of variance programs
you can discover which models are (in principle) having their re-
sidual sums of squares "differenced" to compute various analysis
of variance sums of squares.


WHICH WAY OF COMPUTING SUMS OF SQUARES IS "BEST"?

Yates' approach to computing analysis of variance sums of squares
by computing the difference between the residual sums of squares
of two models shows us that we can compute analysis of variance
sums of squares for unbalanced experiments in many different
ways.  And (generally) for any given effect (e.g., the main ef-
fect for factor A in a three-way analysis) each way will give us
a different value for the sum of squares.  This raises the ques-
tion of which way of computing sums of squares is "best".  I dis-
cuss this question in a paper.  The title is "Which Sums of
Squares Are Best In Unbalanced Analysis of Variance?"  The paper
is available at

                    http://www.matstat.com/ss

Don Macnaughton

------------------------------------------------------- 
Donald B. Macnaughton   MatStat Research Consulting Inc
[EMAIL PROTECTED]      Toronto, Canada
------------------------------------------------------- 







.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to