I sent the message below to sci.stat.edu on February 3. How- ever, presumably due to the recent EdStat technical problems, the message wasn't received by at least some EdStat subscrib- ers. So I've sent the message again. My apologies if you've received it twice.
"Joe" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Don, > > Thanks for replying. > > I have a few further questions .... > > Let's say I am going by the HTO method and I wish to calculate > the SS for factor A. I understand that I should not use the > "whole" model. A more standard name for the "whole model" is the "saturated model". The following discussion assumes that we have no "empty" cells in the analysis. Empty cells make the analysis more complicated. Most real experiments don't have empty cells because the re- searcher takes steps to prevent their occurrence. The idea of an empty cell is discussed further in the paper referenced at the end of this post. > My question is then, what is the "other" model in this case. The two models whose residual sums of squares must be "differ- enced" to compute the SAS Type 2 main effect sum of squares for factor A in a three-way analysis of variance are given in lines 1226 and 1227 in the computer program at http://www.matstat.com/ss/pr0165.htm The two models for the HTO sums of squares for the same effect are given in lines 1371 and 1372 of the same program. > And, also is this "other" model unique and will it lead to > unique value of SS. Interestingly, the parameter estimates of the models under dis- cussion aren't unique. (The so-called sigma-restrictions are of- ten used to generate unique solutions for the parameter esti- mates.) Perhaps somewhat surprisingly, despite the fact that the parameter estimates aren't unique, all the models give unique re- sidual sums of squares. Thus, when we subtract one residual sum of squares from another we get a unique sum of squares for the associated effect. However, as can be seen from the computer program, we have many options for specifying the two models in three-way and higher cases and these pairs of models will all tend to give (in the un- balanced case) different values for "the" sum of squares for an effect. Thus, for example, in the unbalanced case the SAS type 2 sum of squares for the A effect will be "unique", but it will generally be different from the HTO sum of squares for the same effect, which will also be "unique". > > How about when we calculate the SS for factor B The two models for the SAS Type II B effect are given in lines 1233 and 1234 of the above program and the two models for the HTO B effect are given in lines 1378 and 1379. > > Another issue that is troubling me is that of term "ABC". Why > is it essential to include it in case of HTI? The HTI (i.e., SAS Type 3) sums of squares have (by definition) Higher-level Terms Included (i.e., HTI) in both models. In the case in question the sum of squares for the A effect is being computed. In this case the ABC term is a higher-level term so it must be included (along with the three other higher-level terms) in both models. The two models whose residual sums of squares must be differenced to compute the HTI (SAS Type 3) main effect for factor A in a three-way analysis of variance are given in lines 1283 and 1284 of the above program. > What if I increase my number of experiments and try to fit a > quadratic model with second order terms? I suspect that you mean to say that you will increase the number of VALUES of one of the predictor variables (i.e., A, B, or C) from two values to three or more values. Then you can (if that predictor variable is continuous and if a curved line is needed) model the relationship between the response variable (i.e., y) and that predictor variable with a curved line -- i.e., with a quadratic or higher-order model. I'm sorry, but I can't answer the question, partly because I'm not sure what you're asking, but mainly because I haven't studied computing sums of squares for curvilinear effects. You might be able to answer the question by studying the two computer programs on the web site and by experimenting with modified versions of the programs. (They're designed to be easy to modify.) By a process of trial and error and by comparing the output from the programs with output from standard analysis of variance programs you can discover which models are (in principle) having their re- sidual sums of squares "differenced" to compute various analysis of variance sums of squares. WHICH WAY OF COMPUTING SUMS OF SQUARES IS "BEST"? Yates' approach to computing analysis of variance sums of squares by computing the difference between the residual sums of squares of two models shows us that we can compute analysis of variance sums of squares for unbalanced experiments in many different ways. And (generally) for any given effect (e.g., the main ef- fect for factor A in a three-way analysis) each way will give us a different value for the sum of squares. This raises the ques- tion of which way of computing sums of squares is "best". I dis- cuss this question in a paper. The title is "Which Sums of Squares Are Best In Unbalanced Analysis of Variance?" The paper is available at http://www.matstat.com/ss Don Macnaughton ------------------------------------------------------- Donald B. Macnaughton MatStat Research Consulting Inc [EMAIL PROTECTED] Toronto, Canada ------------------------------------------------------- . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
