Well, I appreciate Rich's concern about doing someone else's homework for them. Perhaps I should have identified myself as a professor of human factors and systems in Florida. I often wished I had gone the route of getting my Ph.D. in statistics, but gee, I thought of that too late. Furthermore, after taking my comps, I swore that I would never get another Ph.D.!
My take on things is that the unbalanced design results in messed up SS values. BUT, the SPSS documentation indicates that the Type III SS terms they compute handle that. OK. So maybe the result of non-random assignment are correlations among factors, producing overlapping SS terms? After all, ANOVA models require group independence. One might make the argument that when only two factors are modeled and if one involves true random assignment, you won't necessarily end up with correlated factors...perhaps I'm wrong on that? Finally, most folks who do the stuff I mention below treat the participant factor as a fixed factor, when indeed it's more of a random factor. Some may counter with the argument that this is not the case because "all" potential levels are indeed being represented (i.e. Male, female; low, med, hi IQ). BUT, something tells me that at a minimum, it is a violation of the ANOVA assumptions to NOT randomly assign individuals to "treatment groups". Indeed, I would argue that participant factors are not "treatment groups" at all and should not be treated as such, but this is done all of the time in the behavioral sciences. To be sure, the lack of random assignment to a factor certainly placed great restrictions on one's ability to establish causality, but I see that as more of a research methods issue and not a mathematical/statistical issue. I personally approach such situations with multiple regression, although I can't provide a mathematical proof about why this resolves any of these concerns. I do know that MR is designed to be used with observational data and certainly can accomodate experimental data. Many folks in my discipline prefer ANOVA, presumably because it is easier. They also prefer significance testing to confidence intervals, but that's another story. Regardless, my position is that the use of personal variables as fixed-factors in an ANOVA model is against the assumptions and shouldn't be done. The problem is that I don't understand what the mathematical implications of violating the random assignment assumption are. Perhaps we can start a discussion on this topic and you folks can school a psychology professor? If not, could you point me to a text (that can be read by someone without extensive training in mathematics or statistics) that might address the scenario described below? I've looked in the most thorough texts that I know of (hey, they're all stats books for the behavioral sciences) and have not been able to figure out the answer, but I am sure that there are texts geared towards other disciplines that have the answer in them. Regards, Steve Steve Hall, Ph.D. Department of Human Factors and Systems Embry-Riddle Aeronautical University Daytona Beach, FL 32114-3900 V:386.226.7106 F:386.226.7050 [EMAIL PROTECTED] "Rich Ulrich" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On Wed, 10 Mar 2004 15:46:31 -0500, "Steve" <[EMAIL PROTECTED]> > wrote: > > > Consider the case where an experimental variable along with a "participant > > variable" are used in a study. That is, levels of one variable are > > manipulated by the experimenter and levels of another variable are based on > > some inherent participant attribute (i.e. IQ, hair color, race, gender, > > etc.). Let's consider one of three situations: 1) the number of participants > > per cell varies in proportion to the respective populations 2) the > > experimenter uses quota sampling to ensure equal cell sample sizes 3) when > > applicable, cut scores are chosen in order to produce equal cell-sizes. > > > > Such situations are common in the behavioral sciences. Furthermore, ANOVA is > > commonly used to analyze data collected using such paradigms. > > > > So here are three questions: > > If you don't get an answer, it is probably (it seems to me) > because this reads entirely as if you are passing along to us > your homework assignment. > > Unless a question is controversial or difficult, the folks who > answer tend to find 'homework' boring. Then there's the > moral problem. > > > > > 1. From an analysis point of view, what problems in data analysis are likely > > to arise? > > 2. IF these problems are not addressed, what are the implications? > > 3. What is the proper way to analyze such data? > > If you lay out *your* answers, someone might offer > observations that hint at other things. > > -- > Rich Ulrich, [EMAIL PROTECTED] > http://www.pitt.edu/~wpilib/index.html > - I need a new job, after March 31. Openings? - . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
