Re: ANCOVA vs. sequential regression
Paul Swank <[EMAIL PROTECTED]> wrote: An interaction is always a test of parallel lines whether it is factoral anova, ancova, regression, or profile analysis. Not really. interaction was invented by RA Fisher for ANOVA where there are no lines. that's like saying that ANOVA is regression. It isn't and many people have screwed up by thinking it is. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANCOVA vs. sequential regression
On Mon, 23 Apr 2001, jim clark wrote: > On 22 Apr 2001, Donald Burrill wrote: > > If I were doing it, I'd begin with a "full model" (or "augmented model", > > in Judd & McClelland's terms) containing three predictors: > > y = b0 + b1*X + b2*A + b3*(AX) + error > > where A had been recoded to (0,1) and (AX) = A*X.[1] > > A number of sources (e.g., Aiken & West's Multiple regression: > testing and interpreting interactions) would recommend centering X > first (i.e., subtracting out its mean to produce deviation scores). Yes, this is always an option. Usually recommended to avoid certain computational problems that may arise if the distribution of X has a particularly low coefficient of variation, for example, and if the model contains many variables (and in particular interactions among them). Such problems are unlikely to arise in so simple a model as [1], and are more effectively dealt with when they do arise by deliberately orthogonalizing the predictors. I've never quite understood why deviations from a sample mean, which is after all a random function of the particular sample one has, should be preferred either to the original values of X (unless there ARE distributional problems) or to deviations from some value inherently more meaningful than a sample mean. > You might also consider whether dummy coding (0,1), as recommended by > Donald, would be best or perhaps effect coding (-1, 1). Also a possibility, of course. Note that the interpretations of the several coefficients (b0, b2, and b3 in particular) change with changes in coding of the dichotomy A. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANCOVA vs. sequential regression
Hi On 22 Apr 2001, Donald Burrill wrote: > If I were doing it, I'd begin with a "full model" (or "augmented model", > in Judd & McClelland's terms) containing three predictors: > y = b0 + b1*X + b2*A + b3*(AX) + error > where A had been recoded to (0,1) and (AX) = A*X.[1] A number of sources (e.g., Aiken & West's Multiple regression: testing and interpreting interactions) would recommend centering X first (i.e., subtracting out its mean to produce deviation scores). You might also consider whether dummy coding (0,1), as recommended by Donald, would be best or perhaps effect coding (-1, 1). Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANCOVA vs. sequential regression
The clearest (and withal concise) exposition of ANCOVA that I ever encountered is at the beginning of the third chapter of Tatsuoka's book on multivariate analysis. If you can find a copy, it would both explain what ANCOVA is all about, and illuminate the more cryptic responses you've already had. As has already been agreed, you are correct that the "second analysis option" is identical to the first, the only difference being whether one uses a program labelled "ANCOVA" or one labelled "regression" to carry out the work. FWIW, I'd be inclined to use a regression program, for a couple of reasons: (1) one can investigate directly whether A and X interact, which is to say whether the slope of the regression of Y upon X differs in the groups represented by A, and some ANCOVA programs do not permit this even to be thought of; (2) one can control the order in which the predictors are considered, which (in a system like MINITAB that reports sequential sums of squares) can be informative. I would agree with the respondent that urged you to plot Y vs X using different symbols for the two levels of A. In MINITAB's character graphics, for instance, LPLOT 'Y' 'X' 'A' If I were doing it, I'd begin with a "full model" (or "augmented model", in Judd & McClelland's terms) containing three predictors: y = b0 + b1*X + b2*A + b3*(AX) + error where A had been recoded to (0,1) and (AX) = A*X.[1] If b3 is close enough to 0 to disregard, one would be interested in a "reduced" (or "compact") model y = b0 + b1*X + b2*A + error[2] which fits a common slope to the regression of Y upon X in both groups. Otherwise, b3 represents the difference between the slope in group "1" and the slope in group "0", and the differences in values of Y between groups depend on the value of X. Whether this is much of a complication or not depends on such things as whether the _direction_ of that difference differs within the range of X observed ... which pretty well requires that one examine the letter-plot described above. In model [2], one is most likely to be interested in whether b2 is close enough to 0 to disregard: that is, whether the data really require two parallel lines in the model, or whether one line suffices, in which case one wants to fit the model y = b0 + b1*X + error. [3] (Of course, the models [1] [2] and [3] above are not exhaustive. But discussing others would require speculating even more egregiously than I've already done about possible shapes of the data...) On Fri, 20 Apr 2001, William Levine wrote: > Here is a statistical issue that I have been pondering for a few weeks > now, and I am hoping someone can help set me straight. > > A study was conducted to assess whether there were age differences in > memory for order independent of memory for items. Two preexisting > groups (younger and older adults - let's call this variable A) were > tested for memory for order information (Y). These groups were also > tested for item memory (X). One respondent complained that the two groups appeared not to be randomly selected. I can't tell from this description whether that be true or not; but the first question, I should think, is whether in these data there appears to be any effect of Age at all. If there is, one can then worry about whether the effect is properly _attributable_ to Age, or to any of the variables with which Age is doubtless confounded -- history, for one obvious example -- and try to devise a research design that will help these potential sources of variability to be disentangled in future research. Also, if there is an "Age" effect, it may be worthwhile (depending partly on how much data one has) fitting a model in which Age is allowed to vary on a [quasi-]continuum. (In model [1] above, use something like raw Age rather than the dichotomy A, or possibly Age expressed as a deviation from some middling value; in that case, I'd want to express (AX) as the part of the product Age*X that is orthogonal both to Age and to X.) > Two ways of analyzing these data came to mind. One was to perform an > ANCOVA treating X as a covariate. But the two groups differ with > respect to X, which would make interpretation of the ANCOVA difficult. That might depend on the form of the ANCOVA program output; another reason I prefer using a regression program. But even in ANCOVA, there are only the two predictors, and if Y and X are correlated (as I gather they must be, reading between the lines), the only question is whether there are different regression lines for each group, or whether one line suffices for both: presuming, of course, that the regression slopes ARE parallel, which may not be possible to examine except via a regression program. > Thus, an ANCOVA did not seem like the correct analysis. > > A second analysis option (suggested by a friend) is to perform a > sequential
Re: ANCOVA vs. sequential regression
William B. Ware <[EMAIL PROTECTED]> wrote: : sequential/hierarchical regression as you note below... however, ANCOVA : has at least two assumptions that your situation does not meet. First, it : assumes that assignment to treatment condition is random. Second, it : assumes that the measurement on the covariate is independent of : treatment. That is, the covariate should be measured before the treatment : is implemented. Thus, I believe that you should implement the : hierarchical regression... but I'm not certain what question you are They aren't assumptions but they do affect interpretations. either way is ANCOVA which will answer a question. Write the model comparison and you'll see that. Whether it's the question you want to answer is another = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANCOVA vs. sequential regression
On Fri, 20 Apr 2001 13:11:02 -0400, "William Levine" <[EMAIL PROTECTED]> wrote: ... > A study was conducted to assess whether there were age differences in memory > for order independent of memory for items. Two preexisting groups (younger > and older adults - let's call this variable A) were tested for memory for > order information (Y). These groups were also tested for item memory (X). > > Two ways of analyzing these data came to mind. One was to perform an ANCOVA > treating X as a covariate. But the two groups differ with respect to X, > which would make interpretation of the ANCOVA difficult. Thus, an ANCOVA did > not seem like the correct analysis. - "potentially problematic" - but not always wrong. > A second analysis option (suggested by a friend) is to perform a sequential > regression, entering X first and A second to > test if there is significant leftover variance explained by A. [ snip ... suggestions? ] Yes, you are right, that is exactly the same as the ANCOVA. What can you do? What can you conclude? That depends on - how much you know and trust the *scaling* of the X measure, - how much overlap there is between the groups, and - how much correlation there is, X and Y. You probably want to start by plotting the data. When you use different symbols for Age, what do you see about X and Y? and Age? Here's a quick example of hard choices when groups don't match. Assume: group A improves, on the average, from a mean score of 4, to 2. Assume group B improves from 10 to 5 Then: a) A is definitely better in "simple outcome" at 2 vs. 5; b) B is definitely better in "points of improvement" at 5 vs. 2; c) A and B fared exactly as well, in terms of "50% improvement" (dropping towards a 0 that is apparently meaningful). I would probably opt for that 3rd interpretation, given this set of numbers, since the 3rd answer preserves a null hypothesis. With another single set of numbers in hand, I would lean towards *whatever* preserves the null. But here is where experience is supposed to be a teacher -- If you have dozens of numbers, eventually you have to read them with consistency, instead of bending an interpretation to fit the latest set. But if you do have masses of data on hand, then you should have extra evidence about correlations, and about additive or multiplicative scaling. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANCOVA vs. sequential regression
On Fri, 20 Apr 2001, William Levine wrote: > A study was conducted to assess whether there were age differences in memory > for order independent of memory for items. Two preexisting groups (younger > and older adults - let's call this variable A) were tested for memory for > order information (Y). These groups were also tested for item memory (X). > > Two ways of analyzing these data came to mind. One was to perform an ANCOVA > treating X as a covariate. But the two groups differ with respect to X, > which would make interpretation of the ANCOVA difficult. Thus, an ANCOVA did > not seem like the correct analysis. Here's my take on it... The ANCOVA model can be implemented with sequential/hierarchical regression as you note below... however, ANCOVA has at least two assumptions that your situation does not meet. First, it assumes that assignment to treatment condition is random. Second, it assumes that the measurement on the covariate is independent of treatment. That is, the covariate should be measured before the treatment is implemented. Thus, I believe that you should implement the hierarchical regression... but I'm not certain what question you are really answering... I guess it is whether there is variabilty in memory for order that is related to age, that is independent of variability in memory for items... So, I would not call it an ANCOVA... You might also consider the possibiltiy of interaction... That is, is the relationship between memory for order and memory for items the same for younger and older participants... WBW __ William B. Ware, Professor and Chair Educational Psychology, CB# 3500 Measurement, and Evaluation University of North Carolina PHONE (919)-962-7848 Chapel Hill, NC 27599-3500 FAX: (919)-962-1533 http://www.unc.edu/~wbware/ EMAIL: [EMAIL PROTECTED] __ > > A second analysis option (suggested by a friend) is to perform a sequential > regression, entering X first and A second to > test if there is significant leftover variance explained by A. > > This second option sounds to me like the same thing as the first option. In > an ANCOVA, variance in Y that is predictable by X is removed from the total > variance, and then variance due to A (adjusted) is tested against variance > due to S/A (adjusted). In > the sequential regression, variance in the Y that is predictable by X is > removed from the total variance, and then the leftover variance in Y is > regressed on A. Aren't these two analyses identical? If not, what is it that > differs? Finally, does anyone have any suggestions? > > Many thanks! > -- > William Levine > Department of Psychology > University of North Carolina-Chapel Hill > Chapel Hill, NC 27599-3270 > [EMAIL PROTECTED] > http://www.unc.edu/~whlevine > > > > > = > Instructions for joining and leaving this list and remarks about > the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ > = > = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =