Re: ANCOVA vs. sequential regression

2001-04-23 Thread Elliot Cramer

Paul Swank <[EMAIL PROTECTED]> wrote:
 An interaction is always a test of parallel
lines whether it is factoral anova, ancova, regression, or profile analysis.

Not really. interaction was invented by RA Fisher for ANOVA where there
are no lines.  that's like saying that ANOVA is regression.  It isn't and
many people have screwed up by thinking it is.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANCOVA vs. sequential regression

2001-04-23 Thread Donald Burrill

On Mon, 23 Apr 2001, jim clark wrote:

> On 22 Apr 2001, Donald Burrill wrote:
> > If I were doing it, I'd begin with a "full model" (or "augmented model", 
> > in Judd & McClelland's terms) containing three predictors:
> > y  =  b0 + b1*X + b2*A + b3*(AX) + error
> >  where A had been recoded to (0,1) and (AX) = A*X.[1]
> 
> A number of sources (e.g., Aiken & West's Multiple regression:
> testing and interpreting interactions) would recommend centering X 
> first (i.e., subtracting out its mean to produce deviation scores). 

Yes, this is always an option.  Usually recommended to avoid certain 
computational problems that may arise if the distribution of X has a 
particularly low coefficient of variation, for example, and if the model 
contains many variables (and in particular interactions among them).  
Such problems are unlikely to arise in so simple a model as [1], and are 
more effectively dealt with when they do arise by deliberately
orthogonalizing the predictors.  I've never quite understood why 
deviations from a sample mean, which is after all a random function of 
the particular sample one has, should be preferred either to the original 
values of X (unless there ARE distributional problems) or to deviations 
from some value inherently more meaningful than a sample mean.

> You might also consider whether dummy coding (0,1), as recommended by 
> Donald, would be best or perhaps effect coding (-1, 1).

Also a possibility, of course.  Note that the interpretations of the 
several coefficients (b0, b2, and b3 in particular) change with changes 
in coding of the dichotomy A.
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANCOVA vs. sequential regression

2001-04-23 Thread jim clark

Hi

On 22 Apr 2001, Donald Burrill wrote:
> If I were doing it, I'd begin with a "full model" (or "augmented model", 
> in Judd & McClelland's terms) containing three predictors:
>   y  =  b0 + b1*X + b2*A + b3*(AX) + error
>  where A had been recoded to (0,1) and (AX) = A*X.[1]

A number of sources (e.g., Aiken & West's Multiple regression:
testing and interpreting interactions) would recommend centering
X first (i.e., subtracting out its mean to produce deviation
scores).  You might also consider whether dummy coding (0,1), as
recommended by Donald, would be best or perhaps effect coding
(-1, 1).

Best wishes
Jim


James M. Clark  (204) 786-9757
Department of Psychology(204) 774-4134 Fax
University of Winnipeg  4L05D
Winnipeg, Manitoba  R3B 2E9 [EMAIL PROTECTED]
CANADA  http://www.uwinnipeg.ca/~clark




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANCOVA vs. sequential regression

2001-04-22 Thread Donald Burrill

The clearest (and withal concise) exposition of ANCOVA that I ever 
encountered is at the beginning of the third chapter of Tatsuoka's book 
on multivariate analysis.  If you can find a copy, it would both explain 
what ANCOVA is all about, and illuminate the more cryptic responses 
you've already had.

As has already been agreed, you are correct that the "second analysis 
option" is identical to the first, the only difference being whether one 
uses a program labelled "ANCOVA" or one labelled "regression" to carry 
out the work.

FWIW, I'd be inclined to use a regression program, for a couple of
reasons:  (1) one can investigate directly whether A and X interact, 
which is to say whether the slope of the regression of Y upon X differs 
in the groups represented by A, and some ANCOVA programs do not permit 
this even to be thought of;  (2) one can control the order in which the 
predictors are considered, which (in a system like MINITAB that reports 
sequential sums of squares) can be informative. 

I would agree with the respondent that urged you to plot Y vs X using 
different symbols for the two levels of A.  In MINITAB's character 
graphics, for instance, 
LPLOT 'Y' 'X' 'A'

If I were doing it, I'd begin with a "full model" (or "augmented model", 
in Judd & McClelland's terms) containing three predictors:
y  =  b0 + b1*X + b2*A + b3*(AX) + error
 where A had been recoded to (0,1) and (AX) = A*X.[1]

If b3 is close enough to 0 to disregard, one would be interested in a 
"reduced" (or "compact") model
y  =  b0 + b1*X + b2*A + error[2]
 which fits a common slope to the regression of Y upon X in both groups. 
Otherwise, b3 represents the difference between the slope in group "1" 
and the slope in group "0", and the differences in values of Y between 
groups depend on the value of X.  Whether this is much of a complication 
or not depends on such things as whether the _direction_ of that 
difference differs within the range of X observed ... which pretty well 
requires that one examine the letter-plot described above.

In model [2], one is most likely to be interested in whether b2 is close 
enough to 0 to disregard:  that is, whether the data really require two 
parallel lines in the model, or whether one line suffices, in which case 
one wants to fit the model
y  =  b0 + b1*X + error.   [3]

(Of course, the models [1] [2] and [3] above are not exhaustive.  But 
discussing others would require speculating even more egregiously than 
I've already done about possible shapes of the data...)

On Fri, 20 Apr 2001, William Levine wrote:

> Here is a statistical issue that I have been pondering for a few weeks 
> now, and I am hoping someone can help set me straight.
> 
> A study was conducted to assess whether there were age differences in 
> memory for order independent of memory for items.  Two preexisting 
> groups (younger and older adults - let's call this variable A) were 
> tested for memory for order information (Y).  These groups were also 
> tested for item memory (X). 

One respondent complained that the two groups appeared not to be randomly 
selected.  I can't tell from this description whether that be true or 
not;  but the first question, I should think, is whether in these data 
there appears to be any effect of Age at all.  If there is, one can then 
worry about whether the effect is properly _attributable_ to Age, or to 
any of the variables with which Age is doubtless confounded -- history, 
for one obvious example -- and try to devise a research design that will 
help these potential sources of variability to be disentangled in future 
research.

Also, if there is an "Age" effect, it may be worthwhile (depending partly 
on how much data one has) fitting a model in which Age is allowed to vary 
on a [quasi-]continuum.  (In model [1] above, use something like raw Age 
rather than the dichotomy A, or possibly Age expressed as a deviation 
from some middling value;  in that case, I'd want to express (AX) as the 
part of the product Age*X that is orthogonal both to Age and to X.)

> Two ways of analyzing these data came to mind. One was to perform an 
> ANCOVA treating X as a covariate. But the two groups differ with 
> respect to X, which would make interpretation of the ANCOVA difficult. 

That might depend on the form of the ANCOVA program output;  another 
reason I prefer using a regression program.  But even in ANCOVA, there 
are only the two predictors, and if Y and X are correlated (as I gather 
they must be, reading between the lines), the only question is whether 
there are different regression lines for each group, or whether one line 
suffices for both:  presuming, of course, that the regression slopes ARE 
parallel, which may not be possible to examine except via a regression 
program.

> Thus, an ANCOVA did not seem like the correct analysis.
> 
> A second analysis option (suggested by a friend) is to perform a 
> sequential

Re: ANCOVA vs. sequential regression

2001-04-21 Thread Elliot Cramer

William B. Ware <[EMAIL PROTECTED]> wrote:
: sequential/hierarchical regression as you note below... however, ANCOVA
: has at least two assumptions that your situation does not meet.  First, it
: assumes that assignment to treatment condition is random.  Second, it
: assumes that the measurement on the covariate is independent of
: treatment.  That is, the covariate should be measured before the treatment
: is implemented.  Thus, I believe that you should implement the
: hierarchical regression... but I'm not certain what question you are

They aren't assumptions but they do affect interpretations.  either way is
ANCOVA  which will answer a question.  Write the model comparison and
you'll see that.
  Whether it's the question you want to answer is another


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANCOVA vs. sequential regression

2001-04-20 Thread Rich Ulrich

On Fri, 20 Apr 2001 13:11:02 -0400, "William Levine"
<[EMAIL PROTECTED]> wrote:
 ...
> A study was conducted to assess whether there were age differences in memory
> for order independent of memory for items. Two preexisting groups (younger
> and older adults - let's call this variable A) were tested for memory for
> order information (Y). These groups were also tested for item memory (X).
> 
> Two ways of analyzing these data came to mind. One was to perform an ANCOVA
> treating X as a covariate. But the two groups differ with respect to X,
> which would make interpretation of the ANCOVA difficult. Thus, an ANCOVA did
> not seem like the correct analysis.

 - "potentially problematic" - but not always wrong.

> A second analysis option (suggested by a friend) is to perform a sequential
> regression, entering X first and A second to
> test if there is significant leftover variance explained by A.
 [ snip ...  suggestions? ]

Yes, you are right, that is exactly the same as the ANCOVA.

What can you do?  What can you conclude?  
That depends on  
 - how much you know and trust the *scaling*  of the X measure,
 - how much overlap there is between the groups, and 
 - how much correlation there is, X and Y.

You probably want to start by plotting the data.  When you use
different symbols for Age, what do you see about X and Y? and Age?

Here's a quick example of hard choices when groups don't match.

Assume:
group A improves, on the average, from a mean
score of 4, to 2.  Assume group B improves from 10 to 5

Then:  
 a) A is definitely better in "simple outcome" at 2 vs. 5;
 b) B is definitely better in "points of improvement" at 5 vs. 2;
 c) A and B fared exactly as well, in terms of "50% improvement"
(dropping towards a 0 that is apparently meaningful).

I would probably opt for that 3rd interpretation, given this set of
numbers, since the 3rd answer preserves a null hypothesis.

With another single set of numbers in hand, I would lean towards 
*whatever*  preserves the null.  But here is where experience is
supposed to be a teacher -- If you have dozens of numbers, 
eventually you have to read them with consistency, instead of
bending an interpretation to fit the latest set.  But if you do have
masses of data on hand, then you should have extra evidence 
about correlations, and about additive or multiplicative scaling.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: ANCOVA vs. sequential regression

2001-04-20 Thread William B. Ware

On Fri, 20 Apr 2001, William Levine wrote:

> A study was conducted to assess whether there were age differences in memory
> for order independent of memory for items. Two preexisting groups (younger
> and older adults - let's call this variable A) were tested for memory for
> order information (Y). These groups were also tested for item memory (X).
> 
> Two ways of analyzing these data came to mind. One was to perform an ANCOVA
> treating X as a covariate. But the two groups differ with respect to X,
> which would make interpretation of the ANCOVA difficult. Thus, an ANCOVA did
> not seem like the correct analysis.

Here's my take on it... The ANCOVA model can be implemented with
sequential/hierarchical regression as you note below... however, ANCOVA
has at least two assumptions that your situation does not meet.  First, it
assumes that assignment to treatment condition is random.  Second, it
assumes that the measurement on the covariate is independent of
treatment.  That is, the covariate should be measured before the treatment
is implemented.  Thus, I believe that you should implement the
hierarchical regression... but I'm not certain what question you are
really answering...

I guess it is whether there is variabilty in memory for order that is
related to age, that is independent of variability in memory for
items... So, I would not call it an ANCOVA... You might also consider the
possibiltiy of interaction... That is, is the relationship between memory
for order and memory for items the same for younger and older
participants...

WBW

__
William B. Ware, Professor and Chair   Educational Psychology,
CB# 3500   Measurement, and Evaluation
University of North Carolina PHONE  (919)-962-7848
Chapel Hill, NC  27599-3500  FAX:   (919)-962-1533
http://www.unc.edu/~wbware/  EMAIL: [EMAIL PROTECTED]
__

> 
> A second analysis option (suggested by a friend) is to perform a sequential
> regression, entering X first and A second to
> test if there is significant leftover variance explained by A.
> 
> This second option sounds to me like the same thing as the first option. In
> an ANCOVA, variance in Y that is predictable by X is removed from the total
> variance, and then variance due to A (adjusted) is tested against variance
> due to S/A (adjusted). In
> the sequential regression, variance in the Y that is predictable by X is
> removed from the total variance, and then the leftover variance in Y is
> regressed on A. Aren't these two analyses identical? If not, what is it that
> differs? Finally, does anyone have any suggestions?
> 
> Many thanks!
> --
> William Levine
> Department of Psychology
> University of North Carolina-Chapel Hill
> Chapel Hill, NC 27599-3270
> [EMAIL PROTECTED]
> http://www.unc.edu/~whlevine
> 
> 
> 
> 
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =
> 



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=