Several matters are not clear.  Comments embedded in the query, below.

On Wed, 9 Jul 2003, Dennis wrote:

> I would like to remove outliers from my repetitive measures design,
> however making it missing removes the whole case of the subject.
        This suggests that you're using a software routine that insists
on complete data for each case, and automatically discards the whole
case (as you describe) when any value is <missing>.  Supposing that
treating what you choose to call "outliers" as though they were
<missing> is sensible, choose a routine that accepts missing-data
values.
 For example, if you're in MINITAB trying to use ANOVA, use GLM instead.
 Or if you're using ANOVA in another package, try attacking the problem
with multiple linear regression instead.  Messier, but sometimes more
informative, and usually permits pairwise (instead of listwise) deletion
when <missing values> are involved.

But as to the alleged "outliers":  how do you know they're outliers,
rather than legitimate, if somewhat extreme, members of your data set?
E.g., is an "outlier" apparently outlying with respect to the entire
data set, or with respect to the particular cell of the design it
happens to inhabit?  Or are you finding an entire CELL to be an
"outlier" cell?  If the latter be the case, you have a problem to
address in terms of the model you're trying to fit, not with finding
substitutes for individual values;  and for such a problem I might have
some useful advice, given details as yet unsupplied.

> I've heard that it's possible to replace outliers with the mean of
> the group.
                Yes, well, ANYthing is possible.  Whether it be
advisable is another question.  As Rich Ulrich said (or at least
implied) in his reply, you need to worry about the degree to which
substituting a fictitious value [of whatever kind] for an observed value
causes you to lose some (or even all) of the useful information the
observed value (or the "outlying" values collectively) was trying to
tell you.  Rich gave you advice about your last two questions (below),
including how to find references.

> I am wondering if it's a standard practice (to use for my thesis),
> and are there any good references?
> If it is acceptable, how should I compute the mean if there are
> several outliers in one group/DV, (or variable in SPSS)?

If there are several "outliers" in one group, and if "outlier" is
defined w.r.t. the entire data set and not w.r.t. this group, you may
not in fact be observing outliers:  you may be observing a group that is
interestingly discrepant from the other groups in your design.
 OTOH, if this group contains values whose within-group variance is
noticeably higher than the average, the several outliers may be trying
to tell you something else.  Rich suggested you write an essay on each
outlier:  excellent advice.  When you think you know why an observation
is (or seems to be) an outlier, you may have the beginnings of a handle
on how you might deal with the (possibly spurious) problem(s) raised by
the fact of the "outlier".

 -----------------------------------------------------------------------
 Donald F. Burrill                                         [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110                 (603) 626-0816

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to