Alex Yu wrote:
>
> Disadvantages of non-parametric tests:
>
> Losing precision: Edgington (1995) asserted that when more precise
> measurements are available, it is unwise to degrade the precision by
> transforming the measurements into ranked data.
So this is an argument against rank-based nonparametric tests
rather than nonparametric tests in general. In fact, I think
you'll find Edgington highly supportive of randomization procedures,
which are nonparametric.
In fact, surprising as it may seem, a lot of the location
information in a two sample problem is in the ranks. Where
you really start to lose information is in ignoring ordering
when it is present.
> Low power: Generally speaking, the statistical power of non-parametric
> tests are lower than that of their parametric counterpart except on a few
> occasions (Hodges & Lehmann, 1956; Tanizaki, 1997).
When the parametric assumptions hold, yes. e.g. if you assume normality
and the data really *are* normal. When the parametric assumptions are
violated, it isn't hard to beat the standard parametric techniques.
However, frequently that loss is remarkably small when the parametric
assumption holds exactly. In cases where they both do badly, the
parametric may outperform the nonparametric by a more substantial
margin (that is, when you should use something else anyway - for
example, a t-test outperforms a WMW when the distributions are
uniform).
> Inaccuracy in multiple violations: Non-parametric tests tend to produce
> biased results when multiple assumptions are violated (Glass, 1996;
> Zimmerman, 1998).
Sometimes you only need one violation:
Some nonparametric procedures are even more badly affected by
some forms of non-independence than their parametric equivalents.
> Testing distributions only: Further, non-parametric tests are criticized
> for being incapable of answering the focused question. For example, the
> WMW procedure tests whether the two distributions are different in some
> way but does not show how they differ in mean, variance, or shape. Based
> on this limitation, Johnson (1995) preferred robust procedures and data
> transformation to non-parametric tests.
But since WMW is completely insensitive to a change in spread without
a change in location, if either were possible, a rejection would
imply that there was indeed a location difference of some kind. This
objection strikes me as strange indeed. Does Johnson not understand
what WMW is doing? Why on earth does he think that a t-test suffers
any less from these problems than WMW?
Similarly, a change in shape sufficient to get a rejection of a WMW
test would imply a change in location (in the sense that the "middle"
had moved, though the term 'location' becomes somewhat harder to pin
down precisely in this case). e.g. (use a monospaced font to see this):
:. .:
::. => .::
::::... ...::::
a b a b
would imply a different 'location' in some sense, which WMW will
pick up. I don't understand the problem - a t-test will also reject
in this case; it suffers from this drawback as well (i.e. they are
*both* tests that are sensitive to location differences, insensitive
to spread differences without a corresponding location change, and
both pick up a shape change that moves the "middle" of the data).
However, if such a change in shape were anticipated, simply testing
for a location difference (whether by t-test or not) would be silly.
Nonparametric (notably rank-based) tests do have some problems,
but making progress on understanding just what they are is
difficult when such seemingly spurious objections are thrown in.
His preference for robust procedures makes some sense, but the
preference for (presumably monotonic) transformation I would
see as an argument for a rank-based procedure. e.g. lets say
we are in a two-sample situation, and we decide to use a t-test
after taking logs, because the data are then reasonably normal...
in that situation, the WMW procedure gives the same p-value as
for the untransformed data. However, let's assume that the
log-transform wasn't quite right... maybe not strong enough. When
you finally find the "right" transformation to normality, there
you finally get an extra 5% (roughly) efficiency over the WMW you
started with. Except of course, you never know you have the right
transformation - and if the distribution the data are from are
still skewed/heavy-tailed after transformation (maybe they were
log-gamma to begin with or something), then you still may be better
off using WMW.
Do you have a full reference for Johnson? I'd like to read what
the reference actually says.
Glen