here is an example to ponder ...
let's say that you are an instructor in a course and have decided to
administer a 100 point final exam ... the very first day of class ... and
then some alternate form of that 100 item test the very last day of class
... in general, to see what people "gain"
now, scores are pretty low on the first day ... and since kids learn alot
... the scores went up alot by the end of the course .... have a look at
these data
=========
- *
post - * *
- *
- 2 *
80+ * 2 *
- 2 * * *
- * * *
- * *
- * *
60+
- * *
- * * *
-
- *
40+ *
-
-
----+---------+---------+---------+---------+---------+--pre
10.0 15.0 20.0 25.0 30.0 35.0
positive r between pre and post ... makes sense
MTB > desc c16 c17
Descriptive Statistics: pre, post
Variable N Mean Median TrMean StDev SE Mean
pre 30 25.200 25.500 25.615 5.006 0.914
post 30 71.53 76.50 72.23 14.30 2.61
Variable Minimum Maximum Q1 Q3
pre 11.000 34.000 22.000 29.000
post 39.00 95.00 61.50 81.50
MTB > corr c16 c17
Correlations: pre, post
Pearson correlation of pre and post = 0.604
P-Value = 0.000
now, what if you look at the gain ... from pre to post ... then plot the
pre scores against the gain
MTB > plot c30 c16
Plot
gain -
- *
- *
60+ *
- 2 * 2 * * *
- * * *
- * * * *
- * * *
40+ * * * *
-
- * *
- *
- *
20+
-
- *
-
----+---------+---------+---------+---------+---------+--pre
10.0 15.0 20.0 25.0 30.0 35.0
MTB > corr c16 c30
Correlations: pre, gain
Pearson correlation of pre and gain = 0.303
P-Value = 0.104
the correlation between pre and gain is POSITIVE .3 ... not high of course
but, it is POSITIVE
this means that the ones who scored highest on the pre GAINED THE MOST
the ones who scored lowest on the pre ... GAINED THE LEAST
MTB > sort c16(c30), c31(c32);
SUBC> desc c16.
MTB > prin c31 c32
if i sort the pre from high to low and then list the gain ... we can see
easily that the high pres gain more in fact, the top 6 gain about 51 points
on average ... while the low 6 gain only about 35 points on average
Row sortpre samegain
1 34 53
2 31 54
3 31 46
4 30 53
5 30 47
6 30 55
7 29 41
8 29 61
9 28 67
10 28 49
11 28 29
12 28 62
13 27 12
14 27 41
15 26 54
16 25 54
17 25 54
18 24 57
19 24 46
20 24 44
21 24 41
22 22 54
23 22 50
24 22 55
25 22 31
26 21 42
27 20 32
28 20 24
29 14 39
30 11 43
if you are thinking about regression to the mean in the typical way ... how
come this "regression reversal" seems to have occured?
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================