First, let's consider the 2 observation case. I have 2 assessments of a
behavior rating taken 20 minutes apart; I wish to know how reliable the
assessments are. There are two potential sources of error, the relative
error over time, in which the order of scores for subject a and subject b on
the two assessments may be the same or different, and the absolute error in
which all subjects may be lower on the second assessment. If I do a Pearson
correlation between the two, I find a correlation of .78097 (n=313, p <
.0001). I do an analysis of variance with repeated measures on time (the
equivalent of the paired t-test, and find a significant difference between
the means (time 1, mean = 3.377, sd=1.10; mean 2 = 3.291, sd=1.16; F(1, 312)
= 4.16; p = .0422). Now, I do a generalizability analysis. I find the
following variance components:

Subjects                                .99269
Time                                    .00300
Subjects by Time                        .27842

The generalizability coefficient (or ICC) considering only the relative
error (interaction) is

.99269 / (.99269 + .003) = .99269/1.27111 = .78096 which is the Pearson
Correlation within rounding. I then figure the coefficient taking into
account the mean difference as well.

.99269 / (.99269 + .003 + .27842) = .99269 / 1.27411 = .779.

I have had a minimal effect on the reliability as should be obvious by the
variance component for time, which is very small relative to the other
variance components. 

Thus, even though the difference between time 1 and 2 is significant (due in
part to the large sample and the strong correlation between two observations
taken 20 minutes apart), the effect on the reliability is small. Of course,
I could observe that in the means as well, since they re very close, but of
course, when you see two means, many people want to know if they are
statistically different. 

Add to this result, the fact that, because in reality I have 5 assessments
of the observed variable over an hour's time, the generalizability result is
much easier to deal with than is 10 unique Pearson correlations and an ANOVA
(hopefully not 10 paired t-tests), and it becomes clear that the
generalizability analysis is cleaner than breaking the analysis into two
parts.

Paul R. Swank, Ph.D. 
Professor, Developmental Pediatrics
Medical School
UT Health Science Center at Houston 


-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Richard Ulrich
Sent: Wednesday, May 12, 2004 2:52 PM
To: [EMAIL PROTECTED]
Subject: Re: [edstat] paired t-test for test-retest reliability reference?


On 12 May 2004 06:37:30 -0700, [EMAIL PROTECTED] (Paul R Swank)
wrote:

> And doing a Pearson Coorelation and a t-test doesn't tell you the 
> overall impact of the error.

If the t-test is *not*  relevant, which may be true for test-retest, the
Pearson can be a more proper measure of 
impact than the ICC  which slightly decreases the reported score.
 - there are extra issues for your study if the variances are should not be
pooled, for any choice of coefficient.

If the t-test *is*  relevant, it can be a warning of a grievous impact, all
by itself;  and that warning is generally masked by reporting an ICC which
may be only slightly less than the 
Pearson r. 

Those are two reasons why the two tests together are better for
*examining*  your data, than looking at ICCs.

Yes, it is the overall impact, and that can be useful for the
*final* statement, especially when a very precise statement of 
overall impact is warranted -- because, for instance, power analyses are
being based on the exact value of the exact form of ICC that is needed: Same
versus different raters; single versus 
multiple scorers.  

And I think it is an over-generalization to prefer an ICC when the issue is
the cruder one of apparent adequacy.  The ICC is less informative (about
means) and less transparent (multiple versions available to select, all of
them burying the means).

[snip, rest]

-- 
Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html
.
. =================================================================
Instructions for joining and leaving this list, remarks about the problem of
INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to