Galen Wilkerson wrote:
>
> Hi,
>
> I'm trying to get a definitive answer about this:
>
> Suppose you have two binary random variables, and you get some data
> from them:
>
> X Y
> ----
>
> 0 0
> 1 0
> 1 1
> 1 1
> 1 1
> 1 1
> 1 1
> 0 0
>
> When you are calculating the correlation coefficient of these two, do
> you count the multiple occurances of the (1,1) and (0,0) pairs, or
> just use them once?
Yes, you count multiple occurrences.
Otherwise, with [say] a million data, there would be no difference
between
0 1
0 499999 1
1 1 499999
and
0 1
0 250000 250000
1 250000 250000
which would be a poor measure of "correlation", and clearly the Wrong
Thing.
-Robert Dawson
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================