Galen Wilkerson wrote:
> 
> Hi,
> 
> I'm trying to get a definitive answer about this:
> 
> Suppose you have two binary random variables, and you get some data
> from them:
> 
> X  Y
> ----
> 
> 0  0
> 1  0
> 1  1
> 1  1
> 1  1
> 1  1
> 1  1
> 0  0
> 
> When you are calculating the correlation coefficient of these two, do
> you count the multiple occurances of the (1,1) and (0,0) pairs, or
> just use them once?

        Yes, you count multiple occurrences.

        Otherwise, with [say] a million data, there would be no difference
between

        0       1

0    499999     1
1       1    499999


and 

        0       1

0    250000   250000
1    250000   250000

which would be a poor measure of "correlation", and clearly the Wrong
Thing.

        -Robert Dawson
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to