On Sat, 13 Mar 2004, John Gregory wrote:

> In Pearson's Correlation Coefficient formula, the Greek symbol for Sum
> (backward E) has a small "n" preceeding it at half height.

The mathematical symbol for "Sum" is the Greek letter sigma (a capital
sigma), corresponding to the capital S in the word Sum (which is why it
was chosen in the first place;  the corresponding symbol for "Product"
in multiplication is a capital pi).  If there is a lower-case "n"
preceding it, the value of "n" is to be multiplied by the sum, once the
sum has been obtained.

The standard methods of indicating the range of summation, which is what
you described subsequently, are:
 (1) when each individual value in the formula is indicated or implied
by a subscript (say "i"), a line of small type under the capital Sigma
reads "i = 1" and above the capital Sigma "n";  this would be read "Sum
<the following quantities> for i = 1 to n".  This usage permits one to
indicate partial summations (e.g., from 7 to 31 instead of from 1 to n).
 (2) when the particular quantities to be summed are obvious from the
context (e.g., for all the data in hand), the capital Sigma may be
unadorned, the implication being "for all the data" or "from 1 to n".
 (3) sometimes, as shorthand for (1) above, a subscript "i" attached to
the Sigma.  More commonly used when the items to be summed are indexed
in two dimensions (say "i" and "j") but the summing is to be carried out
only with respect to one of them.

> I think it's to be read something to the effect "Sum the following and
> perform the other operations in sequence for each set of varible in
> the list you're working from; or in other words... row by row from the
> 1st to as many as there are (this "n")."

Close.  Depends on whether "the other operations in sequence" are
intended to be performed before summing or afterward.  This is made
explicit by parentheses in the formula.  In particular, in one of the
numerous formulas for the Pearson correlation coefficient (no need for
capital C's, it's not a title of nobility) you may have an expression
like
  SUM((X_i - Xbar)(Y_i - Ybar))
 (using "_i" for the subscript "i", and "Xbar" for the sample mean of X)
in which case the algorithm implied is
 subtract the mean of X from the current value of X;
 subtract the mean of Y from the current value of Y;
 multiply these two differences together;
 and sum all such products.  Or, if you have a formula part of which
reads
 n SUM XY - (SUM X)(SUM Y)
 the algorithm implied is
 add up all the X values (= SUM X);
 add up all the Y values (= SUM Y);
 count the number of values (= n);
 for each case, multiply X by Y (= XY) and sum all these products;
 multiply this last by "n";
 subtract from this the product of the first two sums.
 (This quantity will be n times as large as the result of the previous
formula, by the way.)

> Is that how I'd express this verbally? Been a long, long time since
> I've had to deal with the formula.

 ------------------------------------------------------------
 Donald F. Burrill                              [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110      (603) 626-0816
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to