Re: stan error of r

James H. Steiger Tue, 03 Apr 2001 15:36:10 -0700
David and others, 

Some comments follow which I hope will be of some interest.


On 28 Mar 2001 15:52:49 -0800, [EMAIL PROTECTED] (David C. Howell)
wrote:

>
>Dennis,
>
>The closest answer that I can find as an answer to your question is that "it is
>very complicated."

The exact distribution under bivariate normality is given in Kendall
and Stuart, "The Advanced Theory of Statistics". You are absolutely
right. It is very complicated.  Computing the cumulative distribution
accurately is very  tricky in C or Fortran, requiring extensive,
careful programming. My computer program, Statistica Power Analysis,
computes the exact cumulative distribution of r for any population
correlation rho. (under the traditional, unrealistic assumption of
bivariate normality)

> But the first question is what you would do with the answer
>if you had it? Because the distribution of r is skewed when rho is not equal to
>0, you wouldn't want to use that standard error to create a confidence
>interval--it would be too wide on one side, and not wide enough on the other.

In a sense this is right, but in another sense it really isn't.

The exact confidence interval can be computed, and is, by Statistica
Power Analysis. If you mean that you cannot compute a simple
confidence interval, using a formula like

     r  (plus or minus) 1.96 * SE(r)

where SE(r) is the standard error of r,
of course you are correct. But this method for computing confidence
intervals is only a special case of a much more general method,
called "the inversion method,"  discussed in great detail by Steiger
and Fouladi (1997) in the Erlbaum book, "What if there were no
significance tests?"  This "inversion" method, using advanced
software, can compute such an exact confidence interval in many
cases where simpler approaches cannot. The use of this method
opens up many new opportunities for using confidence intervals
in place of hypothesis tests.

The Steiger and Fouladi (1997) article not only describes how to
compute exact confidence intervals on rho, the population correlation,
it also describes how to do so for the squared multiple correlation,
and many other interesting quantities, such as the root mean
square standardized effect in ANOVA. This latter confidence
interval replaces the F test, including all the information
available in an ANOVA F-test, and more. BTW, the
ANOVA confidence intervals are also computed in Statistica
Power Analysis. 

>
>There are two other answers. I'm sure that you are aware that we can convert r
>to r' (or z' as Fisher called it), and that its standard error is estimated
>well by 1/sqrt(N-3). The other approach would be to estimate the standard error
>by bootstrapping. That is actually a relatively simple process, but, again, I
>don't know what I would do with the answer once I found it.

The exact standard error of r is not very valuable, although it can be
computed directly using advanced symbolic software like Mathematica.
Using Mathematica, you can compute it in a few lines of code.


The standard error of r can be approximated by the square root of
the asymptotic formula for the variance of r, which is

   Var(r) =     (1/N)* (1-rho^2)^2

This formula, by the way, is not always given correctly. For example,
the classic book by Glass and Stanley has it wrong in several places.

Interestingly, when rho is zero, the above formula reduces to 1/N, and
leads to a very simple, oft-forgotten formula for a "quick and dirty"
significance test for a single correlation.

By standard asymptotic theory, the test statistic 

             Z = r / Sqrt(1/N) 

                  = Sqrt(N) r  

has an asymptotically N(0,1) distribution if rho=0. Moreover, the
distribution of r is rather symmetric and close to normal when rho=0,
so this formula is quite accurate.

Consequently, as a quick and dirty test at the .05 level, simply
examine whether the absolute value of the above statistic is less
than 1.96.

             |Z| < 1.96

This, in turn, leads to "quick and dirty" "significance points"
for r, because

           |Z| < 1.96 

is the same as  |r| < 1.96/Sqrt(N), or, roughly, 
|r| < 2/Sqrt(N) = Sqrt(4/N).

I call the formula Sqrt(4/N) the "blunt method."


Compare 1.96/Sqrt(N) with the tabled "critical values"
of r at the .05 level.

         N     Exact     Quick  Blunt
      ------------------------------
       200    .139     .139      .141
       100    .197     .196     .2 
         50    .278     .277     .283
         25    .396     .392    .4
    -----------------------------

Having the quick and dirty formula or blunt formula in the back of
one's mind allows one to have an intuitive appreciation for
sample sizes necessary for reasonable correlational
analysis, without having to carry around a textbook.

I quickly add that all the above formulas depend, more
or less, on the highly questionable assumption
of bivariate normality.  So bootstrapping may
be an excellent idea in any case.

--Jim



James H. Steiger, Professor
Dept. of Psychology
University of British Columbia
Vancouver, B.C., Canada V6T 1Z4 
email:  [EMAIL PROTECTED]





Reference


Steiger, J.H., & Fouladi, R.T. (1997). Noncentrality
Interval Estimation and the evaluation of statistical
models.  In Harlow, L., Mulaik, S.A., and Steiger, J.H. (Eds)
What if there were no significance tests.  Mawah, NJ: Erlbaum.




>
>Dave
>
>.At 04:18 PM 3/28/01 -0500, dennis roberts wrote:
>>anyone know off hand quickly ... what the formula might be for the standard 
>>error for r would be IF the population rho value is something OTHER than zero?
>>
>>_________________________________________________________
>>dennis roberts, educational psychology, penn state university
>>208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
>>http://roberts.ed.psu.edu/users/droberts/drober~1.htm
>>
>>
>>
>>=================================================================
>>Instructions for joining and leaving this list and remarks about
>>the problem of INAPPROPRIATE MESSAGES are available at
>>                  http://jse.stat.ncsu.edu/
>>=================================================================
>
>________________________________________________________________
>----------------------------------------------------------------------------
>------------------------------------
>
>David C. Howell                                         Phone: (802) 656-2670
>Dept of Psychology                              Fax:   (802) 656-8783
>University of Vermont                           email: [EMAIL PROTECTED]
>Burlington, VT 05405 
>
>
>
>http://www.uvm.edu/~dhowell/StatPages/StatHomePage.html
>
>http://www.uvm.edu/~dhowell/gradstat/index.html



=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================
Re: stan error of r

Reply via email to