In article <[EMAIL PROTECTED]>,
Alex Yu <[EMAIL PROTECTED]> wrote:

>John Tukey differentiates "data analysis" and "statistics." The former may
>or may not employ probability while the latter is based upon probability. 

>Resampling techniques use "empirical probability." In the Fisherian sense,
>probability is based upon infinite hypothetical distributions. But for
>Rechenbach and von Mises, probability is empirically based on limited
>cases that generate relative frequency. 

I consider the Fisherian one to be the only relevant one.
In fact, I do not think it goes far enough; at best,
probability is a property of the real world like length
and mass.  Basing it empirically runs into paradoxes;
for one thing, both they and Fisher make the fantastically
strong assumption of independent identically distributed
observations; this is certainly false, and it is even hard
to come close to it when trying.  I believe the only tenable
approach to probability is that it exists, and that the 
assumptions made are not too far from the truth.

>It seems to me that resampling is qualified as a probabilistic model 
>in Rechenbach and von Mises' view, but not in the Fisherian tradition. My 
>question is: Should resampling be counted as a probabilistic model? 
>What is the nature of inference resulted from bootstrapping? Is it a 
>probabilistic inference?  

In resampling, the probability distribution is a KNOWN
distribution.  The only reason for using resampling rather
than computing the exact distribution is that of cost.

On the other hand, the conclusions drawn from resampling
random because the original sample is random.  There is
ONE sample only.  There are no repeated samples.

>As I recall, Philip Good said that permutuation tests are still subject 
>to the Behrens-Fisher problem (unknown population variance). If 
>resampling is based on empirical probability within the reference set, then 
>why do we care about the population variance? 

Some of them are and some of them are not.  In most of
them, the permutations yield different points, and the
test is about where the sample point lies among them,
so the variance (sample or population) is irrelevant.
under the point null hypothesis, which is false anyhow.

I would not believe that the treatment has exactly NO
effect, no matter how much data is presented.  This is
not a reasonable question.

-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to