In article <000001c055d0$5408df40$6f38de9e@daheiser>,
David Heiser <[EMAIL PROTECTED]> wrote:

>----- Original Message -----
>From: Herman Rubin <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Thursday, November 23, 2000 4:55 PM
>Subject: Re: stat question


>> >Herman Rubin wrote:

>> >>anyone wanting to learn good statistics should not even
>> >>consider taking an "undergraduate" statistics course

>> >Nonsense.(Reply by this anonymous, lion soaking in oil)

>> Not only is that not nonsense, but it is quite difficult
>> to get students who have learned techniques to consider
>> what, if any, basis was behind those techniques.
>> Meaningful statistics is based on the concept of
>> probability, not the computation of probabilities, and
>> consideration of the totality of consequences.

>Wow, Herman, this is deep stuff. There is a huge literature on the attempt
>to understand what probability is. Even Fisher had problems trying to
>understand it outside of the frequentist viewpoint. There is a lot of stat
>work involving maximum likelihood estimates, where there is no probability
>support unless you take a Bayesian approach. (Which is infrequent.)

It has long been known that maximum likelihood is at best
a method, which may or may not be good.  In fact, we have
many reasonable problems where we know that maximum
likelihood is not good, and in which we can do better,
even asymptotically.

My last clause in what you quoted above is what I consider
the key point; consider the TOTALITY of consequence.  This,
without elaboration, rules out classical hypothesis testing,
and both classical and Bayesian confidence regions.

There is a huge literature in attempting to DEFINE
probability, which is quite different from trying to
understand what it is.  The same problem occurs in trying
to define length, or even to define integers.  Integers
have properties, and probabilities have properties, and
it is a mistake to pick a definition of either.

>Just look at the extent of the literature on the 2X2 table, and the
>difficulty there is in understanding the concepts behind an analysis for
>effects.

>I have been reading the absolutely wonderful discussion and raging arguments
>on SEMNET between Mulaik, Pearl, Hayduk, Shipley and others on the meaning
>of b in the simple linear equation Y=bX+e1, where X is one variable and e1
>is a combination of the effects of all other variables and random effects.
>When e1 is large with respect to Y, it becomes very difficult to define a
>simple meaning of b in terms of a quantitative causality. This is deep
>stuff, that even the professors have difficulty in understanding.

One cannot define causality by looking at observed relations.  
This is a place where the philosophers overreached; models do
not come from data, but from the mind, and objectivity is
just plain impossible.  But enough mathematics had to be 
developed to see this, although there were examples of the
contradictions which readily occur.  

>Considering most of the important work involving statistics is in
>psychology, marketing, medicine, economics, physics, social studies, and
>every other hard or soft science out there, we cannot assume that all these
>PhD practitioners understand probability or really understand the nuances of
>the models, equations and conclusions they arrive at. (I did it. One
>sentence per paragraph. Does this put me on Fisher's level?)

Those in economics, although they have not managed to do
too much with statistics, seem to understand the problems,
as do SOME in biology.  Marketing people go by results.
Usually, physicists have sufficiently accurate data that
there is not too much of a problem; however, the first
failure of what is now called meta-analysis which I heard
about came from a physicist.

What is done in the other fields is to use statistical
methods as a RELIGION, nothing more and nothing less.
There is now some progress in getting Bayesian methods into
medicine, and realizing that the protocols and rules in use
can increase suffering and death.  The psychologists and
social scientists, with their forcing of normality, are a
real danger to all.

>It would be nice if all these practitioners had graduate courses in stat,

I did not even hint that.  They need CONCEPTUAL courses in
statistics, not how to carry out religious rituals.  For those
who can understand mathematics, there is no point in taking a
course below the mathematical level which can be understood,
in any field.  The essentials of measure theory and integration,
not antidifferentiation, belong in high school at the latest.
The Greeks understood integration, although they could not 
calculate many.  Those coming out of calculus can calculate many,
but have no understanding.

>but more than likely it is a undergraduate level course taught at the
>graduate level to a student say in psychology, or medicine or.......(e.g.,
>Abelson in his book, "Statistics as Principled Argument", in the first
>sentence of the introduction says, 'This book arises from 35 years of
>teaching a first-year graduate statistics course in the Yale Psychology
>Department."

It would make more sense for mathematicians and
statisticians to teach psychology based on logical
principles, than for those using the statistical religion
in their fields to preach it.

This is typical of most graduate schools, where the first year
>stat course is all that the student gets.) People in these fields will get
>exposed to huge data bases with large numbers of variables and find the
>impossibility of assessing all the implications of any model, hypothesis or
>set of conclusion made.

And they will come out believing that a p-value of .049
means a real difference, and a p-value of .051 means no
difference.  The first one may or may not be correct, but
the second one is rarely even something to consider.  The
psychologist who takes the raw data and converts it to a
normal distribution does not have any idea what a scale is
good for.

>It is very clear that an education in statistics never stops. The
>undergraduate level exposes you to the concepts, and the understanding comes
>with continued education and experience.

My complaint is that the current undergraduate courses do
not teach any of the concepts.  The idea of the balancing 
of errors and costs does not even get mentioned.

>There has been a long discussion previously on EDSTAT about the 0.05
>probability value and the use of it.  There was no common agreement, which is
>typical of most of the basic fundamental things we use in statistics.

Those who use the .05 (or any other) value in this way are
the ones who have no idea of the basics.  That null
hypothesis, as tested (not as stated) is necessarily FALSE,
and I do not need any data for this.  The question is what
action to take.  It MAY be the case that one can ignore the
difference, and treat the problem as a point null, or it
may not, and this can only be found out computationally, or
using good enough estimates based on probability.  But in no
case is fixing a significance level as such reasonable.

        Since
>we as statisticians can't agree on what is significant (in terms of
>probability), how can we expect practitioners to fully understand what
>probability is?

See the above.  The use of the word "significant" was a
complete mistake; how significant a difference is depends
on the magnitude of the difference, and only indirectly is
related to the probability that this large an effect will
be found.
-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
[EMAIL PROTECTED]         Phone: (765)494-6054   FAX: (765)494-0558


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to