(no subject)

2000-04-10 Thread Rex Boggs

unsubscribe edstat-l


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Robert Dawson

Dennis Roberts asked, imagining a testing-free universe:

 what would the vast majority of folks who either do inferential work
and/or
 teach it ... DO
 what analyses would they be doing? what would they be teaching?

I wrote:
 *  students would be told in their compulsory intro stats that
 "a posterior probability of 95% or greater is called
  "statistically significant", and we say 'we believe
  the hypothesis'. Anything less than that is called
 "not statistically significant", and we say 'we disbelieve
  the hypothesis'".

and Herman Rubin responded:

 Why?  What should be done is to use the risk of the procedure,
 not the posterior probability.  The term "statistically significant"
 needs abandoning; it is whether the effect is important enough
 that it pays to take it into account.

Dennis asked what _would_ happen, not what _should_.  Most of the abuses we
see around us are not the fault of hypothesis testing _per_se_, but of
statistics users who believe:

(a) that their discipline ought to be a science;
(b) that statistics must be used to make this so;
(c) and that it is unreasonable to expect them to _understand_
statistics just because of (a) and (b).

Granted, if they did understand statistics, they would not test hypotheses
nearly as often as they do. However, that said, I am not entirely persuaded
that risk calculation is the whole story, either. In many pure research
situations, "risk" is just not well defined. What is the risk involved in
believing (say) that the universe is closed rather than open?

Moreover, suppose we elected Herman to the post of Emperor of Inference,
(with the power of the "Bars and the Axes"?) to enforce a risk-based
approach to statistics (not that he'd take it, but bear with me...), would
the situation realy improve?

My own feeling is that, in many "soft" science papers of the sort where
the research is not immediately applied to the real world, but may affect
public policy and personal politics, a "risk" aproach would be disastrous.
If the researcher had to assign "risks" to outcomes that were merely a
matter of correct or incorrect belief, it  would be all too tempting to
assign a large risk to an outcome that "would set back the cause of X fifty
years" and conversely a small risk to accepting a belief that might be
considered "if not true, at least a useful myth." (Exercise: provide your
own examples).  Everything would be lowered to the level of Pascal's Wager -
surely the canonical example of the limitations of a risk-based approach?

One might argue that in such a situation the rare reader who intends to
take action, and not the writer, should do the statistics. Unfortunately, in
the real world, that won't wash. People want simple answers, and with the
flood of information that we have to deal with in keeping up with the
literature in any subject today, this is not entirely a foolish or lazy
desire. It is considered the author's responsibility to reach a conclusion,
not just to present a mass of undigested data for posterity to analyze.
Thus, it would be unrealistic to expect any discipline, forced to use
risk-based inference, to do other than have the author guess at risks (and
work with those guesses) in situations where objective measurements of risk
don't exist.

-Robert Dawson





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Michael Granaas

On Fri, 7 Apr 2000, Chris Mecklin wrote:

Among other things
 
 My point is that I want to show my class an example where they can see the
 pitfalls of making a decision based solely on a p-value.  I don't want

My favorite, not contrived example, has to do with vocational advice and
gender.  It is well known that in high school boys do better on
standardized measures of mathematics and girls do better on verbal
measures.  This lead to the "obvious" conclusion that girls should avoid
anything mathy in their career choices while boys should avoid the
humanities.

But, it turns out that the effect sizes for these results are typically
around d=0.1 with individual studies maxing out at about 0.2.  (I can't
lay my hands on my notes right now, but these convert to an R^2 that is
fairlly small.  Somewhere below .05.)

The significant findings all have large sample sizes in common.  Typically
1000+ students, so the results are all p  .01.

If you think about these results in terms of variance accounted for
individual variability NOT associated with gender clearly overwhelms any
gender effects by about 19:1.  If you think about these in terms of
overlapping distributions the top 48-49% of the girls are scoring in
roughly the same range as the top 50% of the boys for math with a similar
gender reversed result for verbal skills.

In other words this real relation between gender and ability tests is an
extremely poor substitute for individual information.  Many girls have the
ability to do well in mathy areas and many boys lack the ability.

Michael

 them going "Ok, the p-value is .04 in this problem, so I don't reject, no
 wait, I reject, I think, Ok, yeah I reject, so whatever the treatment is
 must be good."
 
 ___
 Christopher Mecklin
 Doctoral Student, Department of Applied Statistics
 University of Northern Colorado
 Greeley, CO 80631
 (970) 304-1352 or (970) 351-1684
 
 
 
 
 
 ===
 This list is open to everyone.  Occasionally, less thoughtful
 people send inappropriate messages.  Please DO NOT COMPLAIN TO
 THE POSTMASTER about these messages because the postmaster has no
 way of controlling them, and excessive complaints will result in
 termination of the list.
 
 For information about this list, including information about the
 problem of inappropriate messages and information about how to
 unsubscribe, please see the web page at
 http://jse.stat.ncsu.edu/
 ===
 

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Bruce Weaver

On 7 Apr 2000, dennis roberts wrote:

 i was not suggesting taking away from our arsenal of tricks ... but, since 
 i was one of those old guys too ... i am wondering if we were mostly lead 
 astray ...?
 
 the more i work with statistical methods, the less i see any meaningful (at 
 the level of dominance that we see it) applications of hypothesis testing ...
 
 here is a typical problem ... and we teach students this!
 
 1. we design a new treatment
 2. we do an experiment
 3. our null hypothesis is that both 'methods', new and old, produce the 
 same results
 4. we WANT to reject the null (especially if OUR method is better!)
 5. we DO a two sample t test (our t was 2.98 with 60 df)  and reject the 
 null ... and in our favor!
 6. what has this told us?
 
 if this is ALL you do ... what it has told you AT BEST is that ... the 
 methods probably are not the same ... but, is that the question of interest 
 to us?
 
 no ... the real question is: how much difference is there in the two methods?
-- 8 ---

In one of his papers, Bob Frick has argues very persuasively that very
often (in experimental psychology, at least), this is NOT the real
question at all.  I think that is especially the case when you are testing
theories.  Suppose, for example that my theory of selective attention
posits that inhibition of the internal representations of distracting
items is an important mechanism of selection.  This idea has been testing
in so-called "negative priming" experiments.  (Negative priming refers to
the fact that subjects respond more slowly to an item that was previously
ignored, or is semantically related to a previously ignored item, than
they do to a novel item.) Negative priming is measured as a response time
difference between 2 conditions in an experiment.  The difference is
typically between about 20 and 40 milliseconds.  I think the important
thing to remember about this is that the researcher is not trying to
account for variability in response time per se, even though response time
is the dependent variable:  He or she is just using response time to
indirectly measure the object of real interest.  If one was trying to
account for overall variability in response time, the conditions of this
experiment would almost certainly not make the list of important
variables.  The researcher KNOWS that a lot of other things affect
response time, and some of them a LOT more than his experimental
conditions do.  However, because one is interested in testing a theory of
selective attention, this small difference between conditions is VERY
important, provided it is statistically significant (and there is
sufficient power);  and measures of effect size are not all that relevant. 

Just my 2 cents.
-- 
Bruce Weaver
[EMAIL PROTECTED]
http://www.angelfire.com/wv/bwhomedir/




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Michael Granaas

On Fri, 7 Apr 2000, dennis roberts wrote:

 At 04:00 PM 4/7/00 -0500, Michael Granaas wrote:
 
 But whatever form hypothesis testing takes it must first and formost be
 viewed in the context of the question being asked.
 
 
 this seems to be the key to REinventing ourselves ... make sure the focus 
 is on the question ... AND, to REshape the question FROM what we 
 traditionally do in hyp test ...

If you look at Psychology you might well see two traditions, one in which
the zero valued null is used in a rather automatic and mindless fashion
and another in which researchers work very hard setting up experiments
where rejection of the zero valued null does provide some information.

 
 set up the null, etc. etc
 
 to ... ask the question of real interest ...
 
 what effect DOES this new treatment have?
 what kind of correlation IS there between X and Y?

In the second tradition I spoke of you find people asking exactly these
types of questions once they have established that their experimental
results are not due to chance.  They use the hypothesis test as a step on
the road to understanding, not as an end in and of itself.

To me this second group acts more like model fitters (emphasis on
prediction) than they do like hypothesis testers (emphasis on rejecting
nil effects).  Even though this second group rejects some nil valued
hypothesis they, unlike the first, ask questions about things like effect
size or functional form of an effect rather than simply declaring the
effect is not zero and drawing some final conclusion.

For myself I try to get students at all levels asking the types of
questions that Dennis suggests as being obvious follow-ups to rejecting
some nil hypothesis.  I cannot claim a great deal of success, but I am
trying.

 what IS the difference between the smartness of democrats and republicans?
 
 if you ask questions that way ... they do not naturally or sensibly lead to 
 our testing the typical null hypotheses we set up

Yes.  There are a variety of answers to this problem, but, rejecting the
no difference hypothesis when it is a priori false is not among them.

Michael


 

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Robert Dawson

Bruce Weaver wrote (in part):


...Negative priming is measured as a response time
 difference between 2 conditions in an experiment.  The difference is
 typically between about 20 and 40 milliseconds...
   The researcher KNOWS that a lot of other things affect
 response time, and some of them a LOT more than his experimental
 conditions do. However, because one is interested in testing a theory of
 selective attention, this small difference between conditions is VERY
 important, provided it is statistically significant (and there is
 sufficient power);  and measures of effect size are not all that relevant.

Where the measure of effect size is relevant here is in answer to the
question: Can we rule out all other plausible causes for what we observe? No
experimental design is perfect, and in real life one may be forced to work
with some that are very imperfect indeed. The experimenter may be able to
eliminate some major confounding variables by careful design; and while
there are always huge numbers of minor effects that *might* be confounded
with what one wants to observe, it is true that most of them are small in
size.

If one can conclude that the effect size is on the order of 20ms, one
can then ask oneself "is there anything else, not controlled for in the
experiment, that could cause an effect of that magnitude?" and with luck and
good management the answer would be "no".  Whereas, if one just rejected the
null hypothesis, the corresponding question would be "is there anything
else, not controlled for in the experiment, that could cause an effect?"
and the answer, if given honestly, would be "yes".

In the case of negative priming, had the effect been of the order of 1ms
(and the sample size correspondingly vast), I would conjecture that many
other plausible causes (lengthened time between trials? more opportunity to
become curious about the experiment?) for the difference could be dreamed up
that would be difficult to eliminate.

-Robert Dawson



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread dennis roberts

the term 'null' does NOT mean 0 (zero) ... though it is misconstrued that way

the term 'null' means a hypothesis that is the straw dog case ... for which 
we are hoping that sample data will allow us to NULLIFY ...

in some cases, the null happens to be 0 ... but in many cases, it does not

cases in point:

1. null hypothesis is that the population variance for IQ is 225
2. null hypothesis is that the population mean for IQ is 100
3. to test the variance of a population ... the null is that the chi square 
value will be degrees of freedom

and on and on and on


At 10:04 AM 4/10/00 -0500, Michael Granaas wrote:
On Fri, 7 Apr 2000, dennis roberts wrote:

  At 04:00 PM 4/7/00 -0500, Michael Granaas wrote:
 
  But whatever form hypothesis testing takes it must first and formost be
  viewed in the context of the question being asked.
 
 
  this seems to be the key to REinventing ourselves ... make sure the focus
  is on the question ... AND, to REshape the question FROM what we
  traditionally do in hyp test ...

If you look at Psychology you might well see two traditions, one in which
the zero valued null is used in a rather automatic and mindless fashion
and another in which researchers work very hard setting up experiments
where rejection of the zero valued null does provide some information.

 
  set up the null, etc. etc
 
  to ... ask the question of real interest ...
 
  what effect DOES this new treatment have?
  what kind of correlation IS there between X and Y?

In the second tradition I spoke of you find people asking exactly these
types of questions once they have established that their experimental
results are not due to chance.  They use the hypothesis test as a step on
the road to understanding, not as an end in and of itself.

To me this second group acts more like model fitters (emphasis on
prediction) than they do like hypothesis testers (emphasis on rejecting
nil effects).  Even though this second group rejects some nil valued
hypothesis they, unlike the first, ask questions about things like effect
size or functional form of an effect rather than simply declaring the
effect is not zero and drawing some final conclusion.

For myself I try to get students at all levels asking the types of
questions that Dennis suggests as being obvious follow-ups to rejecting
some nil hypothesis.  I cannot claim a great deal of success, but I am
trying.

  what IS the difference between the smartness of democrats and republicans?
 
  if you ask questions that way ... they do not naturally or sensibly 
 lead to
  our testing the typical null hypotheses we set up

Yes.  There are a variety of answers to this problem, but, rejecting the
no difference hypothesis when it is a priori false is not among them.

Michael


 

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Robert Dawson

Dennis Roberts wrote:

 the term 'null' does NOT mean 0 (zero) ... though it is misconstrued that
way
 the term 'null' means a hypothesis that is the straw dog case ... for
which
 we are hoping that sample data will allow us to NULLIFY ...
 in some cases, the null happens to be 0 ... but in many cases, it does not

It always means that _something_ is zero - as does just about any other
algebraic or mathematical expression, after a little rearrangement into
something logically equivalent . Moreover, in cases in which the null
hypothesis has any prior credibility - as should always be the case - that
"something" (eg, amount by which the IQ of the subject population differs
from the standardized value of 100) is usually a sensible thing to study.
And that thing can usually be thought of as "effect size".

In the classic student blooper cases: "H0: mu = x bar"  and "H0: mu
equals the nearest round number to x bar" it isn't: and those tests should
not be done.

-Robert



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Michael Granaas

On Mon, 10 Apr 2000, Robert Dawson wrote:

 Dennis Roberts wrote:
 
  the term 'null' does NOT mean 0 (zero) ... though it is misconstrued that
 way
  the term 'null' means a hypothesis that is the straw dog case ... for
 which
  we are hoping that sample data will allow us to NULLIFY ...
  in some cases, the null happens to be 0 ... but in many cases, it does not
 
 It always means that _something_ is zero - as does just about any other
 algebraic or mathematical expression, after a little rearrangement into
 something logically equivalent . Moreover, in cases in which the null

My grandmother could have told me that the mean height for men and women
was not the same (zero difference).  So based on prior evidence I
hypothesize that the actual difference is 3 inches (mu1 - mu2 = 3) and use
that for my null hypothesis.  True, I can reduce this to a zero difference
version by using (mu1 - mu2) - 3 = 0 but do I really want to?

The problem is that Fisher ment "hypothesis to be nullified" and chose
the term "null" which has a mathematical meaning of "zero".  This might
have been sensible in Fisher's applications where you wouldn't use a new
fertilizer unless it was different from nothing or some other
treatment.  So null met both meanings.

But, I would argue that the height difference hypothesis is better
understood and more meaningful in its non-zero form.  Perhaps we need to
refer to this hypothesis as the "test hypothesis".

 hypothesis has any prior credibility - as should always be the case - that
 "something" (eg, amount by which the IQ of the subject population differs
 from the standardized value of 100) is usually a sensible thing to study.
 And that thing can usually be thought of as "effect size".
 
 In the classic student blooper cases: "H0: mu = x bar"  and "H0: mu
 equals the nearest round number to x bar" it isn't: and those tests should
 not be done.
 
 -Robert
 
 

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Robert Dawson

Dennis Roberts wrote:

 if you are interested in the relationship between heights and weights of
 people, in the larger population ... the notion that we test this against
a
 null of rho=0 is not credible ... in fact, it is rather stupid ... a more
 sensible null would be perhaps a rho of .5 ...

No if you have to start "a more sensible null would be perhaps" you
almost surely do not have a hypothesis worth testing. Put it this way:

If your possible outcomes are
 "The correlation is not zero"
   or "The correlation may be zero"

these are both weak statements but you might not feel silly making either of
them. (If you would, then use an interval estimate instead!)

 "The correlation is not 0.5"
   or "The correlation may be 0.5"

both leave the listener wondering "why 0.5?"  If the only answer is "well,
it was a round number close enough to x bar [or "to my guesstimate before
the experiment"] not to seem silly, but far enough away that I thought I
could reject it." then the test is pointless.

-Robert Dawson





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Robert Dawson

Michael Granaas wrote:

 My grandmother could have told me that the mean height for men and women
 was not the same (zero difference).  So based on prior evidence I
 hypothesize that the actual difference is 3 inches (mu1 - mu2 = 3) and use
 that for my null hypothesis.  True, I can reduce this to a zero difference
 version by using (mu1 - mu2) - 3 = 0 but do I really want to?

if the 3" is credible enough to be worth testing, then yes, you do.
Example: 3" is determined from historical large-sample measurements to be
the difference for the population overall, and you want to determine whether
there is a larger difference in heights within male and female members of a
certain ethnic group; or you want to see if the height difference is
decreasing over time; or whether it is larger for the armed forces. In these
cases (mu1-mu2-3) represents "change in height difference explained by..."
and it is indeed the effect size.

However, if you are simply studying height differences, and do not have
any real source for that 3" figure, you would not be justified in pulling it
out of thin air just to permit you to do a hypothesis test. I would suggest,
in fact, that a very good rule of thumb would be that if you _can't_ cast a
null hypothesis meaningfully in the form "effectsize = 0" you should be
very, very wary of doing the test at all.

-Robert Dawson




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: What to do about simple techniques

2000-04-10 Thread Paul R Swank

I am new to the list so I am jumping into the middle of this. However, we
have to start teaching hypothesis testing somewhere. Even if it goes the
way of the Edsel, it will be a slow death because many of us will continue
to use when we feel it is appropriate to the question. However, I tell my
students that there are always more complicated ways to do things. In many
instances these take more math ability, computer skills, or time than I
have to explain them. So I am going to show them a method that will work
although it won't necessarily be the most powerful or efficient. The key is
to first understand inference thoroughly before jumping into more
complicated things. You don't teach a first grader multiplication before
they understand addition and I don't teach a nursing student a logistic
regression before they can understand a chi-square goodness of fit. If we
had the ability to teach all the students what we thought they needed to
know, we might do things a little differently. Someone mentioned Joe Ward.
His colleague Earl Jennings once told me that the biggest impediment to
understanding linear models was learning the t test, anova, and regression
techniques separately. When I teach linear models I try to get the students
to unlearn a lot of what they know. It seems a waste of time but we are not
always in charge of the curriculum. When you are given 3 hours to teach a
student something about statistics, do you start with linear models?
Probably not. Well, I did not intend this to be quite so long so I'll shut up.
At 06:05 PM 4/7/00 -0600, you wrote:
Dear all,

I am interested in what others are doing when faced with techniques that
appear in standard textbooks that are "simpler" (either computationally
and/or conceptually) than better (but more difficult) techniques.  My
concern is when the "superior" techniques is either inaccessible to the
audience (for instance, a "stat" 1011 class) or would take considerably
longer to teach (and the semester isn't long enough now) or requires use
of the computer for almost any sample.  Some examples of techniques that I
see in lots of stat textbooks but would rarely be used by a
statistician are: 1) chi-square goodness of fit to test for normality
(when Shapiro-Wilk is much better for the univariate case and the
Henze-Zirkler for the multivariate case);  2) paired sample t-tests
(usually better options here such as ANCOVA); 3) sign test (randomization
tests are much superior).  I'm sure I left out/didn't think of plenty of
other cases.  My question to the group, as someone at the beginning a
career teaching statistics, is what to do?  Should some of these tests be
left out (knowing the students may run into the tests in future course work
or in some research?  Should the better procedures always be taught,
knowing that the additional difficulty due to level of
mathematics/concepts/computational load may well lose many students?  I
don't know yet; What do you thing?


___
Christopher Mecklin
Doctoral Student, Department of Applied Statistics
University of Northern Colorado
Greeley, CO 80631
(970) 304-1352 or (970) 351-1684




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===


Paul R. Swank, PhD.
Advanced Quantitative Methodologist
UT-Houston School of Nursing
Center for Nursing Research
Phone (713)500-2031
Fax (713) 500-2033


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread dennis roberts

At 01:16 PM 4/10/00 -0300, Robert Dawson wrote:

 No if you have to start "a more sensible null would be perhaps" you
almost surely do not have a hypothesis worth testing.


now we get to the crux of the matter ... WHY do we need a null ... or any 
hypothesis ... (credible and/or sensible) to test??? what is the scientific 
need for this? what is the rationale within statistical exploration for this?

i am not suggesting that we don't need or must not deal with inferences 
from sample data to what parameters might be ... but, i fail to see WHY 
that necessarily means that one has to have a null hypothesis of any kind

perhaps this is what needs to be debated more ... what function does having 
a hypothesis really have? if any ...

it would be useful if we could have some short listing of reasons why  
and, some examples where WITHOUT a hypothesis, we are unable to make any 
scientific progress





===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread dennis roberts

At 01:16 PM 4/10/00 -0300, Robert Dawson wrote:

both leave the listener wondering "why 0.5?"  If the only answer is "well,
it was a round number close enough to x bar [or "to my guesstimate before
the experiment"] not to seem silly, but far enough away that I thought I
could reject it." then the test is pointless.

 -Robert Dawson


YOU HAVE made my case perfectly!  ... this is why the notion of hypothesis 
testing is outmoded, no longer useful ... not worth the time we put into 
teaching it ...
in the case above ... i would ask:

what is the population rho value ... THAT is the important inferential 
issue ...

there is no reason why we would have to say: i wonder if it is .5 ... let's 
TEST that, or ... i wonder if it is .7 ... let's TEST that ...

we can simply ask the question and try to get an answer to that ... and 
there is no need to test a pre formulated null to get some sensible answer 
to the question

no need for ANY null ... therefore no need for any hypothesis test

if 0 is absurd ... and, if i hypothesized .5 and you ask why .5??? then we 
could have asked anywhere from 0 to .5 ... and they would have been just as 
non functional ...

that's it ... hypothesis testing is non functional








===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Sensible nulls

2000-04-10 Thread Michael Granaas


I, and I think Dennis, are arguing that when we test a hypothesis we
should have a null hypothesis that is plausibly true.  A hypothesis that
reflects some sort of an effect size estimate where such an estimate is
meaningful.

If I understand correctly Robert is arguing that we should always phrase
the null so that it becomes a hypothesis of no effect.

In one case we can do that mathematically by rearranging a hypothesis of
(mu1 - mu2 = 3) to the form ((mu1 - mu2) - 3 = 0).  If this is what Robert
means by saying that only no effect hypotheses are meaningful I think we
are in partial agreement.  I personally shudder to think of trying to
teach the second form to my students.  I think that they will have a much
easier time understanding that I am predicting a difference between two
groups of 3 units using the first.  And they will have an easier time
understanding any implications of rejecting/not rejecting a hypothesis in
the first case.

If Robert is saying it is not sensible to test (mu1 - mu2 = 3) under any
circumstances I disagree.  

My reading of another message is that he thinks there should be some prior
evidence for testing a hypothesis of 3 units of difference.  If my reading
here is correct I think that we may be differing in what we consider
adequate prior evidence, but otherwise are close.

I guess I don't wish to argue all three possibilities if only one of them
is an actual point of disagreement.

For reasons I am willing to develop fully later I think that specifying a
plausibly true value for a null hypothesis (test hypothesis) is more
valuable than a null hypothesis where the specified value is not plausibly
true.

In psychology, and I think education, we see the zero value specified when
it is not even remotely plausible way too often.  This plausibility
judgement is informed by at least some prior evidence.

By plausibly true I am willing to conceed some reasonable interval around
the tested null value where the interval size is informed by content area
knowledge.  (I am willing to say that some small effects should be treated
as if they were zero. I am willing to say that true values only slightly
different from the hypothesized value should be treated as if they were
the hypothesized value.)

Michael

***
Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: Sensible nulls

2000-04-10 Thread Michael Granaas

On Mon, 10 Apr 2000, Robert Dawson wrote:

 Michael Granaas wrote:
 
 H0: being in the target population has no effect on sexual dimorphism in
 height
 Ha: being in the target population does affect sexual dimorphism in
 height

I want to see if I am interpreting your meaning correctly.  If some value
such as "3" comes from some place sensible then your null here would
represent the same idea that I have been expressing as (mu1 - mu2) - 3 = 0?

Michael

 
 which gets to the real heart of the matter.
 



Michael M. Granaas
Associate Professor[EMAIL PROTECTED]
Department of Psychology
University of South Dakota Phone: (605) 677-5295
Vermillion, SD  57069  FAX:   (605) 677-6604
***
All views expressed are those of the author and do not necessarily
reflect those of the University of South Dakota, or the South
Dakota Board of Regents.



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Donald F. Burrill

On Mon, 10 Apr 2000, dennis roberts wrote in part:

  .. the fact that we create a null and test a null does NOT imply that 
 we are therefore testing some effect size ...

Of course not.  One does not TEST an effect size, one ESTIMATES it.
And it is useful to do so only if one has found it not equal to some 
value (possibly, but not necessarily, zero) that would imply the effect 
size to be uninteresting.  (Sometimes it is convenient to do both of 
these things at once, as in constructing a confidence interval.  But 
if the interval includes values that are, a priori, uninteresting, there 
is little utility to pursuing the current estimate of the effect size.)

 and, if we were interested in an effect size, then we don't have to 
 test  for it ... but we could ask the question: how large is it? that is 
 NOT a test of a hypothesis
 
 we don't need ANY null to find answers to questions of import that we 
 might have

I beg to differ.  Strenuously.  The whole point of a point null 
hypothesis is to be able to specify a probability distribution against 
which one can assert, more or less credibly, that one's conclusion is 
supported with a suitably limited probability of error.  (Error, that 
is, in drawing the conclusion.)  For this reason Lumsden used to call 
this the "model-distributional" hypothesis, which had the virtue of 
describing its proper purpose moderately clearly, and had the obvious 
defect of being too much of a mouthful for us ordinary folks to use in 
conversation (or in classrooms, or in other contexts beginning with "c"). 

So long as the logical style of scientific argumentation is argument by 
elimination, one needs a set of propositions about the world  [Note:  not 
about the present sample, nor about statistics measured on the sample.]
that are both exhaustive and mutually exclusive (viz., the null and 
alternative hypotheses), and a means of determining either that one 
proposition is false, or that one's decision that it is false (should 
one come to that decision) has a low probability of being wrong.

THAT is what hypothesis testing is about;  and it follows that sensible 
discussions of the form and/or values associated with a 
model-distributional hypothesis cannot take place in the absence of the 
alternative hypothesis(-es) that are to be considered simultaneously.
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: 3-D regression planes graph,

2000-04-10 Thread Jon Cryer

The free ARC software from the University of Minnesota will do some of this.
Look at

http://stat.umn.edu/ARCHIVES/archives.html

Jon Cryer

At 01:59 PM 4/10/00 -0500, you wrote:

Hello all,

I'm looking for software that can display a 3-D regression environment (x,
y, and z variables) and draw a regression plane for each of two subgroups.

So far, Minitab does a good job of the 3-D scatterplots (regular,
wireframe, and surface (plane) plots), but there's no option (as in the
regular scatterplots) to either code data points according to categorical
variables or to overlay two graphs on the same set of axes.

I'm saving the data in both Minitab and SPSS files, and I can easily
convert to Excel (as a standard go-between spreadsheet file).

Any help will be greatly appreciated.  The effect in my research that I'm
finding so far is that my two groups look similar in univariate and
bivariate settings, but the trivariate regression planes are different.  I
know I could do what I needed to with regression equations (and will do
so), but I'd l-o-v-e to have some graphs to go with it.  SPSS will be fine
for the actual regression equations-- it can deal with subgroups like
that.

Thank you very much in advance,

Cherilyn Young




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===




===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Michael Granaas


My comments are about half-way down.

Michael

On Mon, 10 Apr 2000, Robert Dawson wrote:

 Dennis Roberts wrote:
 
  now we get to the crux of the matter ... WHY do we need a null ... or any
  hypothesis ... (credible and/or sensible) to test??? what is the
 scientific
  need for this? what is the rationale within statistical exploration for
 this?
 
 My understanding, not perhaps parallelling the historical development
 very closely, is that the answer is something like this. I'm sure somebody
 will correct me...
 
   (0)  People want to be able to make qualitative statements: "Manure makes
 the roses grow." "Electric shocks make mice do what you want them to."  "If
 I buy kippers it will not rain."
 
   (1)   In an attempt to be more scientific, instead of making absolute
 statements, people decide to use the idea of probability. They would _like_
 to be able to say "There is a 99% probability that if you put manure on
 roses they will grow better."  However, that does not fit in with
 traditional frequency-based probability, which only officially assigns
 probabilities to events which are "random", a phrase usually undefined or
 defined only inductively "You know, dice, urns, all that stuff."  Roses are
 not random because you do not get to bet on them at Las Vegas. (Horses are
 dubious. Few people recommend using the outcome of the 2:30 to assign mice
 to treatment groups. )
 
   (2)If there is going to be a probability involved, then, it has to
 involve the sampling technique, as that is the only place where the
 experimenter can introduce (or pretend to) an urn or pair of dice.
 
   (3)Even given randomization via sampling, we need to know how things
 really are to compute a probability. If we *did* know this we wouldn't be
 doing statistics. But we can make a "conditional" statement that _if_
 something were true then the probability of observing something would be
 
 (4) In order to avoid circular logic, we *cannot* assume what we want to
 prove, in order to compute the probability. We can however assume it for a
 contradiction. Therefore:
 
This (point 4) is certainly what we have been lead to believe, but I
question the assumption.  Do we not in fact teach that we are to act as if
the null is true until we can demonstrate otherwise?  I'm not sure where
this assumption came from (Fisher, someone's interpretation of Fisher,
someone other than Fischer) but there is no logical problem with assuming
something that might plausibly be true and using it as our null.

Isn't that what we do in our experiments all the time?  We assume that our
experimental manipulation has no effect, which is plausibly true at least
for some time, and then we try to disprove that estimate of the effect.
Failing to do so we act as if the effect were absent (or so small as to be
absent for all practical purposes).

In many cases, however, we are only interested in the presense/absence of
an effect and this is plenty good enough.  But, sometimes we want to
estimate, model, an effect.  In this case we want a parameter estimate
that is reasonbly close to the population value for that effect.  In this
case we might prefer confidence intervals or some such, but we could
certainly adopt our best estimate of the parameter and try to disprove it
by using it as the value tested in the null hypothesis.

There is no logical problem with adopting a plausibly true value for the
null and accepting it if it survives efforts to discredit it.  That is
there is no logical problem with using a prediction approach.  

 (5) There is some set of observations that will lead us to declare that
 that contradiction is reached, and others that won't. Hence the rejection
 region.
 
 (6) The only definite outcome is rejecting the hypothesis, the only
 situation in which we can compute the probability is when the hypothesis is
 true. Hence alpha.

Yes.  But, we can assume a genuinely true hypothesis as well as one that
is in all likelihood false.  That does not pose any problem for the
computation of alpha levels.

The problem is that in too many cases our predictions of what is true are
too weak to allow point/narrow interval predictions of what is true.  We
can only predict  0.  In this case it seems that we are stuck with
rejecting a zero valued null in the correct direction as the strongest
form of theory confirmation that we have.  Robert is correct, predicting
any particular value in this situation is arbitrary and pointless.

But, if the theory is strong enough to make narrow predictions those
should be used as the null and either disproved (rejected) or
corroborated (accepted).

 
 (7) Back at the beginning we wanted a yes-or-no answer. Henced fixed
 alpha testing and the pretence that we "accept" null hypotheses.

If the null is plausibly true we need no pretense.  We accept the null as
true until something better comes along.  I personally have accepted the
notion that psi powers do not exist despite the fact 

scientific method

2000-04-10 Thread dennis roberts

here are a few (fastly found i admit) urls about scientific method ... some 
are quite interesting

http://dharma-haven.org/science/myth-of-scientific-method.htm#Overview

http://teacher.nsrl.rochester.edu/phy_labs/AppendixE/AppendixE.html

http://idt.net/~nelsonb/bridgman.html

http://www.brint.com/papers/science.htm

http://koning.ecsu.ctstateu.edu/Plants_Human/scimeth.html

http://ldolphin.org/SciMeth2.html

http://www.wsu.edu:8080/~meinert/SH.html

http://www.phys.tcu.edu/~dingram/edu/pine.html

now, i know there are tons more ... and, i offer no guarantees about the 
above ...



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread Rich Ulrich

Just because Dennis has trouble with the null hypothesis, that does
not mean that it is a bad idea to use them.

On 10 Apr 2000 08:41:06 -0700, [EMAIL PROTECTED] (dennis roberts)
wrote:

 the term 'null' does NOT mean 0 (zero) ... though it is misconstrued that way
 
 the term 'null' means a hypothesis that is the straw dog case ... for which 
 we are hoping that sample data will allow us to NULLIFY ...

 - this seemed okay in the first sentence.  However, I think that
"straw dog case" is what I would call "straw man argument"  and that
is *not*  the quality of argument of the null.The point-null is
always false, but we state the null so that it is "reasonable" to
accept it, or to require data in order to reject it.

 in some cases, the null happens to be 0 ... but in many cases, it does not
 
 cases in point:
 
 1. null hypothesis is that the population variance for IQ is 225
  snip, similar stuff, and other stuff

 -- I think I reject the null when it is stated, " ... is 225".  Sure,
the point-null is false.  But state it, "The difference between the
Variance and 225 is zero" -- and then, you require data to show that
there is evidence, that difference should be accepted as other than 0.

We either *accept*  that the difference is (may be) zero, or we
*reject* and have some other difference.  We do not *conclude* that
the difference is zero.

There have been posts in the last two weeks on sci.stat.consult
concerning the testing of bioequivalence -- and that is a case where
the null is more complicated.  Generally, the ALTERNATIVE is that the
new drug falls between 80% and 125% of the potency of the old drug.
(Those are the limits that the FDA cares about.)  The null is that the
new is Greater than 125% or Less than 80% of the old.

If we have rotten evidence, with bad means, or huge standard
deviations, then we have to accept the null; the new may be unlike the
old.  That is not a weak "strawman" - that is a reasonable, default
alternative.  The standard testing is called Two One Sided Tests, to
show that the amount is definitely less that the Upper limit, and
definitely greater than the lower.  Basically, you need to construct a
Confidence interval on the difference and have it fall completely
within the limits.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread dennis roberts

the logic behind the null hypothesis method is flawed ... IF you are 
looking for truth AND you keep following the logic of testing AGAINST a 
null ...

first, say you reject the null of rho = 0 ...

then, LOGICALLY ... this says that since we don't know what truth is ... 
just what we think it isn't ... we go

second, make the null as rho = .05 ... then .1, then .15 ... on and on

UNTIL we reach that magical spot (if ever) ... when we had the null of rho 
= .65 ... and we suddenly RETAINED the null!

i guess we know what the truth is now, or, do we?

At 05:20 PM 4/10/00 -0400, Rich Ulrich wrote:
Just because Dennis has trouble with the null hypothesis, that does
not mean that it is a bad idea to use them.

maybe not ... but i don't see that many if any reasons why and the 
discussions are not swaying me ... (of course, that is not the posters 
faults ... maybe just mine)






===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: hyp testing

2000-04-10 Thread David A. Heiser




- Original Message - 
From: Michael Granaas [EMAIL PROTECTED]
 Our current verbal lables leave much to be 
desired.  Depending on who you ask the "null hypothesis" 
is  a) a hypothesis of no effect (nil hypothesis) b) an 
a priori false hypothesis to be rejected (straw dog hypothesis) c) an a 
priori plausible hypothesis to be tested and falsified or corroborated 
(wish I had a term for this usage/real 
null?)--
The concept of a hypothesis is important. It can be used to 
teach an important statistical concept.

Let us supose there are many plausible hypotheses. These 
include the "nil hypothesis" any priori hypotheses any idea at all that may be 
considered. Refer to these in terms of set of all plausible hypothesis 
(including that of no effect) that are to be tested.

The process is to pick each hypothesis and test it. The 
outcome of the test is not only a probability, but a reality check (the 
investigators belief system). THE OUTCOME CAN ONLY BE BINARY, REJECTION OR 
NON-REJECTION. Non-rejection is not acceptance. It just means that under 
non-rejection, the hypothesis is in the set of all hypotheses that were 
not rejected. The process does not pick out the true hypothesis, it never can do 
that. It can only reject those hypothesis that have little chance of fitting the 
data. You can ignor them then. You have to use other techniques to pick the 
acceptable hypothesis out of all those in the 'not rejected" set. Any "verbal or 
mathematical summary" is acceptable (that is in the set of non-rejected 
hypothesis) (Pearson 1892, p22).

As R.A. Fisher said (re. a level of 0.05 level of significance 
in testing a hypotheses) "does not mean that he allows himself to be deceived 
once in twenty experiments. The test of significance only tells him what to 
ignor, namely all experiments in which significant test results are nto 
obtained" (Fisher 1929b, p 191). Fisher also said "a test of significance 
contains no criteria for 'accepting' a hypothesis' (Fisher 1937, p 
45).

DAHeiser


cluster analysis

2000-04-10 Thread Elisa Wood

Can anyone help with good resources on the web, journals, books, etc on
cluster analysis - simularity and ordination.  Any recommended programs
for this type of analysis too.

Cheers
Elisa Wood



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===