Re: Interpreting p-value = .99

2001-11-29 Thread Stan Brown

Gus Gassmann <[EMAIL PROTECTED]> wrote in sci.stat.edu:
>Stan Brown wrote:
>> "The manufacturer of a patent medicine claims that it is 90%
>> effective(*) in relieving an allergy for a period of 8 hours. In a
>> sample of 200 people who had the allergy, the medicine provided
>> relief for 170 people. Determine whether the manufacturer's claim
>> was legitimate, to the 0.01 significance level."

>> But -- and in retrospect I should have seen it coming -- some
>> students framed the hypotheses so that the alternative hypothesis
>> was "the drug is effective as claimed." They had
>> Ho: p <= .9; Ha: p > .9; p-value = .9908.
>
>I don't understand where they get the .9908 from. 

x=170, n=200, p'=.85, Ha: p>.9, alpha=.01
z = -2.357
On TI-83, normalcdf(-2.357,1E99) = .9908; i.e., 99.08% of the area 
is above z = -2.357.


-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Optimal filtering question

2001-11-29 Thread Alex Zhu

Hi All, 

Suppose we have a stochastic processes
with an unknown parameter (the parameter
is used in a general sense, it may a stochastic
mean of the process, then it's current value is also
a parameter). 
We observe the dynamics of this process 
and update our estimate of this parameter. 

It may be the case that our estimate of
this parameter will always be imprecise
in the sense that the variance of the estmator
is greater than zero and does not converge
to zero (like in the case of learning
about a stochastic mean)

However, it seems that if we start from
different priors about this parameter, 
then the estimates x1(t) and x2(t) 
obtained with priors x1(0) and x2(0) respectively
always converge at infinity as time t goes
to infinity. 

Is it always true? 
If yes, is there a theorem stating this?
If not, is there a counterexample?

Many thanks 

Alex


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: t vs. z - recapitulation

2001-11-29 Thread Rich Ulrich

 - I am just taking a couple of questions in this note -

On Thu, 29 Nov 2001 13:16:24 +0100, "Gaj Vidmar"
<[EMAIL PROTECTED]> wrote:
[ ... ]

I saw some decent comments about the table;  the table
was not very useful.

z is used with large N  as 'sufficiently good' approximation for  t. 

z  is used when "variance is known" -- which is, in particular, 
when statistics are computed that were based on dichotomies
or on ranks (if there were no ties).  That's basically when 
variances are known.  With big samples and ties, you are 
probably better off doing your rank-transformation, then using
the F-test or t-test.

Depending on the outliers, ANOVA (t-tests) might be useless,
regardless of how big the sample *might* get -- that happens
more often than careless analysts expect, when they don't 
watch for outliers.  -- If  you can't opt for a parametric transform,
you might need to test after rescoring as 'ranks'  or into categories
(two or more).


> 
> Note 2: t-test is very robust (BTW, is Boneau, 1960, Psychological Bulletin
 - not in *my* opinion (see below) -

> vol. 57, referenced and summarised in Quinn and McNemar, Psychological
> Statistics, 4th ed. 1969, with the nice introduction "Boneau, with the
> indispesable help of an electronic computer, ...", still an adequate
> reference?), whereby:
> - skewness, even extreme, is not a big problem
> - two-tailed testing increases robusteness

 - I was annoyed when I learned that those old-line authors 
would decide that a test was 'robust with two-tails'  when it 
rejected 9%  in one tail in 1% in the other.  It felt somewhat 
like having been lied-to.  I still disagree with that opinion.

Fortunately, the *problem*  of really-bad-p-values (in both
directions)  does not exist for equal N.  Unfortunately, even
for equal N, there *can*  be a great loss in statistical power.
So, you should be unhappy to see great skewness.

But for either equal or unequal N, I am much happier if I can
trust that the variable has been transformed to its proper
metric; and if that metric does not have skewness, or 
heterogeneity of variance.

If a variable *needs*  to be transformed, please transform it.
(But the 'need' issue is worth its own discussion.)

> - unequal variances are a serious problem with unequal N's with larger
> variance of smaller sample

Oh, a problem with larger variance in EITHER sample.
A different problem, one way or the other.

Outliers cause loss of power for ANOVA, just as much as 
outliers screw up a mean -- If you see outliers, ask, Are you 
sure ANOVA is the right tool?  

> 
> Now, what to do if t is inadequate? - This is a whole complex issue in
> itself, so just a few thoughts:
> - in case of extreme skewness, Mann-Whitney is not a good alternative
> (assumes symmetric distrib.), right?
 [ ... ]

It assumes *same*  distributions in two samples, not necessarily
symmetric.  What is your hypothesis going to be?  What can you
fairly conclude, if one sample occupies both ends of the distribution?

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Normal distribution

2001-11-29 Thread Rich Ulrich

On Thu, 29 Nov 2001 14:37:14 -0400, Gus Gassmann
<[EMAIL PROTECTED]> wrote:

> Rich Ulrich wrote:
> 
> > On Thu, 29 Nov 2001 15:48:48 +0300, Ludovic Duponchel
> > <[EMAIL PROTECTED]> wrote:
> >
> > > If x values have a normal distribution, is there a normal distribution
> > > for x^2 ?
> >
> > If z is standard normal [ that is, mean 0, variance 1.0 ]
> > then z^2  is chi squared with 1 degree of freedom.
> >
> > And the sum of S   independent  z  variates
> > is chi squared with S degrees of freedom.
> 
> Hold it! The sum of S independent z variates is normal.
> The sum of the _squares_ of S independent z variates is
> chi squared with S degrees of freedom.
> 
> (But I am sure you knew that.)

oops- make that < z^2 >  for < z >  of course -


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Normal distribution

2001-11-29 Thread Dick Startz

And to add on to Rich Ulrich's note, if the mean isn't zero, then z^2
is non-central chi-square.
-Dick Startz

On Thu, 29 Nov 2001 12:29:47 -0500, Rich Ulrich <[EMAIL PROTECTED]>
wrote:

>On Thu, 29 Nov 2001 15:48:48 +0300, Ludovic Duponchel
><[EMAIL PROTECTED]> wrote:
>
>> If x values have a normal distribution, is there a normal distribution
>> for x^2 ?
>
>If z is standard normal [ that is, mean 0, variance 1.0 ]
>then z^2  is chi squared with 1 degree of freedom.
>
>And the sum of S   independent  z  variates
>is chi squared with S degrees of freedom.

--
Richard Startz  [EMAIL PROTECTED]
Lundberg Startz Associates


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-11-29 Thread Alan McLean

Gus, 

Stan's two alternatives were correct as stated - they were two one sided
tests, not a one sided and a two sided test.

Stan, in practical terms, the conclusion 'fail to reject the null' is
simply not true. You do in reality 'accept the null'. The catch is that
this is, in the research situation, a tentative acceptance - you
recognise that you may be wrong, so you carry forward the idea that the
null may be 'true' but - on the sample evifdence - probably is not.

On the other hand, this should also be the case when you 'reject the
null' - the rejection may be wrong, so the rejection is also tentative.
The difference is that the null has this privileged position.

In areas like quality control, of course, it is quite clear that you
decide, and act as if, the null is true or is not true.

Regards,
Alan



Gus Gassmann wrote:
> 
> Stan Brown wrote:
> 
> > On a quiz, I set the following problem to my statistics class:
> >
> > "The manufacturer of a patent medicine claims that it is 90%
> > effective(*) in relieving an allergy for a period of 8 hours. In a
> > sample of 200 people who had the allergy, the medicine provided
> > relief for 170 people. Determine whether the manufacturer's claim
> > was legitimate, to the 0.01 significance level."
> >
> > (The problem was adapted from Spiegel and Stevens, /Schaum's
> > Outline: Statistics/, problem 10.6.)
> >
> > I believe a one-tailed test, not a two-tailed test, is appropriate.
> > It would be silly to test for "effectiveness differs from 90%" since
> > no one would object if the medicine helps more than 90% of
> > patients.)
> >
> > Framing the alternative hypothesis as "the manufacturer's claim is
> > not legitimate" gives
> > Ho: p >= .9; Ha: p < .9; p-value = .0092
> > on a one-tailed t-test. Therefore we reject Ho and conclude that the
> > drug is less than 90% effective.
> >
> > But -- and in retrospect I should have seen it coming -- some
> > students framed the hypotheses so that the alternative hypothesis
> > was "the drug is effective as claimed." They had
> > Ho: p <= .9; Ha: p > .9; p-value = .9908.
> 
> I don't understand where they get the .9908 from. Whether you test a
> one-or a two-sided alternative, the test statistic is the same. So the
> p-value for the two-sided version of the test should be simply twice
> the p-value for the one-sided alternative, 0.0184. Hence the paradox
> you speak of is an illusion.
> 
> Unfortunately for you, the two versions of the test lead to different
> conclusions. If the correct p-value is given, I would give full marks
> (perhaps, depending on how much the problem is worth overall,
> subtracting 1 out of 10 marks for the nonsensical form of Ha).
> 
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =

-- 
Alan McLean ([EMAIL PROTECTED])
Department of Econometrics and Business Statistics
Monash University, Caulfield Campus, Melbourne
Tel:  +61 03 9903 2102Fax: +61 03 9903 2007


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-11-29 Thread Dennis Roberts

forget the statement of the null

build a CI ... perhaps 99% (which would correspond to your .01 sig. test) ...

let that help to determine if the claim seems reasonable or not

in this case ... p hat = .85 .. thus q hat = .15

stan error of a proportion (given SRS was done) is about

stan error of p hats = sqrt ((p hat * q hat) / n) = sqrt (.85 * .15 / 200) 
= about .025

approximate 99% CI would be about p hat +/-  2.58 * .025 = .85 +/- .06

CI would be about .79 to .91 ... so, IF you insist on a hypothesis test ... 
retain the null

personally, i would rather say that the pop. proportion might be between 
(about) .79 to .91 ...

doesn't hold me to .9

problem here is that if you have opted for .05 ... you would have rejected 
... (just barely)

At 02:39 PM 11/29/01 -0500, you wrote:
>On a quiz, I set the following problem to my statistics class:
>
>"The manufacturer of a patent medicine claims that it is 90%
>effective(*) in relieving an allergy for a period of 8 hours. In a
>sample of 200 people who had the allergy, the medicine provided
>relief for 170 people. Determine whether the manufacturer's claim
>was legitimate, to the 0.01 significance level."
>
>(The problem was adapted from Spiegel and Stevens, /Schaum's
>Outline: Statistics/, problem 10.6.)
>
>
>I believe a one-tailed test, not a two-tailed test, is appropriate.
>It would be silly to test for "effectiveness differs from 90%" since
>no one would object if the medicine helps more than 90% of
>patients.)
>
>Framing the alternative hypothesis as "the manufacturer's claim is
>not legitimate" gives
> Ho: p >= .9; Ha: p < .9; p-value = .0092
>on a one-tailed t-test. Therefore we reject Ho and conclude that the
>drug is less than 90% effective.
>
>But -- and in retrospect I should have seen it coming -- some
>students framed the hypotheses so that the alternative hypothesis
>was "the drug is effective as claimed." They had
> Ho: p <= .9; Ha: p > .9; p-value = .9908.
>
>Now as I understand things it is not formally legitimate to accept
>the null hypothesis: we can only either reject it (and accept Ha) or
>fail to reject it (and draw no conclusion). What I would tell my
>class is this: the best we can say is that p = .9908 is a very
>strong statement that rejecting the null hypothesis would be a Type
>I error. But I'm not completely easy in my mind about that, when
>simply reversing the hypotheses gives p = .0092 and lets us conclude
>that the drug is not 90% effective.
>
>There seems to be a paradox: The very same data lead either to the
>conclusion "the drug is not effective as claimed" or to no
>conclusion. I could certainly tell my class: "if it makes sense in
>the particular situation, reverse the hypotheses and recompute the
>p-value." Am I being over-formal here, or am I being horribly stupid
>and missing some reason why it _would_ be legitimate to draw a
>conclusion from p=.9908?
>
>--
>Stan Brown, Oak Road Systems, Cortland County, New York, USA
>   http://oakroadsystems.com
>My reply address is correct as is. The courtesy of providing a correct
>reply address is more important to me than time spent deleting spam.
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-11-29 Thread Gus Gassmann

Stan Brown wrote:

> On a quiz, I set the following problem to my statistics class:
>
> "The manufacturer of a patent medicine claims that it is 90%
> effective(*) in relieving an allergy for a period of 8 hours. In a
> sample of 200 people who had the allergy, the medicine provided
> relief for 170 people. Determine whether the manufacturer's claim
> was legitimate, to the 0.01 significance level."
>
> (The problem was adapted from Spiegel and Stevens, /Schaum's
> Outline: Statistics/, problem 10.6.)
>
> I believe a one-tailed test, not a two-tailed test, is appropriate.
> It would be silly to test for "effectiveness differs from 90%" since
> no one would object if the medicine helps more than 90% of
> patients.)
>
> Framing the alternative hypothesis as "the manufacturer's claim is
> not legitimate" gives
> Ho: p >= .9; Ha: p < .9; p-value = .0092
> on a one-tailed t-test. Therefore we reject Ho and conclude that the
> drug is less than 90% effective.
>
> But -- and in retrospect I should have seen it coming -- some
> students framed the hypotheses so that the alternative hypothesis
> was "the drug is effective as claimed." They had
> Ho: p <= .9; Ha: p > .9; p-value = .9908.

I don't understand where they get the .9908 from. Whether you test a
one-or a two-sided alternative, the test statistic is the same. So the
p-value for the two-sided version of the test should be simply twice
the p-value for the one-sided alternative, 0.0184. Hence the paradox
you speak of is an illusion.

Unfortunately for you, the two versions of the test lead to different
conclusions. If the correct p-value is given, I would give full marks
(perhaps, depending on how much the problem is worth overall,
subtracting 1 out of 10 marks for the nonsensical form of Ha).





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Interpreting p-value = .99

2001-11-29 Thread Stan Brown

On a quiz, I set the following problem to my statistics class:

"The manufacturer of a patent medicine claims that it is 90% 
effective(*) in relieving an allergy for a period of 8 hours. In a 
sample of 200 people who had the allergy, the medicine provided 
relief for 170 people. Determine whether the manufacturer's claim 
was legitimate, to the 0.01 significance level."

(The problem was adapted from Spiegel and Stevens, /Schaum's
Outline: Statistics/, problem 10.6.)


I believe a one-tailed test, not a two-tailed test, is appropriate. 
It would be silly to test for "effectiveness differs from 90%" since 
no one would object if the medicine helps more than 90% of 
patients.)

Framing the alternative hypothesis as "the manufacturer's claim is 
not legitimate" gives
Ho: p >= .9; Ha: p < .9; p-value = .0092
on a one-tailed t-test. Therefore we reject Ho and conclude that the 
drug is less than 90% effective.

But -- and in retrospect I should have seen it coming -- some 
students framed the hypotheses so that the alternative hypothesis 
was "the drug is effective as claimed." They had
Ho: p <= .9; Ha: p > .9; p-value = .9908.

Now as I understand things it is not formally legitimate to accept 
the null hypothesis: we can only either reject it (and accept Ha) or 
fail to reject it (and draw no conclusion). What I would tell my 
class is this: the best we can say is that p = .9908 is a very 
strong statement that rejecting the null hypothesis would be a Type 
I error. But I'm not completely easy in my mind about that, when 
simply reversing the hypotheses gives p = .0092 and lets us conclude 
that the drug is not 90% effective.

There seems to be a paradox: The very same data lead either to the 
conclusion "the drug is not effective as claimed" or to no 
conclusion. I could certainly tell my class: "if it makes sense in 
the particular situation, reverse the hypotheses and recompute the 
p-value." Am I being over-formal here, or am I being horribly stupid 
and missing some reason why it _would_ be legitimate to draw a 
conclusion from p=.9908?

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Normal distribution

2001-11-29 Thread Gus Gassmann

Rich Ulrich wrote:

> On Thu, 29 Nov 2001 15:48:48 +0300, Ludovic Duponchel
> <[EMAIL PROTECTED]> wrote:
>
> > If x values have a normal distribution, is there a normal distribution
> > for x^2 ?
>
> If z is standard normal [ that is, mean 0, variance 1.0 ]
> then z^2  is chi squared with 1 degree of freedom.
>
> And the sum of S   independent  z  variates
> is chi squared with S degrees of freedom.

Hold it! The sum of S independent z variates is normal.
The sum of the _squares_ of S independent z variates is
chi squared with S degrees of freedom.

(But I am sure you knew that.)



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: N.Y. Times: Statistics, a Tool for Life, Is Getting Short Shrift

2001-11-29 Thread Jerry Dallal

I didn't think you had.  I thought your response was more along the
lines of, "Speaking of disease clusters...".  Actually, Robert
Dawson noted "a normal distribution would be unlikely to apply"
which is along the lines of my " I *think* there's an unfortunate
use of the word
"normal" here, but I can't be sure."

I think it's great that the activities described in the original
article are taking place.  I had to smile when the article stated
that one result would be that students would be able to explain
something, but used language that left it unclear what they might be
explaining! But this is a minor quibble that should not be taken as
any way diminishing the enormous value of the work.

--Jerry

Rich Strauss wrote:
> 
> This has nothing to do with normal distributions, as Robert Dawson noted
> yesterday.  The article I cited makes no mention of normal distributions,
> and I didn't mean to imply that it did.
> 
> Rich Strauss
> 
> At 04:29 AM 11/29/01 +, Jerry Dallal wrote:
> >Rich Strauss <[EMAIL PROTECTED]> wrote:
> >:>If the trend continues nationwide, this newspaper could someday report
> >:>that an apparently alarming cluster of cancer cases has arisen in an
> >:>innocuous normal distribution, and students will be able to explain to
> >:>their parents what that means.
> >
> >: The reporting of cancer clusters already happens on a regular basis,
> >: including in the NYTimes.  An excellent article on "The Cancer-Cluster
> >: Myth" by Atul Gawande was published in The New Yorker, 8 Feb 99.  It was
> >: reprinted in "The Best American Science and Nature Writing" last year
> >: (2000, Houghton Mifflin).
> >
> >I'd be happy if *anyone* could explain to me what "an apparently
> >alarming cluster of cancer cases has arisen in an innocuous normal
> >distribution" means!  I *think* there's an unfortunate use of the word
> >"normal" here, but I can't be sure.
> 
> ===
> Richard E. Strauss  (806) 742-2719
> Biological Sciences (806) 742-2963 Fax
> Texas Tech University   [EMAIL PROTECTED]
> Lubbock, TX  79409-3131
> http://www.biol.ttu.edu/Faculty/FacPages/Strauss/Strauss.html
> ===
> 
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Normal distribution

2001-11-29 Thread Rich Ulrich

On Thu, 29 Nov 2001 15:48:48 +0300, Ludovic Duponchel
<[EMAIL PROTECTED]> wrote:

> If x values have a normal distribution, is there a normal distribution
> for x^2 ?

If z is standard normal [ that is, mean 0, variance 1.0 ]
then z^2  is chi squared with 1 degree of freedom.

And the sum of S   independent  z  variates
is chi squared with S degrees of freedom.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: N.Y. Times: Statistics, a Tool for Life, Is Getting Short Shrift

2001-11-29 Thread Jim Eales

I believe I have seen reference posted here to a teacher who would challenge
his students as follows:

Do one and only one of the following:
1.  flip a coin 200 times and record the outcomes
2.  make up the outcomes of 200 coin tosses without ever flipping a coin

Turn in in your record of the actual or fictitious tosses and I will tell
you whether you flipped the coin or made it up.

The key was the made up flips never included enough runs of 6 or more in a
row.


--

Jim Eales Tel: 765 494-4212
Ag Econ/Purdue Univ   FAX: 765 494-9176
W. Lafayette, IN 47907-1145   [EMAIL PROTECTED]






=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Normal distribution

2001-11-29 Thread Ludovic Duponchel

If x values have a normal distribution, is there a normal distribution
for x^2 ?

Thanks a lot for your help.
Best regards.

Dr. Ludovic DUPONCHEL
UNIVERSITE DES SCIENCES DE LILLE
LASIR - Bât. C5
59655 Villeneuve d'Ascq
FRANCE.

Phone : 0033 3 20436661
Fax :   0033 3 20434902




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Invitation...

2001-11-29 Thread Fitline


If you want to live better. Go there!!!
http://www.336.fitline.com

This is no spam

You are receiving this message because:

You posted a link on our FFA page.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



unsubscribe

2001-11-29 Thread Colleen Cepko

unsubscribe

_
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: fractional factorial design / DOE

2001-11-29 Thread John Fava

[EMAIL PROTECTED] (Ken K.) wrote in message 
news:<[EMAIL PROTECTED]>...
> Focusing on designs that have resolution V or higher, your only two
> options are a full factorial (you don't want that) and a half fraction
> (2^{6-1}). If you six factors are A B C D E F, your design would have
> 32 runs and it will look like this:
> 
> StdOrdr   A   B   C   D   E   F
> 1 -1  -1  -1  -1  -1  -1
> 2 1   -1  -1  -1  -1  1
> 3 -1  1   -1  -1  -1  1
> 4 1   1   -1  -1  -1  -1
> 5 -1  -1  1   -1  -1  1
> 6 1   -1  1   -1  -1  -1
> 7 -1  1   1   -1  -1  -1
> 8 1   1   1   -1  -1  1
> 9 -1  -1  -1  1   -1  1
> 101   -1  -1  1   -1  -1
> 11-1  1   -1  1   -1  -1
> 121   1   -1  1   -1  1
> 13-1  -1  1   1   -1  -1
> 141   -1  1   1   -1  1
> 15-1  1   1   1   -1  1
> 161   1   1   1   -1  -1
> 17-1  -1  -1  -1  1   1
> 181   -1  -1  -1  1   -1
> 19-1  1   -1  -1  1   -1
> 201   1   -1  -1  1   1
> 21-1  -1  1   -1  1   -1
> 221   -1  1   -1  1   1
> 23-1  1   1   -1  1   1
> 241   1   1   -1  1   -1
> 25-1  -1  -1  1   1   -1
> 261   -1  -1  1   1   1
> 27-1  1   -1  1   1   1
> 281   1   -1  1   1   -1
> 29-1  -1  1   1   1   1
> 301   -1  1   1   1   -1
> 31-1  1   1   1   1   -1
> 321   1   1   1   1   1
> 
> Another possibility is to run a resolution IV design, which will take
> 16 runs, banking on the hope that all two-way interactions will turn
> out to be insignificant. In a resolution IV design, two-way
> interactions are confounded with each other, so, if ANY two-way
> interactions from the 16 run study are significant, you will have to
> "fold over" the design, which simply means to run the "other" 16 runs.
> 
> Depending on your testing environment, you may be able to simply run
> the other 16 runs for a total of 32 runs, but you'll be left assuming
> that the experimental conditions were equivalent between the first and
> second set of runs.
> 
> Hardliners might say that after running the 16-run experiment, and
> finding a significant interaction, that you need to fold the desin
> over and rerun ALL 32 teatment combinations.
> 
> 
> RE: need to create an experimental design which is smaller (cheaper)
> than the full factorial design.
>   Thanks -- Eric



Eric,

Shouldn't the design matrix be orthogonal?  That is, none of the
factors should be correlated with any of the other factors in the
design?  A correlation matrix of the factors shows that the 32 trial
design does have correlations (some very high) between factors.  Is
this correct?  Here's the correlation matrix below - should be an
identity matrix with all "1's" on the diagonal and all "0's" everwhere
else (to indicate zero correlation).


A   B   C   D   E   F
A   1.000.050.110.220.430.87
B   0.051.000.000.000.000.00
C   0.110.001.000.000.000.00
D   0.220.000.001.000.000.00
E   0.430.000.000.001.000.00
F   0.870.000.000.000.001.00


John


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: fractional factorial design / DOE

2001-11-29 Thread Allen McIntosh

In article <[EMAIL PROTECTED]>,
John Fava <[EMAIL PROTECTED]> wrote:
>Shouldn't the design matrix be orthogonal?
It is.

>   A   B   C   D   E   F
>A  1.000.050.110.220.430.87
>B  0.051.000.000.000.000.00
>C  0.110.001.000.000.000.00
>D  0.220.000.001.000.000.00
>E  0.430.000.000.001.000.00
>F  0.870.000.000.000.001.00

This is the correlation matrix of "order" with the first 5 factors.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: fractional factorial design / DOE

2001-11-29 Thread John Fava

Whoops!  Sorry, Ken.  Made a mistake.  You're design is orthogonal
(see below).

A   B   C   D   E   F
A   1.0 0.0 0.0 0.0 0.0 0.0
B   0.0 1.0 0.0 0.0 0.0 0.0
C   0.0 0.0 1.0 0.0 0.0 0.0
D   0.0 0.0 0.0 1.0 0.0 0.0
E   0.0 0.0 0.0 0.0 1.0 0.0
F   0.0 0.0 0.0 0.0 0.0 1.0

I am also going to check the design I suggested, which has 25 trials,
to be sure that is also orthogonal.  Of course, as you said, this
would be a main-effects only design.

John


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



New Engineering Utility 31349

2001-11-29 Thread f28511
Title: New Engineering Utility




I noticed your email address on a list serve related to engineering and technology.
  With your permission, we would like to send you information regarding new real-time
  collaboration and utilities based on your interests. Please click the following
  link and opt-in to our product updates and e-newsletter, click
  here.
Cordially,
  
  Victor Black









Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/


Re: N.Y. Times: Statistics, a Tool for Life, Is Getting Short Shrift

2001-11-29 Thread Robert J. MacG. Dawson


Speaking of normal distributions and cancer clusters, does anybody (a)
agree with me that the human race in general has a better "feel" for the
normal distribution than the binomial distribution, and the Poisson is
still worse - and (b) know of any experimental evidence for this?

That is, my conjecture is that if an untrained human thinks that there
is an unusually large collection of tall people, or larger-than-usual
apples, or whatever, in a collection, they are probably right; but there
is a tendancy to expect more uniformity in Bernoulli and Poisson
processes than should be there.  People tend to see clusters of things
and streaks of events when they are not really there.

There is probably a reverse trend in the extreme tail; people probably
overestimate the probability of getting (say) red fifty times in a row
at Roulette simply because we don't have a good feel for really large
and small numbers. 


-Robert Dawson


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: t vs. z - recapitulation

2001-11-29 Thread Robert J. MacG. Dawson



Gaj Vidmar wrote:


> 
> sample size   | distribution(s) | population var | appropriate test
> 
> --
> large (say, N>30) | normal  | known  | z (obvious)

No, here "large" is irrelevant. N=1 can be used...

> large | not normal  | known  | z (CLT takes care of
> numerator) 

N>30 may not be enough; or N=10 may be fine. 

> small | not normal  | known  | still z, right??

You may need a made-to-order test here, and *will* if you define
"small" as "too small for the CLT to help"

> large | normal  | estimated  | t (note 1 below)
> small | normal  | estimated  | t (the case of
> Student)
> small | not normal  | estimated  | mostly t (note 2
> below)
> 
> Note 1: z before computer era and also OK due to Slutsky's theorem

Not necessary as a separate test even then, as interpolating between
the n=100 and n=infinity rows of the t table is intuitive and avoids an
unnecessary choice.
Traditional t tables did not have enough alpha levels but that's
another story (see
http://www.amstat.org/publications/jse/v5n2/dawson.html for an
alternative; and some books now use a similar table (eg, Devore & Peck). 


> 
> Note 2: t-test is very robust (BTW, is Boneau, 1960, Psychological Bulletin
> vol. 57, referenced and summarised in Quinn and McNemar, Psychological
> Statistics, 4th ed. 1969, with the nice introduction "Boneau, with the
> indispesable help of an electronic computer, ...", still an adequate
> reference?), whereby:
> - skewness, even extreme, is not a big problem
> - two-tailed testing increases robusteness
> - unequal variances are a serious problem with unequal N's with larger
> variance of smaller sample
> 
> Now, what to do if t is inadequate? - This is a whole complex issue in
> itself, so just a few thoughts:
> - in case of extreme skewness, Mann-Whitney is not a good alternative
> (assumes symmetric distrib.), right?

For two samples, Wilcoxon-Mann-Whitney assumes symmetry *or* shift
alternative *or* transformability to one of these (eg, lognormal).

For paired and one-sample data, there's always the sign test; it
assumes practically nothing. Actually, differences of paired data are a
pretty good bet for symmetry if the populations are at all similar in
shape. (If they are not, one should think long and hard about whether
the comparison means anything anyway.) Thus one can usually use
signed-ranks on the differences.

-Robert Dawson


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



t vs. z - recapitulation

2001-11-29 Thread Gaj Vidmar

During years of passionate practitioning and round-the-clock chaotic
learning in the field of applied statistics, I have been desperately longing
to learn the funadamentals of mathematical statistics, as well as start
working as statistician. As the later recently came true, I simply had to
make some notable progress in the former as well. Not to extend this
unnecessary introduction any further, let me just state that I simply can
not find adequate words of praise for the role and value of the sci.stat
newsgroups in the whole story.

Now, to be even more lucky, this week I've been ill and thus found some
peace for studying, while at the same time the discussion on CLT and t vs. z
popped up. As a consequence, please allow me to ask for critiques of this
brief recapitulation of the issue.***

(please view in nonproportional font)

sample size   | distribution(s) | population var | appropriate test

--
large (say, N>30) | normal  | known  | z (obvious)
large | not normal  | known  | z (CLT takes care of
numerator)
small | not normal  | known  | still z, right??
large | normal  | estimated  | t (note 1 below)
small | normal  | estimated  | t (the case of
Student)
small | not normal  | estimated  | mostly t (note 2
below)

Note 1: z before computer era and also OK due to Slutsky's theorem

Note 2: t-test is very robust (BTW, is Boneau, 1960, Psychological Bulletin
vol. 57, referenced and summarised in Quinn and McNemar, Psychological
Statistics, 4th ed. 1969, with the nice introduction "Boneau, with the
indispesable help of an electronic computer, ...", still an adequate
reference?), whereby:
- skewness, even extreme, is not a big problem
- two-tailed testing increases robusteness
- unequal variances are a serious problem with unequal N's with larger
variance of smaller sample

Now, what to do if t is inadequate? - This is a whole complex issue in
itself, so just a few thoughts:
- in case of extreme skewness, Mann-Whitney is not a good alternative
(assumes symmetric distrib.), right?
- so Kolmogorov-Smirnov? But where to find truely continuous variables,
especially in social sciences? Plus not very powerful with small N, right?
- so exact permutation test, right? (Permutation Test with General Scores in
StatXact - the manual says in this special case it's called Pitman's test)
- solution for the problematic unequal variances case: take random subsample
of the larger sample of the size of the smaller sample?? Or do kinda
bootstrap - do it, say, 1000 times and take average obtained p??? - Figured
these two out by myself, so surely they are utterly wrong. So transformation
(in real-life cases usually log, or the Box-Cox, which I am yet to
understand)?

A big thanks for any comment,

Gaj Vidmar
University of Ljubljana, Faculty of Medicine
Institute of biomedical informatics

*** I try to be fully aware of how fundamentally wrong is the quest and view
of statistics as collection of recepies, now matter how diverse and advanced
they may be; but the fact stays that masses of people still encounter and/or
are taught statistics precisely in this manner, preferably with the
collection being very limited, extremely outdated and mainly faulty. And I
speak from personal experience in its most extreme form here, but in spite
of having graduated in psychology, I dare at the same time strongestly
oposing any authority whatsoever and wherever who claims thas this is mostly
due to or the case of social sciences! - But fortunately, if there is any
real benefit of Internet to humanity, the wealth of statistics-related
resources ... Anyhow, putting aside nonproductive debates, let me just do my
best to make the living case that the possibility that the aforementioned
approch and circumstances do not always leed to their replication and
proliferation is not zero. - Or, at least, since we all know that even
events with zero probability can happen, ... :) - Yes, to exagerate just a
little, you can start by mastering hand-computed point-biserial correlation,
point&click transformations in SPPS after regression diagnostics the next
year, automate logistic regression with interaction terms the following
year, then speak about somebody named Tufte to your new girlfriend all night
long, and so forth and so forth, and wouldya believe it, last month came
derivation of distribution of minimum of n samples taken from exponential
distribution! And I'll be damned if in 2002 some S-Plus or R simmulations as
part of serious reserach in statistics don't happen! And yes, I can foresee
and promise that - health and means permitting - by retirement (i.e., after
a few decades) even this person doomed to subnormality by the psych degree
will learn enough mathematics to become a Bayesian :)






=

Re: Need help with a probability problem

2001-11-29 Thread Thom Baguley

Donald Burrill wrote:
> Then, again, you are asserting that this is not a probability problem but
> a measuring-skill problem.  Your postulate that the subsequent
> executioners must have reduced "probabilities" (or success rates) only
> makes sense if all executioners use the same method of execution:  a
> condition you have not heretofore required.  Surviving a fencing match
> with the first executioner needn't imply anything about one's ability to
> survive hand-to-hand combat with the second;  except insofar as the

Yes. There is no reason to suppose that such fencing ability is
strictly monotonic. In fact anecdotal evidence suggests otherwise.
For example, the best executioner might be left handed, but have his
handedness advantage removed when fighting a left-handed prisoner etc.

Thom


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: N.Y. Times: Statistics, a Tool for Life, Is Getting Short Shrift

2001-11-29 Thread Rich Strauss

This has nothing to do with normal distributions, as Robert Dawson noted
yesterday.  The article I cited makes no mention of normal distributions,
and I didn't mean to imply that it did.

Rich Strauss

At 04:29 AM 11/29/01 +, Jerry Dallal wrote:
>Rich Strauss <[EMAIL PROTECTED]> wrote:
>:>If the trend continues nationwide, this newspaper could someday report
>:>that an apparently alarming cluster of cancer cases has arisen in an
>:>innocuous normal distribution, and students will be able to explain to
>:>their parents what that means.
>
>: The reporting of cancer clusters already happens on a regular basis,
>: including in the NYTimes.  An excellent article on "The Cancer-Cluster
>: Myth" by Atul Gawande was published in The New Yorker, 8 Feb 99.  It was
>: reprinted in "The Best American Science and Nature Writing" last year
>: (2000, Houghton Mifflin).
>
>I'd be happy if *anyone* could explain to me what "an apparently 
>alarming cluster of cancer cases has arisen in an innocuous normal 
>distribution" means!  I *think* there's an unfortunate use of the word 
>"normal" here, but I can't be sure.

===
Richard E. Strauss  (806) 742-2719
Biological Sciences (806) 742-2963 Fax
Texas Tech University   [EMAIL PROTECTED]
Lubbock, TX  79409-3131 
http://www.biol.ttu.edu/Faculty/FacPages/Strauss/Strauss.html
===



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



www.counton.org : shortlisted for BETT 2002 Award for "Best Free On-line Learning Resource"

2001-11-29 Thread John Bibby


I have received the following from Will Bulman, webmaster of www.counton.org

=
Dear All

STOP PRESS  :  PLEASE COPY THIS EMAIL TO ALL WHO MIGHT BE INTERESTED

The Count On website has been shortlisted for the BETT 2002 Award  for "Best
Free On-line
Learning Resource". Last year the Maths Year 2000 website won this award,
and we want to maintain the high profile for mathematics.

The BETT Awards are based upon votes cast.

We therefore invite all teachers and supporters of maths to vote for our
website, and to encourage others to do likewise by copying this email.

The simplest way to vote is to visit www.counton.org and click on the link
to BETT. Count On is listed in the first category, and is the last website
mentioned on a shortlist of 6. (You do not have to vote in all categories;
your name and some details are asked for at the bottom of the form so you
can enter the prize draw for "£100s worth of educational software".)

Will Bulman

===


It would be good to keep the profile of mathematics high by voting for one
or other of the maths items featured on the shortlist.

If you think that www.counton.org is the best of the nominated websites,
then please take part in the voting process as indicated above.

JOHN BIBBY



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Matrixer 4.4 is released

2001-11-29 Thread Alexander Tsyplakov

Dear Colleagues,

 Version 4.4 of Matrixer econometric program was recently released.

 The following features were added:
# Dynamic forecast for Box-Jenkins model (since version 4.3.1)
# Powerful capabilities for importing data from text files
There are several other improvements and bug fixes. 

 Note that there was a bug in the version 4.3 related to 
translation of the program captions from Russian to English. The 
bug was fixed by translating the program completely into 
English.

 For further details, and to download the program, please visit
one of the Matrixer sites
http://matrixer.narod.ru/
http://mtxr.hypermart.net/

E-mail: [EMAIL PROTECTED]

-
Alexander Tsyplakov
Novosibirsk State University
http://www.nsu.ru/ef/tsy/about_en.htm




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=