Re: normal approx. to binomial

2001-04-09 Thread Gary Carson

It's the proportion of success (x/n) which has approxiatmenly a normal
distribution for large n, not the number of success (x).

 
Gary Carson
http://www.garycarson.com


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: In realtion to t-tests

2001-04-09 Thread Donald Burrill

On Mon, 9 Apr 2001, Rich Ulrich wrote:

> On Mon, 09 Apr 2001 10:44:40 -0400, Paige Miller
> <[EMAIL PROTECTED]> wrote:
> 
> > "Andrew L." wrote:
>
AL> I am trying to learn what a t-test will actually tell me, in 
> simple terms.  <  snip  >, but i still dont quite
> understand the significance.
> 
PM> A t-test compares a mean to a specific value...or two means to each
> other...
>  [ ... ]
> 
RU> I remember my estimation classes, where the comparison was
RU> always to ZERO for means. 

Yes, that's what Paige said:  here the mean (mean difference, to be 
precise) is being compared to the specific value zero.
OR "two means to each other", since the allegation "X1 = X2" is 
equivalent to the allegation "(X1-X2) = 0"
That the hypothethical expectation is often zero (that is, null) is the 
reason why that hypothesis is colloquially called "the null hypothesis"; 
Lumsden argued that it were better called "the model-distributional 
hypothesis", but that apparently is too much of a mouthful for most 
folks.  There is, however, no formal or logical REQUIREMENT that the 
value expected under the model-distributional hypothesis be zero.

RU> To ONE, I guess, for ratios.
RU> Technically  speaking, or writing.

Someone else pointed out that if the ratio were of interest, one should 
probably be taking logarithms;  in which case the comparison of interest 
would be to log(1) = 0.
(Unless the ratio of interest were a ratio of variances;  but in that 
case the relevant distribution would not be a t distribution.)
 
RU> For instance, if the difference in averages X1, X2  is expected to 
RU> be zero, then  "{(X1-X2) -0 }"  ... is distributed as t . 

This is, I believe, technically inaccurate.  "{(X1-X2) - 0}" is 
distributed normally, or approximately so under a central limit theorem; 
in which case  "{(X1-X2) - 0}" divided by its estimated standard error is 
distributed as t .  Again technically, as the standard central t . 
("Standard", implying that the mean and standard deviation of the 
sampling distribution are 0 and 1 respectively;  "central", implying that 
the non-centrality parameter is zero.)

RU> It might look like a lot of equations with the 'minus zero'  
RU> seemingly tacked on, but  I consider this to be good form. 

No argument with that.  Nor with this:

RU> It formalizes as minus 
-- DFB.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-472-3742  



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



US government grants and scholarships for International students.

2001-04-09 Thread sonit

I would like some information on US government grants and scholarships for
International students for 8th grade.






=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: normal approx. to binomial

2001-04-09 Thread Jay Warner

one tech issue, one thinking issue, I believe.

1)   Tech:   if np _and_  n(1-p) are > 5, the distribution of binomial 
observations is considered 'close enough' to Normal.  So 'large n' is 
OK, but fails when p, the p(event), gets very small.

Most examples you see in the books use p = .1 or .25 or so.  Modern 
industrial situations usually have p(flaw) around 0.01 and less.  Good 
production will  run under 0.001.  To reach the 'Normal approximation' 
level with p = 0.001, you have to have n = 5000.  Not particularly 
reasonable, in most cases.

If you generate the distribution for the situation with np = 5 and n = 
20 or more, you will see that it is still rather 'pushed' (tech term) up 
against the left side - your eye will balk at calling it normal.  But 
that's the 'rule of thumb.'  I have worked with cases, pushing it down 
to np = 4, and even 3.  However, I wouldn't want to put 3 decimal 
precision on the calculations at that point.

My personal suggestion is that if you believe you have a binomial 
distribution, and you need the confidence intervals or other 
applications of the distribution, then why not simply compute them out 
with the binary equations.  Unless n is quite large, you will have to 
adjust the limits to suit the potential observations, anyway.  For 
example, if n = 10, there is no sense in computing a 3 sigma limit of np 
= 3.678 - you will never measure more precisely than 3, and then 4.  But 
that's the application level speaking here.

2)I think your books are saying that, when n is very large (or I 
would say, when np>5), the binomial measurement will fit a Normal dist.  
It will be discrete, of course, so it will look like a histogram not a 
continuous density curve.  But you knew that.  I think your book is 
calling the binomial rv a single measurement, and it is the collection 
of repeated measurements that forms the distribution, no?  I explain a 
binomial measurement as,  n pieces touched/inspected, x contain the 
'flaw' in question, so p = x/n.  p is now a single measurement in 
subsequent calculations.  to get a distribution of 100 proportion 
values, I would have to 'touch' 100*n.  I guess that's OK, if you are 
paying the inspector.

Clearly, one of the draw backs of a dichotomous measurement (either OK 
or not-OK) is that we have to measure a heck of a lot of them to start 
getting decent results.  the better the product (fewer flaws) the worse 
it gets.  See the situation for p = 0.001 above.  Eventually we don't 
bother inspecting, or automate and do 100% inspection.  So the next 
paragraph better explain about the improved information with a 
continuous measure...

Sorry, I got up on my soap box by mistake.

Is this enough explanation?

Jay

James Ankeny wrote:

>   Hello,
> I have a question regarding the so-called normal approx. to the binomial
> distribution. According to most textbooks I have looked at (these are
> undergraduate stats books), there is some talk of how a binomial random
> variable is approximately normal for large n, and may be approximated by the
> normal distribution. My question is, are they saying that the sampling
> distribution of a binomial rv is approximately normal for large n?
> Typically, a binomial rv is not thought of as a statistic, at least in these
> books, but this is the only way that the approximation makes sense to me.
> Perhaps, the sampling distribution of a binomial rv may be normal, kind of
> like the sampling distribution of x-bar may be normal? This way, one could
> calculate a statistic from a sample, like the number of successes, and form
> a confidence interval. Please tell me if this is way off, but when they say
> that a binomial rv may be normal for large n, it seems like this would only
> be true if they were talking about a sampling distribution where repeated
> samples are selected and the number of successes calculated.
> 
> 
> 
> 
> 
> 
> ___
> Send a cool gift with your E-Card
> http://www.bluemountain.com/giftcenter/
> 
> 
> 
> 
> =
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
> =
> 
> 
> 

-- 
Jay Warner
Principal Scientist
Warner Consulting, Inc.
 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX:(262) 681-1133
email:  [EMAIL PROTECTED]
web:http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: normal approx. to binomial

2001-04-09 Thread Elliot Cramer

James Ankeny <[EMAIL PROTECTED]> wrote:
:  My question is, are they saying that the sampling
: distribution of a binomial rv is approximately normal for large n?
: 
It's a special case of the CLT for a binary variable with probability p,
taking the sum of n observations



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: In realtion to t-tests

2001-04-09 Thread Rich Ulrich

On Mon, 09 Apr 2001 10:44:40 -0400, Paige Miller
<[EMAIL PROTECTED]> wrote:

> "Andrew L." wrote:
> > 
> > I am trying to learn what a t-test will actually tell me, in simple terms.
> > Dennis Roberts and Paige Miller, have helped alot, but i still dont quite
> > understand the significance.
> > 
> > Andy L
> 
> A t-test compares a mean to a specific value...or two means to each
> other...
 [ ... ]

I remember my estimation classes, where the comparison was
always to ZERO for means.  To ONE, I guess, for ratios.
Technically  speaking, or writing.

For instance,   if the difference in averages X1, X2  is expected to
be zero, then  "{(X1-X2) -0 }"  ... is distributed as t .   It might
look like a lot of equations with the 'minus zero'  seemingly tacked
on, but  I consider this to be good form.  It formalizes as
   minus 

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Logistic regression advice

2001-04-09 Thread Rich Ulrich

On 6 Apr 2001 13:15:34 -0700, [EMAIL PROTECTED] (Zina Taran)
wrote:

 [ ... on logistic regression ]
ZT: "1). The 'omnibus' chi-squared for the equation.  Is it 
accurate to say that I can interpret individual significant
coefficients if (and only if) the equation itself is significant? "

Confused question.  Why do you label it the omnibus test?
When you think to use that term, the term is (mainly) a ROLE 
for the overall test, or for a test that subsumes a coherent set of
several tests;  sometimes you place use test that way, and 
sometimes you don't.

ZT: "2) A few times I added interaction terms and some things 
became significant.  Can I interpret these even if the interaction
variable itself (such as 'age') is not  significant?  Can I interpret
an interaction term if neither variable has a significant beta?"

Probably not.  Assuredly not, unless someone has used care and
attention (and knowledge) in the exact dummy-coding of the effects.


[ ... snip, 'Nagelkerke' that I don't recall; 'massive' regression
which is a term that escapes me, but I think it means, 'no hypotheses,
test everything'; and so I disapprove. ] 
ZT: "5) I know the general rule is 'just the facts'  in the results
section, meaning that there should be no explanation or 
interpretation regarding the results.  When writing the results
section do I specifically draw conclusions as to whether a 
hypothesis is supported or does that get left to the discussion?"

Do you have an Example that is difficult?  - It seems to me that 
if the analyses are straightforward, there should be little question
about what the 'results'  mean, when you lay them out in their own,
minimalist section.  In other words, leave discussion to the 
discussion; but that should be a re-cap of what's apparent.  
You hope.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Reverse of Fisher's r to z

2001-04-09 Thread Jerrold Zar

Yes, there are reasons for using the transformation frm z to r.

And, there are published tables of this.  For example, Appendix Table
B.19 of Zar, Biostatistical Analysis, 4th ed., 1999.

Jerrold H. Zar, Professor
Department of Biological Sciences
Northern Illinois University
DeKalb, IL 60115
[EMAIL PROTECTED]
===
>>> Will Hopkins <[EMAIL PROTECTED]> 04/09/01 04:29AM >>>
It's elementary algebra, Cherilyn.  BTW, it's z = 0.5log..., not sqrt.

So r = (e^2z - 1)/(e^2z + 1).

Will



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Reverse of Fisher's r to z

2001-04-09 Thread Cherilyn Young


Thanks-- my algebra (and apparently my eyesight too) has gotten a bit
creepy around the edges, so I didn't trust it for something this
important  Truly appreciate it!!!

Best,

Cherilyn

On Mon, 9 Apr 2001, Will Hopkins wrote:

> It's elementary algebra, Cherilyn.  BTW, it's z = 0.5log..., not sqrt.
> 
> So r = (e^2z - 1)/(e^2z + 1).
> 
> Will
> 
> 



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: In realtion to t-tests

2001-04-09 Thread Paige Miller

"Andrew L." wrote:
> 
> I am trying to learn what a t-test will actually tell me, in simple terms.
> Dennis Roberts and Paige Miller, have helped alot, but i still dont quite
> understand the significance.
> 
> Andy L

A t-test compares a mean to a specific value...or two means to each
other...

The reason we do this, if you are comparing two means, for example, is
if you get mean of group A is 6.75 and mean of group B is 6.8, we
really want to know if this difference is significant -- which is
another way of saying is it likely that these two mean values arise
purely out of random chance? This is a "layman's" way of describing
the t-test. The actual proper statistical description can be found in
numerous textbooks.

There are some assumptions being made in order to do this ... usually
the data is identically normally and independently distributed ...
approximately normal is usually okay

-- 
Paige Miller
Eastman Kodak Company
[EMAIL PROTECTED]

"It's nothing until I call it!" -- Bill Klem, NL Umpire
"Those black-eyed peas tasted all right to me" -- Dixie Chicks


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Time Series Data. Significant movement

2001-04-09 Thread Philip Bouchier

I have a set of measurements (e.g. number of errors, faults, etc) over a
period of time (e.g. 9 Months) and measurements are taken weekly. These
measurements are graphed on a spreadsheet. I need to select a small number
of measurements and graphs then display the measurements and the graphs to
my audience.

My question is, Is there a statistical way of selecting the set of
measurement that show movement up or down other than just eye balling the
graphs??

--
Philip
[EMAIL PROTECTED]






=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Reverse of Fisher's r to z

2001-04-09 Thread Chuck Cleland

Cherilyn Young wrote:
> I have an itchy little question about the familiar Fisher's r to z
> transformation:  The formula, expressed as z= sqrt (log e ( (1+r)/(1-r))),
> is in pretty much any older stats textbook.  Does anyone know of a source
> where the equation is written to solve for r?  I know it's a very uncommon
> use (if used at all in this way ), but I've got a very legitimate research
> need (and my brain's doing odd things when I'm trying to rewrite the
> equation).

r.back <- function(x)
{
((2.71828182845905^(2 * x)) - 1)/((2.71828182845905^(2 * x)) + 1)
}

fish.z <- function(x)
{
ifelse(x == 0, 0, 0.5 * log((1 + abs(x))/(1 - abs(x))) * (x/abs(x)))
}

Examples:

> fish.z(.45)
[1] 0.4847003

> r.back(.4847003)
[1] 0.45

> r.back(fish.z(.45))
[1] 0.45

HTH,

Chuck

-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-
 Chuck Cleland 
 Institute for the Study of Child Development  
 UMDNJ--Robert Wood Johnson Medical School 
 97 Paterson Street
 New Brunswick, NJ 08903   
 phone: (732) 235-7699 
   fax: (732) 235-6189 
  http://www2.umdnj.edu/iscdweb/   
  http://members.nbci.com/cmcleland/ 
-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-<>-


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression toward the Mean - search question

2001-04-09 Thread Gene Gallagher

>A few weeks ago, I believe on this list, a quick discussion of Galton's 
>regression to the mean popped up.  I downloaded some of Galton's data, 
>generated my own, and found some ways to express the effect in ways my 
>non-statistian education friends might understand.  Still working on 
>that part.
>
>In addition, there was a reference to a wonderful article, which I read, 
>and which explained the whole thing in excellent terms and clarity for 
>me.  The author is clearly an expert on the subject of detecting change 
>in things.  He (I think) even listed people who had fallen into the 
>regression toward the mean fallacy, including himself.
>
>Problem:  Now of course I really want that article again, and 
>reference.  I cannot find it on my hard drive.  Maybe I didn't download 
>it - it was large.  But I can't find the reference to it, either.  Bummer!
>
>Can anyone figure out who and what article I'm referring to, and 
>re-point me to it?
>
>Very much obliged to you all,
>Jay
>
>-- 
>Jay Warner
>Principal Scientist
>Warner Consulting, Inc.
> North Green Bay Road
>Racine, WI 53404-1216
>USA
>
Trochim's page has a nice description of the problem but with few historical
references:
http://trochim.human.cornell.edu/kb/regrmean.htm

Campbell, D. T. and D. A. Kenny 1999.  A primer on regression artifacts.
Guilford Press.  This book is devoted almost entirely to regression to the mean
and what to do about it.

Stigler, S. M. 1999. Statistics on the table. Harvard University Press. 
[Stigler has several essays to the discovery of RTM under the heading
"Galtonian Ideas" He also presents a sobering case study of poor Otto Secrist,
whose 1933 magnum opus in econometrics is a classic RTM artifact.
Eugene Gallagher
ECOS
UMASS/Boston


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: rotations and PCA

2001-04-09 Thread Robert J. MacG. Dawson



Eric Bohlman wrote:
 In science, it's not enough to
> say that you have data that's consistent with your hypothesis; you also
> need to show a) that you don't have data that's inconsistent with your
> hypothesis and b) that your data is *not* consistent with competing
> hypotheses.  And there's absolutely nothing controversial about that last
> sentence [...]

Well, I'd want to modify it a little. On the one hand, a certain amount
of inconsistency can be (and sometimes must be) dealt with by saying
"every so often something unexpected happens"; otherwise it would only
take two researchers making inconsistent observations to bring the whole
structure of science crashing down.  And on the other hand there are
_always_ competing hypotheses. [Consider Jaynes' example of the
policeman seeing one who appears to be a masked burglar exiting from the
broken window of a jewellery store with a bag of jewellery; he (the
policeman) does *not* draw the perfectly logical conclusion that this
might be the owner, returning from a costume party, and, having noticed
that the window was broken, collecting his stock for safekeeping.] It is
sufficient to show that your data are not consistent with hypotheses
that are simpler or more plausible, or at least not much less simple or
plausible.

-Robert Dawson


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



No Subject

2001-04-09 Thread NEUMA TERESINHA NADAL




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Funding for Posters: European Nutrition and Cancer Conference

2001-04-09 Thread Julie Dechy

Dear All,

The European Conference on Nutrition and Cancer will take place in Lyon, 
France on 21-24 June 2001.

An important feature of the conference is 2 large poster sessions on days 2 
and 3.  As indicated on the programme web site, in the GENERAL INFORMATION 
section, funds have been set aside to pay for travel expenses and lodging 
for 
up to 50 participants presenting posters.

Poster abstracts must be submitted by the 30th April, 2001.  Posters 
concerning studies of diet, nutrition, genetics, hormones, epidemiologic and 
statistical methods or other related areas of research are welcome.

The form for submitting abstracts is available on the Conference web site:

http://www.nutrition-cancer2001.com

For further information send email to:

[EMAIL PROTECTED]



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Reverse of Fisher's r to z

2001-04-09 Thread Will Hopkins

It's elementary algebra, Cherilyn.  BTW, it's z = 0.5log..., not sqrt.

So r = (e^2z - 1)/(e^2z + 1).

Will



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Reverse of Fisher's r to z

2001-04-09 Thread Cherilyn Young


Hi everyone,

I have an itchy little question about the familiar Fisher's r to z
transformation:  The formula, expressed as z= sqrt (log e ( (1+r)/(1-r))),
is in pretty much any older stats textbook.  Does anyone know of a source
where the equation is written to solve for r?  I know it's a very uncommon
use (if used at all in this way ), but I've got a very legitimate research
need (and my brain's doing odd things when I'm trying to rewrite the
equation).

Thanks in advance,

Cherilyn



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=