Who said "Correlation does not imply causation".

2001-12-01 Thread Andrew Morse

Who was the first to say "Correlation does not imply causation" in so many
words?  I know that the idea dates back to David Hume, but Hume did his
work about a century before the term "correlation" acquired its modern
statitical meaning.  I've seen many sources that crdit Karl Pearson with
banishing the idea of causation from modern statistical theory, but none
that attribute the quote directly to him.  Sewall Wright?  Francis Galton?
Ronald Fisher?  Any other candidates?

-- Andrew


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



What usually should be done with missing values when I am conducting a t test or other tests?

2001-12-01 Thread jenny

What should I do with the missing values in my data.  I ned to perform
a t test of two samples to test the mean difference between them.

How should I handle them in S-Plus or SAS?

Thanks.
JJ


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: experimental design(doubt)?

2001-12-01 Thread dennis roberts

i think this points out that it is hard to really give good responses 
sometimes when all the details are not known ... in this case, we really 
don't have sufficient information on HOW samples were selected and 
assigned, METHODS and orders that items were heated and then porcelain 
applied, and on and on

for it to be totally randomized, we would have to have something akin to 
... having 72 samples ... assigning them to temp and porcelain type first 
... then executing this "design" in that order ...

could be that sample 1 got 430 degrees and type b, sample 2 might have 
gotten temp 700 with porcelain type b, and so on

but, we probably know that is NOT what happened ... because that would have 
created implementation problems

we know in factories ... there are runs of different items at different 
times ... they might have a run of X from 8AM to NOON, then there is a 
transition period before Y gets done from 1PM to 5PM ...

in this instance, it probably was the case that all 24 were heated to the 
first temp ... then when that was all done, the oven was revved up to the 
next higher temp and then the next 24 were heated ... and a final heat up 
to the last temp saw the final 24 done

so, this is not exactly a totally randomized plan ... since there could 
have been some systematic difference from one batch to the other

we also don't know how the porcelain was applied ... it might have been 
that after all were heated to temp 1 ... then the first 12 that came out of 
the oven were given porcelain type A ... since this was easier to do ... 
then the last 12 got (after the change over) porcelain type B

if either temp or type of porcelain made a BIG difference, these procedural 
details probably don't make a hill of beans of difference but, of course, 
if the impacts (though maybe real) were very small, then some systematic 
error might make a difference

as i said ... we just don't know

however, i think trying to give the "design" the proper NAME is really not 
that important ... the real important matter is whether the implementation 
of the plan was sufficiently close enough to a fully randomized design that 
... an analysis according to that design would be satisfactory in this case

bottom line is: snippets of information given to the list ... does not 
necessarily allow us to field the ensuing inquiries ... and, it is better 
to probe more about methods and procedures first ... then to rush off with 
some analysis conclusion

BUT WE DO IT ANYWAY!





>If the samples have been treated independently, that is each sample is
>individually raised to the randomly assigned temperature and
>subsequently treated with the assigned porcelain type, the design is a
>completely randomized design. Any application effects (including
>possible deviations of supposed temperatures and irregularities during
>the whole proces of heating and subsequent cooling) will contribute to
>the random error of the observations. But when all samples of the same
>temperature treatment are simultaneously put in the furnace and
>treated as one batch the situation is different. In that case
>application effects (whose existence and magnitude is not known
>generally) are confounded with the effects of the temperature
>treatment. In my comment I supposed and stated that probably this was
>the situation at hand. I have to admit that the original message is
>not entirely clear on this point.

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Stat question

2001-12-01 Thread dennis roberts

At 06:13 PM 12/1/01 -0500, Stan Brown wrote:
>Jon Miller <[EMAIL PROTECTED]> wrote in sci.stat.edu:
> >
> >Stan Brown wrote:
> >
> >> I would respectfully suggest that the OP _first_ carefully study the
> >> textbook sections that correspond to the missed lectures, get notes from
> >> a classmate
> >
> >This part is of doubtful usefulness.
>
>Doubtful? It is "of doubtful usefulness" to get notes from a
>classmate and study the covered section of the textbook? Huh?

perhaps doubtful IF the students OP asked to look at were terrible students 
who took terrible notes ... and/or ... OP when reading the text could not 
make anything of it ...

but, those are two big ifs

usually, students won't ask to see the notes of students whom they know are 
"not too swift" ... and, also ... usually students who read the book do get 
something out of it ... maybe not enough

the issue here is ... it appeared (though we have no proof of this) that 
the original poster did little, if anything, on his/her own ... prior to 
posting a HELP to the list

stan seemed to be reacting to that assumption and, i don't blame him


>--
>Stan Brown, Oak Road Systems, Cortland County, New York, USA
>   http://oakroadsystems.com/
>"My theory was a perfectly good one. The facts were misleading."
>-- /The Lady Vanishes/ (1938)
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: survival curves and life expectancy

2001-12-01 Thread Karp

Beth Clarkson <[EMAIL PROTECTED]> wrote in message 
news:<[EMAIL PROTECTED]>...
> I'm looking for some sources on how to compute survival rates and
> remaining life determinations.  In particular, I'd like to find
> information on the Iowa type survivor curves and the retirement rate
> method.  This is in regard to computations on tangible and intangible
> assets.  Can anyone help me out with some useful references?  Thanks.
> 
> Beth Clarkson

Beth, you can do a search in www.google.com, for example, with
"survivor curve Iowa" and get a lot of references. For instance:

http://www.willamette.com/pubs/insights/99/intangibleassetsandintellectualproperties.html

seems to be one of them. Good luck! :-)


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Stat question

2001-12-01 Thread Stan Brown

Jon Miller <[EMAIL PROTECTED]> wrote in sci.stat.edu:
>
>Stan Brown wrote:
>
>> I would respectfully suggest that the OP _first_ carefully study the
>> textbook sections that correspond to the missed lectures, get notes from
>> a classmate
>
>This part is of doubtful usefulness.

Doubtful? It is "of doubtful usefulness" to get notes from a 
classmate and study the covered section of the textbook? Huh?

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com/
"My theory was a perfectly good one. The facts were misleading."
   -- /The Lady Vanishes/ (1938)


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Stat question

2001-12-01 Thread Jon Miller

Stan Brown wrote:

> I would respectfully suggest that the OP _first_ carefully study the
> textbook sections that correspond to the missed lectures, get notes from
> a classmate

This part is of doubtful usefulness.

> , and _then_ contact the instructor to fill in any remaining gaps or
> answer any questions.

Jon Miller



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-12-01 Thread Dennis Roberts

At 08:29 AM 12/1/01 -0500, Stan Brown wrote:


>How I would analyze this claim is that, when the advertiser says
>"90% of people will be helped", that means 90% or more. Surely if we
>did a large controlled study and found 93% were helped, we would not
>turn around and say the advertiser was wrong! But I think that's
>what would happen with a two-tailed test.
>
>Can you explain a bit further?

would the advertiser feel he/she was wrong if the 90% value was a little 
less too ... within some margin of error from 90? probably not

perhaps you want to say that the advertiser is claiming around 90% or MORE, 
or at LEAST 90% ...

again ... we are getting far too hung up in how some hypothesis is stated 
... is not the more important matter ... what sort of impact is there? if 
that is the case ... testing a null ... ANY null ... is really not going to 
help you

you need to look at the SAMPLE data ... then ask yourself ... what sort of 
a real effect might there be if i got the sample results that i did? if you 
then want to superimpose on this a question ... i wonder if 90 or more 
could have been the truth ... fine

but that is an after thought

this does not call for a hypothesis test


>--
>Stan Brown, Oak Road Systems, Cortland County, New York, USA
>   http://oakroadsystems.com
>My reply address is correct as is. The courtesy of providing a correct
>reply address is more important to me than time spent deleting spam.
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-12-01 Thread Stan Brown

[cc'd to previous poster]

Rich Ulrich <[EMAIL PROTECTED]> wrote in sci.stat.edu:
>I think I could not blame students for floundering about on this one.
>
>On Thu, 29 Nov 2001 14:39:35 -0500, [EMAIL PROTECTED] (Stan Brown)
>wrote:
>> "The manufacturer of a patent medicine claims that it is 90% 
>> effective(*) in relieving an allergy for a period of 8 hours. In a 
>> sample of 200 people who had the allergy, the medicine provided 
>> relief for 170 people. Determine whether the manufacturer's claim 
>> was legitimate, to the 0.01 significance level."

>I have never asked that as a question in statistics, and 
>it does not have an automatic, idiomatic translation to what I ask.

How would you have phrased the question, then? Though I took this 
one from a book, I'm always looking to improve the phrasing of 
questions I set in quizzes and exams.

>I can expect that it means, "Use a 1% test."  But, for what?

>That claim could NEVER, legitimately, have been *based*  
>on these data.   That is an idea that tries to intrude itself,
>to me, and makes it difficult to address the intended question.

Agreed! My idea, in reading that problem, was that the manufacturer 
claimed something for a product that has been on the market for some 
time, and some independent group, such as a newspaper or TV network, 
did a study to test the claim.

> - By the way, it also bothers me that "90% effective"  is 
>apparently translated as "effective for 90% of the people."
>I wondered if the asterisk was supposed to represent "[sic]".

The asterisk led to my note defining it as relieving symptoms for 
90% of people who use it, and asking students to think whether the 
claim would also be true if it relieved symptoms for more than 90%. 
(I think the real-world answer is clearly Yes: If a product is 
claimed to help 90% of people and it actually helps 93%, we do not 
say the claim was false.)

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-12-01 Thread Stan Brown

[cc'd to previous poster; please follow up in newsgroup]

Robert J. MacG. Dawson <[EMAIL PROTECTED]> wrote in 
sci.stat.edu:
>Stan Brown wrote:
>> "The manufacturer of a patent medicine claims that it is 90%
>> effective(*) in relieving an allergy for a period of 8 hours. In a
>> sample of 200 people who had the allergy, the medicine provided
>> relief for 170 people. Determine whether the manufacturer's claim
>> was legitimate, to the 0.01 significance level."

>   A hypothesis test is set up ahead of time so that it can only 
>give a definite answer of one sort. In this case, we have (at least)
>three distinct possibilities.

I really like your presentation of the three possible tests as 
"advertiser's test", "consumer advocate's test", and "quality 
controller's test". I see why the quality controller would want to 
do a two-tailed test: the product should not be outside 
manufacturing parameters in either direction. (Presumably the QC 
person would be testing the pills themselves, not patients taking 
the pills.)

But I don't see why either the advertiser or the consumer advocate 
would, or should, do a two-tailed test. Alan McLean seemed to agree 
that both would be one-tailed, if I understand him correctly.

>   (1) The "consumer advocate's test": we want a definite result that
>makes the manufacturer look bad, so H0 is the manufacturer's
>claim, Ha is that the claim is wrong, and the p-value is to be used 
>as an indication of reason to believe H0 wrong (if so).  Using a
>one-sided test here is akin to saying "I want all my type I errors to be
>ones that make the manufacturer look bad".  Ethical behaviour here is to
>do a two-sided test and report a result in either direction.  

I don't get this. Why is that ethical behavior?

How I would analyze this claim is that, when the advertiser says 
"90% of people will be helped", that means 90% or more. Surely if we 
did a large controlled study and found 93% were helped, we would not 
turn around and say the advertiser was wrong! But I think that's 
what would happen with a two-tailed test.

Can you explain a bit further?

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Interpreting p-value = .99

2001-12-01 Thread Stan Brown

Alan McLean <[EMAIL PROTECTED]> wrote in 
sci.stat.edu:
>Stan, in practical terms, the conclusion 'fail to reject the null' is
>simply not true. You do in reality 'accept the null'. The catch is that
>this is, in the research situation, a tentative acceptance - you
>recognise that you may be wrong, so you carry forward the idea that the
>null may be 'true' but - on the sample evifdence - probably is not.
>
>On the other hand, this should also be the case when you 'reject the
>null' - the rejection may be wrong, so the rejection is also tentative.
>The difference is that the null has this privileged position.

Thanks -- that makes some sense.

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: experimental design(doubt)?

2001-12-01 Thread Jos Jansen


"Dennis Roberts" <[EMAIL PROTECTED]> wrote in message
[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> At 03:11 PM 11/30/01 -0200, Ivan Balducci wrote:
> >Hi, members:
> >Please, I am hoping to get some clarification on this problem:
> >In my University, (I work in Dental School, in Brazil),
> >a dentist brought to me your data:
> >Experimental Unit:
> >cilindrical shape of Titanium pure ( 72 samples): diameter: 4mmm;
height: 5mm
> >submeted to Shear Test (Instrom)
> >..
> >The Ti were to heating in furnace:
> >24 samples to 430ºC;
...
> >...
> >12 samples (from 430ºC) received porcelain type A
> >12 samples (from 430ºC) received porcelain type B
.
> >
> >Objectives:
> >Effect Interaction between the variables: Temperature and Porcelain
> >on Shear Data;
...
> >My question is: She made an Split-Plot design ?
> >Whole plot: Temperature
> >Split: Porcelain.
>
> looks like a simple randomized design to me ... in effect, you have
> selected 12 at random and raised to 430 AND gave porcelain type A ...
>
> and the other 5 combinations...
>
> 3 by 2 design ... fully randomized ...
>
> unless i am missing something
>

Yes, you are missing the application error that should be assigned to each
temperature treatment, which (probably) is applied to all samples
simultaneously. Because of this application error, the design is of
splitplot type; however, the variance of the main-plot error is not known
and cannot be estimated.

Jos Jansen




=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Stat question

2001-12-01 Thread Stan Brown

Elliot Cramer <[EMAIL PROTECTED]> wrote in sci.stat.edu:
>Sima <[EMAIL PROTECTED]> wrote:
>: I have missed some lectures on statistics due to heavy illness
>: and now i got an assignment which i cannot solve.
>
>We all feel sorry for you Sima, but perhaps you should talk to your
>instructor about it.  He undoubtedly has office hours.

While that's the conventional advice, speaking as an instructor I do 
get tired of students who miss class for whatever reason, don't 
crack the textbook, and expect me to give them a private lesson that 
duplicates what was done in class. I don't know what if anything the 
OP has done about making up the missed material.

I would respectfully suggest that the OP _first_ carefully study the 
textbook sections that correspond to the missed lectures, get notes 
from a classmate, and _then_ contact the instructor to fill in any 
remaining gaps or answer any questions.

-- 
Stan Brown, Oak Road Systems, Cortland County, New York, USA
  http://oakroadsystems.com
My reply address is correct as is. The courtesy of providing a correct
reply address is more important to me than time spent deleting spam.


=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=