as a start, you could relate everyday examples where the notion of CI seems
to make sense
A. you observe a friend in terms of his/her lateness when planning to meet
you somewhere ... over time, you take 'samples' of late values ... in a
sense you have means ... and then you form a rubric like ... for sam ... if
we plan on meeting at noon ... you can expect him at noon + or - 10 minutes
... you won't always be right but, maybe about 95% of the time you will?
B. from real estate ads in a community, looking at sunday newspapers, you
find that several samples of average house prices for a 3 bedroom, 2 bath
place are certain values ... so, again, this is like have a bunch of means
... then, if someone asks you (visitor) about average prices of a bedroom,
2 bath house ... you might say ... 134,000 +/- 21,000 ... of course, you
won't always be right but .... perhaps about 95% of the time?
but, more specifically, there are a number of things you can do
1. students certainly have to know something about sampling error ... and
the notion of a sampling distribution
2. they have to realize that when taking a sample, say using the sample
mean, that the mean they get could fall anywhere within that sampling
distribution
3. if we know something about #1 AND, we have a sample mean ... then, #1
sets sort of a limit on how far away the truth can be GIVEN that sample
mean or statistic ...
4. thus, we use the statistics (ie, sample mean) and add and subtract some
error (based on #1) ... in such a way that we will be correct (in saying
that the parameter will fall within the CI) some % of the time ... say, 95%?
it is easy to show this via simulation ... minitab for example can help you
do this
here is an example ... let's say we are taking samples of size 100 from a
population of SAT M scores ... where we assume the mu is 500 and sigma is
100 ... i will take a 1000 SRS samples ... and summarize the results of
building 100 CIs
MTB > rand 1000 c1-c100; <<< made 1000 rows ... and 100 columns ... each
ROW will be a sample
SUBC> norm 500 100. <<< sampled from population with mu = 500 and sigma = 100
MTB > rmean c1-c100 c101 <<< got means for 1000 samples and put in c101
MTB > name c1='sampmean'
MTB > let c102=c101-2*10 <<<< found lower point of 95% CI
MTB > let c103=c101+2*10 <<<< found upper point of 95% CI
MTB > name c102='lowerpt' c103='upperpt'
MTB > let c104=(c102 lt 500) and (c103 gt 500) <<< this evaluates if the
intervals capture 500 or not
MTB > sum c104
Sum of C104
Sum of C104 = 954.00 <<<< 954 of the 1000 intervals captured 500
MTB > let k1=954/1000
MTB > prin k1
Data Display
K1 0.954000 <<<< pretty close to 95%
MTB > prin c102 c103 c104 <<< a few of the 1000 intervals are shown below
Data Display
Row lowerpt upperpt C104
1 477.365 517.365 1
2 500.448 540.448 0 <<< here is one that missed 500 ...the
other 9 captured 500
3 480.304 520.304 1
4 480.457 520.457 1
5 485.006 525.006 1
6 479.585 519.585 1
7 480.382 520.382 1
8 481.189 521.189 1
9 486.166 526.166 1
10 494.388 534.388 1
_________________________________________________________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================