as a start, you could relate everyday examples where the notion of CI seems 
to make sense

A. you observe a friend in terms of his/her lateness when planning to meet 
you somewhere ... over time, you take 'samples' of late values ... in a 
sense you have means ... and then you form a rubric like ... for sam ... if 
we plan on meeting at noon ... you can expect him at noon + or - 10 minutes 
... you won't always be right but, maybe about 95% of the time you will?

B. from real estate ads in a community, looking at sunday newspapers, you 
find that several samples of average house prices for a 3 bedroom, 2 bath 
place are certain values ... so, again, this is like have a bunch of means 
... then, if someone asks you (visitor) about average prices of a bedroom, 
2 bath house ... you might say ... 134,000 +/- 21,000 ... of course, you 
won't always be right but .... perhaps about 95% of the time?

but, more specifically, there are a number of things you can do

1. students certainly have to know something about sampling error ... and 
the notion of a sampling distribution

2. they have to realize that when taking a sample, say using the sample 
mean, that the mean they get could fall anywhere within that sampling 
distribution

3. if we know something about #1 AND, we have a sample mean ... then, #1 
sets sort of a limit on how far away the truth can be GIVEN that sample 
mean or statistic ...

4. thus, we use the statistics (ie, sample mean) and add and subtract some 
error (based on #1) ... in such a way that we will be correct (in saying 
that the parameter will fall within the CI) some % of the time ... say, 95%?

it is easy to show this via simulation ... minitab for example can help you 
do this

here is an example ... let's say we are taking samples of size 100 from a 
population of SAT M scores ... where we assume the mu is 500 and sigma is 
100 ... i will take a 1000 SRS samples ... and summarize the results of 
building 100 CIs

MTB > rand 1000 c1-c100; <<< made 1000 rows ... and 100 columns ... each 
ROW will be a sample
SUBC> norm 500 100. <<< sampled from population with mu = 500 and sigma = 100
MTB > rmean c1-c100 c101 <<< got means for 1000 samples and put in c101
MTB > name c1='sampmean'
MTB > let c102=c101-2*10  <<<< found lower point of 95% CI
MTB > let c103=c101+2*10  <<<< found upper point of 95% CI
MTB > name c102='lowerpt' c103='upperpt'
MTB > let c104=(c102 lt 500) and (c103 gt 500)  <<< this evaluates if the 
intervals capture 500 or not
MTB > sum c104

Sum of C104

    Sum of C104 = 954.00   <<<< 954 of the 1000 intervals captured 500
MTB > let k1=954/1000
MTB > prin k1

Data Display

K1    0.954000  <<<< pretty close to 95%
MTB > prin c102 c103 c104 <<<  a few of the 1000 intervals are shown below

Data Display


  Row   lowerpt   upperpt   C104

    1   477.365   517.365      1
    2   500.448   540.448      0  <<< here is one that missed 500 ...the 
other 9 captured 500
    3   480.304   520.304      1
    4   480.457   520.457      1
    5   485.006   525.006      1
    6   479.585   519.585      1
    7   480.382   520.382      1
    8   481.189   521.189      1
    9   486.166   526.166      1
   10   494.388   534.388      1





_________________________________________________________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to