Re: Counting Techniques

2001-10-04 Thread dennis roberts

At 12:41 PM 10/4/01 -0500, Edwina Chappell wrote:
>Permutations

say you have a speech class and, there are 5 students who have to give a 
short speech one day ... how many different ORDERS can they go in?

>versus Combinations.

what if on a test, of 30 mc items ... the instructor lets you pick any 10 
and work them ... and you don't have to work the others ... if you consider 
a collection of 10 items a mini test ... how many different mini tests are 
possible in this situation?

or, the famous pizza example ... what if you go into pizza hut and, find 
out there are 12 things on the menu to pick from ... and, the special of 
the night is "any 3 ingredients on a large pizza for $9.99" ... how many 
different pizzas are possible?

>  Easy ways to understand the concepts and distinguish when to use?
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



tacoma narrows bridge

2001-10-02 Thread Dennis Roberts

At 02:36 PM 10/2/01 -0500, Olsen, Chris wrote:
>Hello All --
>
>   Not only that, I have an old Tacoma Narrows Bridge I'd like to sell
>someone.


some interesting urls about this

http://www.enm.bris.ac.uk/research/nonlinear/tacoma/tacoma.html

http://www.civeng.carleton.ca/Exhibits/Tacoma_Narrows/DSmith/photos.html

http://www.civeng.carleton.ca/Exhibits/Tacoma_Narrows/ ... a little qt 
movie is here too


>   -- Chris
>
>Chris Olsen
>George Washington High School
>2205 Forest Drive SE
>Cedar Rapids, IA

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help for DL students in doing assignments

2001-10-02 Thread Dennis Roberts

At 06:07 PM 10/2/01 +, Jon Miller wrote:


>The neat thing about math is the numerical answer doesn't matter, just the
>method.
>
>Jon Miller

gee ... i hope you don't really mean that ... if so, that will take your 
bank off the hook IF they royally mess up your bank statement and interest 
calculations ...




>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: They look different; are they really?

2001-10-01 Thread dennis roberts

were these two different sections at the same class time? that is ... 10AM 
on mwf?
if not ... then there can be all kinds of reasons why means would be this 
different ... nonewithstanding one or two real deviant scores in either 
section ...

could also be different quality in the instruction ...

all kinds of things

of course, if you opted for 95 or 99% cis, the non overlap would be greater 
...

what is the purpose of doing this in the first place? do not the mean 
differences really suggest that there is SOMEthing different about the two 
groups ... ? or ... at least something different in the overall operation 
of the course in these two sections?

At 02:33 PM 10/1/01 -0300, Gus Gassmann wrote:
>Stan Brown wrote:
>
> > Another instructor and I gave the same exam to our sections of a
> > course. Here's a summary of the results:
> >
> > Section A: n=20, mean=56.1, median=52.5, standard dev=20.1
> > Section B: n=23  mean=73.0, median=70.0, standard dev=21.6
> >
> > Now, they certainly _look_ different. (If it's of any valid I can
> > post the 20+23 raw data.) If I treat them as samples of two
> > populations -- which I'm not at all sure is valid -- I can compute
> > 90% confidence intervals as follows:
> >
> > Class A: 48.3 < mu < 63.8
> > Class B: 65.4 < mu < 80.9

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: optimal sample size

2001-10-01 Thread Dennis Roberts

of course. the most important issue is ... what do you mean by optimal? if 
you can specify what the purpose of the sampling project is ... the 
parameter to be estimated, within what margin of error, etc. ... then you 
might be able to answer the question ... "what is the MINIMAL n needed to 
accomplish these ends" ... that might be optimal if you are looking for the 
smallest n you can get by with ... but, optimal does not have to be defined 
as such ...

At 12:23 PM 10/1/01 +0200, Bernhard Kuster wrote:
>Hi
>
>I am interessted in the question of optimal sample size in general, not for
>a special statistical technique.
>
>My questions: (1) What do I have to keep in mind if I compute optimal sample
>size, what is relevant? (2) What are the classic studies and who has highly
>influenced the subject? (3) What are the problems discussed right now by the
>scientific community? (4) What are the relevant journals and is there some
>information on the web?
>
>Can anybody advise on one or more of these questions? Thanks a lot!
>
>Bernhard
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: What is a confidence interval?

2001-09-29 Thread dennis roberts

At 02:16 AM 9/29/01 +, John Jackson wrote:

>For any random inverval selected, there is a .05% probability that the
>sample will NOT yield an interval that yields the parameter being estimated
>and additonally such interval will not include any values in area
>represented by the left tail.  Can you make different statements about the
>left and right tail?

unless CIs work differently than i think ... about 1/2 the time the CI will 
miss to the right ... and 1/2 the time they will miss to the left ... thus, 
what if we labelled EACH CI with a tag called HIT ... or MISSleft ... or 
MISSright ... for 95% CIs ... the p of grabbing a CI that is HIT from all 
possible is about .95 ... the p for getting MISSleft PLUS MISSright is 
about .05 ... thus, about 1/2 of the .05 will be MISSleft and about 1/2 of 
the .05 will be MISSright

so, i don't see that you can say anything differentially important about 
one end or the other




>"Michael F." <[EMAIL PROTECTED]> wrote in message
>[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> > (Warren) wrote in message:
> >
> > > So, what is your best way to explain a CI?  How do you explain it
> > > without using some esoteric discussion of probability?
> >
> > I prefer to focus on the reliability of the estimate and say it is:
> >
> > "A range of values for an estimate that reflect its unreliability and
> > which contain the parameter of interest 95% of the time in the long run."
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Help with Minitab Problem?

2001-09-28 Thread dennis roberts

unless you have a million rows ... seems like using the data window and 
just sliding over each row and highlight and delete ... would be easy

by the way, why do you want to get rid of entire rows just because 
(perhaps) one value is missing? are you not wasting alot of useful data?

At 06:16 PM 9/28/01 -0400, John Spitzer wrote:
>I have a dataset which has about 35 column.  Many of the cells have
>missing values.  Since MINITAB recognizes the missing values, I can
>perform the statistical work I need to do and don't need to worry about
>the missing values.  However, I would like to be able to obtain the
>subset of observations which MINITAB used for its calculations.  I would
>like to be able to create a worksheet with only the rows from my dataset
>which do NOT contain any missing values.
>
>I cannot seem to find any way to do this (except manually, YUCK). Does
>anyone have any ideas? suggestions?
>
>Here is an greatly simplified version of my problem:
>
>   C1   C2 C3  C4
>1 1   2  1  2
>2 2   *  *  3
>3 3   3  4  2
>4  0  1  *  5
>5  1  *   1 8
>6  2  2   3 7
>
>
>I am looking for a command, or set of commands which will provide me
>with the subset of the above data consisting of:
>
>   C1   C2 C3  C4
>1 1   2  1  2
>2 3   3  4  2
>3 2  2   3 7
>
>Many thanks, in advance
>
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: E as a % of a standard deviation

2001-09-28 Thread Dennis Roberts

this is the typical margin of error formula for building a confidence 
interval were the sample mean is desired to be within a certain distance of 
the population mean

n = sample size
z = z score from nd that will produce desired confidence level (usually 
1.96 for 95% CI)
e = margin of error

so, typical CI for mu would be:

samp mean +/- z times standard error of mean

  e or the margin of error here is z * stan error of the mean (let me 
symbolize se)

 e = z * se

for 95% CI .. e = 1.96 * se

e = 1.96 * (sigma / sqrt n)

now, what n might it take to produce some e? we can rearrange the formula ...

sqrt n = (1.96 * sigma) / e

but, we don't want sqrt n ... we WANT n!

n = ((1.96 * sigma)/ e) ^2

so, what if we wanted to be within 3 points of mu with our sample mean the 
population standard deviation or sigma were 15?

   n = ((1.96 * 5) / 3)^2 = about 11 ...

only would take a SRS of about 11 to be within 3 points of the true mu 
value in your 95% confidence interval

unless i made a mistake someplace


At 09:54 AM 9/28/01 -0400, Randy Poe wrote:
>John Jackson wrote:
>
> > the forumla I was using was n = (Z?/e)^2  and attempting to express .05 
> as a
> > fraction of a std dev.
>
>I think you posted that before, and it's still getting
>garbled. We see a Z followed by a question mark, and
>have no idea what was actually intended.
>
>  - Randy
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_____
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: What is a confidence interval?

2001-09-28 Thread Dennis Roberts

At 01:23 AM 9/28/01 +, Radford Neal wrote:


radford makes a nice quick summary of the basic differences between 
bayesian and frequentist positions, which is helpful. these distinctions 
are important IF one is seriously studying statistical ideas

personally, i think that trying to make these distinction for introductory 
students however is a waste of time ... these are things for "majors" in 
statistics or "statisticians" to discuss and battle over

in reference to a CI, the critical issue is CAN it be said that ... in the 
long run, there is a certain probability of producing CIs (using some CI 
construction procedure) that ... contain the parameter value ... that is, 
how FREQUENTLY we expect the CIs to contain the true value ... well, yes we can

THAT is the important idea and, i think that if we try (for the sake of 
edification of the intro student)to defend it or reject it according to 
being proper bayesian/frequentist or improper ... is totally irrelevant to 
the basic concept

but, that is just my opinion



>In article <yyPs7.55095$[EMAIL PROTECTED]>,
>John Jackson <[EMAIL PROTECTED]> wrote:
>
> >this is the second time I have seen this word used: "frequentist"? What does
> >it mean?
>
>It's the philosophy of statistics that holds that probability can
>meaningfully be applied only to repeatable phenomena, and that the
>meaning of a probability is the frequency with which something happens
>in the long run, when the phenomenon is repeated.  This rules out
>using probability to describe uncertainty about a parameter value,
>such as the mass of the hydrogen atom, since there's just one true
>value for the parameter, not a sequence of values.

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: What is a confidence interval?

2001-09-27 Thread Dennis Roberts

At 07:33 AM 9/27/01 -0700, Warren wrote:


>Now, we take our sample mean and s.d. and we compute a CI.  We know
>we can't say anything about a probability for this single CI...it
>either
>contains the mean or it doesn't.  So, what DOES a CI tell us?  Does it
>really give you a range of values where you think the parameter is?


most disciplines have various models for how they describe/explain/predict 
events and behaviors

in each of these, there are assumptions ... i don't see how we can get 
around that ... one must start from SOME point of reference (of course, 
some models make us start from much more stringent starting points than 
others)

to me, in statistics, particularly of the inferential type, the biggest 
assumption that we make that is suspect is the one of SRS ... taking random 
samples ...

however, if we did NOT make some assumption about the data 
being  representative of the overall population ... which SRSing helps to 
insure ... what can we do? what inferences could we possibly make?

in the case of CIs ... no, you are not sure at all that the range you got 
in your CI encompasses the parameter but, what are the odds that it does 
NOT? generally, fairly small. (well, all bets are off if you like to build 
25% CIs!) so, under these conditions, is it not reasonably assured that the 
parameter IS inside there someplace? this does not pinpoint WHERE within it 
is but, it does tend to eliminate from the long number line on which the CI 
rests ... what values do NOT seem to be too feasible for the parameter

unfortunately, if you are interested in knowing something about some 
parameter and, have no way to identify each and every population element 
and "measure" it (of course, even then, how do you know that your measure 
is "pure"?)... you are necessarily left with making this inference based on 
the data you have ... i don't see any way out of this bind ... OTHER than 
trying as best possible to take a good sample ... of decent size (to reduce 
sampling error) ... and then trusting the results that you find

if there is another way, i would certainly like to know it






_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



p value

2001-09-27 Thread Dennis Roberts

let's say that you do a simple (well executed) 2 group study ... 
treatment/control ... and, are interested in the mean difference ... and 
find that a simple t test shows a p value (with mean in favor of treatment) 
of .009

while it generally seems to be held that such a p value would suggest that 
our null model is not likely to be correct (ie, some other alternative 
model might make more sense), does it say ANYthing more than that?

specifically, does the p value in and of itself impute ANY information 
about the non null possibilities being in the direction favoring the 
treatment group?

or, just that the null model is not very plausible

bottom line: is there any value added information imparted from the p value 
other than a statement about the null?

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



CIs

2001-09-27 Thread Dennis Roberts

it seems to me that the notion of a confidence interval is a general 
concept ... having to do with estimating some unknown quantity in which 
errors are known to occur or be present in that estimation process

in general, the generic version of a CI is:

  statistic/estimator +/- (multiplier) * error

the multiplier will drive the amount of confidence and, error will be 
estimated by varying processes depending upon the parameter or thing you 
are estimating

what we might want to estimate in a regression setting is what one 
particular person might do on a future outcome variable, like college gpa, 
given that we know what THAT person has achieved on some current variable 
(high school gpa) ... if we are interested in this specific person, then 
error will be estimated by some function of HIS/HER variation and that will 
be factored into the above generic equation as error ... this would be what 
jon cryer rightfully called a prediction interval ... BUT, it still fits 
within the realm of the concept of a CI

in other regression cases, we might not be interested in estimation for one 
specific individual on the criterion given that individual's score on the 
current variable, but rather what is the expected MEAN criterion value for 
a group of people who all got the same current variable value ... in this 
case, error is estimated by some function of the group on the current 
variable ... and this is what in regression terms is called a confidence 
band or interval ... but, the concept itself is no different than the 
prediction interval ... what IS different is what is considered error and 
how we estimate it

when we use a sample mean to estimate some population mean, we have the 
same identical general problem ... since we use the sample mean as the 
estimator and, we have a way of conceptualizing and estimating error 
(sampling error of the mean) in that case BUT, we still use the generic 
formula above ... to build our CI

in all of these cases, there is a concept of what error is and, some method 
by which we estimate it and, in all these cases we use some given quantity 
(statistic/estimator) to take a stab at an unknown quantity (parameter/true 
criterion)  and we use the estimated error around the known quantity as 
a fudge factor, tolerance factor, a margin of error factor ... when making 
our estimate of the unknown quantity of interest

all of these represent the same basic idea ... only the details of what is 
used as the point estimate and what is used as the estimate of ERROR of the 
point estimate ... change

also, in all of these cases whether it be in regression work or  sampling 
error (of means for example) work ... we still attach a quantity ... a 
percentage value ... to the intervals like  we have created when estimating 
the unknown and, as far as i can tell, we interpret that percentage in the 
same identical way in all of these cases ... with respect to the long run 
average number or percentage of "hits" that our intervals have of capturing 
the true value (parameter  or true criterion value)

i am more than willing to use different terms to differentiate amongst 
these different settings  ... such as in regression when you are inferring 
something about an individual ... or a group of individuals (though even 
here, i think we could select better differentiators than we currently use 
... like personal interval versus group interval) ... but overall, all of 
these are variations of the same notion and fundamental idea

IMHO of course





_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: What is a confidence interval?

2001-09-26 Thread dennis roberts

some people are sure picky ...

given the context in which the original post was made ... it seems like the 
audience that the poster was hoping to be able to talk to about CIs was not 
very likely to understand them very well ... thus, it is not unreasonable 
to proffer examples to get one into having some sense of the notion

the examples below ... were only meant to portray ... the idea that 
observations have error ... and, over time and over samples ... one gets 
some idea about what the size of that error might be ... thus, when 
projecting about behavior ... we have a tool to know a bit about some 
penultimate value ... say the parameter for a person ... by using AN 
observation, and factoring in the error you have observed over time or 
samples ...

in essence, CIs are + and - around some observation where ... you 
conjecture within some range what the "truth" might be ... and, if you have 
evidence about size of error ... then, these CIs can say something about 
the parameter (again, within some range) in face of only seeing a limited 
sample of behavior

At 09:30 PM 9/26/01 +, Radford Neal wrote:
>In article <[EMAIL PROTECTED]>,
>Dennis Roberts <[EMAIL PROTECTED]> wrote:
>
> >as a start, you could relate everyday examples where the notion of CI seems
> >to make sense
> >
> >A. you observe a friend in terms of his/her lateness when planning to meet
> >you somewhere ... over time, you take 'samples' of late values ... in a
> >sense you have means ... and then you form a rubric like ... for sam ... if
> >we plan on meeting at noon ... you can expect him at noon + or - 10 minutes
> >... you won't always be right but, maybe about 95% of the time you will?
> >
> >B. from real estate ads in a community, looking at sunday newspapers, you
> >find that several samples of average house prices for a 3 bedroom, 2 bath
> >place are certain values ... so, again, this is like have a bunch of means
> >... then, if someone asks you (visitor) about average prices of a bedroom,
> >2 bath house ... you might say ... 134,000 +/- 21,000 ... of course, you
> >won't always be right but  perhaps about 95% of the time?
>
>These examples are NOT analogous to confidence intervals.  In both
>examples, a distribution of values is inferred from a sample, and
>based on this distribution, a PROBABILITY statement is made concerning
>a future observation.  But a confidence interval is NOT a probability
>statement concerning the unknown parameter.  In the frequentist
>statistical framework in which confidence intervals exists,
>probability statements about unknown parameters are not considered to
>be meaningful.

you are clearly misinterpreting, for whatever purpose, what i have said

i certainly have NOT said that a CI is a probability statement about any 
specific parameter or, being able to attach some probability value to some 
certain value as BEING the parameter

the p or confidence associated with CIs only makes sense in terms of 
dumping all possible CIs into a hat ... and, asking  what is the 
probability of pulling one out at random that captures the parameter 
(whatever the parameter might be) ...

the example i gave with some minitab work clearly showed that ... and made 
no other interpretation about p values in connection with CIs

perhaps some of you who seem to object so much to things i offer ... might 
offer some posts of your own in response to requests from those seeking 
help ... to make sure that they get the right message ...


>Radford Neal
>
>
>Radford M. Neal   [EMAIL PROTECTED]
>Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED]
>University of Toronto http://www.cs.utoronto.ca/~radford
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Analysis of covariance

2001-09-26 Thread Dennis Roberts

At 02:26 PM 9/26/01 -0500, Burke Johnson wrote:
> >From my understanding, there are three popular ways to analyze the 
> following design (let's call it the pretest-posttest control-group design):
>
>R Pretest   Treatment   Posttest
>R PretestControl   Posttest

if random assignment has occurred ... then, we assume and we had better 
find that the means on the pretest are close to being the same ... if we 
don't, then we wonder about random assignment (which creates a mess)

anyway, i digress ...

what i would do is to do a simple t test on the difference in posttest 
means and, if you find something here ... then that means that treatment 
"changed" differentially compared to control

if that happens, why do anything more complicated? has not the answer to 
your main question been found?

now, what if you don't ... then, maybe something a bit more complex is 
appropriate

IMHO

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: E as a % of a standard deviation

2001-09-26 Thread Dennis Roberts

At 04:49 PM 9/26/01 +, John Jackson wrote:
>re: the formula:
>
>   n   = (Z?/e)2
>
>
>could you express E as a  % of a standard deviation .
>
>In other words does a .02 error translate into .02/1 standard deviations,
>assuming you are dealing w/a normal distribution?


well, let's see ... e is the margin of error ... using the formula for a CI 
for a population mean ..

   X bar +/- z * stan error of the mean

so, the margin of error or e ... is z * standard error of the mean

now, let's assume that we stick to 95% CIs ... so the z will be about 2 ... 
that leaves us with the standard error of the mean ... or, sigma / sqrt n

let's say that we were estimating SAT M scores and assumed a sigma of about 
100 and were taking a sample size of n=100 (to make my figuring simple) ... 
this would give us a standard error of 100/10 = 10 so, the margin of error 
would be:

   e = 2 * 10 or about 20

so, 20/100 = .2 ... that is, the e or margin of error is about .2 of the 
population sd

if we had used a sample size of 400 ... then the standard error would have 
been: 100/20 = 5

and our e or margin of error would be 2 * 5 = 10

so, the margin of error is now 10/100 or .1 of a sigma unit OR 1/2 the size 
it was before

but, i don't see what you have accomplished by doing this ... rather than 
just reporting the margin of error ... 10 versus 20 ... which is also 1/2 
the size

since z * stan error is really score UNITS ... and, the way you done it ... 
.2 or .1 would represent fractions of sigma ... which still amounts to 
score UNITS ... i don't think anything new has been done ... certainly, no 
new information has been created







>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=====

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: What is a confidence interval?

2001-09-26 Thread Dennis Roberts

as a start, you could relate everyday examples where the notion of CI seems 
to make sense

A. you observe a friend in terms of his/her lateness when planning to meet 
you somewhere ... over time, you take 'samples' of late values ... in a 
sense you have means ... and then you form a rubric like ... for sam ... if 
we plan on meeting at noon ... you can expect him at noon + or - 10 minutes 
... you won't always be right but, maybe about 95% of the time you will?

B. from real estate ads in a community, looking at sunday newspapers, you 
find that several samples of average house prices for a 3 bedroom, 2 bath 
place are certain values ... so, again, this is like have a bunch of means 
... then, if someone asks you (visitor) about average prices of a bedroom, 
2 bath house ... you might say ... 134,000 +/- 21,000 ... of course, you 
won't always be right but  perhaps about 95% of the time?

but, more specifically, there are a number of things you can do

1. students certainly have to know something about sampling error ... and 
the notion of a sampling distribution

2. they have to realize that when taking a sample, say using the sample 
mean, that the mean they get could fall anywhere within that sampling 
distribution

3. if we know something about #1 AND, we have a sample mean ... then, #1 
sets sort of a limit on how far away the truth can be GIVEN that sample 
mean or statistic ...

4. thus, we use the statistics (ie, sample mean) and add and subtract some 
error (based on #1) ... in such a way that we will be correct (in saying 
that the parameter will fall within the CI) some % of the time ... say, 95%?

it is easy to show this via simulation ... minitab for example can help you 
do this

here is an example ... let's say we are taking samples of size 100 from a 
population of SAT M scores ... where we assume the mu is 500 and sigma is 
100 ... i will take a 1000 SRS samples ... and summarize the results of 
building 100 CIs

MTB > rand 1000 c1-c100; <<< made 1000 rows ... and 100 columns ... each 
ROW will be a sample
SUBC> norm 500 100. <<< sampled from population with mu = 500 and sigma = 100
MTB > rmean c1-c100 c101 <<< got means for 1000 samples and put in c101
MTB > name c1='sampmean'
MTB > let c102=c101-2*10  <<<< found lower point of 95% CI
MTB > let c103=c101+2*10  <<<< found upper point of 95% CI
MTB > name c102='lowerpt' c103='upperpt'
MTB > let c104=(c102 lt 500) and (c103 gt 500)  <<< this evaluates if the 
intervals capture 500 or not
MTB > sum c104

Sum of C104

Sum of C104 = 954.00   <<<< 954 of the 1000 intervals captured 500
MTB > let k1=954/1000
MTB > prin k1

Data Display

K10.954000  <<<< pretty close to 95%
MTB > prin c102 c103 c104 <<<  a few of the 1000 intervals are shown below

Data Display


  Row   lowerpt   upperpt   C104

1   477.365   517.365  1
2   500.448   540.448  0  <<< here is one that missed 500 ...the 
other 9 captured 500
3   480.304   520.304  1
4   480.457   520.457  1
5   485.006   525.006  1
6   479.585   519.585  1
    7   480.382   520.382  1
8   481.189   521.189  1
9   486.166   526.166  1
   10   494.388   534.388  1





_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Analysis of covariance

2001-09-25 Thread Dennis Roberts

At 03:19 PM 9/25/01 +, Radford Neal wrote:

>Neither the question nor the response are all that clearly phrased, but
>when I interpret them according to my reading, I don't agree.  For instance,
>if you're measuring pain levels, I don't see anything wrong with measuring
>pain before treatment, randomly assigning patients to treatment and control
>groups, doing a regression for pain level afterwards with the pain level
>before and a treatment/control indicator as explanatory variables, and
>judging the effectiveness of the treatment by looking at the coefficient for
>the treatment/control variable.  Or is the actual proposal something else?

IMHO seems like to remove the variance from post pain ... using pre pain 
variance ... is a no brainer ... since the r between the two pain readings 
will necessarily be high (unless there is something really screwy about the 
data like severe restriction of range on the post measure) ... what has 
been explained in the post pain variance? pain?

the basic idea is to be able to "explain" the post score variance in terms 
of something ELSE ... that is, for example ... we know that some of the 
variance in pain is due to one's TOLERANCE for PAIN ... thus, if we can 
remove the part of pain variance that is due to TOLERANCE FOR pain ... then 
the leftover variance on pain is a purer measure in its own right ..

if you do as suggested ... remove the pre from the post ... say pre pain 
from post pain ... what is left over? it is not pain anymore but rather, 
some OTHER variable ... which is not what the purpose of the study was ... 
to investigate (i assume anyway)

i do most certainly agree with radford that ... random assignment is still 
essential in this design ... unfortunately, far too many folks use ANCOVA 
to somehow makeup for the fact that NON random assignment happened and, 
they think ANCOVA will solve that problem ...

it won't




>Radford Neal
>
>
>Radford M. Neal   [EMAIL PROTECTED]
>Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED]
>University of Toronto http://www.cs.utoronto.ca/~radford
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Analysis of covariance

2001-09-25 Thread Dennis Roberts

At 10:26 AM 9/25/01 +, Morelli Paolo wrote:
>HI all,
>I have to analyse some clinical data. In particular the analysis is a
>comparison between two groups of the mean change baseline to endpoint of a
>score. The statistician who planned the analysis used the ANCOVA on the mean
>change, using as covariate the baseline values of the scores.
>Do you think this analysis is correct?

NO! ... this is not a legitimate covariate ... a pre measure of the same 
thing you are measuring latter as evidence of effectiveness

the notion of a covariate is to have previously collected data ... on a 
variable that rationally should explain some of the variance in the 
criterion ... and the idea is to "remove" that part of the criterion 
variance that can be accounted for by the co-linearity with the covariate

in situations where the treatment effect is likely to be small ... 
especially if error variance is large ... using an appropriate covariate 
(assuming of course that Ss were randomly assigned to the different 
conditions) is a good way to reduce the error term and hence, increase your 
chances for finding "significance" (if that is your goal)

>I thing that in this way we are correcting twice. I think that the right
>analysis is an ANOVA on the mean change.
>Please let me know your opinion
>thanks
>Paolo
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Simple Median Question

2001-09-24 Thread Dennis Roberts

At 12:01 PM 9/24/01 -0500, you wrote:
>  I have a question about "averaging medians."  My dataset consists of 
> median values for a variable of interest. To find the average, do I 
> average the medians and get a mean median, or do I find the median of the 
> median values?


since we don't know how many of these medians you have ... or anything 
about the shapes of the distributions on which you have (only) median 
values ... we don't know if it really makes much of or any difference BUT, 
to be consistent ... if you have collected medians ... ie, Q2 values ... 
then, it makes most consistent sense (to me anyway) if you need an average 
of these ... to take the median of these ...

by the way ... why would you have the medians of this variable ... and not 
the means? was there some important reason?






>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



strapping boots

2001-09-21 Thread Dennis Roberts

At 02:12 PM 9/21/01 -0500, Jon Cryer wrote:
>I wouldn't call bootstrapping "sampling from a population."
>Would you?
>
>Jon Cryer

however, we should perhaps not make too lightly of this method ... if 
bootstrapping or resampling ... will produce accurate estimates of standard 
errors (for example)

in stat classes, we typically will let "software" do the many many 
samplings from some population and then plot the statistics that occur 
across all those samples

what if we could show that taking ONE SRS of decent size ... and beating 
the dickens out of it (ie, resampling) ... would produce a sigma sub x bar 
... that is essentially the same that we would find across say ... 5000 
separate samples and then looking at the SD of those 5000 sample means?

thus, if the question is ... what is the standard error likely to be ... 
then perhaps we can arrive at that answer from bootstrapping or resampling 
... just as well ... and in a sense more efficiently ... than our normal 
strategy of generating new SRSes from a defined population


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Free program to generate random samples

2001-09-21 Thread Dennis Roberts

At 02:12 PM 9/21/01 -0500, Jon Cryer wrote:
>I wouldn't call bootstrapping "sampling from a population."
>Would you?

well, getting the first boot ... to do the strapping ... might be ... but, 
after that ... then REsampling from the first SAMPLE (boot) ... would be a 
better way to describe it

similarly ... let's say we do a two stage survey project ... where the 
first phase is to circulate a survey form to n=1000 folks (from a large 
population) ... and after we get back their responses (let's assume for 
fantasy sake that we get all 1000 back) ... we pick n=25 from that 1000 to 
do extensive follow up interviews with ... can we assume that the n=25 
sample is a SRS of n=25 from the larger population? i don't think so


>Jon Cryer
>
>At 06:03 PM 9/21/01 GMT, you wrote:
> >Jon Cryer wrote:
> >>
> >> But it would be bad statistics to sample with replacement.
> >
> >Whew!  saves me from having to learn about all that bootstrap
> >stuff!  :-)
> >
> >
> >=
> >Instructions for joining and leaving this list and remarks about
> >the problem of INAPPROPRIATE MESSAGES are available at
> >  http://jse.stat.ncsu.edu/
> >=
> >
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: for students (biology et al.) that hate numbers

2001-09-21 Thread Dennis Roberts

At 06:14 PM 9/21/01 +, Jerry Dallal wrote:
>I wrote:
>
> > Does anybody really care about the proportions of different colors
> > in bags of M&Ms?
>
>because I surely didn't, but perhaps I should.  Since the % blues
>differ among plain and peanut (10 v 30, says WBW) there's probably a
>good medical/epidemiology exercise to be had by relabeling
>plain/peanut, blue/non-blue and handing out a bunch of bags to the
>class. (Thanks also to LD).

this may be true but ... in the overall scheme of things ... this has to be 
of trivial importance and interest ... there must be better "exemplars" to 
use that have more real life applications

when m and ms go off the market, does this sort of problem have ANY 
relevance to anything?

if we have to rely on m and m examples to capture their attention ... then 
no wonder we don't make much headway in our pursuit of teaching "fearful" 
students about statistics ...


>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: what type of distribution on this sampling

2001-09-21 Thread Dennis Roberts

At 12:47 PM 9/21/01 -0400, Joe Galenko wrote:

>The mean of a random sample of size 81 from a population of size 1 billion
>is going to be Normally distributed regardless of the distribution of the
>overall population (i.e., the 1 billion).

i don't think so .. check out the central limit theorem

_____
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



samp. w or w/o replacement

2001-09-21 Thread Dennis Roberts



seems to me that if you are talking about, for example, generating a 
sampling distribution of means, ... then each and every SRS is assumed to 
be randomly and INDEPENDENTLY drawn from said population ... thus, sampling 
with replacement is assumed

if not, each NEXT sample is not being drawn from the defined population

at the individual CASE level ... or individual EXPERIMENT level ...  seems 
like it makes no sense to talk about sampling with or without replacement ...

so, sampling with or without replacement seems like it only makes sense 
when you are talking about taking multiple samples ... and whether the 
population stays constant when each new sample is taken



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: what type of distribution on this sampling

2001-09-20 Thread dennis roberts
onditions precisely.
> >
> > >Is it possible to translate it into a z score without any addtional data.
> >
> > If the population mean and standard deviation are known, that's all
> > you need for a z score. The formula is
> > z = [ xbar - mu ] / [ SEM ]
> > For your scenario,
> > z = (xbar-78)/3
> >
> > A sample mean of 60 has a z score of -6, so it is quite unlikely
> > that you'd draw a sample with a mean of 60. (My TI-83 says that the
> > area in the tail past z=-6 is just under 10^-9.)
> >
> > >In other words is the std deve of 27 and mean of 81 in any way predictive
>of
> > >what a histogram of a distribution would look like?
> >
> > I assume you meant to say "mean of 78 and sample size of 81"?
> > Assuming that, the histogram of sample means should be normal or
> > nearly so, with mean (mu-sub-xbar) 78 (same as population mean) and
> > standard deviation (standard error of the mean, sigma-sub-xbar) 3.
> >
> > >Finally what difference does it make how many random samples you take
>(ie.
> > >100 or 1000). What statistic or parameter does this speak to?
> >
> > None that I know, in a formal sense. If you take 100 random samples
> > of size 81, or 100,000 random samples of size 81, your histogram of
> > sample means will have the same shape, though the curve will be a
> > bit smoother with 100,000 samples.
> >
> > --
> > Stan Brown, Oak Road Systems, Cortland County, New York, USA
> >   http://oakroadsystems.com
> > My reply address is correct as is. The courtesy of providing a correct
> > reply address is more important to me than time spent deleting spam.
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: what type of distribution on this sampling

2001-09-20 Thread dennis roberts

At 06:28 PM 9/20/01 -0400, Stan Brown wrote:

>None that I know, in a formal sense. If you take 100 random samples
>of size 81, or 100,000 random samples of size 81, your histogram of
>sample means will have the same shape, though the curve will be a
>bit smoother with 100,000 samples.

this is for sure ... here i generated 100 samples of n=25 from nd 50, 10 
... and then 10 samples of same size ... here are the dotplots and desc 
stats

Dotplot: 100, 10


  ::
. .
.   : :. :
  ..:. :: :: : . .
...:.::.. .
   -+-+-+-+-+-+-100
Each dot represents up to 496 points
   .
.:.
   .
 .:::
.:.
  .:
...
 ... .:::..
   -+-+-+-+-+-+-10
 42.0  45.5  49.0  52.5  56.0  59.5


MTB > desc c27 c26

Descriptive Statistics: 100, 10


Variable N   Mean Median TrMean  StDevSE Mean
100100 50.224 50.036 50.190  1.941  0.194
10  10 50.006 50.017 50.007  2.003  0.006

Variable   MinimumMaximum Q1 Q3
100 46.283 55.307 49.042 51.504
10  40.947 58.238 48.653 51.355




>--
>Stan Brown, Oak Road Systems, Cortland County, New York, USA
>   http://oakroadsystems.com
>My reply address is correct as is. The courtesy of providing a correct
>reply address is more important to me than time spent deleting spam.
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

======
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Free program to generate random samples

2001-09-20 Thread dennis roberts

why does everything have to be free???

but ... go here first ... http://members.aol.com/johnp71/javastat.html

scroll down to random number generators ...

one example is http://ebook.stat.ucla.edu/calculators/cdf/index.phtml

regular stat software does these sorts of things easily ... in minitab for 
example ... let's say i wanted to generate a sample of n=100 from a chis 
dist. with 3 degrees of freedom


—   9/20/01 5:50:36 PM   

Welcome to Minitab, press F1 for help.
MTB > rand 100 c1;
SUBC> chis 3.
MTB > dotp c1

Dotplot: C1


   .
   :  :.: ..:  .  ..
   :  :::.:: .::  ::..   .
   :.:::  .. :  .. .  ..  . .
  +-+-+-+-+-+---C1
0.0   2.5   5.0   7.5  10.0  12.5

MTB > desc c1

Descriptive Statistics: C1


Variable N   Mean Median TrMean  StDevSE Mean
C1 100  3.399  2.785  3.219  2.404  0.240

Variable   MinimumMaximum Q1 Q3
C1   0.144 11.483  1.549  5.048

or, 100 samples of n=100 from the same chisquare distribution

MTB > rand 100 c1-c100;
SUBC> chis 3.
MTB > rmean c1-c100 c101
MTB > dotp c101

Dotplot: C101


 .
 ::
 ::   .
. :
 :.. .  :.:  ::: .
.  :  .:.:::.::..: ... : .  . .
   -+-+-+-+-+-+-C101
 2.40  2.70  3.00  3.30  3.60  3.90

h ... looks like the sampling distribution of the mean ... when the 
population is shaped like a chisquare 3 distribution ... has that funny 
looking normal like shape

MTB > desc c101

Descriptive Statistics: C101


Variable N   Mean Median TrMean  StDevSE Mean
C101   100 2.9789 2.9525 2.9733 0.2393 0.0239

Variable   MinimumMaximum Q1 Q3
C1012.3895 3.7689 2.8695 3.1187

MTB >


At 06:09 PM 9/20/01 -0300, Voltolini wrote:
>I am interested in the same programs and if possible,
>one that can generate normal, binomial, etc distributions 
>
>Thanks for any suggestions !!!
>
>_
>Prof. J. C. Voltolini
>Grupo de Estudos em Ecologia de Mamiferos - ECOMAM
>Universidade de Taubate - Depto. Biologia
>Praca Marcellino Monteiro 63, Bom Conselho,
>Taubate, SP - BRASIL. 12030-010
>
>TEL: 0XX12-2254165 (lab.), 2254277 (depto.)
>FAX: 0XX12-2322947
>E-Mail: [EMAIL PROTECTED]
>
>- Original Message -
>From: @Home <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Sent: Thursday, September 20, 2001 2:50 PM
>Subject: Free program to generate random samples
>
>
> > Is there any downloadable freeware that can generate let's say 2000 random
> > samples of size n=100 from a population of 100 numbers.
> >
> > Is this conceivable? for excel etc.
> >
> >
> >
> >
> > =
> > Instructions for joining and leaving this list and remarks about
> > the problem of INAPPROPRIATE MESSAGES are available at
> >   http://jse.stat.ncsu.edu/
> > =
> >
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-19 Thread dennis roberts

At 05:14 PM 9/19/01 -0400, Rich Ulrich wrote:

>It has Likert's original observations on writing
>an attitude scale (1932, which I had not seen elsewhere).

likert's work appeared in the archives of psychology ... #141 i think ... 
in 1932 ... it was his dissertation work ... under the direction i think of 
gardner murphy

the intention of likert's work was NOT to validate in any way ... the 3 
scales he used in that dissertation ... but, to show that a simpler method 
of attitude item scaling would be about as useful as the much harder to do 
... thurstonian scaling ... equal appearing intervals i think

for sure, it is simpler

however, we have to keep in mind that this was 70 years ago ... i hope we 
have learned a few things since then ... but, sometimes i wonder


>This Likert article has 4 items of example which are
>surprising to me in a couple of respects.  Two of the items
>are scaled categories, instead of being symmetrical
>ratings around "Undecided" or indifferent.  But the content
>is more surprising.  Race relations have surely shifted --

==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



for number haters

2001-09-19 Thread Dennis Roberts

At 12:36 PM 9/19/01 -0500, jeff rasmussen wrote:
> >
> One thing I recently did was divide the class into 6 groups of ~5 
> each.
>Each group got a baggy with different stuff:  one was multicolored
>confetti, another was different types of pasta, another was different
>lenghts of twine that had an inverse relationship between lenght and color
>saturation.  Their task was to Organize, Summarize, Describe, Graph and
>Present the results & also to  make the results attractive via Graphic
>Design considerations.  The class voted on who did the best job (I was
>quite surprised that the confetti group won since they had little to work
>with in terms of the complexity of the data set).  At the end of the year
>after a couple such contests, I'll give the winners a prize... usually
>Godiva Chocolate.

one thing that is a given here ... for most groups taking statistics for 
the first time ... is that, since most do NOT like math ... and numbers ... 
that transfers to statistics ... the logic to them goes like:

1. i hate numbers
2. statistics has numbers in it
3. therefore, i WILL hate statistics

now, the reality is that there will NOT be much you can do about that ... 
but, what you can do is to make your class or course ... data driven ... 
that is, have students work with data ... (i can't attest to whether the 
above "sets" of data are the right ones to use)

but, what i do say about using data is that the data should be 
UNDERSTANDABLE to the students ...

note that i did not say "meaningful" or "relevant" to them ... ie, 
tailoring the data you use to the INTERESTS of students ... unfortunately, 
probably every student would be interested in something different but, it 
is critical that they understand the data they are using or playing around 
with ... things they are familiar with

of course, talking about descriptive oriented topics within statistics is 
rather easy to do ... but, when it comes to inference ... that is a tougher 
nut to crack since there many many assumptions that have to be made ABOUT 
the data ... about the population ... about the methods by which you GOT 
the data ... and, in addition, the logic about inference is not that simple 
to follow

IMHO of course


_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: sig. testing articles

2001-09-19 Thread Dennis Roberts

At 02:11 PM 9/19/01 -0400, Rich Ulrich wrote:

>Or download the APA  conclusions on the general topic.
>This paper also provided a rare and valuable overview on
>social science research:
>
>http://www.apa.org/journals/amp/amp548594.html
>
>American psychologist
>August 1999, Vol. 54, No. 8, 594-604.
>
>"Statistical Methods in Psychology Journals: Guidelines and
>Explanations"
>
>Leland Wilkinson and Task Force on Statistical Inference
>APA Board of Scientific Affairs

actually, there is precious little in this paper about significance testing 
... though, there is much more in the bibliographic references

the general link rich refers to is much more general ... ie, about an 
entire journal article and sections in it ... what should be presented and 
what should not be

actually, one of the statements within the conclusion section is:
=
Some had hoped that this task force would vote to recommend an outright ban 
on the use of significance tests in psychology journals. Although this 
might eliminate some abuses, the committee thought that there were enough 
counterexamples (e.g., Abelson, 1997) to justify forbearance. Furthermore, 
the committee believed that the problems raised in its charge went beyond 
the simple question of whether to ban significance tests.


one might get the impression from this that there was a major discussion of 
the pros and cons of continuing "significance testing" and, the above was 
the conclusion drawn ... but actually, the paper in the url above ... 
really says very little about this specific issue

finally, i did NOT send the note about the sig. test. articles ... to make 
a case for or agin (though i have my own opinion) ... i just sent it as 
reading that some might find useful



_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



sig. testing articles

2001-09-18 Thread Dennis Roberts

some time ago ... i posted a site that had a series of articles about 
significance testing ...

The Fall 1998 Issue of Research in the Schools, was a special full issue on 
Statistical Significance Testing. This issue contained 6 primary papers and 
3 follow up comments. The Editors and Publishers of Research in the Schools 
agreed to have this issue put in a web format.

here is the link

http://roberts.ed.psu.edu/users/droberts/sigtest.htm

these are in pdf format ... if you have any problems, let me know

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression to the mean,Barry Bonds & HRs

2001-09-17 Thread Dennis Roberts


>
>My main point was not about baseball or Bonds. It was about the
>cavalier way that people toss around the phrase, "regression to
>the mean," as if it were an immutable law that trumped all other
>differences in conditions.
>
>--Robert Chung

right ... reg. to the mean is not a cause of anything ... but, a 
description of the relationship between variables ... and how the RELATIVE 
POSITIONS on one tend to go along with the RELATIVE POSITIONS on the other

it's not less than that and certainly no more than that

if bonds breaks the record ... reg. to the mean will not flawed ... and if 
he does not break the record it is not because reg. to the mean works



>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: effect size/significance

2001-09-13 Thread Dennis Roberts

here are some data ... say we randomly assigned 30 Ss ... 15 to each 
condition and found the following:

MTB > desc c1 c2

Descriptive Statistics: exp, cont


Variable N   Mean Median TrMean  StDevSE Mean
exp 15  26.13  27.00  26.00   4.96   1.28
cont15  21.73  22.00  21.85   3.95   1.02

MTB > twos c1 c2

Two-sample T for exp vs cont

N  Mean StDev   SE Mean
exp   15 26.13  4.96   1.3
cont  15 21.73  3.95   1.0

Difference = mu exp - mu cont
Estimate for difference:  4.40
95% CI for difference: (1.04, 7.76)
T-Test of difference = 0 (vs not =): T-Value = 2.69  P-Value = 0.012 <<<<< 
p value

MTB > let k1=(26.13-21.73)/3.95
MTB > prin k1

Data Display

K11.11392  <<<< simple effect size calculation

other than being able to say that the experimental group ... ON AVERAGE ... 
had a mean that was about 1.11 times (control group sd units) larger than 
the control group mean, which is purely DESCRIPTIVE ... what  can you say 
that is important?




_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: effect size/significance

2001-09-13 Thread Dennis Roberts

At 02:33 PM 9/13/01 +0100, Thom Baguley wrote:
>Rolf Dalin wrote:
> > Yes it would be the same debate. No matter how small the p-value it
> > gives very little information about the effect size or its practical
> > importance.
>
>Neither do standardized effect sizes.

agreed ... of course, we would all be a whole lot better off if a 
researcher DEFINED for us ...  a priori ... what size of effect he/she 
considered to be important ... and why that "amount" has practical benefit

then we could evaluate (approximately) if he/she found something of 
practical importance (at least according to the researcher)

but, what we get is an after the fact description of this  which by the 
very nature of its post hocness ... is not really that helpful

bringing into this discussion statistical significance only has relevance 
IF you believe that null hypothesis testing is of real value

bringing  effect size into the discussion only has relevance IF you know 
what effect it takes to have practical importance ... (and i have yet to 
see the article that focuses on this even if they do report effect sizes ... )

what we need in all of this is REPLICATION ... and, the accumulation of 
evidence about the impact of independent variables that we consider to have 
important potential ... and not to waste our time and money on so many 
piddly manipulations ... just for the sake of "getting stuff published"

the recent push by apa journals and others to make submitters supply 
information on effect sizes is, in my view, misplaced effort ... what 
should be insisted upon in studies where the impacts of variables is being 
investigated ... is a clear statement and rationale BY the researcher as to 
WHAT size impact it would take to make a practical and important difference 
...

if that ONE piece of information were insisted upon ... then all of us 
would be in a much better position to evaluate results that are presented

reporting effect sizes (greater than 0) does nothing to help readers 
understand the functional benefit that might result from such treatments

until we tackle that issue directly, we are more or less going around in 
circles


>Thom
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=====

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: not significant

2001-09-12 Thread dennis roberts

At 10:10 PM 9/12/01 -0400, Stan Brown wrote:
>[cc'd to previous poster; please follow up in newsgroup]
>
>Jerry Dallal <[EMAIL PROTECTED]> wrote in sci.stat.edu:
>
>One suggestion, if I may: I scratched my head for a moment over
>"SEM". At least in my course, I don't believe the textbook ever uses
>that abbreviation (and I know I don't). Perhaps you might want to
>define it the first time on that page: SEM = standard error of the
>mean.


this is a good point ... in fact, sem is more seen in measurement as an 
abbreviation for stan error of measurement ... and SEM is often used to 
mean the analytical method ... structural equation modeling ...


==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: effect size/significance

2001-09-12 Thread dennis roberts

At 07:23 PM 9/12/01 -0500, jim clark wrote:
>Hi
>
>
>What your table shows is that _both_ dimensions are informative.
>That is, you cannot derive effect size from significance, nor
>significance from effect size.  To illustrate why you need both,
>consider a study with small n that happened to get a large effect
>that was not significant.  The large effect should be "ignored"
>as being due to chance.  Only having the effect size would likely
>lead to the error of treating it as real (i.e., non-chance.



or, another way to view it is that neither of the dimensions is very 
informative

of course we know that significance does not mean "real" and non 
significance does not mean "chance alone" ... there is no way to decipher 
one from the other based on our significance tests

the distinction between significant or not ... is based on an arbitrary 
cutoff point ... which has on one side ... the notion that the null seems 
as though it might be tenable ... and the other side ... the notion that 
the null does not seem to tenable ... but this is not an either/or deal ... 
it is only a matter of degree

what if we juxtapose ... non significant findings with a large effect size 
... with significant results with a small effect size ... which of these 
two would most feel "had most import"?

surely, if both dimensions are seen as been equal parts of the overall 
puzzle, then, both would seem to be more or less equal

but, if one opts for large effect size when results are not significant ... 
then, this suggests that significance adds little if anything to the mix ...

however, if we opt for significance along with a small effect size, then 
this suggests that significance is playing a more important role in one's eyes

the reality is too that effect sizes are, when categorized as small, 
medium, and large ... again ... totally arbitrary ... which makes it even 
harder still to make much sense of these ... just like it is difficult to 
make much sense out of significance

and finally, effect sizes do NOT say anything about importance OF the 
effect ... merely some statement about the SIZE of the effect ... so, it 
could very well be that for many independent variables ... a small effect 
size has a much greater impact consequence than a large effect size ... 
even when both are found to have significance on their side

just to muddy the waters more


==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



effect size/significance

2001-09-12 Thread Dennis Roberts

given a simple effect size calculation ... some mean difference compared to 
some pooled group or group standard deviation ... is it not possible to 
obtain the following combinations (assuming some significance test is done)

   effect size
  small   mediumlarge

 res NS

 res sig

that is ... can we not get both NS or sig results ... when calculated 
effect sizes are small, medium, or large?

if that is true ... then what benefit is there to look at significance AT ALL

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



R values

2001-09-11 Thread dennis roberts

some of you chuckled a while back when i suggested that our R values had a 
different meaning ...

have a look at ... http://www.alumni.psu.edu/vrpennstate/HistMrkr/Index.html

down the page a bit there is R ... along with some other interesting 
"markers" and their stories

======
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-10 Thread dennis roberts

At 11:29 AM 9/11/01 +1200, Magenta wrote:
>I incorporate a separate "N/A" option. This could be included in an earlier
>question that would ensure respondents who should not answer the questions
>were skipped over those questions.  This is standard practice, e.g. in CATI
>situations.
>CATI = computer assisted telephone interviewing.
>
>IMO the problem has become incorrect survey question pattern design in this
>case, rather than incorrect response design.

i will try one more shot at this

again, say the item is ... "i like statistics"

SCENARIO A

and, 2 people using your response scale ... respond:

"i don't agree) 0 <-> 5 "i strongly agree"

person 1 is here ^^
person 2 is here ^^

SCENARIO B

now, what if the same two people are presented with the following 
additional item:

  "i DON'T like statistics"

"i don't agree" 0 <-> 5 "i strongly agree"

person 1 is here ^^
person 2 is -^^^

the combination of SCENARIO A plus SCENARIO B ... suggests that person 1 IS 
more or less neutral ... BUT, person 2 is really ANTI statistics ... 
perhaps even HATES it

but, with ONLY SCENARIO A ... you CANNOT know that these two persons are 
different ... and to assume that they are both at the 0 point on your scale 
... is a mistake

thus, the problem here is NOT with the item or stem design ... it is most 
surely with the response options given to the S

you seem to be forgetting ... or wanting to bypass ... the notion that for 
attitudes anyway ... there is an OBJECT that ... we have some valence for 
... or not ... and one of the dimensions is of course strength of valence 
but, also ... which your approach misses ... the DIRECTION of that valence 
... ie, the tendency to want to approach it or avoid it ...

when you create an item ... that in itself has some direction to it ... 
agree with the statement provides information about the S and his or her 
strength AND direction ... but, for a person who is inclined in the 
opposite direction of the way the item is stated ... a "i don't agree" or 0 
... does not provide that S with ANY response that fits his/her attitudinal 
pattern ...

thus, without the other end of the continuum being one of the RESPONSE 
OPTIONS for the S ... the 0 point "i don't agree" simply provides ambiguous 
data to the data collector


>cheers
>Michelle
>
>"Dennis Roberts" <[EMAIL PROTECTED]> wrote in message
>[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> > At 01:17 PM 9/9/01 +1200, Magenta wrote:
> > >It would treat "don't agree" as the zero point.  So an answer at the 100%
> > >point would be interpreted as twice as strong as an answer at the 50%
>point.
> >
> >
> > let's say the item is
> >
> > "i like statistics"
> >
> > and, we have two people ... PERSON 1 who HATES statistics ... and PERSON 2
> > one who really has had no exposure to statistics and therefore, really has
> > "no opinion" at this point in time
> >
> > and the response options are:
> >
> > I DON'T AGREE WITH THIS || I STRONGLY
> > AGREE WITH THIS
> >  0 5 10
> >
> > NOW, both person A and person B ... respond "I DON'T AGREE WITH THIS"
> > (which is dictated by the item response possibilities)
> >
> > are you trying to tell us that you would consider both of these responses
> > as reflecting the same degree of "agreement" and/or ... compared to
>someone
> > who might have responded 5 ... equally different than the person who said
> > "5"???
> >
> >
> >
> >
> >
> >
> > _
> > dennis roberts, educational psychology, penn state university
> > 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
> > http://roberts.ed.psu.edu/users/droberts/drober~1.htm
> >
> >
> >
> > =
> > Instructions for joining and leaving this list and remarks about
> > the problem of INAPPROPRIATE MESSAGES are available at
> >   http://jse.stat.ncsu.edu/
> > =
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-10 Thread Dennis Roberts

At 01:17 PM 9/9/01 +1200, Magenta wrote:
>It would treat "don't agree" as the zero point.  So an answer at the 100%
>point would be interpreted as twice as strong as an answer at the 50% point.


let's say the item is

"i like statistics"

and, we have two people ... PERSON 1 who HATES statistics ... and PERSON 2 
one who really has had no exposure to statistics and therefore, really has 
"no opinion" at this point in time

and the response options are:

I DON'T AGREE WITH THIS || I STRONGLY 
AGREE WITH THIS
 0 5 10

NOW, both person A and person B ... respond "I DON'T AGREE WITH THIS" 
(which is dictated by the item response possibilities)

are you trying to tell us that you would consider both of these responses 
as reflecting the same degree of "agreement" and/or ... compared to someone 
who might have responded 5 ... equally different than the person who said 
"5"???






_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-10 Thread Dennis Roberts

At 01:17 PM 9/9/01 +1200, Magenta wrote:
>It would treat "don't agree" as the zero point.  So an answer at the 100%
>point would be interpreted as twice as strong as an answer at the 50% point.

again ... one (of many) problems with this notion is that it assumes that a 
person who opts for this choice ... has NO degree of feeling with the 
statement at all ... when in fact ... since you have given this person NO 
other way to respond ... it could be because the person DISagrees with the 
statement ... or, in fact, rather than having zero opinion about it ... 
does NOT want to respond or thinks the item is ambiguous ... and hence will 
use the "don't agree" as a way to NOT making a response ... but still 
making one


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-07 Thread dennis roberts

At 11:06 AM 9/8/01 +1200, Magenta wrote:

>Sure do, I think that if you redid it so that the scale was now:
>
>don't agree
>strongly agree
>  |___|
>
>that would give you a ratio scale between no agreement and strong agreement.
>You would then be able to use, e.g. ANOVA, on your test results, which would
>be numeric in millimeters.

TO TALK about these things as ratio scales is downright silly

look at the item:

stat will help me in my professional work

don't agree |(0)__(5)__| agree

you aren't going to claim that the "agree" means 5 times a stronger view 
than "don't agree" ... are you???

does don't agree mean you are thinking that it will HURT your professional 
work? i don't think so ... it could mean you have a view that it will not 
help ... but, it won't put you in some disadvantaged position ... but it 
COULD mean that you don't agree because you think it will harm you

thus, responding "don't agree" is like the ? or neutral position ... which 
is essentially impossible to evaluate


>cheers
>Michelle 
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=====

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-07 Thread Dennis Roberts

At 11:22 AM 9/7/01 -0400, Rich Ulrich wrote:

>I agree with Mike's opinion, above, that "Likert Scale" does
>not need to refer to attitudes, and that it still ought to imply
>that some amount of reliability testing has been performed.


well, i happen to take a different view ... and that is ... to honor the 
context in which likert did his work ... and NOT confuse all kinds of other 
uses of such response categories ... agree and disagree ... with his name

he work with what he considered attitudes ... affective dimensions ...

if we want to take it out of that realm ... then i think we should leave 
likert's name out of it

for example ... what if we had an item on a stat test like:

the mean of the set of data ... 10, 9, 8, 7, 6 is ... 8.

A. I strongly agree with this
B. I agree with this
C. I am undecided on this
D. I disagree with this
E. I strongly disagree with this

or forgetting the problematic (as with attitude scales too) ? category

A. I strongly agree with this
B. I agree with this
C. I disagree with this
D. I strongly disagree with this

we could, i guess ... decide some way to score this item ... say ... either 
by counting correct A and B and counting wrong C and D ... or, 
differentially weighting A more than B ... and D less than C

and come up with an item score ... and sum across items to get a total test 
score

but i certainly would NOT call this either a likert type item ... or a 
likert scale ... that would be absurd

this is the problem ... we have so mixed up our use of the term likert ... 
with dozens and dozens of things we do ... that, the term likert has taken 
on meanings that were NOT intended by likert himself ... especially if you 
look at HIS work

why should we mangle HIS usage so?


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-07 Thread Dennis Roberts

At 06:28 PM 9/7/01 +1200, Magenta wrote:

>"John Uebersax" <[EMAIL PROTECTED]> wrote in message
>[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
> > A recent question made me realize the extent of ambiguity in the use
> > of "Likert scale" and related terms.  I'd like to see things be more
> > clear.  Here are my thoughts (I don't claim they are correct; they're
> > just a starting point for discussion).  Concise responses are
> > encouraged.  If there are enough, I'll post a summary.
> >
> > 3.  I do not know if Likert also used a visual analog format such as:
> >
> >  neither
> > strongly   mildly   agree normildly   strongly
> > disagree  disagree  disagree agree agree
> >
> >1 2  3  4 5
> >+-+--+--+-+
>
>My understanding of the use of visual analog scales is that only the anchors
>are labelled - so that you have a line like so:
>
>strongly disagree
>strongly agree
>     |___|

in any case ... likert did not use this visual method


_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-06 Thread Dennis Roberts

we do have a semantics problem with terms like this ... scale ... and 
confuse sometimes the actual physical paper and pencil instrument with the 
underlying continuum on which we are trying to place people

so, even in likert's work ... he refers to THE attitude scales ... and then 
lists the items on each ... thus, it is easy to see an equating made 
between the collection of items ... nicely printed ... BEING the scale ...

but really, the scale is not that ... one has to think about the  SCORE 
value range ... that is possible ... when this physical "thing" (nicely 
printed collection of items) is administered to Ss ...

thus ... for 10 typically response worded likert items with SA to SD ... 
the range of scores on the scale might be 10 to 50 ... of which any 
particular S might get any one of those values somewhere along the continuum

but of course, scale is even "deeper" than that since, what we really have 
is a psychophysical problem ... that is, what is the functional 
relationship that links the physical scale ... 10 to 50 ... to  the 
(assumed to exist) underlying psychological continuum ...

PHYSICAL SCALE 10 (NEGATIVE) <> 50 
(POSITIVE)

PSYCHOLOGICAL
CONTINUUM  MOST NEGATIVE <> 
MOST POSITIVE

problems like ... do equal distances along the physical scale ... equate to 
the same and equal distances along the psychological continuum? is there a 
linear relationship between these two? curvilinear?

so, i think what we really mean by scale is  this construct ... ie, the 
psychological continuum ... and a scale value would be where a S is along 
it ... but, about the best we can do to "assess" this is to see where the S 
is along the physical scale ... ie, where from 10 to 50 ... and use this as 
our PROXY measure ...

BUT IN any case ... i think it is helpful NOT to call the actual instrument 
... the paper and pencil collection of items ... THE scale ...




_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Definitions of Likert scale, Likert item, etc.

2001-09-06 Thread Dennis Roberts

again ... the best place to read about what rensis likert did ... is to 
read his work:

a technique for the measurement of attitudes, archives of psychology, #140, 
New York, June 1932

to the best of my knowledge, this document is not online in any form (not 
that it should be) even though it is a sort of famous document (but, unread 
far too often)

here are examples of each type of item that likert used in this work ... 
his dissertation  under the guidance of gardner murphy

likert used 3 different "attitude" scales ... each consisting of multiple items

one was the internationalism scale, one was the negro scale, the final one 
was an imperialism scale

below, i will give ONE example from each type of item used (not necessarily 
from each scale)

QUESTION ASKED YES NO kind of response scale

Do you favor the early entrance of the United States into the League of 
Nations?

 YES (4)   ? (3)NO (2)#s = scale weights

[THERE WERE  19 ITEMS LIKE THIS ACROSS THE 3 SCALES]

QUESTION ASKED WITH VERBAL OPTIONS FOR RESPONSES

How much military training should we have?

   (a) We need universal compulsory military training  (1)
   (b) We need Citizens Military Training Camps and Reserve Officers 
Training Corps, but not universal military training  (2)
(c) We need some facilities for training reserve officers but not as 
much as at present (3)
(d) We need only such military training as is required to maintain our 
regular army (4)
 (e) All military training should be abolished  (5)

[THERE WERE  8 ITEMS LIKE THIS (hard to make)ACROSS THE 3 SCALES]

STATEMENT PRESENTS WITH TYPICAL LIKERT RESPONSE SCALE

All men who have the opportunity should enlist in the Citizens Military 
Training Camps

Strongly Approve (1)   Approve (2)  Undecided (3)   Disapprove 
(4)   Strongly Disapprove (5)

[THERE WERE 31 ITEMS LIKE THIS ACROSS THE 3 SCALES]

now, NO OTHER FORMS OF ITEMS WERE USED BY LIKERT IN THIS 1932 WORK ...

for each attitude scale, a summed value was obtained ...

At 10:44 AM 9/6/01 -0700, John Uebersax wrote:
>A recent question made me realize the extent of ambiguity in the use
>of "Likert scale" and related terms.  I'd like to see things be more
>clear.  Here are my thoughts (I don't claim they are correct; they're
>just a starting point for discussion).  Concise responses are
>encouraged.  If there are enough, I'll post a summary.

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



program listings

2001-09-05 Thread Dennis Roberts

once or twice a year ... i post a note reminding people that i have a link 
on my website to edpsy types of programs ...

http://roberts.ed.psu.edu/users/edpsy/links.htm

and mention that if your program is NOT listed and you would like it to be 
... OR, there is some updated url that works better or a problem with a 
link i have ... please let me know

i make no claim that this list is exhaustive (certainly it is not) ... but, 
such a list might be helpful to those looking for web information related 
to edpsy programs of study

social science oriented programs focusing primarily on learning and 
cognition ... and/or research methodology (statistics and measurement) ... 
are the main ones i am trying to list

thanks

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Bimodal distributions

2001-08-30 Thread Dennis Roberts

At 01:22 PM 8/30/01 -0500, Paul R. Swank wrote:
>A bomodal distibution is often thought to be a mixture of two other
>distibution with different modes. If the distributions have different sizes,
>then it is possible to have two or more "humps". I once read somewhere (and
>now can't remember where) that this may be referred to as bimodal (or
>multimodal). In the bimodal case, some refer to the higher "hump" as the
>major mode and the other as the minor mode.

this is an interesting point but, one we have to be careful about ... in 
the minitab pulse data set ... c6 is heights of 92 college students ... a 
mixture of males and females ...



  :   :
  :   .   :   :   :   .
  :   :   :   :   :   :   :   :   .
  :   :   :   :   :   :   :   :   :   :   :   :   .
  .  .:  .:   :   : . :   :   :   : . :   : . :   : : :   :
   -+-+-+-+-+-+-Height
 62.5  65.0  67.5  70.0  72.5  75.0

now, if we were to 'roughly' see the 'peaks' ... around 68/69 ... and 72/73 
... one might say that THIS is because of the gender differences (ie, where 
the modes or averages BY sex were)... but look at the separate dotplots

Dotplot: Height by Sex


  .   :   .
  Sex :   .   :   :   :   .
  1   :   :   :   :   :   :   :   :   :   .
  :   :   :   : . :   : . :   : : :   :
   -+-+-+-+-+-+-Height
  :
  Sex :   :   :   :   .   :   .
  2   .  .:  .:   :   : . :   :   :   :   .
   -+-+-+-+-+-+-Height
 62.5  65.0  67.5  70.0  72.5  75.0

but look at the desc. stats ...

Descriptive Statistics: Height by Sex


Variable   Sex   N   Mean Median TrMean  StDev
Height 157 70.754 71.000 70.784  2.583
235 65.400 65.500 65.395  2.563

using the modes ... as approximations for the averages ... means or medians 
... might not be a good idea ...

in this case ... we get a 'peak' around 68/69 not because of ONE gender 
concentrating there ... but, OVERLAPPING between the sexes ... at this 
approximate location of heights

modes are tricky



_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Bimodal distributions

2001-08-30 Thread Dennis Roberts

At 02:04 PM 8/30/01 -0400, David C. Howell wrote:
>Karl Wuensch asks an interesting question, though I would phrase it 
>somewhat more generally. "At what point does a bimodal distribution become 
>just a distribution with two peaks?"


or allow me to rephrase as ... when are there enough frequencies at a 
location or several rather distinct locations ... to identify it/them as 
MODAL locations?

the issue here is really not about equality of peaks ... but, when IS it a 
peak that warrants special mention

for example ... you might have a class of intro stat students ... some of 
which have had 3 courses before ... and, most of which who have had no stat 
classes before ... and you give a final exam the first day of class ... and 
see a low peak at the high end ... and a big peak down at the low end of 
the score scale ...

now, the heights of the peaks will surely be different but, there is 
explanatory importance to these two peaks ... that can be explained 
(primarily) by the amount of pre work that has been done

since i don't think there are any technical definitions of what exactly a 
bimodal ... or trimodal distribution is ... we have to take any 
representation OF distributions AS SUCH with a  grain of salt ... and of 
course, insist on actually SEEING the distribution ... as our own personal 
check

for example ... here are 5 randomly generated patterns of data (n=100 each 
time) using minitab ... from an integer distribution ... which assumes 
equal probability across the numbers ... 10 to 20

would anyone want to take a stab in some definitive way ... and 
characterize the modality of these?

Each dot represents up to 2 points
   :
  :: ..:  .
  ::::::.::.:
  :::::::::::
   ---+-+-+-+-+-+---C1
   10.0  12.0  14.0  16.0  18.0  20.0
Each dot represents up to 2 points
   :.   .
   :::  ..  :
   ::: ::::.:
  :::::::::::
   ---+-+-+-+-+-+---C2
   10.0  12.0  14.0  16.0  18.0  20.0
Each dot represents up to 2 points
.
:
 : :.. ::
   :::::::.::
  .::::::::::
   ---+-+-+-+-+-+---C3
   10.0  12.0  14.0  16.0  18.0  20.0
Each dot represents up to 2 points
.
:
  : :   .  :.
  : :::.:::::
  :::::::::::
   ---+-+-+-+-+-+---C4
   10.0  12.0  14.0  16.0  18.0  20.0
. :   :.
  : : ::  ::.
  ::: :: ::::
  :::::: ::::
  ::::::.::::
  :::::::::::
   ---+-+-+-+-+-+---C5
   10.0  12.0  14.0  16.0  18.0  20.0

MTB >





_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Bimodal distributions

2001-08-30 Thread Dennis Roberts

hi karl ... i think the answer is yes ... if you want it to have 2 modes

the mode is a problematical statistic ... since there is no good definition 
for it and ... a few frequencies shifting around ... could radically change 
the "mode" or "modes"

in minitab, there is no place where ANY mode is even identified ...
i have heard about some software that report modes ... but, how they handle 
multiple peaks with differing ns ... i have no idea

in the example you cite ... what if there were a spike at 12 hundredths ... 
with = frequency to 10 ... would you call it bimodal ... or unimodal??? 
that is ... is there something of significance about the difference between 
10 and 12 ... that we would want to separate them out as REALLY different 
values? ... maybe it is still mono modal

a former student and now a academic vice president .. i know, a demotion! 
... coined a new term for when you had two adjacent values ... each with 
the highest frequency in the distribution ... he said take the median of 
the two modes ... and call it the ...

MODIAN

At 12:54 PM 8/30/01 -0400, Wuensch, Karl L. wrote:
> Does a bimodal distribution necessarily have two modes?  This might
>seem like a silly question, but in my experience many folks apply the term
>"bimodal" whenever the PDF has two peaks that are not very close to one
>another, even if the one peak is much lower than the other.  For example,
>David Howell (Statistical Methods for Psychology, 5th, p. 29) presents
>Bradley's (1963) reaction time data as an example of a bimodal distribution.
>The frequency distribution shows a peak at about 10 hundredths of a second
>(freq about 520), no observations between about 18 and 33 hundredths, and
>then a second (much lower) peak at about 50 hundredths (freq about 25).
>
>+
>Karl L. Wuensch, Department of Psychology,
>East Carolina University, Greenville NC  27858-4353
>Voice:  252-328-4102 Fax:  252-328-6283
>[EMAIL PROTECTED]
>http://core.ecu.edu/psyc/wuenschk/klw.htm
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



doxplots

2001-08-30 Thread Dennis Roberts

speaking of combining info from a dotplot and a boxplot ... which i want to 
dub ... DOXPLOT ...minitab does have a macro file ... called %describe ... 
that shows the histogram of a distribution and below it, the boxplot ... 
one example is at

http://roberts.ed.psu.edu/users/droberts/introstat/desc.png

of course, there really is no importance to the BOX .. part of the boxplot 
... and if one could indicate along the baseline of the histogram ... or, 
dotplot ... the same points indicated on the boxplot ... Q1, median, Q3 ... 
seems like that would do the trick ...

SO, AGAIN, DOES ANYONE KNOW OF A REGULAR GRAPH ROUTINE ... IN ANY PACKAGE 
... THAT DOES JUST THAT?? IN one GRAPH ... show both the frequency 
distribution ... and, the summary Q points along the baseline?

as for flagging extreme values ... which the boxplot above shows an example 
of ... it becomes rather visually obvious ... in looking at the histogram 
graph ... independent OF the dot in the boxplot ... out past the upper whisker

doxplots would kill two birds with one graphic stone

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-30 Thread Dennis Roberts

all of this is assuming of course, that some extreme value ... by ANY 
definition ... is "bad" in some way ... that is, worthy of special 
attention for fear that it got there by some nefarious method

i am not sure the flagging of extreme values has any particular value ... 
certainly, to flag and look at these ... makes no more sense to me than 
examining all the data points ... to make sure that all seem legitimate ... 
and accounted for ...

actually, the more i think about boxplots ... since one can't get any 
notion of frequency PER score value ... i like the notion of just drawing 
the whisker to the endpoints ... and be done with it ...

boxplots give you limited data in any case ... why make them work more than 
they are worth?

what would be better would be to have a dotplot formation on top of the 
boxplot ... or, a dotplot with ... Q1, Q2, and Q3 ... indicated by some 
notch technique ...

hmmm ... anyone know of a package that does this kind of a diagram???

doxplot

At 09:45 AM 8/30/01 -0300, Robert J. MacG. Dawson wrote:


>I wrote:
>
> > Er, no.
> >
> > Q1 ~ mu - 2/3 sigma
> > Q3 ~ mu + 2/3 sigma
> > 1 IQR ~ 4/3 sigma
> > 1.5 IQR ~ 2 sigma
> >
> > inner fence ~ mu +- 2 2/3 sigma which is about the 0.5 percentile.
>
> -right so far -
>
>and then burbled
>
> > The inner fences are selected to give a false positive rate of about 1
> > in 1000.
> >
> > I suppose that if we take into account the Unwritten Rule of 
> Antique
> > Statistics that all data sets have 30 elements, this *does* give
> > a "p-value" of (1-e)*30*0.001 = 5% 
>
>which is obviously wrong. The false positive rate is about 1 in 100
>and my fanciful 5% calculation is unsalvageable.
>
> -Robert Dawson
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>       http://jse.stat.ncsu.edu/
>=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-28 Thread dennis roberts

At 11:30 PM 8/28/01 +, Jim Callahan wrote:
>Eric Bohlman wrote:
>
> >And furthermore, not all the wrong answers are equally "bad."  Someone who
> >would answer A or B must know quite a bit less than someone who would
> >answer C (in fact, it would tend to indicate that they had no concept at
> >all of what the boxplot represented).
>
>I don't believe that making all distractors "equally bad" is a test writting
>criterion.  In preparing multiple choice distractors the goal is to create
>distractors that are plausible.  For example in a quantitative question, the
>distractors might be the results obtained by dividing by n rather than (n - 1)
>or selecting the incorrect number of degrees of freedom.  The question writer
>is usually expected to be able to explain why the distractors were chosen.

which clearly here ... this item writer ... or (i would find it hard to 
believe) item review committee??? could not do ...









>Jim
>
>Stamp out fuzzy thinking.
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-28 Thread dennis roberts

At 10:43 PM 8/28/01 +, EugeneGall wrote:
>I got an email from Anand Vaishnav, the Globe reporter who did Friday's 
>article
>on the math and stats problems in MCAS.  Only about 50% of the 63000 10th
>graders in MA got median and range.  I suspect that mean and range 
>probably was
>the most popular incorrect answer (according to the MCAS graders)


i would say then ... according to my hypothesis that options A and B ... 
mean only and median only ... got even LESS than chance % ... even if total 
guessing were going on ... since, given that the diagram shows a variety of 
information ... most should know (even if they really don't know 
much/anything about boxplots) that choices like ONLY ... would be rarely 
correct ...

thus ... if 50% selected range and median ... i bet that range and mean had 
a % very close to that

anyone selecting A and B ... i bet it was more clerical answer sheet coding 
error ... than, these were really choices that someone picked



>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

======
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-28 Thread Dennis Roberts

At 01:33 PM 8/28/01 -0500, Jay Warner wrote:
>Suggest we step back a minute.

by de facto definition ... the MCAS tests ... are intended to convey ... 
MINIMUM skills/knowledge that they expect all high school GRADUATES to have 
... they certainly cannot purport to test and/or represent anything higher 
than that

actually, even LESS than that ... because, there is some 'cut' score ... 
and, we all know that (as an example given a 40 item test say) if the cut 
score were 30 ... there are essentially an infinite number of ways one 
could get thirty correct (even by knowledge ... forgetting totally about 
guessing) ... in fact, they could have missed each and every one of the 6 
stat items

thus ... i kind of think it is pretty hard to argue that any tests like 
these ... are measuring or can be assuming to measure ... much in the way 
of higher level skills

if we take the infamous #39 item ... where the options were (if i recall)...

A. mean only
B. median only
C. range and mean
D. range and median

well, even if we accepted this item as "fair" ...

a student looks at the graph ... sees that there is a bunch of stuff in the 
boxplot ... and then sees A and B ... without too much thought ... they 
could say, it can't be mean or median ONLY ... there must be more you can 
do with that boxplot than that ... so, without knowing anything of 
consequence about boxplots ... you are down to a two choice item

thus, for even the uninformed ... we have essentially  a T and F item

but, if you did happen to KNOW that the boxplot uses the median ... then 
the fact that the RANGE is part of the C and D options ... means the term 
RANGE is irrelevant to the item ... given that A and B are just too 
restrictive (because of the use of the term ONLY) and C and D have RANGE as 
a common element ...  so, if you did pick D ... then it has to be because 
you knew that isolated "fact" ... which certainly is about as low level as 
you can get




_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-28 Thread Dennis Roberts

At 02:21 PM 8/28/01 +, NoSpam54 wrote:

>   If there were an AP stats course, they would probably be using a
>college-level text that would be using a true Tukey boxplot, not the
>Harcourt-Brace/NCTS boxplot.  I don't think it fair for students to know that
>the NCTS and the K-12 textbook writers (and the MCAS test writers) have 
>adopted
>one feature of the Tukey boxplot, but not the most important feature: the
>ability to flag outliers.
>Eugene Gallagher

however ... the "flagging" of "outliers" is totally arbitrary ... i see no 
rationale for saying that if a data point is 1.5 IQRs away from some point 
... that there is something significant about that

there may be ... there may not be

just as ANY value in the entire data set could be flawed ...

one could have selected 1.75 IQRs or 1.25 IQRs or some function of a 
standard deviation ...  or any other value

so, whether we think that the tukey rule for indication is good or not ... 
it really  boils down to if there should be ANY indication of values ... 
ie, highlighted to the boxplot observer ... that are extreme by any 
definition ... or not

overall, perhaps that is a good idea ... but, as long as the boxplot user 
clearly knows  that this does NOT mean 'significant' or 'important' or some 
other verbiage that implies that outliers are BAD deviants in some sense of 
the word

as for the item #39 on the MCAS test ... i really don't care much if the 
boxplot is technically drawn correctly or not ... again, THE important 
matter is that the "thing" being assessed by the item is 
essentially  irrelevant to general statistical knowledge, especially at the 
level of a 10th grade student who is learning something about statistics in 
the context of their mathematics work ... and again, there is the matter 
that the item itself is very poorly constructed ...







>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-28 Thread Dennis Roberts

At 11:13 AM 8/28/01 -0300, Robert J. MacG. Dawson wrote:


> If indeed the scores are being reduced by hiding the easy questions
>among the harder ones, then I would say yes, this is a defect of the
>current system, and should be changed. It may be that the questions
>themselves ought to be more difficult; but the difficulty ought to be
>intrinsic to the questions, not an artifact of the test format. What is
>at issue here is essentially signal-to-noise ratio.
>
> -Robert Dawson

however, we have to consider a countering factor too ... content

many tests sort of group their items in clusters ... that is ... items 
about stat together ... general math together ... etc. and, there is some 
usefulness to this FOR the examinees ... rather than mixing everything up

in that format ... difficulty probably spirals ... that is, within a 
cluster ... you will have easier and harder items ... next cluster ... same 
thing

so, in this case ... content is more or less constant (across n items) but, 
difficulty varies

however, when we order items OVERALL from easy to hard (assuming you have 
good and sufficient data to be ABLE to do this) ... the content keeps 
mixing up ... so, you have the back and forth phenomenon of having to 
constantly switch gears ... which could defeat the purpose of ordering by 
difficulty

here, difficulty is relatively constant across groups of n items ... but 
content varies

much of this depends on the TIME you have to work on the test ... if the 
time limit is generous ... so everyone has sufficient time ... these 
factors play much less (if any) of a role ... but, when the time is tight 
... then anything we can do to get examinees through all the items ... to 
at least have a look and see if they have any idea about how to answer them 
... the better

the main thing we have to guard against is ... having examinees start the 
test with a couple of lulu items ... and then think ... well, if 
these  first couple are doozies ... the rest of the items must be IMpossible!

the problem we face is helping examinees adopt good test taking strategies 
... but, we know that once we let em loose on the test ... we have no 
control over how they move through the test ... we would like to believe 
that our instruction to "not spend too much time on any one item" will be 
headed but, we know that it won't (some examinees will get "stuck" on an 
item and NOT move on) ... so, if we can ARRANGE items in a way to help 
optimize their performance ... we are getting better estimates of what they 
know ...

and that should be our main goal


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Excel for simulating normal distribution

2001-08-28 Thread Dennis Roberts

plus ... many good REAL stat packages do this so easily

MTB > rand 5000 c1;
SUBC> norm 100 10.  <<<<< mean and sigma
MTB > dotp c1

Dotplot: C1


  ..
 .
  .:.
 .
  ..
..::..
   .:::.
   .. ..::... .
   -+-+-+-+-+-+-C1
   607590   105   120   135

MTB > desc c1

Descriptive Statistics: C1


Variable N   Mean Median TrMean  StDevSE Mean
C15000 100.09 100.17 100.12   9.91   0.14

Variable   MinimumMaximum Q1 Q3
C1   64.00 135.34  93.37 106.77


At 11:55 AM 8/28/01 +, DELOMBA wrote:
>Please do NOT rely on Excel random number generator. It is an old one and is
>not powerful at all (1 000 000 numbers). Efficient routines can be found at
>the Cern.
>
>Y.

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-28 Thread Dennis Roberts

At 09:23 AM 8/28/01 -0300, Robert J. MacG. Dawson wrote:
>I wrote:
> >
>
> > >  An obvious approach that would seem to give the advantages hoped for
> > >from the focussed test without the disadvantages would be just to group
> > >questions in the original test in roughly increasing order of
> > >difficulty.  

of course, research eons ago has shown that test performance is optimized 
... by having items in the order of easy to difficult ... IF there is a 
time limit where some examinees have to push to get finished

now that's a thought ... maybe if the items WERE ordered that way ... some 
of that large % that seem to be failing ... would gain an item or two in 
their score and pass!!! what a simple thing to do to make the students in 
mass. look better! and mass. education!

> -Robert Dawson

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression to the mean,Barry Bonds & HRs

2001-08-28 Thread Dennis Roberts

SO, when bonds hits 73 ... what will people say vis a vis regression to the 
mean?

At 11:40 PM 8/27/01 -0400, Stan Brown wrote:
>Rich Ulrich <[EMAIL PROTECTED]> wrote in sci.stat.edu:
> >This was a topic a month ago.  Just to bring things up to date

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-27 Thread Dennis Roberts

At 01:57 PM 8/27/01 -0300, Robert J. MacG. Dawson wrote:

> The focussed test isn't an entirely bad idea; it does allow a 
> genuine D
>student to avoid getting blown out of the water by questions intended to
>discriminate between A and B students. However, it seems like a very
>poor way to carry the idea out.
>
>
> An obvious approach that would seem to give the advantages hoped for
>from the focussed test without the disadvantages would be just to group
>questions in the original test in roughly increasing order of
>difficulty. One might (I'm not so sure that this would be a good idea)
>put between each group a rubric along the lines of
>
>===
> PROGRESS CHECKPOINT 1
> If you have got the right answers to 15 of the preceding 20 questions
>you have already passed (50%).
> If you have got the right answers to 18 of the preceding 20 questions
>you have already got at least a C- (60%).
> The questions below are mostly more advanced. Right answers to any of
>them will raise your numerical mark further and may raise your letter
>grade.
> KEEP ON GOING!
>===

yeah but ... they could do that from day 1 ... in a CAT format ... the 
first time 10th graders take the test ... why wait till retake 3 or retake 4?

why waste 10th graders time the first go round ... 

all of this begs the question though ... that the item ... maybe more than 
one ... are terribly bad items to be putting on high stakes tests like 
these ... even if they have 100 retakes

someone in mass. ... or the company that mass. contracts for to do the 
tests ... is NOT doing their homework ... PRIOR to having these tests 
finalized ... printed ... and administered



> Alternatively one could use icons - say Happy Faces - to identify 
> a set
>of recommended easier questions that nobody should quit without
>attempting, if there was a reason to use the order to encode something
>else.

or ... sad faces that mean ... unless you are really smart ... DON'T TRY ME!


> -Robert Dawson
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=====

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp

2001-08-27 Thread Dennis Roberts

At 12:31 PM 8/27/01 +, EugeneGall wrote:
>  The Harcourt-Brace description of the boxplot, which is now being taught
>to MA students, isn't a proper boxplot (maybe the Harcourt-Brace K-12 boxplot
>is different from the Tukey boxplot), becuase it doesn't properly plot 
>outliers
>and extreme outliers.


of course, boxplots do NOT have to show outliers if there are none, 
correct? so, what you are saying is that in THIS case ... since you have 
done more digging into the boxplot put on item 39 ... and have concluded 
that it is not properly designated ... that it is wrong

now, i made a data set ... here is the old time boxplot ... that you can 
get from minitab ... which is perfectly legit ...

Boxplot


   -
   I + I--
   -
   +-+-+-+-+-+--C1
20.0  22.0  24.0  26.0  28.0  30.0

there is no designation for outliers since, by definition in this data set 
... there are none

in the better graphic version minitab has ... all you see are single lines 
extending from the hinges ... with NO other symbols (dots, etc.)

thus, in the item 39 example ... while i assume that the dots at both ends 
are just meant to show where the data stop ... this is not how (i dare 
say?) ANY software would show it ...

however, if the boxplot were done by hand ... you might see the texts that 
they are using depict it this way ... draw a box ... extend lines ... put 
dots at the end to signify more clearly ... the ends ... and the notion of 
"outliers" MAY NOT enter into the discussions carried on in mass. class 
curricula ...

in any case ... we have a messy item ... by ANY criterion used


>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-27 Thread Dennis Roberts

eugene ... first of all ... how come your email ID always gets bounced back???

now, certainly, you are not saying are you that if they "fail" this test 
the first time they take it in the 10th grade ... when it is given ... they 
cannot graduate, are you??? if that were the case ... these %ages below 
would pack up and quit attending school

so, there must me more than one time they can take the test ... right? if 
so ... how many times? this doesn't make it any less of a high stakes test 
but, to be fair, it is not all or nothing ONCE ... without any other chance 
of taking the test again

and, if there are multiple opportunities for passing the test ... what MUST 
the items be on the alternate forms if they are just as "interesting" 
as the couple that are here being discussed ... this really is an amazing 
assessment strategy!



>As I've mentioned in other posts, this 10th grade MCAS test is a make-or-break
>high stakes test.  A straight-A average won't get you a diploma if you fail
>this test, and the majority of 10th graders in MA fail this test, as it is now
>being scored.  80% of hispanic students, 76% of black students, and 36% of
>white students in MA failed the 2000 10th grade math test.  This despite the
>fact that MA has among the highest NAEP math scores in the nation.
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Boston Globe: MCAS results show weakness in teens' grasp

2001-08-26 Thread dennis roberts

Which statistics—mean, median, mode,
range—can be determined from this
graph?
A. mean only
B. median only
C. range and mean
D. range and median
.

here is the item ... without the boxplot ... now, the lower point
is roughly 15 ... the spot above the | is about 26 ... and the upper end
point is about .. 41

of course, though the stem mentions the mode ... it is NOT an option in
the choices ... so, what the heck is it doing in the stem??? 

the reality is ... NONE of these things (the mode certainly) can be known
for sure from the boxplot ... even if we know that the | is where the
median is shown ... because, all you can do is to "approximate"
the values ... 

so, as it stands ... there is no correct answer to this bad question ...
given that the stem says "determined"

without the actual data ... and knowing the actual lower and upper
scores, you can't determine the range ... without knowing the data ...
you can't know the middle score

in this item ... it IS possible to "approximate" the range, to
"approximate" the median, and believe it or not ... to
"approximate" the mean ... since there is some skew indicated
by the box part ... and, given that AND an approximation for the median
... one could thus "approximate" the mean [of course, the stem
does not say the "approximate" values ... it wants you to know
them for SURE]

what you CAN'T do is know anything about the mode ... of course, that is
not really important here since (though in the stem) it is not in the
answer possibilities

thus ... for this item you are either stuck with:

1. there are two answers that should be keyed as correct ... C and D ...
allowing that "approximate" is = to "determined"


 or

2. there really is NO correct answer

therefore: even if we accepted that this were an important item .. the
item itself is seriously flawed ... 

toss the sucker!!! 




==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/


RE: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-26 Thread dennis roberts

the first hurdle an item has to pass is ... a content one ... is the 
content that we are asking about ... in the item ... sensible ... important 
enough ... to spend 1 of the 6 items worth on the test ... given all the 
concepts that could be tested on the test

if the answer to this is yes ... then we can move forward and think about 
how difficult should the item be

only after this crucial first screening ...

if the answer is no ... then, it matters not if the item happens to be 
difficult or easy ... it should not be used

i am going to go out on a limb here and say ... that, no matter what is 
done in mass. ... in the statistics part of the curriculum ... this 
particular item fails the first screening test

now, if it were part of a CLASSROOM test in a statistics course ... maybe 
(???)  it might be ok ... but certainly not on a statewide test ... where 
scores have more import to the examinees

the notion that the little doohickey in the box part of a boxplot is the 
median .. and not the mean ... just does not rise to a sufficient level of 
importance ...

At 03:57 PM 8/26/01 -0500, Olsen, Chris wrote:
>Hello Dennis and All --
>
>   Please pardon the formatting of my response here -- apparently I cannot
>choose a different font, so I will bracket my comments by "-->" and "<---."

==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-26 Thread dennis roberts

At 10:40 AM 8/26/01 -0500, Olsen, Chris wrote:
>Dear Sir or Madam:
>
>   I read with interest your posting on the issue of the so-called Tukey
>boxplot.  I would like to make a few observations, if you will forgive the
>temerity of a high school teacher.

since i was not the person posting the original item on this matter ... i 
do know in fact what the real purpose was for this particular item and, i 
do not know what the mass. objectives are ... and the material presented in 
typical classes ... that then finds its way on to the assessment test

i would say however ... that IF the test included only 6 items related to 
statistics ... out of the larger test ... then the issue of whether in a 
boxplot ... the vertical bar ... or .. in old minitab a + ...

Boxplot


  ---
 -I +   I---
  ---
   +-+-+-+-+-+--C1
   -1.80 -1.20 -0.60  0.00  0.60  1.20




  is the mean or median ... is trivial no matter what state the test is for ...

IF indeed there is an item on the test related to a boxplot ... then it 
should be about interpreting data using the boxplot ... not about some 
ditzy little tidbit that the line in the center part ... or the + sign 
above ... is the mean or the median ...

by the way ... i think the original item used vertical bar or line ... in 
the stem ... but, if one were using the graphic display above ... which is 
perfectly legit ... then there is no vertical line or bar ... what would an 
examinee do in that case???





=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp

2001-08-25 Thread dennis roberts

At 07:35 PM 8/25/01 +, EugeneGall wrote:

>  The answer hinges only on a trivial choice made by Tukey
>when he described the boxplot.  Incidentally, Tufte criticized the lack of
>information in the box width in the Tukey boxplot and proposed an alternative.

i don't think this is relevant to the issue you raise ... there are lots of 
trivial things in statistics ... and every other discipline too ...


>The 10th grade math test, which will be a graduation requirement in
>Massachusetts, contained only 40 questions.  Six of those questions could be
>considered to deal with probability or statistics (question 9, 26, 36, 37, 39
>and 40).  This one boxplot question (question 39)  constitutes 1/6th of the
>score in the area of statistics and probability.

now ... this is the crux of the problem ... given a highly limited sample 
... then items should really be important ... after all ... if you consider 
this to be a 6 item test about stat ... an item like this one that either 
adds to your "out of 6" score or subtracts from it ... certainly adds 
precious little information about the person and their understanding level 
of statistics ... IF IT HAD BEEN ME ... i would have found a better and 
more important issue to ask that 1/6th of the weight ... question

>  I don't think it is a fair
>question, even if box and whisker plots are listed on the Dept of Education's
>document of what MUST be taught in a K-12 curriculum.


of course, we do this on typical classroom tests too ... i bet that a good 
analysis from an objective reviewer would turn up plenty of trivially 
important items ... but usually, the high stakesness of a classroom test is 
not nearly as great



>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=====

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-25 Thread dennis roberts

conceptually yes ... since, some items are clearly more important than 
others ...

empirically ... since this has been explored many times ... back in the 70s 
and 80s ... it seems to have no impact on rel and val ...

who decides? well, one way is to say ... experts ... have them rate items 
in terms of importance

but, it takes lots of time and ... where do we get the experts from?

obviously, the notion of unit weighting does not make "importance" sense 
but, it is easy ... that's why we do it



At 01:33 PM 8/25/01 -0400, [EMAIL PROTECTED] wrote:
>In a message dated 8/25/01 9:06:18 AM Pacific Daylight Time, [EMAIL PROTECTED]
>writes:
>
><< whether the item you talk about rises to the level of being important
>  enough ... i am not sure ... certainly, in the overall scheme of things ...
>  IF it is included ... it would have to be considered to be of trivial value
>  ... and if someone misses it ... they should not be docked as much as many
>  other more fundamentally important items ... that hopefully WILL be put on
>  the test >>
>
>
>Dennis:
>
>Are you saying that when we make up and give a test we should weight the
>point score for each question?  Exactly who will decide on the importance of
>each item?
>
>Just a question as I sit here preparing the final exam for my students.
>
>Dr. Robert C. Knodt
>4949 Samish Way, #31
>Bellingham, WA 98226
>[EMAIL PROTECTED]
>
>"The man who does not read good books has no advantage over the man who can't
>read them." - Mark Twain

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Boston Globe: MCAS results show weakness in teens' grasp of

2001-08-25 Thread dennis roberts

the only purpose it can serve ... and i am not saying this is important ... 
is to know that the median is part of a boxplot ... just like you might 
want them to identify what the | are at the ends ... the hinges ... just as 
you might want them to know that the whisker has 25% in it ... at each end ...

now, what might be important from my view:

1. be able to differentiate it from say .. a histogram ... or other graphic 
display
2. that the distance from the end of each whisker ... represents the range 
of data
3. that the bar in the middle indicates something about "average"
4. that patterns with short a short whisker at one end and a long whisker 
at the other end ... tell you something about the shape not being symmetrical

bottom line:

there are thousands of items one can create and ask ...
lots of these are totally unimportant
each item that goes on a test that has some import ... should be seen as 
being an important fact/idea to be testing

whether the item you talk about rises to the level of being important 
enough ... i am not sure ... certainly, in the overall scheme of things ... 
IF it is included ... it would have to be considered to be of trivial value 
... and if someone misses it ... they should not be docked as much as many 
other more fundamentally important items ... that hopefully WILL be put on 
the test



At 02:17 PM 8/25/01 +, EugeneGall wrote:
>Some of the MCAS stats and probability questions were tough but fair.  I
>disagreed vehemently with one question:
>Question 39 on the 10th grade Math 2001 test.
>It showed a Tukey boxplot and asked whether the graph represented a mean and
>range or a median and range.
>   Now, this question will do one thing.  It will separate the rich school
>districts that have purchased the latest texts which include a description of
>the boxplot from older texts (listed on the MCAS curriculum guideline)
>   As a college prof, I can state that none of my texts circa 1976-1980 
> included
>descriptions of the boxplot.  Many still do not (e.g., Larsen & Marx Intro to
>Math stats 2001)
>   I think Tukey could just as well have chosen the mean for his display 
> method.
>  In scientific talks, even today, one must tell the audience that the bar
>indicates the median not the mean.
>   So if the problem isn't in the texts, probably wasn't taught to the 
> teachers,
>and never appears in the popular press and is even rare in the scientific
>literature, why put it as one of the handful of test items?!!!
>   What purpose does a question like this serve
>
>http://www.doe.mass.edu/mcas/01release/
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: definition of " metric" as a noun

2001-08-23 Thread Dennis Roberts

i think some of the posts about what "metric" as a noun means are going a 
bit beyond reality and trying to inculcate nuances that just don't have to be

sure, there is the metric system ... where the term metric is a specific 
adjective to system ... but, the KEY term there is system

which does not have to be defined as metric ... or even anything related to 
distance

my old dictionary says as one definition of metric as: of, involving, or 
used in measurement

another version says: in mathematics, the theory of measurement

so, it seems to me that metric, as a noun, SIMPLY means that there is SOME 
system of measuring something ... could be the way we represent volumes, or 
lengths, or pressures, or hardness of rocks ... just something that is 
systematic applied to measuring some object, or phenomenon

that it has to meet some highly rigid set of constraints seems rather 
irrelevant to me






_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: SD is Useful to Normal Distribution Only ?

2001-08-21 Thread Dennis Roberts

At 06:16 AM 8/21/01 -0700, RFerreira wrote:
>The formula wich gives the Standard Deviation ,
>SD=((x-mean)^2/(n-1))^0.5 ,can be applied to Any data set. When we
>have that value we know two things about the set: The Mean and the SD.
>With this two values We can have one powerful intuitive use to them:
>The "centre" of the set is the mean and 68% of values are in the
>interval [mean-SD to mean+SD], IF the set have Normal Distribution. If
>we forecast the set distribution is Not Normal What intuitive use have
>the values?

well, maybe the 68% values may not be totally relevant but, remember, the 
SD is not just that ... but, a relative spread measure ... so, two 
distributions that are not normal ... if one has an SD of 6 and the other 
has an SD of 15 ...then the SD values still tell you something about the 
spread of scores and variability


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



groundbreak

2001-08-20 Thread Dennis Roberts

some might be interested in this ...

http://web.centredaily.com/content/centredaily/2001/08/18/news_local/18minitab.htm

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Venn diagram program?

2001-08-16 Thread dennis roberts

At 10:40 AM 8/17/01 +1000, Alan McLean wrote:
>You can draw Venn diagrams very easily in Powerpoint using the
>ellipse/circle and box/rectangle tools. Draw the diagram, group all the
>bits together, and copy it into Word or whatever.
>
>Whether it is 'publication quality' depends on your definition of htis
>term.
>
>Alan


actually, don's idea of overlapping squares seems to make the most sense 
... you can easily do this in one of the accessories in windows ... paint 
... squares are much easier to control as far as overlap goes ...

nice thing here too is that you can "fill" the overlapping area with color, 
etc.

since when you save it, it is a bmp file i think ... quality is good




==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Presenting results of categorical data?

2001-08-16 Thread Dennis Roberts

At 12:39 PM 8/16/01 +0100, Thom Baguley wrote:
>  For example, if a new drug is administered to a
>treatment group made up of serious cases and compared to a control
>group of mild cases obtaining more "cures" for the treatment group
>might be considered better evidence than a random sample.
>
>Thom

sorry ... i can't agree with this ...

it could be that in the "serious" cases ... there is a unidentifiable gene 
factor that INTERACTS with the treatment ... that is not available in the 
"mild" cases group (that's why you have serious and mild cases)  ... so, it 
is not the treatment that is doing this ... it is the presence or lack of 
presence of the gene factor

in the above ... you are trying to identify ... IF there is an effect, WHAT 
it is due to and, the design tendered above will not do that





>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=====

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Excel for simulating normal distribution

2001-07-29 Thread dennis roberts

again ... in something like minitab ...

MTB > rand 1 c1;  command is random ... say how many ... and where 
to put them
SUBC> norm 50 10. ... the subc just tells minitab what distribution to 
sample from ... from dozens

that's all there is to it THEN ... you can do what you want with the data

you can generate normal like data for the unit normal distribution too ... 
that is the default

mtb> rand 1 c1

that's it!

MTB > dotp c1

Dotplot: C1


Each dot represents up to 68 points
  .
   ::.
   ..::
 .:::...
   ..  ....
   -+-+-+-+-+-+-C1
   153045607590

MTB > desc c1

Descriptive Statistics: C1


Variable N   Mean Median TrMean  StDevSE Mean
C1   1 50.123 50.242 50.122  9.932  0.099

Variable   MinimumMaximum Q1 Q3
C1  13.465 84.291 43.416 56.764

MTB >


At 11:05 PM 7/29/01 +, David Winsemius wrote:
>I quite agree with those who said it would be easier in a real stats
>package. However, if you want a start, and feel that Excel is familiar
>ground, here goes. The rand() function will generate random numbers from a
>uniform distribution on the interval [0,1]. You can convert that to a
>randomly distributed set of numbers using the inverse normal function,
>=NORMSINV(.) .

==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



R values

2001-07-29 Thread dennis roberts

bet you think you know what these are ... do you?

http://community.webshots.com/photo/6073830/6073886GMgEftrGUI

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: confidence interval

2001-07-28 Thread dennis roberts

one way is:

1. convert sample r to Fisher's BIG Z (consult conversion table)
2. find standard error of Fisher's Z ... (find formula in good stat book)
3. for 95% CI ... go 1.96 standard error (from #2) units on either side of 
Z (from #1)
4. convert EACH end of the CI in Fisher Z units back to r values (use table 
from #1 in reverse)

At 05:28 AM 10/22/99 -0200, Alexandre Moura wrote:
>Dear members,
>
>how can I construct a confidence interval for a Pearson correlation?
>
>Thanks in advance.
>
>Alexandre Moura.
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

======
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: EXCEL

2001-07-27 Thread dennis roberts

At 03:41 PM 7/27/01 -0700, David Heiser wrote:
>Lets not knock EXCEL for statistics.
>
>Most of the responses are biased, because they don't have to pay the
>excessive cost of the software they are recommending. The EXCEL stat package
>comes with Microsoft OFFICE, so in many, many situations, the stat software
>is FREE. You can't beat that.

sure, but somebody IS paying for that ... it's a hidden fee called tuition 
... or, a new computer fee ... or something ... microsoft is not exactly 
giving this away ... totally of no cost to users ...

now, let me take the CON side of EXCEL ... and sure, i admit i am biased 
... but, it is not because of the excessive cost of software ... some 
decent stat packages that students can get ... either via student editions 
or, other programs from companies ... make getting an exceedingly better 
product ... for the price of almost any text used in an ONE accounting 
course ...

example: you can download minitab ... full package ... for free for 30 days 
... and, rent it for another 6 months ... for 26 dollars ... now, you might 
say well ... THEN what but, over that 7 months ... a student can easily see 
that a product designed specifically for statistical analysis ... is much 
better ... in all ways ... than this free (?) product ... that was never 
designed for statistical analysis in the first place

in addition ... there are better tools ONline ... FREE ... for doing all 
kinds of analyses ... than EXCEL

COST is a false issue!

in addition, the message this sends to students is bad ... and that is ... 
it is not important AS A PROFESSIONAL ... to attempt to get the best tools 
you can ... for your occupation ... personally i find this approach to be 
detrimental to the students' BEST long run interest

if a student from a business program were to land a job where statistical 
analysis was a major part of his or her job ... EXCEL will not cut it ... 
the potential employer will ask "... can you use SAS or SPSS ... or 
something comparable? " ... you canNOT say yes if all you have done is to 
use EXCEL

i don't deny that EXCEL is popular ... and, it comes built in at many 
schools ... or computers ... and, access is given to faculty and students 
"free" ... but, this is still not a sufficient reason to say that it is 
good enough to pass off as a full featured ... time tested ... professional 
product ... THAT we should be encouraging our students to learn to use AND 
adopt as part of their professional set of tools ...

of course, this is just my opinion ...

i liken using EXCEL for doing statistical analysis to saying that since 
"notepad" is built in to windows 98 etc ... that let's just teach it ... 
but, while it is easy to use ... and works fine for rather informal written 
communications ... it is highly limited when it comes to doing professional 
word processing ... notepad is to good word processing capabilities as 
EXCEL is to good statistical processing capabilities

1. graphics in EXCEL are poor and highly limited
2. data management is primitive
3. the integration of correlated techniques is highly limited
4. the utility of working with many variables and doing interrelated 
analyses is very cumbersome
5. and the number of analytical routines is sparse

and the list goes on




dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: web page to help use normal table

2001-07-27 Thread Dennis Roberts

i visited this page ... here are a few comments

1. you have 9 scenarios on the page but, really ... you don't need that many

area above/below a point
area between or beyond 2 points

so ... i would reduce the number of basic graphs that you present

2. if you reduce 1 above  then you can put BOTH the graphs AND the 
dialogs that visitors fill in on the same page ...

let's say that the problem is ... area above or below a point ...

you have a graph shown ... with mythical point shown ... but, visitor in a 
box ... enters their OWN z ... then you give back to them ... area below 
AND the area above ... kill two birds with one stone

for area between or beyond two points  same idea ...one graph ... 2 zs 
they input ... then, you return area BETWEEN and area BEYOND ...

3. also ... you need a few more WORDS to people as to what to do ... now, 
most will figure out you ENTER some z into the input box but ... it is not 
obviously clear that you do that ... so TELL them how to use it ... can't hurt

there are a number of these routines available in other places such as ...

http://members.aol.com/johnp71/javastat.html#Tables

At 01:30 PM 7/27/01 -0500, EAKIN MARK E wrote:


>I have just finished creating an ASP web page that will help students
>use a normal table that gives probabilities for ranges of the standard
>normal that start at 0 up to a Z value. If you wish to try it, go to
>
>http://www2.uta.edu/eakin/busa3321/normaltable/p2.asp
>
>I would be interested in feedback.
>
>Dr. Mark Eakin
>Informations Systems and Operations Management
>University of Texas at Arlington
>Arlington, Texas 76019
>email: [EMAIL PROTECTED]
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: independent, identically distributed

2001-07-26 Thread Dennis Roberts

At 09:27 AM 7/26/01 -0700, Philip Ackerman wrote:
>  Hello,
>
>What I do not understand, is why >>>>>>> individuals' heights<<<<<<< is a 
>random variable


instead of individual' s heights ...



if you change the wording abit above ... it might make more sense





>___
>Send a cool gift with your E-Card
>http://www.bluemountain.com/giftcenter/
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Excel

2001-07-26 Thread Dennis Roberts

At 12:51 PM 7/26/01 +0200, you wrote:
>Hi,
>
>For some stats explaining I need an excel sheet that calculates and graphs
>the normal(Bell)-shaped curve of some data set with a given average and a
>known stdev.
>It would be nice to demonstrate the calculation the chance of occurance of
>some given value in relation to the data set.
>
>Does any one have such an examplesheet? or knows how it is easely done?

try a good stat package ... like minitab ... these sorts of things are a 
"snap" (well, maybe a mouse click) with the proper tools

here are a few quickie things ... not all in the best graphics but, they 
could be

say you want to simulate a ND with mu = 100 and sigma = 16

NO problem

mtb> rand 1 c1;
subc> norm 100 16.
mtb> dotplot c1

and you get



  ..:...
  .::..
  ..:..
 ...:..
  +-+-+-+-+-+---C1
 255075   100   125   150

here are the desc stats on this batch of 1 cases

MTB > desc c1

Descriptive Statistics: C1


Variable N   Mean Median TrMean  StDevSE Mean
C1   1 100.15 100.11 100.15  15.97   0.16

Variable   MinimumMaximum Q1 Q3
C1   43.02 155.20  89.20 110.89

if you want the p values associated with different possible score values 
for the theoretical ND with mu = 100 and sigma = 16 ... easy to do ...

MTB > pdf c2 c3;
SUBC> norm 100 16.
MTB > prin c2 c3

Data Display


  Row IQprob

1 52   0.0002770
2 53   0.0003335
3 54   0.0003999
4 55   0.0004777
5 56   0.0005683
6 57   0.0006736
7 58   0.0007953
8 59   0.0009352
9 60   0.0010955
   10 61   0.0012783
   11 62   0.0014857
   12 63   0.0017201
   13 64   0.0019837
   14 65   0.0022788
   15 66   0.0026076
   16 67   0.0029721
   17 68   0.0033744
   18 69   0.0038163
   19 70   0.0042991
   20 71   0.0048242
   21 72   0.0053923
   22 73   0.0060038
   23 74   0.0066586
   24 75   0.0073561
   25 76   0.0080948
   26 77   0.0088731
   27 78   0.0096883
   28 79   0.0105371
   29 80   0.0114156
   30 81   0.0123191
   31 82   0.0132423
   32 83   0.0141792
   33 84   0.0151232
   34 85   0.0160671
   35 86   0.0170034
   36 87   0.0179242
   37 88   0.0188211
   38 89   0.0196858
   39 90   0.0205101
   40 91   0.0212855
   41 92   0.0220041
   42 93   0.0226583
   43 94   0.0232409
   44 95   0.0237457
   45 96   0.0241668
   46 97   0.0244994
   47 98   0.0247399
   48 99   0.0248852
   49100   0.0249339
   50101   0.0248852
   51102   0.0247399
   52103   0.0244994
   53104   0.0241668
   54105   0.0237457
   55106   0.0232409
   56107   0.0226583
   57108   0.0220041
   58109   0.0212855
   59110   0.0205101
   60111   0.0196858
   61112   0.0188211
   62113   0.0179242
   63114   0.0170034
   64115   0.0160671
   65116   0.0151232
   66117   0.0141792
   67118   0.0132423
   68119   0.0123191
   69120   0.0114156
   70121   0.0105371
   71122   0.0096883
   72123   0.0088731
   73124   0.0080948
   74125   0.0073561
   75126   0.0066586
   76127   0.0060038
   77128   0.0053923
   78129   0.0048242
   79130   0.0042991
   80131   0.0038163
   81132   0.0033744
   82133   0.0029721
   83134   0.0026076
   84135   0.0022788
   85136   0.0019837
   86137   0.0017201
   87138   0.0014857
   88139   0.0012783
   89140   0.0010955
   90141   0.0009352
   91142   0.0007953
   92143   0.0006736
   93144   0.0005683
   94145   0.0004777
   95146   0.0003999
   96147   0.0003335
   97148   0.0002770

MTB >



>With regards
>
>Wim
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-26 Thread Dennis Roberts


>
>1) Responses in the middle of the scale represent something
>*qualitatively* different from responses near the ends.  For example, if a
>particular issue isn't relevant or applicable to some of the
>subjects, they're likely to respond in the middle, but this doesn't mean
>the same thing as someone to whom the issue *is* relevant taking a
>neutral position.

bob frary wrote a nice paper many moons ago about developing simple surveys 
... and, had a nice section on the elusive ? or Neutral category ...

here is the link ... http://www.testscoring.vt.edu/fraryquest.html

there is other good stuff in the write up too


dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts


>here are a few videos of likert ...


http://ollie.dcccd.edu/mgmt1374/book_contents/3organizing/org_process/Likert.htm


_____
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

At 11:45 AM 7/25/01 -0700, John Uebersax wrote:
>If your items are visually anchored so as to imply equal spacing,
>like:
>
> +++++
> 01234
>   leastmost
>  possiblepossible


of course, likert did not use a numerical scale like this ... his were 
always like:

YES ? NO   or ... ... SD ... ?   SA 

to which one could respond to an item like:

i think that the fed needs to reduce the prime rate ...

SD to SA ... makes sense as possible reponses ... but, least possible to 
most possible do not ...

unless we rephrase the item to something like:

to what extent do you think the fed should reduce the prime rate???

very little .. very much .

but, then ... is it a real likert type of item i don't think so


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

inherent problems related to LICKert items and level of measurement that 
create problems would be these too

1. how many response categories are there for AN item??? by the way ... 
likert used many types ... including YES ? NO

at THIS level ... i think it a bit presumptuous to think that we are 
working with interval level measurement

2. what the labelling is FOR points ON an item ... i think it is easier to 
pretend the item level measurement is interval IF the scale is in terms of 
% agreement terms ... rather than SA ... SD kinds of response points

3. how MANY items there are ...

now, if you have FEW items ... with FEW points ... that are like SA ? SD 
... then at the item or total score level ... i think it is hard to assume 
interval level data ... if you have MANY items that each have NUMEROUS 
scale points ... that are framed differently that SA to SD ... then 
assuming interval level data is much more tenable ...






_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

for a good treatment of this issue ... levels of measurement and statistics 
to use ... though, it is not real simple ...

see

ftp://ftp.sas.com/pub/neural/measurement.html

warren sarle of SAS wrote this and, it is excellent

forget about scales and statistics for a moment ... what kinds of 
STATEMENTS do you want to be able to make ... about measurement variables 
... THAT is the real issue ... and whether you should pay attention to 
levels of measurement and statistics ...

At 09:18 AM 7/25/01 -0700, Alex Yu wrote:

>The following is extracted from one of my webpage. Hope it can help:



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: output

2001-07-25 Thread Dennis Roberts


>perhaps we need for software to have 2 overall options ... show me all the 
>output


or, in the case of some interaction plots ... find a graphing method ... 
using different symbols ... that represent ON the graph ... pairs that are 
different from others (ie, any pair of DARK dots means different ... 
whereas a DARK dot and a LIGHT dot ... mean no) ... if we have adopted some 
pre set alpha ...

or, a little table FIRST in the output ... that simply lists the 
combinations ... and says next to them ... YES ... NO ... without all the 
other "peripherals" included ...


_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: output

2001-07-25 Thread Dennis Roberts

At 12:04 PM 7/25/01 -0400, Donald Burrill wrote:
>On Wed, 25 Jul 2001, Dennis Roberts wrote (edited):
>
> > For a class I used an example from moore and mccabe, a 2 factor anova
> > with 4 levels of factor A, 4 levels of factor B, a completely
> > randomized design, n=10 in each of the 16 cells.
> > Now, after the data are [conveniently arrayed], it is easy to get a
> > nice graph and do the anova, [finding] one main effect and a
> > significant interaction:  graph = 1 page, anova output = part of 1
> > page.
> >
> > Now, if you wanted to do some multiple comparisons (say, the tukey
> > test) there is an option in the minitab glm command to do this
>
>But why would you WANT to?  Looks like a knee-jerk reaction to me.
>Surely the graphical output (although you haven't described it), if
>it's useful at all, shows the shape(s) of the main effect and of the
>interaction, and would lead you to want to make certain comparisons --
>and not others.


sure ... but, graphs can be deceiving ... when ns are small in some cases 
... thus, "speculating" on where differences of interest might be ... is 
not THAT easy to do ...

and of course, we have "taught" them that it is statistical SIGNIFICANCE 
that is key, right???

i don't think mine was a knee jerk reaction ... if you find some 
significant Fs ... which might call for either some planned comparisons or 
multiple comparisons ... there still is a lot of MORE work to do ...

my ONLY point was ... that, as packages have become newer and fancier ... 
the output gets progressively (maybe AGGRESSIVELY would be a better term) 
more voluminous ... and, that makes sifting the wheat from the chaff harder 
for folks to do ... especially students who are being encouraged to use 
software for their analyses

really, that was the only point i was making ... nothing DEEPER than that



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



output

2001-07-25 Thread Dennis Roberts

for a class ... i used an example from moore and mccabe ... a 2 factor 
anova case ...

4 levels of factor A ... 4 levels of factor B ... completely randomized 
design ... n=10 in each of the 16 cells

now, after the data are stacked so that data are in a column and codes for 
the two independent variables are in TWO other columns ... it is easy to 
get a nice graph ... and do the anova  which yielded one main effect 
and a significant interaction

graph = 1 page ... anova output = part of 1 page ...

now, if you wanted to do some multiple comparisons ... say, the tukey test 
... there is an option in the minitab glm command to do this 

think of it ... 16 means ... all possible comparisons ... and minitab not 
only produces (which i like) confidence intervals but ... all possible t 
test statistics ...

THAT TOOK AND YIELDED ... 12 pages of output!

reading statistical output these days is really complicated due (partly) to 
THAT ... the volume of possible output becomes huge ... hence, the 
confusion factor of reading (heaven forbid ... understanding!) what is 
there drastically increases



_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: vote counting

2001-07-25 Thread Dennis Roberts

At 09:33 AM 7/25/01 -0400, Sanford Lefkowitz wrote:
>In a certain process, there are millions of people voting for thousands
>of candidates. The top N will be declared "winners". But the counting
>process is flawed and with probability 'p', a vote will be miscounted.
>(it might be counted for the wrong candidate or it might be counted for
>a non-existent candidate.)


could you elaborate on a real context for something like this? sure, in 
elections, millions of people vote for thousands of candidates BUT ... 
winners are not determined by the top N # of votes across the millions ... 
for example ... in utah ... the winner might have a very SMALL SMALL 
fraction of millions ... but, in ny state ... a LOSER might have a very 
LARGE fraction of the millions

so, a little more detail about a real context  might be helpful



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: likert scale items

2001-07-25 Thread Dennis Roberts

At 07:26 AM 7/25/01 -0400, Teen Assessment Project wrote:
>I am using a measure with likert scale items.  Original psychometrics
>for the measure
>included factor analysis to reduce the 100 variables to 20 composites.
>However, since the variables are not interval,  shouldn't non-parametic
>tests be done to determine group differences (by gender, age, income) on
>the variables?

what were you assuming about the variables when you did a factor analysis 
on them???

>  Can I still use the composites...was it appropriate to
>do the original factor analysis on ordinal data?
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



RE: SRSes

2001-07-24 Thread dennis roberts

my hypothesis of course is that more often than not ... in data collection 
problems where sampling is involved AND inferences are desired ... we goof 
far more often ... than do a better than SRS job of sampling

1. i wonder if anyone has really taken a SRS of the literature ... maybe 
stratified by journals or disciplines ... and tried to see to what extent 
sampling in the investigations was done via SRS ... better than that ... or 
worse than that??? of course, i would expect even if this is done ... we 
would have a + biased figure ... since, the notion is that only the 
better/best of the submitted stuff gets published so, the figures for all 
stuff that is done (ie, the day in day out batch), published or not ... 
would have to look worse off ...

2. can worse than SRS ... be as MUCH worse ... as complex sampling plans 
can be better than SRS??? that is ... could a standard error for a bad 
sampling plan (if we could even estimate it) ... be proportionately as much 
LARGER than the standard error for SRS samples ... as complex sampling 
plans can produce standard errors that are as proportionately SMALLER than 
SRS samples? are there ANY data that exist on this matter?


==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: SRSes

2001-07-24 Thread Dennis Roberts

At 03:55 PM 7/24/01 -0400, Donald Burrill wrote:
>Hi, Dennis!
> Yes, as you point out, most elementary textbooks treat only SRS
>types of samples.  But while (as you also point out) some more realistic
>sampling methods entail larger sampling variance than SRS, some of them
>have _smaller_ variance -- notably, stratified designs when the strata
>differ between themselves on the quantity being measured.

sure ... i know that

(then i said) ... but, we KNOW that most samples are drawn in a way that is 
WORSE than SRS



and you responded

>I don't think _I_ know this.  I know that SOME samples are so drawn;
>but (see above) I also know that SOME samples are drawn in a way that
>is BETTER than SRS (where I assume by "worse" you meant "with larger
>sampling variance", so by "better" I mean "with smaller sampling
>variance").

i think we do know this ... if you enumerate all the situations you know of 
where sampling from some larger population has been done ... i would bet a 
dollar to a penny that ... the sampling plan is WORSE than SRS  ... so, i 
would suggest that the NORM is worse ... the exception is SRS or better

i don't think books spend nearly enough time ... on the fact that most day 
in day out samples are taken in a pretty pathetic way ...


>I perceive the "basic problem" as the fact that sampling variance is
>(relatively) easily calculated for a SRS, while it is more difficult
>to calculate under almost _any_ other type of sampling.

sure ... but, books ONLY seem to discuss the easy way ... and i do too ... 
because it seems rather straight forward ... but, given time constraints 
... it never goes further than that ...

>  Whether it is enough more difficult that one would REALLY like to avoid
>it in an elementary course is a judgement call;  but for the less
>quantitatively-oriented students with whom many of us have to deal, we
>_would_ often like to avoid those complications.  Certainly dealing with
>the completely _general_ case is beyond the scope of a first course, so
>it's just a matter of deciding how many, and which, specific types of
>cases one is willing to shoehorn into the semester (and what "previews
>of coming attractions" one wishes to allude to in higher-level courses).

however, we do become sticklers for details ... and force students to use 
the correct CVs, make the right CIs, ... do the t tests correctly ... and 
heaven forbib if you get off a line or two when reading off the values from 
the t table ...


>Seems to me the most sensible "adjustment" (and of a type we make at
>least implicitly in a lot of other areas too) is
>  = to acknowledge that the calculations for SRS are presented
>(a) for a somewhat unrealistic "ideal" kind of case,

i would stress ... really unrealistic ...

>(b) to give the neophyte _some_ experience in playing this game,

and then leave them hanging

>Some textbooks I have used (cf. Moore, "Statistics:  Concepts &
>Controversies" (4th ed.), Table 1.1, page 40) present a table giving the
>margin of error for the Gallup poll sampling procedure, as a function of
>population percentage and sample size.  Such a table permits one to show
>how Gallup's precision varies from what one would calculate for a SRS,
>thus providing some small emphasis for the cautionary tale one wishes to
>convey.

but ... in moore and mccabe ... the stress throughout the book ... is on 
SRSes ... and no real mention is made nor solutions to ... the problems 
that it will be a rare day in analysis land ... for the typical person 
working with data ... to be doing SRS sampling ...
it's just not going to happen

the bottom line, IMHO, is that we glide over this like it is not a problem 
at all ... when we know it is


>  ------------
>  Donald F. Burrill [EMAIL PROTECTED]
>  184 Nashua Road, Bedford, NH 03110  603-471-7128

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



SRSes

2001-07-24 Thread Dennis Roberts

most books talk about inferential statistics ... particularly those where 
you take a sample ... find some statistic ... estimate some error term ... 
then build a CI or test some null hypothesis ...

error in these cases is always assumed to be based on taking AT LEAST a 
simple random sample ... or SRS as some books like to say ...

but, we KNOW that most samples are drawn in a way that is WORSE than SRS ...

thus, essentially every CI ... is too narrow ... or, every test statistic 
... t or F or whatever ... has a p value that is too LOW ...

what adjustment do we make for this basic problem?

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Confidence interval for Pearson Correlation

2001-07-20 Thread Dennis Roberts

one way ... it is done by transforming the r value to a Fishers big Z ... 
then building the CI around that (there is a stan error of the big Z) ... 
then, finally reconverting the end Z points back to r values ...

At 02:02 AM 10/22/99 -0200, Alexandre Moura wrote:
>Dear Members,
>
>How can I construct a confidence interval about Pearson correlation using
>standard error and t value? What is the formula?
>
>Regards,
>
>Alexandre Moura.
>
>
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: statistical similarity of two text

2001-07-19 Thread Dennis Roberts

At 04:21 PM 7/19/01 -0400, Rich Ulrich wrote:
>On 18 Jul 2001 03:41:57 -0700, [EMAIL PROTECTED] (Donald Burrill)
>wrote:
>
> > On Tue, 17 Jul 2001, Cantor wrote:
> >
> > > Does anybody know where I can find program on the website which [can]
> > > compare two texts/articles and settle whether or not they are similar
> > > assuming any significant level.
>DB >
> > Sorry, Cantor:  this is not possible, in general.


some try however ... see http://www.turnitin.com ... a plagiarism detection 
company


_____
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: [Q] testing the (bio)statistics minor

2001-07-19 Thread Dennis Roberts

the extent to which the "minor" is prescribed, then such an exam should not 
be that difficult to create ... but, to the extent to which a minor can be 
any number of combinations of courses ... in stat OR biostat ... with NO 
fixed core ... then, i think it becomes more impossible ...

now, for sure ... if there is an exam ... even if there is no fixed core of 
courses ... sooner or later there WILL be a fixed core ... since, the TEST 
then will be driving the selection of courses on the part of the students



At 01:45 PM 7/19/01 +, you wrote:
>Some graduate programs in the (social) sciences require their
>students to take a minor in statistics or biostatistics.



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression to the mean,Barry Bonds & HRs

2001-07-17 Thread dennis roberts

At 04:08 PM 7/17/01 -0400, Rich Ulrich wrote:

>But, so far as I have heard,  the league MEANS stay the same.
>The SDs are the same.  There is no preference, that I have ever
>heard, for records to be set by half-season, early or late, team
>or individual.  My guess is that association between "talent"
>and "winning"  (or hitting, or pitching, etc.) remains the same.

but, individuals are not half tests ... and, the mean # of homers in the 
second half is not the same (except by coincidence) to the # of home runs 
in the second half ... the original post was not about TEAM stats ... which 
are fixed in the sense that if one team loses alot ... another team wins 
alot ...

every individual player could get better the 2nd half of the season ... or, 
get worse ... in terms of batting ... that is not necessarily true of 
pitchers ... all pitchers can't really improve their w/l records ...

thus ... it depends on the stat you are interested in ...



==========
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression to the mean,Barry Bonds & HRs

2001-07-16 Thread dennis roberts

regression to the mean has NOTHING to do with raw numbers ... it ONLY has 
to do with relative location withIN a distribution

example: i give a course final exam the first day ... and get scores (on 
100 item test) from 10 to 40 ... and an alternate form of the final on the 
last day ... and get scores 50 to 95 ... now, the LOWest score on the final 
final is higher than the highest score on the first final ... ie, everyone 
does at least better than the BEST first go round. the regression to the 
mean issue is ... where are the ones who got close to 40 on the first test 
... withIN the posttest distribution??? generally, we see some relative 
dropping of POSITION ... overall ... on average ... but not in every single 
case .. for example the one who got 40 on the first final could have 
obtained THE 95 on the post final

the issue that has to be raised with respect to the baseball example is ... 
are the two halves PARALLEL HALVES? ... like, parallel tests given at 
essentially the same time? well, no, they are not. for parallel tests ... 
if i give form A now and ... and hour later ... form B ... then there is no 
reason to expect the distributions of A and B to be much different ... nor, 
the individual examinees to earn much different scores on form A and form B

this is NOT the case in baseball ... or any sport where there is some 
arbitrary division of 1/2 of the season versus the other half of the season ...

these are NOT like parallel tests given to the same examinees ... because 
of many factors ...

1. weather is different in second 1/2 than first 1/2
2. players get injured differentially across the halves ... may not be in 
first half but is in the second half (or vice versa)
3. the TEAMS you play may not balance out during the second half the same 
way they did in the first half (better competition or worse competition)

4. etc. etc.

too many people forget that regression to the mean is in terms of position 
 NOT actual score values ... and only in the case where the 2 
distributions are the same ... AND there is less than perfect r between the 
two sets ... will there generally be not only a drop in relative position 
(if you were at the top on first measurement) but, there will be a downward 
change in score too ... and the reverse at the bottom ... BUT, unless both 
conditions have been met ... then downward shifting in position could just 
as likely be connected with an INCREASE in score ...

and in any case ... regression to the mean is just (as was said) a 
description of what happens ... and not some explanation for a root cause ...

At 08:26 AM 7/16/01 -0400, Paige Miller wrote:
>EugeneGall wrote:
> >
> > Jordan Ellenberg, in today's Slate, PROVES that Bonds won't break the
> > HR record because of regression to the mean.  The argument is a
> > little sloppy, but there is definitely some RTM involved:
> >   "If our discussion above is correct, then hitters who
> >lead the major leagues in home runs at the All-Star break should
> >tend to decline in the second half of the season. ...
> >Of the  74 hitters involved (there are more hitters than years 
> because of
> >ties) only 12 equaled their pre-break production in the second
> >half...
> > I'd be curious if reduction in the 1st half leaders was comparable to the
> > improvement in the 2nd half leaders.
> > The link:
> > http://slate.msn.com/math/01-07-12/math.asp
>
>This hardly "PROVES" anything. It is more a statement about what has
>happened in the past. Most people, including myself and probably the
>author you quote, believe it is likely to happen in the future for the
>exact same reasons as it did in the past.
>
>So Bonds will "tend to decline" in the 2nd half...and lets see, the
>record is 70, so if Bonds declines from 38 in the first half to 33 in
>the second half, well...there's a new record. I didn't read the article
>at SLATE, but based upon the quote you provide, you have gone way beyond
>what the author intended with that quote.
>
>--
>Paige Miller
>Eastman Kodak Company
>[EMAIL PROTECTED]
>
>"It's nothing until I call it!" -- Bill Klem, NL Umpire
>"When you get the choice to sit it out or dance,
>I hope you dance" -- Lee Ann Womack
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=

_
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



Re: Regression to the mean,Barry Bonds & HRs

2001-07-13 Thread dennis roberts

the real question is ... which ONES???

At 12:26 AM 7/14/01 +, EugeneGall wrote:
>Jordan Ellenberg, in today's Slate, PROVES that Bonds won't break the
>HR record because of regression to the mean.  The argument is a
>little sloppy, but there is definitely some RTM involved:
>   "If our discussion above is correct, then hitters who
>lead the major leagues in home runs at the All-Star break should
>tend to decline in the second half of the season. ...
>Of the  74 hitters involved (there are more hitters than years because of
>ties) only 12 equaled their pre-break production in the second
>half...
>I'd be curious if reduction in the 1st half leaders was comparable to the
>improvement in the 2nd half leaders.
>The link:
>http://slate.msn.com/math/01-07-12/math.asp
>
>
>=
>Instructions for joining and leaving this list and remarks about
>the problem of INAPPROPRIATE MESSAGES are available at
>   http://jse.stat.ncsu.edu/
>=========

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/drober~1.htm



=
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
  http://jse.stat.ncsu.edu/
=



<    1   2   3   4   5   6   >