Re: What is standard deviation exactly?

2000-05-22 Thread Paul Gardner

Glen Barnett wrote:
 
  In article [EMAIL PROTECTED],
  Neil  [EMAIL PROTECTED] wrote:
  I was wondering what the standard deviation means exactly?
 
 I've seen the equation, etc., but I don't really understand
 what st dev is and what it is for.
 
 I'm going to take a different tack to that Herman has taken.
 If I tell you what you already know, my apologies.
 
 I assume you're talking about sample standard deviations,
 not population standard deviations (though interpretation
 of what it represents is similar).
 
 Standard deviation is an attempt to measure how "spread out"
 the values are - big standard deviation means more spread out,
 small standard deviation means closer together. A standard
 deviation of zero means all the values are the same.
 
 Note that the standard deviation can't exceed half the range
 (largest value minus smallest value).
 
 Standard deviation is measured in the original units. For example,
 if you record a set of lengths in mm, their standard deviation is in mm.
 
 There is a huge variety of reasonable measures of spread.
 Standard deviation is the most used. You will get more of
 a feel for the standard deviation if you compare what it
 does to some other measures of spread.
 
 For example, another common measure is the mean deviation -
 the average distance of observations from the mean. By contrast,
 standard deviation is the root-mean-square distance from the mean
 (as you can see from the formula**).
 
 ** At least the n-denominator (maximum likelihood) version is the
 root-mean-square deviation; the n-1 denominator is just a constant
 times that.
 
 This squaring puts relatively more weight on the larger deviations,
 and less weight on the smaller deviations than the mean deviation,
 but it is still a kind of weighted average of the deviations from the
 mean.
 
 Here's a quick (tiny) example to help illustrate some of the points
 (I am using the n-1 version of the standard deviation here):
 
 Sample 1: 4, 6, 7, 7, 8, 10
 Mean = 7, mean deviation = 4/3 = 1.333..., std deviation=2
 
 Sample 2:   1, 5, 7, 7, 9, 13
 Mean = 7, mean deviation = 8/3 = 2.666..., std deviation =4
 
 Note that Sample 2's values are more 'spread out' than sample 1's,
 and both of the measures of spread tell us that.
 
 Standard deviation is used for a variety of reasons - including the
 fact that it is the square root of the variance, and variance has
 some nice properties, both in general and also particularly for
 normal r.v.'s, but s.d. is measured in original units.
 
 Glen
 
This is a useful summary: I'd just like to add one point to it.  People
sometimes ask, which measure of spread is "best"?  Or, why use standard
deviation, it seems more complicated than simpler statistics such as
mean average deviation.  Various measures of spread are useful for
different purposes, but the real strength of s.d. is that many other
statistical concepts are built upon it.  Thus s.d. underpins the notion
of a standard (z) score, z score underpins the definition of Pearson
product-moment correlation, and hence linear regression; s.d. squared is
variance, and this underpins the variance theorem, analysis of variance,
F-ratio etc. etc.  Thus it's a "big idea", a substantive concept in the
structure of statistics, in a way that other measures of spread aren't.

There are parallels to this in other branches of science and
mathematics.  Mass times velocity (momentum) is a useful concept,
because it enters into relationships with other concepts.  So does
(1/2)m v-squared (kinetic energy).  But no one uses mass per unit
velocity, or mass times the square root of velocity, or m v-cubed,
because (as far as I know) these concepts don't enter into any
relationships which are useful for describing aspects of the world.

Paul Gardner


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: obsolete methods?

2000-05-17 Thread Paul Gardner

[EMAIL PROTECTED] wrote:
 
 I have been looking for resources on attitude scale construction. The
 methods I have been looking at are things like paired comparisons and
 successive intervals. The strange thing about finding descriptions of
 these methods is that the only book I can find in print is *Techniques
 of Attitude Scale Construction* by Edwards (1957?). In fact, it seems
 that nearly all the standard references on these statistical methods
 were published in the fifties or before.


Summated rating scales (e.g. Likert scales with Strongly Agree/Agree/Not
Sure/DisAgree/Strongly Disagree responses to opinion statements)
analysed by conventional item analysis and factor analysis methods
remain in common use for situations where "objective" data is to be
obtained from large samples.  However, such scales are limited in that
they cannot be used to probe individuals' meanings, perceptions,
personal experiences etc.  I advise my students to use a combination of
methods in order to get various lines of evidence about people's
attitudes.  For example, one of my students, a nurse educator, developed
a four-scale Likert instrument on nursing students' attitudes to the
elderly, and also used interviews and participant observation during
field placement.

References in the area didn't stop with Edwards 1957!  Some later texts
that I can recommend are Robert de Vellis, Scale Development: Theory and
Applications (Sage, 1991, strong on the psychometrics), Robert Gable,
Instrument Development in the Affective Domain (Kluwer-Nijhoff, 1986,
good on both psychometrics and scale development methods) and the
revised edition of the classic A.N. Oppenheim, Questionnaire Design,
Interviewing and Attitude Measurement (Pinter, 1992, emphasis on various
qualititative and quantitative data-gathering techniques, not on the
psychometrics).

Hope this is helpful,

Paul Gardner 
 Does anyone know what happened? Did these methods go out of style
 bacause they were superceded?
 
 Regards,
 Tom
 
 Sent via Deja.com http://www.deja.com/
 Before you buy.
 
 ===
 This list is open to everyone.  Occasionally, less thoughtful
 people send inappropriate messages.  Please DO NOT COMPLAIN TO
 THE POSTMASTER about these messages because the postmaster has no
 way of controlling them, and excessive complaints will result in
 termination of the list.
 
 For information about this list, including information about the
 problem of inappropriate messages and information about how to
 unsubscribe, please see the web page at
 http://jse.stat.ncsu.edu/
 ===


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: Number of factors to be extracted

2000-05-02 Thread Paul Gardner

I would add another criterion, which is qualitative, and therefore not
reducible to a quantitative rule:

3. Use your professional judgement.  Does the pattern of factor loadings
make sense?  For example, if the variables are item scores on a
multi-dimensional instrument, can you see a meaningful connection among
the items which load highly on a particular factor?

The "eigen-value greater than 1" criterion is very arbitrary, and in
interpreting a factor analysis matrix of item scores, I often discard
numerous factors which meet the eigen-value criterion but fail to make
any sense when I apply my judgement to the pattern of loadings.

I can reduce all this to a single maxim: Factor analysis is an art as
well as a science.

Paul Gardner


Alex Yu wrote:
 
 There are several rules. The most popular two are:
 
 1. Kasier criterion: retain the factor when eigenvalue is larger than 1
 2. Scree plot: Basically, it is eyeballing. Plot the number of factors
 and the eigenvalue and see where the sharp turn is.
 
 Hope it helps.

 Chong-ho (Alex) Yu, Ph.D., CNE, MCSE

 
 On Tue, 2 May 2000 [EMAIL PROTECTED] wrote:
 
  Would any of you know a rule of thumb for selecting the proper (of
  optimal) number of factors to be extracted from a factor analysis.
  Also, how many variables can there be in such factor (is two variable
  in one factor not enough?).


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: split half reliability

2000-04-17 Thread Paul Gardner

Paul R Swank wrote:
 
 I disagree with the statement that the split-half reliability coefficient
 is of no use anymore. Coefficient alpha, while being an excellent estimator
 of reliability, does have one rather stringent requirement. The items must
 be homogeneous. This is not always the case with many kinds of scales, nor
 should it be. In many cases homogeneity of item content may lead to reduced
 validity if the consruct is too narrowly defined. Screening measures often
 have this problem. They need to be short but they also need to be broad in
 scope. Internal consistency for such scales would suffer but a split half
 procedure, which is much less sensitive to item homogeneity, would fit the
 bill nicely.

I have four responses to this:

1. Split-half requires the items to be divided into two "equal" halves. 
How is this to be done?  Odd/even?  First half/second half?  Randomly?
Cronbach's alpha does not depend on this arbitrary division into halves.

2. Stanley and Hopkins (1972) demonstrated that Cronbach's alpha was
essentially equivalent to the "mean of all possible split-half
reliability estimates". DeVellis (1991) demonmstrates that if the items
in a scale have similar variances (a condition frequently met in
well-designed scales), it can be shown that the value of alpha (called
standardised alpha) is algebraically equivalent to the Spearman-Brown
formula for estimating split-half.  In other words, there is no great
difference conceptually between the two.

3.  Many writers use the term 'homogeneity' to bolster arguments in
discussions of reliability and validity.  In a paper I have completed
recently which is currently under review for publication, I show that
the term has about six different meanings in the literature.  Whenever I
read the word now, I respond, What exactly does the writer mean by
homogeneity here?

4.  If, by homogeneity, you mean all the items are measuring a similar
construct, i.e. the item scores all inter-correlate with each other
because they are indicators of a unidimensional construct, then the
assertion that Cronbach's alpha depends on being this being the case is
demonstrably untrue.  Cronbach's alpha will be high as long as every
item in a scale correlates well with at least some other items, but not
necessarily all of them. Homogeneity is not a "stringent requirement"
for a high Cronbach alpha level at all.  Cronbach's alpha is simply a
measure of reliability;  it is not an indicator of unidimensionality, a
point widely misunderstood in the literature.

Paul Gardner


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: Split half coefficient?

2000-04-16 Thread Paul Gardner

busker wrote:
 
 I'm completely new to statistics but am putting together a
 customer satisfaction survey, thanks to which I am daily
 becoming fascinated by my whole new world of Means and
 Medians and Variabilities and Variances, and so forth. I am
 told that certain "duplicate" questions are sometimes put
 in to test the consistency/'truthfulness' of a respondent's
 answers,and that these 'check' questions are called split
 half coefficients (or thereabouts). But i find no reference
 in the text books I'm poring over. Can anyone enlighten me?
 I hope I've explained myself correctly and, if not, that I
 cn be pointed on the right track: I know how vital it is to
 have the correct terms in this business.
 
Chris:
The split-half coefficient was invented in the early years of the 20th
century as a way of checking the internal consistency of a measurement
scale. One takes half the items in a scale (say the odd numbered items)
and scoresd their total, and then correlates this with the score on the
other half of the scale.  An adjustment is then made to correct for the
shortened length of the scale by taking only half the items.  Nobody
bothers with this any more;  the procedure has been superseded by the
more convenient Cronbach's alpha coefficient.  Neither of these
statistics is directly concerned with the issue you raise, namely that
of having repeated items in order to check whether an individual
respondent is answering the same question consistently.

You won't find these concepts discussed in books on basic statistics. 
Look instead for books on educational and psychological measurement. 
You local university library should be able to help.

Paul Gardner


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: What to study next

2000-03-01 Thread Paul Gardner

I agree with with Rich Ulrich's comments, but bear in mind I was only
answering the original query, which was for a good text.  I find
Diekhoff useful as an additional reference in my introductory stats
course (it's not the text I use, which is Runyon, Haber, Pittenger and
Coleman.)

Diekhoff's actual decision trees occupy less than four pages of a 23
page chapter.  The decision trees are elaborated with extensive
discussion of the purpose of an analysis, the nature of the research
question, the number of variables involved, the kind of data
collected...  Very few statistics texts contain this kind of material.

As I teach my course (one semester, 13 3-hour sessions) I continually
link new statistical tests to the Diekhoff decision tree.  I also give
homework and run a workshop in which students are given a wide variety
of research scenarios (sample data, explicit or implicit research
questions) and ask them to consider which statistical test or tests
would be appropriate.  Obviously this doesn't instantly turn all my
students into expert designers and statisticians, but they certainly
display good competence in getting the majority of tasks right.  This is
in marked contrast to students who have never been asked to do such
things, and learn stats by simply doing exercises from textbooks and who
have never been asked to decide on appropriate procedures when given an
unfamiliar scenario.

Since my course is only an introduction, we cover only a limited number
of statistical procedures, and obviously there are dozens or hundreds of
others.  But I think the procedure I use encourages the students to read
and reflect on research situations, and frame the question, "How might
this research question be answered?"

Paul Gardner


Rich Ulrich wrote:
 
 [rearranging this note, to put the posts into order, earliest first. ]
 
   At 07:44 AM 02/29/2000 -0800, Ward Soper wrote:
   After one learns to do the textbook problems, as in Freund's
   Mathematical Statistics, where should one turn to learn what tests to
   use in various situations and how to design studies?  Can anyone suggest
   some good texts or other resources?
 
 ===
  dennis roberts wrote:
  
   william trochim's research methods knowledge base is a good place to start
   ... to get ideas
  
   http://trochim.human.cornell.edu/kb/
 
 
 On 29 Feb 2000 17:48:07 -0800, [EMAIL PROTECTED]
 (Paul Gardner) wrote:
 
  George. M. Diekhoff, Basic Statistics for the Social and Behavioral
  Sciences, Prentice Hall, 1996, has an excellent chapter at the end which
  presents a decision tree.  This summarises the various statistical
  procedures in the text and helps learners to determine which statistics
  are appropriate under various conditions.
 
 =
  - Pardon; I haven't seen Diekoff, but 'decision tree' sounds too
 cheap.  There is certainly a place for a mechanical framework of tests
 and procedures;  but I read the original question as less particular
 than that, and more general ("how to design studies"); and the first
 answer, that way, too.
 
 An enormous decision tree may give the right technical answer to 100%
 of the narrow questions, but -- since it takes knowledge to frame the
 right question -- that will be a misleading answer, I would guess, for
 1/3 of the naive questioners, at least.  People just can't tell you
 what they never thought to ask, concerning
   'reliability' (of  various kinds);
   'dependence' (ditto);
   'shape of the distribution';
   'outliers'; and
   'What numbers are meaningful when we use this measurement?' or,
 'What transformations might be useful?'
 
 (I am still answeriing the big question, Why can't a computer give us
 all the stats advice that we need?  So far, no one has programmed a
 computer with 10,000 well-classified examples)
 
 If they have not learned the whole statistical vocabulary, they won't
 be able to argue persuasively that their own answers are correct.  And
 you can't thoroughly learn the vocabulary until you are expert enough
 to know something about all the available techniques.
 
 In addition to the statistics, there are particular problems in each
 area about their own sorts of statistical designs.  To learn what to
 do in various situations, I think you have to *read*, you have to be
 exposed to a large number of various situations.  You have to read
 some good examples, and you have to read criticisms which include
 examples that were not-so-good.
 
 --
 Rich Ulrich, [EMAIL PROTECTED]
 http://www.pitt.edu/~wpilib/index.html
 
 ===
 This list is open to everyone.  Occasionally, less thoughtful
 people send inappropriate messages.  Please DO NOT COMPLAIN TO
 THE POSTMASTER about these messages because the postmaster has no
 way of controlling them, and excessive complaints will result in
 termination of the list.
 
 For information about this list, includi

Bonferroni

2000-02-27 Thread Paul Gardner

Bonferroni, a technique for dealing with the problem of increasing the
chance of making Type I errors when multiple comparisons are made, works
by changing the alpha-level.  I'll use the symbol  for alpha.

Step 1.  Find n, the number of possible comparisons when the means of k
groups are to be compared.   n = k(k-1)/2
For example, if 4 groups are to be compared, n = 6

Step 2.  Find *, the adjusted alpha level:
* = /n
For example, if  = .05 and n = 6, * = .008

Then make multiple comparisons using the t-test, look up tables of
significance at the * level (you may need to interpolate or
approximate) and then claim significance only at the  level.  In this
example. .008 is approximately equal to .01.

Other techniques for dealing with the multiple comparisons problem are
the Scheffe procedure and the Tukey HSD (Honestly Significant
Difference) test.

Paul Gardner


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: 50 random envelopes/people

2000-02-16 Thread Paul Gardner

Pez Boy wrote:
 
 My statistics textbook mentioned the following problem: "A secretary addresses
 50 different letters and envelopes to 50 different people, but the letters are
 randomly mixed before being put into envelopes. What is the probability that at
 least one letter gets into the correct envelope?" It said that the probability
 was 0.632 but simply said that the solution was "way beyond" the scope of the
 text and did not give a place to look for further information. Could someone
 explain how to find this result or point me to a web site that explains it?
 
A wise Indian mathematician who visited our faculty a few years ago gave
me some very useful advice, a good principle for problem-solving and
lateral thinking.  "To solve a complicated problem, try solving a simple
problem first."

So, a hint:  Try calculating the probability that every letter finishes
up in a WRONG envelope, by finding (a) the number of ways this can
happen and (b) the total number of different ways all the letters could
be placed in all the envelopes.  Call this probability pw.

(1 - pw) must therefore be the probability of every other combination,
i.e. the probability that at least one letter is in its correct
envelope.

(Dr) Paul Gardner,
Director,
Research Degrees


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;-29488
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University,  Vic. Australia 3800
end:vcard



Re: Scale Reliability

1999-12-16 Thread Paul Gardner

[EMAIL PROTECTED] wrote:
 
 The fact that the shorter scale has low internal consistency doesn't necessarily
 mean that the 4 items in question are not unidimensional.  It may just be that
 the measurement error is large relative to their covariance.  Given that the
 four items in question are drawn from a scale with established internal
 consistency, I'd suspect they probably are measuring the same thing - only not
 measuring it very well.
 
 purnima.
 
No, there is a flaw in the logic here.  If a scale has "established
internal consistency" (usually based on a high Cronbach alpha value), a
researcher CANNOT conclude that the items are "measuring the same
thing".  All it takes for alpha to be high is that each item correlates
well with at least some other items, but not necessarily with all of
them.  Alpha is a good indicator of the relative freedom of the items in
a scale from random measurement error.  It is NOT a sound indicator of
unidimensionality.  The misconception that it is such an indicator is
widespread.

Paul Gardner


begin:vcard 
n:Gardner;Dr Paul
tel;cell:0412 275 623
tel;fax:Int + 61 3 9905 2779 (Faculty office)
tel;home:Int + 61 3 9578 4724
tel;work:Int + 61 3 9905 2854
x-mozilla-html:FALSE
adr:;;
version:2.1
email;internet:[EMAIL PROTECTED]
x-mozilla-cpt:;27600
fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Clayton, Vic. Australia 3800
end:vcard