Re: What is standard deviation exactly?
Glen Barnett wrote: In article [EMAIL PROTECTED], Neil [EMAIL PROTECTED] wrote: I was wondering what the standard deviation means exactly? I've seen the equation, etc., but I don't really understand what st dev is and what it is for. I'm going to take a different tack to that Herman has taken. If I tell you what you already know, my apologies. I assume you're talking about sample standard deviations, not population standard deviations (though interpretation of what it represents is similar). Standard deviation is an attempt to measure how "spread out" the values are - big standard deviation means more spread out, small standard deviation means closer together. A standard deviation of zero means all the values are the same. Note that the standard deviation can't exceed half the range (largest value minus smallest value). Standard deviation is measured in the original units. For example, if you record a set of lengths in mm, their standard deviation is in mm. There is a huge variety of reasonable measures of spread. Standard deviation is the most used. You will get more of a feel for the standard deviation if you compare what it does to some other measures of spread. For example, another common measure is the mean deviation - the average distance of observations from the mean. By contrast, standard deviation is the root-mean-square distance from the mean (as you can see from the formula**). ** At least the n-denominator (maximum likelihood) version is the root-mean-square deviation; the n-1 denominator is just a constant times that. This squaring puts relatively more weight on the larger deviations, and less weight on the smaller deviations than the mean deviation, but it is still a kind of weighted average of the deviations from the mean. Here's a quick (tiny) example to help illustrate some of the points (I am using the n-1 version of the standard deviation here): Sample 1: 4, 6, 7, 7, 8, 10 Mean = 7, mean deviation = 4/3 = 1.333..., std deviation=2 Sample 2: 1, 5, 7, 7, 9, 13 Mean = 7, mean deviation = 8/3 = 2.666..., std deviation =4 Note that Sample 2's values are more 'spread out' than sample 1's, and both of the measures of spread tell us that. Standard deviation is used for a variety of reasons - including the fact that it is the square root of the variance, and variance has some nice properties, both in general and also particularly for normal r.v.'s, but s.d. is measured in original units. Glen This is a useful summary: I'd just like to add one point to it. People sometimes ask, which measure of spread is "best"? Or, why use standard deviation, it seems more complicated than simpler statistics such as mean average deviation. Various measures of spread are useful for different purposes, but the real strength of s.d. is that many other statistical concepts are built upon it. Thus s.d. underpins the notion of a standard (z) score, z score underpins the definition of Pearson product-moment correlation, and hence linear regression; s.d. squared is variance, and this underpins the variance theorem, analysis of variance, F-ratio etc. etc. Thus it's a "big idea", a substantive concept in the structure of statistics, in a way that other measures of spread aren't. There are parallels to this in other branches of science and mathematics. Mass times velocity (momentum) is a useful concept, because it enters into relationships with other concepts. So does (1/2)m v-squared (kinetic energy). But no one uses mass per unit velocity, or mass times the square root of velocity, or m v-cubed, because (as far as I know) these concepts don't enter into any relationships which are useful for describing aspects of the world. Paul Gardner begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: obsolete methods?
[EMAIL PROTECTED] wrote: I have been looking for resources on attitude scale construction. The methods I have been looking at are things like paired comparisons and successive intervals. The strange thing about finding descriptions of these methods is that the only book I can find in print is *Techniques of Attitude Scale Construction* by Edwards (1957?). In fact, it seems that nearly all the standard references on these statistical methods were published in the fifties or before. Summated rating scales (e.g. Likert scales with Strongly Agree/Agree/Not Sure/DisAgree/Strongly Disagree responses to opinion statements) analysed by conventional item analysis and factor analysis methods remain in common use for situations where "objective" data is to be obtained from large samples. However, such scales are limited in that they cannot be used to probe individuals' meanings, perceptions, personal experiences etc. I advise my students to use a combination of methods in order to get various lines of evidence about people's attitudes. For example, one of my students, a nurse educator, developed a four-scale Likert instrument on nursing students' attitudes to the elderly, and also used interviews and participant observation during field placement. References in the area didn't stop with Edwards 1957! Some later texts that I can recommend are Robert de Vellis, Scale Development: Theory and Applications (Sage, 1991, strong on the psychometrics), Robert Gable, Instrument Development in the Affective Domain (Kluwer-Nijhoff, 1986, good on both psychometrics and scale development methods) and the revised edition of the classic A.N. Oppenheim, Questionnaire Design, Interviewing and Attitude Measurement (Pinter, 1992, emphasis on various qualititative and quantitative data-gathering techniques, not on the psychometrics). Hope this is helpful, Paul Gardner Does anyone know what happened? Did these methods go out of style bacause they were superceded? Regards, Tom Sent via Deja.com http://www.deja.com/ Before you buy. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: Number of factors to be extracted
I would add another criterion, which is qualitative, and therefore not reducible to a quantitative rule: 3. Use your professional judgement. Does the pattern of factor loadings make sense? For example, if the variables are item scores on a multi-dimensional instrument, can you see a meaningful connection among the items which load highly on a particular factor? The "eigen-value greater than 1" criterion is very arbitrary, and in interpreting a factor analysis matrix of item scores, I often discard numerous factors which meet the eigen-value criterion but fail to make any sense when I apply my judgement to the pattern of loadings. I can reduce all this to a single maxim: Factor analysis is an art as well as a science. Paul Gardner Alex Yu wrote: There are several rules. The most popular two are: 1. Kasier criterion: retain the factor when eigenvalue is larger than 1 2. Scree plot: Basically, it is eyeballing. Plot the number of factors and the eigenvalue and see where the sharp turn is. Hope it helps. Chong-ho (Alex) Yu, Ph.D., CNE, MCSE On Tue, 2 May 2000 [EMAIL PROTECTED] wrote: Would any of you know a rule of thumb for selecting the proper (of optimal) number of factors to be extracted from a factor analysis. Also, how many variables can there be in such factor (is two variable in one factor not enough?). begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: split half reliability
Paul R Swank wrote: I disagree with the statement that the split-half reliability coefficient is of no use anymore. Coefficient alpha, while being an excellent estimator of reliability, does have one rather stringent requirement. The items must be homogeneous. This is not always the case with many kinds of scales, nor should it be. In many cases homogeneity of item content may lead to reduced validity if the consruct is too narrowly defined. Screening measures often have this problem. They need to be short but they also need to be broad in scope. Internal consistency for such scales would suffer but a split half procedure, which is much less sensitive to item homogeneity, would fit the bill nicely. I have four responses to this: 1. Split-half requires the items to be divided into two "equal" halves. How is this to be done? Odd/even? First half/second half? Randomly? Cronbach's alpha does not depend on this arbitrary division into halves. 2. Stanley and Hopkins (1972) demonstrated that Cronbach's alpha was essentially equivalent to the "mean of all possible split-half reliability estimates". DeVellis (1991) demonmstrates that if the items in a scale have similar variances (a condition frequently met in well-designed scales), it can be shown that the value of alpha (called standardised alpha) is algebraically equivalent to the Spearman-Brown formula for estimating split-half. In other words, there is no great difference conceptually between the two. 3. Many writers use the term 'homogeneity' to bolster arguments in discussions of reliability and validity. In a paper I have completed recently which is currently under review for publication, I show that the term has about six different meanings in the literature. Whenever I read the word now, I respond, What exactly does the writer mean by homogeneity here? 4. If, by homogeneity, you mean all the items are measuring a similar construct, i.e. the item scores all inter-correlate with each other because they are indicators of a unidimensional construct, then the assertion that Cronbach's alpha depends on being this being the case is demonstrably untrue. Cronbach's alpha will be high as long as every item in a scale correlates well with at least some other items, but not necessarily all of them. Homogeneity is not a "stringent requirement" for a high Cronbach alpha level at all. Cronbach's alpha is simply a measure of reliability; it is not an indicator of unidimensionality, a point widely misunderstood in the literature. Paul Gardner begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: Split half coefficient?
busker wrote: I'm completely new to statistics but am putting together a customer satisfaction survey, thanks to which I am daily becoming fascinated by my whole new world of Means and Medians and Variabilities and Variances, and so forth. I am told that certain "duplicate" questions are sometimes put in to test the consistency/'truthfulness' of a respondent's answers,and that these 'check' questions are called split half coefficients (or thereabouts). But i find no reference in the text books I'm poring over. Can anyone enlighten me? I hope I've explained myself correctly and, if not, that I cn be pointed on the right track: I know how vital it is to have the correct terms in this business. Chris: The split-half coefficient was invented in the early years of the 20th century as a way of checking the internal consistency of a measurement scale. One takes half the items in a scale (say the odd numbered items) and scoresd their total, and then correlates this with the score on the other half of the scale. An adjustment is then made to correct for the shortened length of the scale by taking only half the items. Nobody bothers with this any more; the procedure has been superseded by the more convenient Cronbach's alpha coefficient. Neither of these statistics is directly concerned with the issue you raise, namely that of having repeated items in order to check whether an individual respondent is answering the same question consistently. You won't find these concepts discussed in books on basic statistics. Look instead for books on educational and psychological measurement. You local university library should be able to help. Paul Gardner begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: What to study next
I agree with with Rich Ulrich's comments, but bear in mind I was only answering the original query, which was for a good text. I find Diekhoff useful as an additional reference in my introductory stats course (it's not the text I use, which is Runyon, Haber, Pittenger and Coleman.) Diekhoff's actual decision trees occupy less than four pages of a 23 page chapter. The decision trees are elaborated with extensive discussion of the purpose of an analysis, the nature of the research question, the number of variables involved, the kind of data collected... Very few statistics texts contain this kind of material. As I teach my course (one semester, 13 3-hour sessions) I continually link new statistical tests to the Diekhoff decision tree. I also give homework and run a workshop in which students are given a wide variety of research scenarios (sample data, explicit or implicit research questions) and ask them to consider which statistical test or tests would be appropriate. Obviously this doesn't instantly turn all my students into expert designers and statisticians, but they certainly display good competence in getting the majority of tasks right. This is in marked contrast to students who have never been asked to do such things, and learn stats by simply doing exercises from textbooks and who have never been asked to decide on appropriate procedures when given an unfamiliar scenario. Since my course is only an introduction, we cover only a limited number of statistical procedures, and obviously there are dozens or hundreds of others. But I think the procedure I use encourages the students to read and reflect on research situations, and frame the question, "How might this research question be answered?" Paul Gardner Rich Ulrich wrote: [rearranging this note, to put the posts into order, earliest first. ] At 07:44 AM 02/29/2000 -0800, Ward Soper wrote: After one learns to do the textbook problems, as in Freund's Mathematical Statistics, where should one turn to learn what tests to use in various situations and how to design studies? Can anyone suggest some good texts or other resources? === dennis roberts wrote: william trochim's research methods knowledge base is a good place to start ... to get ideas http://trochim.human.cornell.edu/kb/ On 29 Feb 2000 17:48:07 -0800, [EMAIL PROTECTED] (Paul Gardner) wrote: George. M. Diekhoff, Basic Statistics for the Social and Behavioral Sciences, Prentice Hall, 1996, has an excellent chapter at the end which presents a decision tree. This summarises the various statistical procedures in the text and helps learners to determine which statistics are appropriate under various conditions. = - Pardon; I haven't seen Diekoff, but 'decision tree' sounds too cheap. There is certainly a place for a mechanical framework of tests and procedures; but I read the original question as less particular than that, and more general ("how to design studies"); and the first answer, that way, too. An enormous decision tree may give the right technical answer to 100% of the narrow questions, but -- since it takes knowledge to frame the right question -- that will be a misleading answer, I would guess, for 1/3 of the naive questioners, at least. People just can't tell you what they never thought to ask, concerning 'reliability' (of various kinds); 'dependence' (ditto); 'shape of the distribution'; 'outliers'; and 'What numbers are meaningful when we use this measurement?' or, 'What transformations might be useful?' (I am still answeriing the big question, Why can't a computer give us all the stats advice that we need? So far, no one has programmed a computer with 10,000 well-classified examples) If they have not learned the whole statistical vocabulary, they won't be able to argue persuasively that their own answers are correct. And you can't thoroughly learn the vocabulary until you are expert enough to know something about all the available techniques. In addition to the statistics, there are particular problems in each area about their own sorts of statistical designs. To learn what to do in various situations, I think you have to *read*, you have to be exposed to a large number of various situations. You have to read some good examples, and you have to read criticisms which include examples that were not-so-good. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, includi
Bonferroni
Bonferroni, a technique for dealing with the problem of increasing the chance of making Type I errors when multiple comparisons are made, works by changing the alpha-level. I'll use the symbol for alpha. Step 1. Find n, the number of possible comparisons when the means of k groups are to be compared. n = k(k-1)/2 For example, if 4 groups are to be compared, n = 6 Step 2. Find *, the adjusted alpha level: * = /n For example, if = .05 and n = 6, * = .008 Then make multiple comparisons using the t-test, look up tables of significance at the * level (you may need to interpolate or approximate) and then claim significance only at the level. In this example. .008 is approximately equal to .01. Other techniques for dealing with the multiple comparisons problem are the Scheffe procedure and the Tukey HSD (Honestly Significant Difference) test. Paul Gardner begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: 50 random envelopes/people
Pez Boy wrote: My statistics textbook mentioned the following problem: "A secretary addresses 50 different letters and envelopes to 50 different people, but the letters are randomly mixed before being put into envelopes. What is the probability that at least one letter gets into the correct envelope?" It said that the probability was 0.632 but simply said that the solution was "way beyond" the scope of the text and did not give a place to look for further information. Could someone explain how to find this result or point me to a web site that explains it? A wise Indian mathematician who visited our faculty a few years ago gave me some very useful advice, a good principle for problem-solving and lateral thinking. "To solve a complicated problem, try solving a simple problem first." So, a hint: Try calculating the probability that every letter finishes up in a WRONG envelope, by finding (a) the number of ways this can happen and (b) the total number of different ways all the letters could be placed in all the envelopes. Call this probability pw. (1 - pw) must therefore be the probability of every other combination, i.e. the probability that at least one letter is in its correct envelope. (Dr) Paul Gardner, Director, Research Degrees begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;-29488 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Vic. Australia 3800 end:vcard
Re: Scale Reliability
[EMAIL PROTECTED] wrote: The fact that the shorter scale has low internal consistency doesn't necessarily mean that the 4 items in question are not unidimensional. It may just be that the measurement error is large relative to their covariance. Given that the four items in question are drawn from a scale with established internal consistency, I'd suspect they probably are measuring the same thing - only not measuring it very well. purnima. No, there is a flaw in the logic here. If a scale has "established internal consistency" (usually based on a high Cronbach alpha value), a researcher CANNOT conclude that the items are "measuring the same thing". All it takes for alpha to be high is that each item correlates well with at least some other items, but not necessarily with all of them. Alpha is a good indicator of the relative freedom of the items in a scale from random measurement error. It is NOT a sound indicator of unidimensionality. The misconception that it is such an indicator is widespread. Paul Gardner begin:vcard n:Gardner;Dr Paul tel;cell:0412 275 623 tel;fax:Int + 61 3 9905 2779 (Faculty office) tel;home:Int + 61 3 9578 4724 tel;work:Int + 61 3 9905 2854 x-mozilla-html:FALSE adr:;; version:2.1 email;internet:[EMAIL PROTECTED] x-mozilla-cpt:;27600 fn:Dr Paul Gardner, Reader in Education and Director, Research Degrees, Faculty of Education, Monash University, Clayton, Vic. Australia 3800 end:vcard