Re:[tips] What's normal about the normal curve?
On Mon, 20 Jun 2011 22:45:03 -0500, Jim Clark wrote: > Hi > Thanks for the quotes from Hayes (and for correcting my >use of "observations"). You're welcome. >I do wonder, however, about Hayes use of "error". Is it >really the case that someone with an IQ of 120 deviates >from the norm of 100 because of "error" (which I normally >interpret to mean random influences) rather than the cumulation >of multiple factors that systematically (causally?) contribute >to a higher IQ? I think that Hays is just presenting the traditional line about measurement of psychological attributes which makes an analogy to physical measurement. One classic example of the normal distribution or distribution of errors is Gauss' analysis of astronomical data. The positions of distant heavenly bodies like stars are stable in location but the same astronomer looking at the star across many nights will record locations that differ from night to night. Often the errors are small but sometimes the errors are large. What causes these errors? Some of it may be due to human error, due to atmospheric conditions (e.g., amount of humidity or water vapor in the air, amount of particulate and light population in the sky, etc.). But as long as the conditions are temporary, they can be treated as random effects. So, the position of the star is best modeled by the classical test theory model Y = True + Error What you are suggesting is that the measurement of IQ is better modelled by something like Y = T + SystemaicFactors + Error I'll have more to say about this below. Galton would do similar types of analyses for his "biometric" measurements, the example of one such analysis in provided on the Galton Institute's page which can be accessed here: http://www.galtoninstitute.org.uk/Newsletters/GINL9912/francis_galton.htm Today, I have problems presenting IQ or standardized test results according to the traditional "law of errors/temporary systematic effects" because such variables do not exist in a vacuum. A person's IQ or intelligence score will depend upon our "measurement model" but it also true that over the course one's life systematic factors have operated to either suppress one's intelligence or to facilliate it -- factors such as gender, race, socioeconomic status, one's level of education, the education levels of one's parents, and so on will have had an influence. A better model for IQ or intelligence score for any given person may be: Y = T + GenderEffect + SESEffect + EducSelfEffect + EducPareEffect + ... + Error Some of these issues are addressed in the Handbook of Psychological Assessment which available on books.google.com; here a link to one section on predicting what a person's WAIS-III score would be after one has taken into account demographic and other variables; see: http://tinyurl.com/GooglePsychAssess Quoting from page 123: |In the WAIS-III - WMS-III Technical Manual, it was emphasized that |good practice means that all scores should be evaluated in light of |someone's life history, socioeconomic status, and medical and |psychosocial history. >I wonder if Hayes is using "error" as a substitute for something more >general than just random variation? My reading of Hays is that he is using the traditional descriptions of normal curves (systematic factors would be covered in the sections on ANOVA and regression). YMMV. > When I get a chance I will modify simulation to incorporate correlation >between random influences (Hayes error) to see what impact it has on >the final distribution. -Mike Palij New York University m...@nyu.edu --- You are currently subscribed to tips as: arch...@jab.org. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=11091 or send a blank email to leave-11091-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
Re: [tips] What's normal about the normal curve?
Hi Thanks for the quotes from Hayes (and for correcting my use of "observations"). I do wonder, however, about Hayes use of "error". Is it really the case that someone with an IQ of 120 deviates from the norm of 100 because of "error" (which I normally interpret to mean random influences) rather than the cumulation of multiple factors that systematically (causally?) contribute to a higher IQ? I wonder if Hayes is using "error" as a substitute for something more general than just random variation? When I get a chance I will modify simulation to incorporate correlation between random influences (Hayes error) to see what impact it has on the final distribution. Take care Jim James M. Clark Professor of Psychology 204-786-9757 204-774-4134 Fax j.cl...@uwinnipeg.ca >>> "Mike Palij" 21-Jun-11 1:53 AM >>> On Mon, 20 Jun 2011 09:25:58 -0700, Jim Clark wrote: >> Mike Palij wrote: >>(4) Somebody should mention that what John Kulig and Jim Clark >>allude to below relates to the "central limit theorem". The Wikipedia >>entry on the normal distribution also coverts this topic; see: >> http://en.wikipedia.org/wiki/Normal_distribution#Central_limit_theorem >> and there is a separate entry on it (yadda-yadda) as well: >>http://en.wikipedia.org/wiki/Central_limit_theorem >JC: >The central limit theorem, which concerns sampling distribution of means based >on n observations, is one specific application of the fact that sums (means if >divided by n) of scores increasingly approximate a normal distribution, but as >John pointed out any score that is dependent on multiple independent >contributing factors will increasingly approximate a normal distribution. Actually, I prefer how Hays (1973) describes it on page 309-310. Paraphrasing his presentation: Consider that one has a random variable Y which is the sum of two independent parts. This is represented in the equation: (1) Y = TrueScore + Error TrueScore is a constant but Error is a random variable that represents the additive effects of independent influences or "errors". One can represent Error in the following equation: Error = g[E1 + E2 + E3 + ... + EN) where g is a constant that reflects the "weight" of the error component, E1 is the contribution of some Factor designated Factor01, E2 is the contribution of some Factor designated Factor02, and so on for each of relevant error factors. If all of the Es are dichotomous variables, then the Error can be considered to reflect the number of "successes" in N independent trials (i.e., does the Factor make a contribution or does it fail to make a contribution). If N is very large, then the distribution of Error must approach a normal distribution. If each factor has a probability of making an contribution of p=.50, then the mean [i.e., E(Error)] will be zero. That is, E(Error)= 0.00 because positive and negative errors cancel each other out in the long run. If these conditions are met, then the mean of the random variable Y will be the true score: E(Y) = E(T) + E(Error) = T In the example that Jim Clark provides below, one can think of IQ as being synonymous with Y. NOTE: making the errors dichotomous simplifies matters but the errors can be defined in different ways. >Hence, if IQs depend on multiple discrete observations the IQs of individuals >(not means of individual IQs) will be normally distributed. I think Jim might mean "multiple discrete influences" instead of observations. With the definitions given above, a single IQ score will represent the sum of influences (this an assumption; errors can have multiplicative relationships) and the distribution of IQs or any standardized test score will form a normal distribution if the number of influences is large and other conditions are met. Quoting Hays on this issue: |...[T]he same kind of reasoning is sometime used to explain why distributions |of natural traits, such as height, weight, and size of head, follow a more or |less normal rule. Here, the mean of some population is thought of as the |'true" value, or the "norm". However, associated with each individual |is some departure from the norm, or error, representing the culmination |of all of the billions of chance factors that operate on him, quite independently |of other individuals.. Then by regarding these factors as generating a |binomial distribution, we can deduce that the whole population should |take on a form like the hypothetical normal distribution. However, this |is only a theory about how errors might operate and there is no reason |at all why errors must behave in the simple additive way assumed here. |If they do not, then the distribution need not be normal in form at all. (Hayes, 1973, p310). >The same holds for >any variable (score) with multiple contributing factors. In the simulation, >for example, the central limit theorem would strictly apply if individuals had >dichotomous scores (e.g., dying or not, passing or not, ...) and the >dist
Re: [tips] What's normal about the normal curve?
On Mon, 20 Jun 2011 09:25:58 -0700, Jim Clark wrote: >> Mike Palij wrote: >>(4) Somebody should mention that what John Kulig and Jim Clark >>allude to below relates to the "central limit theorem". The Wikipedia >>entry on the normal distribution also coverts this topic; see: >> http://en.wikipedia.org/wiki/Normal_distribution#Central_limit_theorem >> and there is a separate entry on it (yadda-yadda) as well: >>http://en.wikipedia.org/wiki/Central_limit_theorem >JC: >The central limit theorem, which concerns sampling distribution of means based >on n observations, is one specific application of the fact that sums (means if >divided by n) of scores increasingly approximate a normal distribution, but as >John pointed out any score that is dependent on multiple independent >contributing factors will increasingly approximate a normal distribution. Actually, I prefer how Hays (1973) describes it on page 309-310. Paraphrasing his presentation: Consider that one has a random variable Y which is the sum of two independent parts. This is represented in the equation: (1) Y = TrueScore + Error TrueScore is a constant but Error is a random variable that represents the additive effects of independent influences or "errors". One can represent Error in the following equation: Error = g[E1 + E2 + E3 + ... + EN) where g is a constant that reflects the "weight" of the error component, E1 is the contribution of some Factor designated Factor01, E2 is the contribution of some Factor designated Factor02, and so on for each of relevant error factors. If all of the Es are dichotomous variables, then the Error can be considered to reflect the number of "successes" in N independent trials (i.e., does the Factor make a contribution or does it fail to make a contribution). If N is very large, then the distribution of Error must approach a normal distribution. If each factor has a probability of making an contribution of p=.50, then the mean [i.e., E(Error)] will be zero. That is, E(Error)= 0.00 because positive and negative errors cancel each other out in the long run. If these conditions are met, then the mean of the random variable Y will be the true score: E(Y) = E(T) + E(Error) = T In the example that Jim Clark provides below, one can think of IQ as being synonymous with Y. NOTE: making the errors dichotomous simplifies matters but the errors can be defined in different ways. >Hence, if IQs depend on multiple discrete observations the IQs of individuals >(not means of individual IQs) will be normally distributed. I think Jim might mean "multiple discrete influences" instead of observations. With the definitions given above, a single IQ score will represent the sum of influences (this an assumption; errors can have multiplicative relationships) and the distribution of IQs or any standardized test score will form a normal distribution if the number of influences is large and other conditions are met. Quoting Hays on this issue: |...[T]he same kind of reasoning is sometime used to explain why distributions |of natural traits, such as height, weight, and size of head, follow a more or |less normal rule. Here, the mean of some population is thought of as the |'true" value, or the "norm". However, associated with each individual |is some departure from the norm, or error, representing the culmination |of all of the billions of chance factors that operate on him, quite independently |of other individuals.. Then by regarding these factors as generating a |binomial distribution, we can deduce that the whole population should |take on a form like the hypothetical normal distribution. However, this |is only a theory about how errors might operate and there is no reason |at all why errors must behave in the simple additive way assumed here. |If they do not, then the distribution need not be normal in form at all. (Hayes, 1973, p310). >The same holds for >any variable (score) with multiple contributing factors. In the simulation, >for example, the central limit theorem would strictly apply if individuals had >dichotomous scores (e.g., dying or not, passing or not, ...) and the >distribution represented the sampling distribution of the dichotomous >observations for n individuals, either as sums as in the simulation or as >means >(equivalently proportions for dichotomous -0 1 scores) if divided by n. If, >however, the dichotomous 0 1 numbers represent some underlying contributing >factor to the individual scores represented by the sums (or means), then the >results represent individual scores, which if averaged together for samples >would have a sampling distribution of the means of the scores for n >individuals. > >Perhaps just a subtle and esoteric distinction, but isn't that what academics >specialize in? I often wonder what academics specialize in. -Mike Palij New York University m...@nyu.edu --- You are currently subscribed to tips as: arch...@jab.org. To unsubscribe click her
Re: [tips] What's normal about the normal curve?
H James M. Clark Professor of Psychology 204-786-9757 204-774-4134 Fax j.cl...@uwinnipeg.cai >>> "Mike Palij" m...@nyu.edu> 20-Jun-11 4:42 PM >> ( mailto:m...@nyu.edu> ) (4) Somebody should mention that what John Kulig and Jim Clark allude to below relates to the "central limit theorem". The Wikipedia entry on the normal distribution also coverts this topic; see: http://en.wikipedia.org/wiki/Normal_distribution#Central_limit_theorem and there is a separate entry on it (yadda-yadda) as well: http://en.wikipedia.org/wiki/Central_limit_theorem JC: The central limit theorem, which concerns sampling distribution of means based on n observations, is one specific application of the fact that sums (means if divided by n) of scores increasingly approximate a normal distribution, but as John pointed out any score that is dependent on multiple independent contributing factors will increasingly approximate a normal distribution. Hence, if IQs depend on multiple discrete observations the IQs of individuals (not means of individual IQs) will be normally distributed. The same holds for any variable (score) with multiple contributing factors. In the simulation, for example, the central limit theorem would strictly apply if individuals had dichotomous scores (e.g., dying or not, passing or not, ...) and the distribution represented the sampling distribution of the dichotomous observations for n individuals, either as sums as in the simulation or as means (equivalently proportions for dichotomous -0 1 scores) if divided by n. If, however, the dichotomous 0 1 numbers represent some underlying contributing factor to the individual scores represented by the sums (or means), then the results represent individual scores, which if averaged together for samples would have a sampling distribution of the means of the scores for n individuals. Perhaps just a subtle and esoteric distinction, but isn't that what academics specialize in? Take care Jim --- You are currently subscribed to tips as: arch...@jab.org. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=11079 or send a blank email to leave-11079-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
Re: [tips] What's normal about the normal curve?
Just to dot some i's and cross some t's, a couple of points: (1) As far as I can tell, it is unclear why the "normal" distribution is called the normal distribution. Wikipedia has an entry on the normal distribution (yadda-yadda) and, in the history section, it is pointed out that de Moivre in 1783 first suggested the possibility of what we call the normal distribution. It was Gauss (1809) and Laplace (1774) who would provide a basis for thinking about the normal distribution as a probability distribution and this is why "really old school" statisticians refer to the normal distribution as a Gaussian distribution or Laplace-Gaussian distribution. See: http://en.wikipedia.org/wiki/Normal_distribution#Development (2) The Wikipedia entry also provides some of the history for the different names that have been used for this distribution. Quoting from the entry: |Naming | |Since its introduction, the normal distribution has been known |by many different names: the law of error, the law of facility of |errors, Laplace’s second law, Gaussian law, etc. By the end of |the 19th century some authors[nb 6] had started using the name |normal distribution, where the word “normal” was used as an |adjective — the term was derived from the fact that this distribution |was seen as typical, common, normal. Peirce (one of those authors) |once defined “normal” thus: “...the ‘normal’ is not the average |(or any other kind of mean) of what actually occurs, but of what |would, in the long run, occur under certain circumstances.”[49] |Around the turn of the 20th century Pearson popularized the term |normal as a designation for this distribution.[50] http://en.wikipedia.org/wiki/Normal_distribution#Naming Why it should continue to be called the "normal distribution" is a puzzlement especially since we know that there are many different probability distributions and these may describe behavioral and/or psychological variables better than the normal/guassian/whatever. (3) I think that the use of the "normal distribution" in psychological testing, especially for IQ and other intelligence tests, promotes the use of the word "normal" because it is expected that "normal" intelligence is variable but extreme values in either direction are rare. So, 2 standard deviations above and below the mean in the normal distribution contain slightly more than 95% of the values in the distribution (thus "normal" or commonly expected) but values beyond this range have less than a 5% chance of occurring by chance (thus, "exceptional" or "special" or whatever euphemism we're using today to refer to people with intelligence way below "normal" or way above "normal"). Similar points can be made for other psychological variables (such as continuous measures of psychopathology). (4) Somebody should mention that what John Kulig and Jim Clark allude to below relates to the "central limit theorem". The Wikipedia entry on the normal distribution also coverts this topic; see: http://en.wikipedia.org/wiki/Normal_distribution#Central_limit_theorem and there is a separate entry on it (yadda-yadda) as well: http://en.wikipedia.org/wiki/Central_limit_theorem (5) I'm surprised no one asked Prof. Sylvester what he mean by "normal" or "normal distribution". It is possible that his definition is not consistent with common or "normal" definitions of those terms. -Mike Palij New York University m...@nyu.edu Original Message --- On Sun, 19 Jun 2011 21:12:41 -0700, Jim Clark wrote: Hi Here's a simple spss simulation of John's point about sums of multiple discrete factors being normally distributed. Just cut and paste into spss syntax window. input program. loop o = 1 to 1000. end case. end loop. end file. end input program. compute score = 0. do repeat v = v1 to v25. compu v = rv.uniform(0,1) > .5. end repeat. compute score1 = v1. compute score2 = sum(v1 to v2). compute score3 = sum(v1 to v3). compute score4 = sum(v1 to v4). compute score9 = sum(v1 to v9). compute score16 = sum(v1 to v16). compute score25 = sum(v1 to v25). freq score1 to score25 /form = notable /hist norm. Take care Jim James M. Clark Professor of Psychology 204-786-9757 204-774-4134 Fax j.cl...@uwinnipeg.ca >>> John Kulig 20-Jun-11 4:38 AM >>> Well, most things in psychology have numerous independent causes. Height is caused by at least several genes, your score on an exam is caused by answers to many individual questions, etc. The sum (i.e. adding) of independent events gets "normal" -- faster the more things you add (among other things). Example: Toss ONE coin, and record 0 for tails and 1 for heads. If this experiment - tossing one coin - is repeated long enough, you get about 50% heads and 50% tails (a flat or uniform distribution). Next toss two coins and record the total number of heads - it will either be 0, 1 or 2 heads. Repeat this experiment - two coins in one toss - and 25% of time you'll get 0 heads (TT) 50% of the time you
Re: [tips] What's normal about the normal curve?
Hi Here's a simple spss simulation of John's point about sums of multiple discrete factors being normally distributed. Just cut and paste into spss syntax window. input program. loop o = 1 to 1000. end case. end loop. end file. end input program. compute score = 0. do repeat v = v1 to v25. compu v = rv.uniform(0,1) > .5. end repeat. compute score1 = v1. compute score2 = sum(v1 to v2). compute score3 = sum(v1 to v3). compute score4 = sum(v1 to v4). compute score9 = sum(v1 to v9). compute score16 = sum(v1 to v16). compute score25 = sum(v1 to v25). freq score1 to score25 /form = notable /hist norm. Take care Jim James M. Clark Professor of Psychology 204-786-9757 204-774-4134 Fax j.cl...@uwinnipeg.ca >>> John Kulig 20-Jun-11 4:38 AM >>> Well, most things in psychology have numerous independent causes. Height is caused by at least several genes, your score on an exam is caused by answers to many individual questions, etc. The sum (i.e. adding) of independent events gets "normal" -- faster the more things you add (among other things). Example: Toss ONE coin, and record 0 for tails and 1 for heads. If this experiment - tossing one coin - is repeated long enough, you get about 50% heads and 50% tails (a flat or uniform distribution). Next toss two coins and record the total number of heads - it will either be 0, 1 or 2 heads. Repeat this experiment - two coins in one toss - and 25% of time you'll get 0 heads (TT) 50% of the time you'll get one head (since 1 head can be either TH or HT) and 25% of the time you'll get 2 heads (HH). With 0 1 and 2 heads on the X axis, it's not exactly a normal distribution but it is peaked at 1 head. When this is done by summing, say, number of heads when 5 or 6 coins are tossed in a single experiment, the resultant distribution (number of heads in one experiment) gets "normal" very fast (slower if the probability of a 'heads' is different than the probability of "tails" but it will still get to normal with enough coins in the experiment). Life is like coin tosses, no? Most everything we measure has multiple causes, so it should be no surprise that many things in the natural world are distributed "normally" .. though sometimes when a distribution has deviations from normality it's a clue about different underlying processes. IQ is somewhat normally distributed, though there is a little hump at the lower end (single gene effects?) and a slight bulge in the upper half (high IQ marrying other high IQ people?). Even when you measure the same exact thing over and over - like having all your students measure your height, their measurements will look normal .. classic psychological measurement theory would say that any measurement is the result of your "true" score added to an error component, and in many situations they assume "error" is unbiased, itself normally distributed, yaddy yaddy yaddy ... gets complicated quickly but the bottom line is that many things in the real world simply ARE normally distributed, or at least close enough to assume normality. A google search of the "central limit theorem" will give more precise information than this. On the other hand, I always tell my students to never take normality for granted, and merely LOOKING at data is the first step in determining if we can assume normality. Or at Yogi Berra put it, "you can observe a lot by looking" == John W. Kulig, Ph.D. Professor of Psychology Director, Psychology Honors Plymouth State University Plymouth NH 03264 == - Original Message - From: "michael sylvester" To: "Teaching in the Psychological Sciences (TIPS)" Sent: Sunday, June 19, 2011 10:48:16 AM Subject: [tips] What's normal about the normal curve? Michael --- You are currently subscribed to tips as: ku...@mail.plymouth.edu . To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13338.f659d005276678c0696b7f6beda66454&n=T&l=tips&o=11063 (It may be necessary to cut and paste the above URL if the line is broken) or send a blank email to leave-11063-13338.f659d005276678c0696b7f6beda66...@fsulist.frostburg.edu --- You are currently subscribed to tips as: j.cl...@uwinnipeg.ca. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13251.645f86b5cec4da0a56ffea7a891720c9&n=T&l=tips&o=11068 or send a blank email to leave-11068-13251.645f86b5cec4da0a56ffea7a89172...@fsulist.frostburg.edu --- You are currently subscribed to tips as: arch...@jab.org. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=11071 or send a blank email to leave-11071-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu
Re: [tips] What's normal about the normal curve?
Well, most things in psychology have numerous independent causes. Height is caused by at least several genes, your score on an exam is caused by answers to many individual questions, etc. The sum (i.e. adding) of independent events gets "normal" -- faster the more things you add (among other things). Example: Toss ONE coin, and record 0 for tails and 1 for heads. If this experiment - tossing one coin - is repeated long enough, you get about 50% heads and 50% tails (a flat or uniform distribution). Next toss two coins and record the total number of heads - it will either be 0, 1 or 2 heads. Repeat this experiment - two coins in one toss - and 25% of time you'll get 0 heads (TT) 50% of the time you'll get one head (since 1 head can be either TH or HT) and 25% of the time you'll get 2 heads (HH). With 0 1 and 2 heads on the X axis, it's not exactly a normal distribution but it is peaked at 1 head. When this is done by summing, say, number of heads when 5 or 6 coins are tossed in a single experiment, the resultant distribution (number of heads in one experiment) gets "normal" very fast (slower if the probability of a 'heads' is different than the probability of "tails" but it will still get to normal with enough coins in the experiment). Life is like coin tosses, no? Most everything we measure has multiple causes, so it should be no surprise that many things in the natural world are distributed "normally" .. though sometimes when a distribution has deviations from normality it's a clue about different underlying processes. IQ is somewhat normally distributed, though there is a little hump at the lower end (single gene effects?) and a slight bulge in the upper half (high IQ marrying other high IQ people?). Even when you measure the same exact thing over and over - like having all your students measure your height, their measurements will look normal .. classic psychological measurement theory would say that any measurement is the result of your "true" score added to an error component, and in many situations they assume "error" is unbiased, itself normally distributed, yaddy yaddy yaddy ... gets complicated quickly but the bottom line is that many things in the real world simply ARE normally distributed, or at least close enough to assume normality. A google search of the "central limit theorem" will give more precise information than this. On the other hand, I always tell my students to never take normality for granted, and merely LOOKING at data is the first step in determining if we can assume normality. Or at Yogi Berra put it, "you can observe a lot by looking" == John W. Kulig, Ph.D. Professor of Psychology Director, Psychology Honors Plymouth State University Plymouth NH 03264 == - Original Message - From: "michael sylvester" To: "Teaching in the Psychological Sciences (TIPS)" Sent: Sunday, June 19, 2011 10:48:16 AM Subject: [tips] What's normal about the normal curve? Michael --- You are currently subscribed to tips as: ku...@mail.plymouth.edu . To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13338.f659d005276678c0696b7f6beda66454&n=T&l=tips&o=11063 (It may be necessary to cut and paste the above URL if the line is broken) or send a blank email to leave-11063-13338.f659d005276678c0696b7f6beda66...@fsulist.frostburg.edu --- You are currently subscribed to tips as: arch...@jab.org. To unsubscribe click here: http://fsulist.frostburg.edu/u?id=13090.68da6e6e5325aa33287ff385b70df5d5&n=T&l=tips&o=11068 or send a blank email to leave-11068-13090.68da6e6e5325aa33287ff385b70df...@fsulist.frostburg.edu