Re: Means of semantic differential scales
At 09:51 AM 2/28/02 -0800, Jay Tanzman wrote: >I partially did this, insofar as I ran Pearson and Spearman correlations >between >several of the scales and, not surprisingly, the two correlation coefficients >and their p-values were similar. <<<<< that issue is entirely a separate >one since the rank order FORMULA was derived from the pearson ... > Dr. Kim was not impressed. > >-Jay i hate to ask this question but, what the heck, spring break is near so i will if your boss, dr. kim??? ... seems so knowledgeable about what the data are and what is and is not appropriate to do with the data, why is not dr. kim doing the analysis? this reminds me of assigning a task to someone and, doing so much micro-managing that ... it would have been better off doing it oneself ... >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Applied analysis question
At 07:37 AM 2/28/02 -0800, Brad Anderson wrote: >I think a lot of folks just run standard analyses or arbitrarily apply >some "normalizing" transformation because that's whats done in their >field. Then report the results without really examining the >underlying distributions. I'm curious how folks procede when they >encounter very goofy distrubions. Thanks for your comments. i think the lesson to be gained from this is that, we seem to be focusing on (or the message that students and others get) getting the analysis DONE and summarizied ... and with most standard packages ... that is relatively easy to do for example, you talk about a simple regression analysis and then show them in minitab that you can do that like: mtb> regr 'height' 1 'weight' and, when they do it, lots of output comes out BUT, the first thing is the best fitting straight line equation like: The regression equation is Weight = - 205 + 5.09 Height and THAT's where they start AND stop (more or less) while software makes it rather easy to do lots of prelim inspection of data, it also makes it very easy to SKIP all that too before we do any serious analysis ... we need to LOOK at the data ... carefully ... make some scatterplots (to check for outliers, etc.), to look at some frequency distributions ON the variables, to even just look at the means and sds ... to see if some serious restriction of range issue pops up ... THEN and ONLY then, after we get a feel for what we have ... THEN and ONLY then should we be doing the main part of our analysis ... ie, testing some hypothesis or notion WITH the data (actually, i might call the prelims the MAIN part but, others might disagree) we put the cart before the horse ... in fact, we don't even pay any attention to the horse unfortunately, far too much of this is "caused" by the dominant and preoccupation of doing "significance tests" so we run routines that give us these "p values" and are done with it ... without paying ANY attention to just looking at the data my 2 cents worth >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Applied analysis question
i thought of a related data situation ...but at the opposite end what if you were interested in the relationship between the time it takes students to take a test AND their test score so, you have maybe 35 students in your 1 hour class that starts at 9AM ... you decide to note (by your watch) the time they turn in the test ... and about 9:20 the first person turns it in ... then 9:35 the second 9:45 the 3rd 9:47 the 4th ... and then, as you get to 10, when the time limit is up ... the rest sort of come up to the desk at the same time for about 1/2 of the students, you can pretty accurately write down the time ... but, as it gets closer to the time limit, you have more of a (literal) rush and, at the end ... you probably put down the same time on the last 8 students you could decide just to put the order of the answer sheet as it sits in the pile ... or, you might collapse the set to 3 groupings ... quick turner iners, middle time turner iners ... and slow turner iners BUT, this clouds the data here we have a situation where the BIG times have lots of the n ... where there are widely scattered (but infrequent) short times ... if you have time on the baseline, it is radically NEG skewed better ways to record the times do not really solve this even if you have a time stamper like i used to have to used when punching my time card on coming into and leaving work it's a conundrum for sure At 10:17 AM 2/28/02 +1100, Glen Barnett wrote: >Brad Anderson wrote: > > > > I have a continuous response variable that ranges from 0 to 750. I > > only have 90 observations and 26 are at the lower limit of 0, which is > > the modal category. > >If it's continuous, it can't really have categories (apart from those >induced by recording the variable to some limited precision, but people >don't generally call those categories). > >The fact that you have a whole pile of zeros makes it mixed rather than >continuous, and the fact that you say "category" makes it sound purely >discrete. > >Glen > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Means of semantic differential scales
At 01:39 PM 2/27/02 -0600, Jay Warner wrote: > > > > > >Not stressful 1__ 2__ 3__ 4__ 5__ 6__ 7__ Very stressful just out of curiosity ... how many consider the above to be an example of a bipolar scale? i don't now, if we had an item like: sad happy 1 . 7 THEN the mid point becomes much more problematic ... since being a 4 ... is neither a downer nor upper now, a quick search found info from ncs about the 16pf personality scale ... it shows 16 BIpolar dimensions as: Bipolar Dimensions of Personality Factor A Warmth (Cool vs Warm) Factor B Intelligence (Concrete Thinking vs Abstract Thinking) Factor C Emotional Stability (Easily Upset vs Calm) Factor E Dominance (Not Assertive vs Dominant) Factor F Impulsiveness (Sober vs Enthusiastic) Factor G Conformity (Expedient vs Conscientious) Factor H Boldness (Shy vs Venturesome) Factor I Sensitivity (Tough-Minded vs Sensitive) Factor L Suspiciousness (Trusting vs Suspicious) Factor M Imagination (Practical vs Imaginative) Factor N Shrewdness (Forthright vs Shrewd) Factor O Insecurity (Self-Assured vs Self-Doubting) Factor Q1 Radicalism (Conservative vs Experimenting) Factor Q2 Self-Sufficiency (Group-Oriented vs Self-Sufficient) Factor Q3 Self-Discipline (Undisciplined vs Self-Disciplined) Factor Q4 Tension (Relaxed vs Tense) let's take the one ... shy versus venturesome ... now, we could make a venturesome scale by itself ... 0 venturesomeness .. (up to) very venturesome 7 does 0 = shy seems like if the answer is no ... then we might have a bipolar scale ... if the answer is yes ... then we don't > It could be the use of the particular bipolars > > "not stressful" and "very stressful." >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Applied analysis question
At 04:11 PM 2/27/02 -0500, Rich Ulrich wrote: >Categorizing the values into a few categories labeled, >"none, almost none, " is one way to convert your scores. >If those labels do make sense. well, if 750 has the same numerical sort of meaning as 0 (unit wise) ... in terms of what is being measured then i would personally not think so SINCE, the categories above 0 will encompass very wide ranges of possible values if the scale was # of emails you look at in a day ... and 1/3 said none or 0 ... we could rename the scale 0 = not any, 1 to 50 as = some, and 51 to 750 as = many (and recode as 1, 2, and 3) .. i don't think anyone who just saw the labels ... and were then asked to give some extemporaneous 'values' for each of the categories ... would have any clue what to put in for the some and many categories ... but i would predict they would seriously UNderestimate the values compared to the ACTUAL responses this just highlights that for some scales, we have almost no differentiation at one end where they pile up ... perhaps (not saying one could have in this case) we could have anticipated this ahead of time and put scale categories that might have anticipated that after the fact, we are more or less dead ducks i would say this though ... treating the data only in terms of ranks ... does not really solve anything ... and clearly represents being able to say LESS about your data or interrelationships (even if the rank order r is .3 compared to the regular pearson of about 0) ... than if you did not resort to only thinking about the data in rank terms >-- >Rich Ulrich, [EMAIL PROTECTED] >http://www.pitt.edu/~wpilib/index.html > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Means of semantic differential scales
At 08:18 AM 2/26/02 -0800, Jay Tanzman wrote: > > > > > > Not stressful 1__ 2__ 3__ 4__ 5__ 6__ 7__ Very stressful these contain more information than simply ordinality ... they give you some indication of amount of stress too differentiate this sort of item and response from: rank order your preferences for the following foods: steak ___ ... 1 veal ... 2 chicken ... 4 fish ... 5 pork ... 3 and, assume it says to put 1 for the top 1 ... and 5 for the low one so, i do as above both CAN be thought of as ordering scales ... but, there is definitely MORE information in the not stressful to very stressful item and responses the end points of the 1 to 7 scale DO have meaning ... in terms of ABSOLUTE quantities that is not so for the food orderings ... can we infer that i don't like fish since i ranked it 5 and DO like steak since i ranked it one??? NOT necessarily there is a fundamental difference in the information you can extract from each of the examples above i see nothing inherently wrong with finding means on items like the stress item ... since means close to 1 or 7 ... do have some underlying referent to quantity of stress ... one cannot say that about the food preferences in terms of some underlying absolute liking or disliking of the foods Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Means of semantic differential scales
i think we are all missing the main point if you have a number of these items where, your goal (perhaps) is to SUM them together in some way ... where one end represents low amounts of the "thing" presented and the other end represents large amounts of the thing presented ... then ACROSS items ... the issue is do Ss tend to respond at the low end or the high end? i really don't care if the exact scale IS interval or interpreted by Ss as such ... the main thing is how do they respond across a set of items? whether or not these data or scales are interval or not, the MEAN has meaning ... excuse the pun ... i am willing to bet that those Ss who produce mean values close to 1 below are not experiencing any serious stress ... whereas those Ss whose means are close to 6 or 7 ... are now, does that mean i know precisely what they are thinking/feeling? of course not but, it is plenty good enough to get a good idea of variation across Ss on these items or dimensions i really don't see what the big fuss is At 08:10 AM 2/26/02 -0800, Jay Tanzman wrote: >Jay Warner wrote: > > > > Jay Tanzman wrote: > > > > > I just got chewed out by my boss for modelling the means of some 7-point > > > semantic differential scales. The scales were part of a written, > > > self-administered questionnaire, and were laid out like this: > > > > > > Not stressful 1__ 2__ 3__ 4__ 5__ 6__ 7__ Very stressful Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Means of semantic differential scales
of course, to be fair to the first jay .. could be simply that his boss did not like semantic diff. scales ... AND, for none of the reasons the second jay below said ... it would be helpful if the first jay could give us some further info on why his boss was so ticked off ... At 09:39 PM 2/25/02 -0600, Jay Warner wrote: >Jay Tanzman wrote: > > > I just got chewed out by my boss for modelling the means of some 7-point > > semantic differential scales. The scales were part of a written, > > self-administered questionnaire, and were laid out like this: > > > > Not stressful 1__ 2__ 3__ 4__ 5__ 6__ 7__ Very stressful > > > > So, why or why not is it kosher to model the means of scales like this? > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: What is an outlier ?
of course, if one has control over the data, checking the coding and making sure it is correct is a good thing to do if you do not have control over that, then there may be very little you can do with it and in fact, you may be totally UNaware of an outlier problem i see as a potentially MUCH larger problem when ONLY certain summary statistics are shown without any basic tallies/graphs displayed so, IF there are some really strange outlier values, it usually will go undetected ... correlations are ONE good case in point ... have a look at the following scatterplot ... height in inches and weight in pounds ... from the pulse data set in minitab - * - 300+ - Weight - - 2 - 2 224 32 150+ ** 3458*454322* -*53*3*535 2 - ** --+-+-+-+-+-+Height 32.0 40.0 48.0 56.0 64.0 72.0 now, the actual r between the X and Y is -.075 ... and of course, this seems strange but, IF you had only seen this in a matrix of r values ... you might say that perhaps there was serious range restriction that more or less wiped out the r in this case ... but even the desc. stats might not adequately tell you of this problem IF you had the scatterplot, you probably would figure out REAL quick that there is a PROBLEM with one of the data points ... in fact, without that one weird data point, the r is about .8 ... which makes a lot better sense when correlating heights and weights of college students At 09:06 PM 2/25/02 +, Art Kendall wrote: >--6F47CB3D3B10A10A3E9B064C >Content-Type: text/plain; charset=us-ascii >Content-Transfer-Encoding: 7bit > >An "outlier" is any value for a variable that is suspect given the >measurement system, "common sense", other values for the variable in >the data set, or the values a case has on other variables. >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
RE: Chi-square chart in Excel
sure is easy in minitab ... one can draw a very nice curve (it's easy but, hard to post here) but, to make a distribution easy for viewing we can MTB > rand 10 c1; <<< generated 10 values from SUBC> chis 4. <<< a chi square distribution with 4 degrees of freedom MTB > dotp c1 Dotplot: C1 Each dot represents up to 778 points . .::. : .::. : ::. :. ::. .. +-+-+-+-+-+---C1 0.0 6.0 12.0 18.0 24.0 30.0 MTB > desc c1 Descriptive Statistics: C1 Variable N Mean Median TrMean StDevSE Mean C1 10 4.0123 3.3729 3.7727 2.8350 0.0090 Variable MinimumMaximum Q1 Q3 C1 0.008029.4143 1.9236 5.4110 not quite as fancy as the professional graph but, will do in a pinch At 06:27 PM 2/21/02 -0800, David Heiser wrote: >-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED]]On Behalf Of Ronny Richardson >Sent: Wednesday, February 20, 2002 7:29 PM >To: [EMAIL PROTECTED] >Subject: Chi-square chart in Excel > > >Can anyone tell me how to produce a chart of the chi-square distribution in >Excel? (I know how to find chi-square values but not how to turn those into >a chart of the chi-square curve.) > > >Ronny Richardson >--- >Excel does not have a function that gives the Chi-Square density > >The following might be helpful regarding future graphs. It is a fraction of >a larger "package" I am preparing. It is awkward to present it in .txt >format. > > >DISTRIBUTIONDENSITY CUMMULATIVE I NVERSE >BetaBETADIST >BETAINV >BinomialBINOMST CRITBINOM >Chi-Square CHIDIST CHINV >Exponential EXPONDIST EXPONDIST >F FDIST FINV >Gamma GAMMADIST GAMMADIST >GAMMAINV >Hyper geometric HYPGEOMDIST >Log Normal LOGNORMDIST LOGINV >Negative Binomial NEGBINOMDIST >Normal(with parameters) NORMDISTNORMDISTNORMINV >Normal (z >values) NORMSDIST NORMSINV >Poisson POISSON >t TDIST TINV >Weibull WEIBULL > >You have to build a column (say B) of X values. > >Build an expression for column C calculating the Chi-Square density, given >the x value in col B and the df value in A1. > >It would be "=exp(($A$1/2)*LN(2) + GAMMALN($A$1/2) + (($A$1/2)-1)*LN(B1) - >B1/2)" without the quotes. >You can equation-drag this cell down column C for each X value. > >Now build a smoothed scatter plot graph as series 1 with the X value column >B and the Y value as column C. > >DAHeiser > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: What is an experiment ?
At 03:59 PM 2/20/02 -0300, Voltolini wrote: >Hi, > >I was reading a definition of "experiment" in science to be used in a >lecture and the use of treatments and controls are an important feature of >an experiment but my doubt is... is it possible to plan an experiment >without a control and call this as an "experiment" ? > >For example, in a polluted river basin there is a gradient of contamination >and someone are interested in to compare the fish diversity in ten rivers of >this basin. Then, the "pollution level" are the treatment (with ten levels) >but if there is not a clean river in the basin, I cannot use a control ! > >Is this an experiment anyway ? the main issue is CONTROL OVER ... that the "experimenter" exerts over an independent variable if you want to compare diversity of fish ACROSS rivers ... and you find a difference, what does this necessarily have to do with contamination? i see three variables here ... diversity of fish ... rivers ... level of contamination (ie, where the gradients are different) what are you trying to show impacts on what? Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Correlations-statistics
well, one simple way would be to add B and C ... then correlate with A if these are radically different scales, convert to z scores first At 02:05 AM 2/20/02 -0800, Holger Boehm wrote: >Hi, > >I have calculated correlation coefficients between sets of parameters >(A) and (B) and beween (A) and (C). >Now I would like to determine the correlation between (A) and (B >combined with C). How can I combine the two parameters (B) and (C), >what kind of statistical method has to be applied? > >Thanks for your tips, > >Holger Boehm > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Statistical Distributions
At 07:34 AM 2/19/02 -0500, Herman Rubin wrote: >I do not see this. The binomial distribution is a natural >one; the normal distribution, while it has lots of mathematical >properties, is not. i don't know of any "distribution" that is natural ... what does that mean? inherent in the universe? all distributions are human made ... in the sense that WE observe events ... and, find some function that links events to probabilities all of mathematics ... and statistics too as an offshoot ... is made up Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Numerical recipes in statistics ???
what is it you wanted to cook? At 01:35 PM 2/18/02 -0800, The Truth wrote: >Are there any "Numerical Recipes" like textbook on statistics and >probability ? >Just wondering.. > >Thanks. > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Which is faster? ziggurat or Monty Python (or maybe something else?)
At 09:50 AM 2/18/02 +, Ian Buckner wrote: >We generate pairs of properly distributed Gaussian variables at >down to 10nsec intervals, essential in the application. Speed can >be an issue, particularly in real time situations. > >Ian wow ... how our perspectives have changed! back in grad school, we had some cdc mainframe over in the next building and, we would go over and stick our "stack of cards" through the slot .. and then come back the NEXT DAY ... to pick them up ... we just hoped we hadn't put a , or other bad character someplace and would have to wait another day Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Statistical Distributions
addendum if one manipulates n and p in a binomial and, gets to a point where a person would say (or we would say as the instructor) that what you see is very similar to ... and might even be approximated well by ... the nd ... this MEANS that the nd came first in the sense that one would have to be familiar with that before you could draw the parallel At 06:36 PM 2/17/02 -0500, Timothy W. Victor wrote: >I also think Alan's idea is sound. I start my students off with some >binomial expansion theory. > >Alan McLean wrote: > > > > This is a good idea, Dennis. I would like to see the sequence start with > > the binomial - in a very real way, the normal occurs naturally as an > > 'approximation' to the binomial. Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Statistical Distributions
not to disagree with alan but, my goal was to parallel what glass and stanley did and that is all ...seems like there are all kinds of distributions one might discuss AND, there may be more than one order that is acceptable most books of recent vintage (and g and s was 1970) don't even discuss what g and s did but, just for clarity sake ... are you saying that the nd is a logical SECOND step TO the binomial or, that if you look at the binomial, one could (in many circumstances of n and p) say that the binomial is essentially a nd (very good approximation).. ? the order i had for the nd, chis square, F and t seemed to make sense but, i don't necessarily buy that one NEED to START with the binominal certainly, however, if one talks about the binomial, then the link to the nd is a must At 06:36 PM 2/17/02 -0500, Timothy W. Victor wrote: >I also think Alan's idea is sound. I start my students off with some >binomial expansion theory. > >Alan McLean wrote: > > > > This is a good idea, Dennis. I would like to see the sequence start with > > the binomial - in a very real way, the normal occurs naturally as an > > 'approximation' to the binomial. > > Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Statistical Distributions
Back in 1970, Glass and Stanley in their excellent Statistical Methods in Education and Psychology book, Prentice-Hall ... had an excellent chapter on several of the more important distributions used in statistical work (normal, chi square, F, and t) and developed how each was derived from the other(s). Most recent books do not develop distributions in this fashion anymore: they tend to discuss distributions ONLY when a specific test is discussed. I have found this to be a more disjointed treatment. Anyway, I have developed a handout that parallels their chapter, and have used Minitab to do simulation work that supplements what they have presented. The first form of this can be found in a PDF file at: http://roberts.ed.psu.edu/users/droberts/papers/statdist2.PDF Now, there is still some editing work to do AND, working with the spacing of text. Acrobat does not allow too much in the way of EDITING features and, trying to edit the original document and then convert to pdf, is also somewhat of a hit and miss operation. When I get an improved version with better spacing, I will simply copy over the file above. In the meantime, I would appreciate any feedback about this document and the general thrust of it. Feel free to pass the url along to students and others; copy freely and use if you find this helpful. Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: If T-Test can not be applied
it's called the behrens-fisher problem ... there is nothing that says that population variances HAVE to be equal essentially what you do is to be a bit more conservative in your degrees of freedom ... most software packages do this as the default ... or at least give you the choice between making an assumption of equal population variances ... or not At 11:48 PM 2/14/02 +0100, Matthias wrote: >Hello, > >would be nice if someone can give me some advice with regard to the >following problem: > >I would like to compare the means of two independent numerical sets of data >whether they are significantly different from each other or not. One of the >two underlying assumption to calculate the T-Test is not given (Variances >are assumed to be NOT equally distributed; but data is normally >distributed). What kind of (non?)parametric-test does exist - instead of the >T-Test - to calculate possible differences in the two means? >I'm using SPSS for further calculations. > >Thank you for your time and help, > >Matthias > > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: F-test
can you bit a bit more specific here? f tests AND t tests are used for a variety of things give us some context and perhaps we can help at a minimum of course, one is calling for using a test that involves looking at the F distribution for critical values ... the other calls for using a t distribution but, that still does not really tell you what is going on At 04:41 AM 2/14/02 -0800, Jan wrote: >The question is how do I see the difference when it's asked for an >f-test or a t-test? > > >Jan > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: one-way ANOVA question
At 09:21 AM 2/13/02 -0600, Mike Granaas wrote: >On Fri, 8 Feb 2002, Thomas Souers wrote: > > > > 2) Secondly, are contrasts used primarily as planned comparisons? If > so, why? > > > >I would second those who've already indicated that planned comparisons are >superior in answering theoretical questions and add a couple of comments: another way to think about this issue is: what IF we never had ... nor will in the future ... the overall omnibus F test? would this help us or hurt us in the exploration of the experimental/research questions of primary interest? i really don't see ANY case that it would hurt us ... and, i can't really think of cases where doing the overall F test helps us ... i think mike's point about planning comparisons making us THINK about what is important to explore in a given study ... is really important because, we have gotten lazy when it comes to this ... we take the easy way out of testing all possible paired comparisons when, it MIGHT be that NONE of these are really the crucial things to be examined Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: one-way ANOVA question
At 10:37 AM 2/8/02 -0800, Thomas Souers wrote: >2) Secondly, are contrasts used primarily as planned comparisons? If so, why? well, in the typical rather complex study ... all pairs of possible mean differences (as one example) are NOT equally important to the testing of your theory or notions so, why not set up ahead of time ... THOSE that are (not necessarily restricted to pairs) you then follow ... let the other ones alone no law says that if you had a 3 by 4 by 3 design, that the 3 * 4 * 3 = 36 means all need pairs testing ... in fact, come combinations may not even make a whole lot of sense EVEN if it is easier to work them into your design >I would very much appreciate it if someone could take the time to explain >this to me. Many thanks. > > >Go Get It! >Send FREE Valentine eCards with Lycos Greetings >http://greetings.lycos.com > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: 'Distance' between two normal distributions
seems, as you have said, depends what you want to do with it if there is considerable overlap, then whatever distance you use will have some of both distributions included ... if there is essentially no overlap ... then any pair of values ... one from each ...will reflect a real difference of course, if there is a small difference in means but very large sds ... that is one thing wheres ... if there were the same small differences in means but, minuscule sds ... that would be another thing the simple thing would be to use the mean difference but, that really does not reflect if there is any overlap between the two and, that seems to be part of the issue At 07:28 PM 2/6/02 +, Francis Dermot Sweeney wrote: >If I have two normal distributions N(m1, s1) and N(m2, s2), what is a >good measure of the distance between them? I was thinking of something >like a K-S distance like max|phi1-phi2|. I know it probably depende on >what I want it for, or what exactly I mean by distance, but any ideas >would be helpful. > >Thanks, >Francis. > >-- > >Francis Sweeney >Dept. of Aero/Astro >Stanford U. > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= Dennis Roberts, 208 Cedar Bldg., University Park PA 16802 WWW: http://roberts.ed.psu.edu/users/droberts/drober~1.htm AC 8148632401 = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: COV
here is one VERY simple example the COV is the AVERAGE of the PRODUCTS of the deviations around the means ... of two variables if the cov is + .. there is a + relationship between X and Y, if it is -, there is a - relationsip between X and Y X Y devX devY (devX)(devY) 10 20 1-2 - 2 9 24 0 2 0 8 22 -10 0 the sum of the products of the deviations around the two means is -2 the average is -2 / 3 = -.67 = the covariance now, some books will divide by n-1 or 2 in this case ... which would = -2/2 = -1 for the covariance here is the minitab output ... note, mtb and most software packages will divide by n-1 Y - * - - 22.5+ - * - - - 20.0+ * - +-+-+-+-+-+--X 8.00 8.40 8.80 9.20 9.60 10.00 MTB > prin c1-c5 Data Display Row X Y devX devY product 1 10 20 1 -2 -2 2 9 24 0 20 3 8 22 -1 00 MTB > sum c5 Sum of product Sum of product = -2. MTB > cova c1 c2 Covariances: X, Y XY X 1.0 Y-1.0 4.0 MTB > NOTE ... THE INTERSECTION OF THE X AND Y OUTPUT ABOVE ... -1 ... IS THE COVARIANCE IN THIS PROBLEM by the way, the CORRELATION is MTB > corr c1 c2 Correlations: X, Y Pearson correlation of X and Y = -0.500 At 06:06 PM 2/2/02 +, Maja wrote: >Hi everyone, > >we just leanred the cov, and I'm trying to apply the formula >cov(X,Y)=E(XY)-E(X)E(Y) to the following question >f(x,y)=(0.6)^x(0.4)^1-x(0.3)^y(0.52)^1-y(2)^xy where poss. values for X and >Y are x=0,1 and y=0,1. >The prof. never gave us any examples for cov and neither does the text book. >Now I don't know what values to plug in for X and what values to plug in for >Y. > >Could someone PLEASE explain to me how am I supposed to know what vales to >plug into the equation??? > >TNX >Maja > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: correlation of dependent variables
bout the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: cutting tails of samples
t; > = > > Instructions for joining and leaving this list, remarks about the > > problem of INAPPROPRIATE MESSAGES, and archives are available at > > http://jse.stat.ncsu.edu/ > > = > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: area under the curve
unless you had a table comparable to the z table for area under the normal distribution ... for EACH different level of skewness ... an exact answer is not possible in a way that would be explainable here is one example that may help to give some approximate idea as to what might happen though . .:::. .:. .:::. .:. .:::. ::: . .:::. . +-+-+-+-+-+---C1 the above is a norm. distribution of z scores ... where 1/2 the data are above 0 and 1/2 are below : :. .::. . :. . . ... . +-+-+-+-+-+---C2 -5.0 -2.5 0.0 2.5 5.0 7.5 here is a set of z score data that is radically + skewed ... and even though it has 0 as its mean, 50% of the area is NOT above the mean of 0 ... note the median is down at about -.3 ... so, there is LESS than 50% above the mean of 0 ... this means that for z scores above 0 ... there is not as much area beyond (to the right) ... as you would expect if the distribution had been normal ... so, we can have some approximate idea of what might happen but the exact amount of this clearly depends on how much skew you have MTB > desc c1 c2 Descriptive Statistics: C1, C2 Variable N Mean Median TrMean StDevSE Mean C1 1 0.0008 0.0185 0.0029 1.0008 0.0100 C2 1 0.-0.3098-0.1117 1. 0.0100 Variable MinimumMaximum Q1 Q3 C1 -3.9468 3.9996-0.6811 0.6754 C2 -0.9946 8.0984-0.7127 0.3921 At 07:27 AM 1/30/02 -0800, Melady Preece wrote: >A student wants to know how one can calculate the area under the curve for >skewed distributions. Can someone give me an answer about when a >distribution is too skewed to use the z table? > >Melady _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Definition of "Relationship Between Variables" (was Re: Eight
experimental design, whose basic principles are rather simple, is elegant and if applied in good ways, can be very informative as to data, variables and their impact, etc. but, please hold on for a moment when it comes to humans, we have developed some social policies that say: 1. experimental procedures should not harm Ss ... nor pose any unnecessary risk 2. Ss should be informed about what it is they are being asked to participate in as far as experiments are concerned 3. Ss are THE ones to make the decision about participation or not (except in cases of minors ... we allocate that decision to guardians/parents) you aren't suggesting, are you, that for the sake of knowledge, we should abandon these principles? so you see, there are serious restrictions (and valid ones at that) when it comes to doing experiments with humans ... for so much of human behavior and things we would like to explore, we are unable to "do experiments" ... it is that plain and simple yes, while we may be able to gather data on many of these variables of interest in other ways ... we are rarely if EVER in the position of being able to say with any causative assurance (since we did not experimentally manipulate things) that when we do X, Y happens 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: transformation of dependent variable in regression
there is nothing from stopping you (is there?) trying several methods that are seen as sensible possibilities ... and seeing what happens? of course, you might find a transformational algorithm that works BEST (of those you try) with the data you have but ... A) that still might be an "optimal" solution and/or B) it might be "best" with THIS data set but ... for other similar data sets it might not be i think the first hurdle you have to hop over is ... does it make ANY sense WHATSOEVER to take the data you have collected (or received) and change the numbers from what they were in the first place? if the answer is YES to that then my A and B comments seem to apply but, if the answer is NO ... then neither A nor B seem justifiable with 2 independent and 1 dependent variables... you have possibilities for transforming 0 of them ... 1 of them ... 2 of them ... or all of them and, these various combinations of what you do might clearly produce varying results _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: random versus fixed factor
gee just a short question to answer! here is one part of it say you were interested in whether different teaching methods impacted on how well students learned intro statistics ... now, if we put our minds to it, there probably are 50 or more different ways we could teach a course like this but, there is no way you would be able to run an experiment trying out all 50 1. you could take several methods AT random (after you list out all 50) ... maybe 3 ... and try ... and, if you find some differences amongst these 3 ... then you might be able to generalize back to the entire set of 50 ... ie, there are differences amongst the others too 2. perhaps there are only 3 specific methods you have any interest in (out of the 50) ... and are not interested in any of the other 47. so, these specific 3 you use in your experiment and perhaps you find some differences. well, in this case ... you might be able to say that there ARE differences in these 3 but, you could not generalize to the larger set of the other 47 (since you did not sample representatively from all 50) #1 is called a random (independent variable) factor #2 is called a fixed (independent variable) factor At 02:39 PM 1/15/02 -0800, Elias wrote: >hi >i am a little confused about this topic > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: SAT Question Selection
for the SAT ... which is still paper and pencil ... you will find multiple sections ... math and verbal ... as far as i know ... there usually are 3 of one and 2 of the other ... the one with 3 has A section that is called "operational" ... which does NOT count ... but is used for trialing new items ... revised items ... etc. don't expect them to tell you which one that is however ... in a sense ... they are making YOU pay for THEIR pilot work ... and, of course, if you happen to really get fouled up on the section that is operational and does not count ... it could carry over "emotionally" to another section ... and have some (maybe not much) impact on your motivation to do well on that next section unless it has changed ... At 05:19 PM 1/14/02 -0500, you wrote: >[cc'd to previous poster; please follow up in newsgroup] > >L.C. <[EMAIL PROTECTED]> wrote in sci.stat.edu: > >Back in my day (did we have days back then?) I recall > >talk of test questions on the SAT. That is, these questions > >were not counted; they were being tested for (I presume) > >some sort of statistical validity. > > > >Does anyone have any statistical insight into the SAT question > >selection process. Does anyone have a specific lead? I can > >find virtually nothing. > >I remember reading a good book about the inner operation of ETS >(administers the SATs), with some bits about the "test" questions >you refer to, but I can't quite remember the title. I've searched >the catalog of my old library, and this _may_ be it: > >Lemann, Nicholas. > The big test : the secret history of the American meritocracy > New York : Farrar, Straus and Giroux, 1999. > >-- >Stan Brown, Oak Road Systems, Cortland County, New York, USA > http://oakroadsystems.com/ >"What in heaven's name brought you to Casablanca?" >"My health. I came to Casablanca for the waters." >"The waters? What waters? We're in the desert." >"I was misinformed." > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Proportionate vs. disproportionate
1. well, one can consider proportionate ... equal change VALUES ... and i think that is one legitimate way to view it ... which is how the video guy was talking about it ... 2. one could consider proportionate ... equal change from the BASE ... and i think that is legitimate too ... this is clearly NOT how the video guy was referring to it to sure, if one group goes from 60 to 90 ... this is a change of 30 % points ... and, if another group goes from 30 to 60 ... this is a change of 30% points ... i think it is fair to say that the amount of change % points is the same ... thus proportional now, another way we can view the data is to say that in the first group ... since the base is 60 ... the change of 30 is a 50% change in % points ... compared to the base ... whereas in the second group ... the change from 30 to 60 represents a 100% change from the base ... now, if the base N is the same ... say 600 people ... 30% of 600 = 180 people ... no matter if a group change from 60 to 90 OR 30 to 60 ... thus, if the BASE n is the same ... then both value of % change AND volume of n ... mean the same thing if one group's n = 600 and another group's n = 100 ... these are not the same but in any case ... and the way we usually look at these poll %ages ... is in terms of the absolute value of the % values ... so, in THAT context and the way the public usually views these things ... scenario #1 above is how the video person presented the data ... and in that context i think his presentation did NOT try nor did "snow" the video viewers i don't think there is any natural law that says that proportionate or disproportionate has to be interpreted in terms of scenario #2 above ... finally ... i think we are making a mountain out of a molehill in this ... to me ... the most important "fact" from the video was that (regardless of change and how you define it) ... whites approved of the president to a FAR greater extent than blacks ... and, the second most important "fact" was that AFTER the event ... the approval ratings for BOTH groups went un dramatically if the video guy had made the distinctions in scenarios 1 and 2 above ... and had then interpreted the data under both cases ... i think this would have helped NONE in conveying to the public the information that i (IMHO) think was most important ... we seem to be trying to find something that the fellow was hiding FROM the public when, i don't really think he was trying (nor gallup) to hide anything ... he was presenting some data results ... giving one interpretation of the results ... and, if WE want to interpret them differently ... we can is that not true for any set of results? At 10:58 AM 1/11/02 -0500, you wrote: >On Fri, 11 Jan 2002, Dennis Roberts wrote: > > > if the polls used similar ns in the samples ... i disagree > > > > now, if the white sample was say 600 and the black sample was 100 ... i > > MIGHT be more likely to agree with the comment below > >consider white goes 10% to 15% up 50%, 5%pts > black goes 66.7% to 100% up 50%, 33.3%pts > >These are proportionate but hardly equivalent > >white goes 0 to 50% up infinite %, 50%pts >black goes 50% to 100% up 100%50pts >same %pts but black is more striking >I can't see any kind of equivalence in either case _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Proportionate vs. disproportionate
if the polls used similar ns in the samples ... i disagree now, if the white sample was say 600 and the black sample was 100 ... i MIGHT be more likely to agree with the comment below At 06:12 PM 1/10/02 -0500, Elliot Cramer wrote: >EugeneGall <[EMAIL PROTECTED]> wrote: >: The Gallup organization posted a video to explain why the the increase in >: black's job approval for Bush is 'proportionate' to the increase among >whites. > >It makes no sense to talk of "proportionate" increases in percentages > >Suppose you start at zero or 99% ... > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Proportionate vs. disproportionate
definition: proportionate = equal % change IF we agree on this ... and maybe we don't ... then, since the % change is always UN =, then all changes are DISproportionate but, given margins of error and the like ... and, just the practical interpretation of the data ... i would say that we could have a pragmatic agreement that if the changes were within P %, then we might "call" the changes = the fact is, if the data are accurate, for both whites and blacks, the after ratings jumped dramatically .. compared to the before ratings ... now we are just quibbling over whether those dramatic jumps should be called = or not thus, the issue in the video and the information that was presented is ... are the changes SO large as to make even tolerant people say that they are different in the case of george w and, the white and black change from pre to post ... i am MORE than willing to concede that they look about the same for the elder bush ... in term of gulf war pre and post ... the changes between approval ratings between whites and blacks i would be less willing to argue that way but, unless we have STANDARD ERRORS OF DIFFERENCES IN PROPORTIONS FOR CORRELATED SAMPLES .. to make the comparisons with, i think it is just an exercise in "whatever you think" about the data i still don't think the person in the video made any egregious misstatements of how the data looked, and in addition ... if you view the data watching the video, which is very clear you could make up your own mind anyway perhaps you could elaborate on why you think he should have been saying DISproportionate all the time ... at what threshold "change" value would have to be evidenced in the data for you to think he should have been speaking in opposite terms? At 09:59 PM 1/10/02 +, EugeneGall wrote: >His definition of proportionate would mean that if a group's approval of Bush >went from 1% to 31%, that too would be proportionate. The relative odds would >be one way of expressing the changes in proportions, but the absolute >difference (60% to 90% is roughly propotionate to an increase from 33% to 68%) >seems quite wrong. > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Proportionate vs. disproportionate
there are two sets of data ... one for georgeDUBU ... and the elder george bush here is what i glean from the charts for george w ... the EVENT was sept 11 ... for the elder george bush ... the EVENT was the gulf war ... and both were before and after ratings 1. whites approval rating for BOTH ... was much higher than blacks 2. both whites and blacks jumped rather dramatically (in the 30 percent range) on AFTER compared to BEFORE 3. to me, proportionate would be "both increasing" the same approximate % ... disproportionate would imply large differentials in % changes ... in neither case were the % jumps the same ... for each bush ... before and after ... comparing whites and blacks (assuming the data reported in the video is correct) ... so TECHNICALLY ... it is disproportionate ... but ... what about "approximately" ??? i think it is a matter of practical differences and semantics ... not really statistically significant differences ... given the ns ... it is possible that the difference in THE differences MIGHT have been significant ... here are the values george w ... WHITES pre post change 60 9030 BLACKS 33 6835 elder george WHITES 64 9026 BLACKS 33 7037 difference between w/b for geore w = 31 versus 35 difference between w/b for elder george = 26 versus 37 now, i would be willing to say that there is less difference in change for george w than the elder george ... in viewing the video ... i did not see that the person really said anything categorical about this ... he used the term "roughly" ... just depends if the VIEWER of the video and data wants to think that 4% verus 11% means "roughly" the same change ... or not thus, i don't think the moderator said anything really wrong ... At 04:27 PM 1/10/02 +, EugeneGall wrote: >The Gallup organization posted a video to explain why the the increase in >black's job approval for Bush is 'proportionate' to the increase among >whites. >Both increased by about 30% (60 to 90 for whites, mid thirties to roughly 70% >for blacks), so the increase is proportionate, not disproportionate, since >both >increases were about 30%. Unless I'm missing something, and I don't think I >am, this proportionate - disproportionate error is repeated and emphasized >several times in the video. > >http://www.gallup.com/poll/Multimedia/video/archived/2002/01/vr020108b.ram > >Gene Gallagher >UMASS/Boston > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >===== _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
this is about the most irrelevant argument i have heard ... as though the only stat package is SAS ... there are many excellent stat packages ... even their "student" trimmed down versions are better that excel add ons ... and, hundreds of institutions have cheap software purchase options ... at penn state for example ... the full package of minitab is about 90 bucks ... that's not bad for an excellent tool that will serve one's analysis needs well in addition, students could go to http://www.e-academy.com ... and find that they could lease minitab for 6 months for 26 bucks ... or a year for 50 bucks ... i challenge any person to try a real package (doesn't have to be minitab) and see what you can do and THEN gravitate back to excel's add ons ... finally, i find the implied notion below that what we need are "free" things ... and that's the way to go ... as the way to operate ... to be professionally appalling ... most institutions SHOULD have a good statistical package on their lab systems ... so, students can learn with a good tool then, when and if they decide that they would like that tool (or another) in their professional array of tools ... THEN they could shop around and look for some stat package that is within their own or their employer's reach the bottom line here seems to be: since excel is free ... and around ... use it. even though we know that it was never designed to be a good full serviced package in statistics and graphics we do our students a huge DISservice when we knowingly push tools (just because they are cheap or free) RATHER than introduce them to better more useful resources serious companies and institutions and agencies ... DON'T use excel to do their mainline statistical work ... At 03:58 AM 1/9/02 +, you wrote: >Why bother teaching students SAS if nobody can afford their annual license >fee? > Spreadsheets works because many people owns MS Office and chances of their >using skills learned in class is greater. > >Ken > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Measurement Position
Now that the holidays are behind (most) us, I would like to remind possible interested persons in the Asst./Assoc. Professor position opening in Measurement at Penn State ... in the Educational Psychology program. We are hoping to find someone with SOME experience beyond the doctorate (but, new doctoral recipients will be considered) with primary focus instructional areas being some combination of IRT, HLM, and SEM The description of the position can be found at: http://www.ed.psu.edu/employment/edpsymeas.asp If you have ANY questions about this position, please contact ME directly at: <mailto:[EMAIL PROTECTED]> _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs. Specialized stats packages (was: Excel vs Quattro Pro)
>rse? > >This is an interesting discussion, but the line between a spreadsheet and >stats package is not so clear-cut these days. If you look at how the major >stats packages have developed over the last decade, you can see how they >have copied more and more features from Excel. In fact almost all stats >packages now boast of containing a fully featured built-in spreadsheet for >data entry. certainly minitab makes no such claim ... their worksheet is NOT a spreadsheet >Looking at the situation from another angle, why can't a spreadsheet be used >for statistical analysis? Granted, some of Excel's built-in statistical >functions leave a lot to be desired and should be used with care. But the >Excel spreadsheet package is still head-and-shoulders above any other >similar product in terms of ease of use, data entry and collection, >presentation, programming interfaces, and it's excellent integration with >the other Office applications. so, i am not sure this has anything to do with statistical analysis >So if the basic spreadsheet component is sound, and almost all computer and >non-computer literate users can use Excel without problems, why not just >extend Excel's statistical capabilities with reliable accurate statistical >add-ons? Many exist, and we develop a product called "Analyse-it" for this >very purpose. i have looked at analyse-it and one other plug in (plus what comes with excel) ... and, there just is no comparision between them (well there is ... and it is not very good) and most of the popular stat packages >A reliable low-cost statistics add-on for Excel can easily bypass these >problems. unfortunately though, it does not exist here are the major problems with using excel as a stat package including 3rd party plugins (off the top of my head) 1. poor data MANAGEMENT capabilities 2. poor and HIGHLY LIMITED graphics 3. highly limited set of routines to select from 4. inability to work with any/many random generation functions (for distributions) 5. limited access to important statistical tables from discussions like this on several lists, it is clear that no argument pro or con will sway those who have opted for or agin using excel as the statistical analysis tool but, each side keeps trying this kind of discussion, though interesting, pales in comparision to a discussion we should be having about the over reliance and importance we place in statistical analysis in the first place ... and even though i have been in this sort of enterprise for more years than you can shake a stick at ... the reality is that typical analysis that we do has limited practical uses and benefits the entire area of statistical significance testing is just one case in point >_ > >James Huntington, >..Analyse-it Software, Ltd. >. >Analyse-it! accurate low-cost statistical software for >Microsoft Excel. For more information & to download a >free evaluation, visit us: http://www.analyse-it.com > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
most stat packages have nothing to do with programming anything ... you either use simple commands to do things you want done (like in minitab ... mtb> correlation 'height' 'weight') or, select procedures from menus and dialog boxes At 12:27 AM 1/8/02 +, Kenmlin wrote: > >i don't know the answer to this but ... i have a general question with > >regards to using spreadsheets for stat analysis > >Many students are computer illiterate and it might be easier to teach them how >to use the spreadsheet than a formal programming language. > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >= = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Excel vs Quattro Pro
i don't know the answer to this but ... i have a general question with regards to using spreadsheets for stat analysis why? ... why do we not help our students and encourage our students to use tools designed for a task ... rather than substituting something that may just barely get us by? we don't ask stat packages to do what spreadsheets were designed to do ... why the reverse? just because packages like excel are popular and readily available ... does not therefore mean that we should be recommending it (or them) to people for statistical analysis it's like telling people that notepad will be sufficient to do all your word processing needs ... At 04:56 PM 1/7/02 -0600, Edward Dreyer wrote: >Does anyone know if Quattro Pro suffers the same statistical problems as >Excel? > >Cheers. ECD >___ > >Edward C. Dreyer >Political Science >The University of Tulsa > > > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Looking for some datasets
some minitab files and other things are here http://roberts.ed.psu.edu/users/droberts/datasets.htm _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Standardizing evaluation scores
sorry for late reply ranking is the LEAST useful thing you can do ... so, i would never START with simple ranks any sort of an absolute kind of scale ... imperfect as it is ... would generally be better ... one can always convert more detailed scale values INTO ranks at the end if necessary BUT, you cannot go the reverse route say we have 10 people measured on variable X ... and we end up with no ties ... so, we get ranks of 1 to 10 ... but, these value give on NO idea whatsoever as to the differences amongst the 10 if i had a 3 person senior high school class with cumulative gpas of 4.00, 3.97, and 2.38 ... the ranks would be 1, 2, and 3 ... but clearly, there is a huge difference between either of the top 2 and the bottom ... but, ranks give no clue to this at all so, my message is ... DON'T START WITH RANKS At 02:11 AM 12/19/01 +, Doug Federman wrote: >I have a dilemma which I haven't found a good solution for. I work with >students who rotate with different preceptors on a monthly basis. A >student will have at least 12 evaluations over a year's time. A >preceptor usually will evaluate several students over the same year. >Unfortunately, the preceptors rarely agree on the grades. One preceptor >is biased towards the middle of the 1-9 likert scale and another may be >biased towards the upper end. Rarely, does a given preceptor use the 1-9 >range completely. I suspect that a 6 from an "easy" grader is equivalent >to a 3 from a "tough" grader. > >I have considered using ranks to give a better evaluation for a given >student, but I have a serious constraint. At the end of each year, I >must submit to another body their evaluation on the original 1-9 scale, >which is lost when using ranks. > >Any suggestions? > >-- >"It has often been remarked that an educated man has probably forgotten >most of the facts he acquired in school and university. Education is what >survives when what has been learned has been forgotten." >- B.F. Skinner New Scientist, 31 May 1964, p. 484 > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
Re: Correlation problem
sum of deviations around a mean always = 0 >X-X >1-3.5=2.5 >2-3.5=-1.5 >3-3.5=-0.5 >3-3.5=-0.5 >4-3.5=0.5 >4-3.5=0.5 >5-3.5=1.5 >6-3.5=2.5 > >0 > > > >As you see the answer is zero. What do I do wrong? and the same with >Y-Y(with a line above). It turns out to be zero. Please help me to tell >how I should do. > >Janne > > > > > >= >Instructions for joining and leaving this list, remarks about the >problem of INAPPROPRIATE MESSAGES, and archives are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at http://jse.stat.ncsu.edu/ =
RE: Statistical illiteracy
At 02:20 PM 12/14/01 -0500, Wuensch, Karl L wrote: >I came across a table of costume jewelry at a department store with a sign >that said "150% off. " I asked them how much they would pay me to take it >all off of their hands. I had to explain to them what 150% meant, and >they then explained to me how percentages are computed in the retail >trade: Then we cut it in half again. Now we have cut it in half a third >time. 50% + 50% + 50% = 150% off. well, if the item was originally $100 ... and you really like this item ... it still means it would be $12.50 ... sounds like a good deal to me!! they might be ILLITERATE statistically but, you still are making out like a bandit! this is sort of like some statistical TEXTbooks that no one really wants ... that end up in bargain shelves in some clearance kinds of bookstores _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: used books
try ... http://www.bookfinder.com ... you might have luck there At 11:51 AM 12/12/01 -0700, IPEK wrote: >Do you know any online used bookstore other than Amazon? I need to find some >old stat and OR books. > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANOVA = Regression
of course, the typical software program, when doing regression analysis ... prints out a summary ANOVA table ... so, there is one place to start ... At 10:52 AM 12/11/01 -0500, Wuensch, Karl L wrote: >For a demonstration of the equivalence of regression and traditional ANOVA, >just point your browser to >http://core.ecu.edu/psyc/wuenschk/StatHelp/ANOVA=reg.rtf. > >-Original Message- >From: Stephen Levine [mailto:[EMAIL PROTECTED]] >Sent: Tuesday, December 11, 2001 3:47 AM >To: Karl L. Wuensch >Subject:Re: When does correlation imply causation? > >Hi >You wrote > >>Several times I have I had to explain to my colleagues that two-group t >tests and ANOVA are just special cases of correlation/regression analysis. >I can see what you mean - could you please proof it - I read, in a pretty >good text, that the results are not necessarily the same! >Cheers >S. > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When to Use t and When to Use z Revisited
At 03:42 PM 12/10/01 +, Jerry Dallal wrote: >Dennis Roberts wrote: > > > this is pure speculation ... i have yet to hear of any convincing case > > where the variance is known but, the mean is not > >A scale (weighing device) with known precision. as far as i know ... knowing the precision is expressed in terms of ... 'accurate to within' ... and if there is ANY 'within' attached ... then accuracy for SURE is not known >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _____ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When to Use t and When to Use z Revisited
At 04:14 AM 12/10/01 +, Jim Snow wrote: >"Ronny Richardson" <[EMAIL PROTECTED]> wrote in message >[EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > > > A few weeks ago, I posted a message about when to use t and when to use z. > >I did not see the earlier postings, so forgive me if I repeat advice already >given.:-) > > 1. The consequences of using the t distribution instead of the normal >distribution for sample sizes greater than 30 are of no importance in >practice. what's magical about 30? i say 33 ... no actually, i amend that to 28 > 2. There is no good reason for statistical tables for use in practical >analysis of data to give figures for t on numbers of degrees of freedom over >30 except that it makes it simple to routinely use one set of tables when >the variance is estimated from the sample. with software, there is no need for tables ... period! > 3. There are situations where the error variance is known. They >generally arise when the errors in the data arise from the use of a >measuring instrument with known accuracy or when the figures available are >known to be truncated to a certain number of decimal places. For example: > Several drivers use cars in a car pool. The distance tavelled on each >trip by a driver is recorded, based on the odometer reading. Each >observation has an error which is uniformly distributed in (0,0.2). The >variance of this error is (0.2)^2)/12 = .00 and standard deviation >0.0578 . To calculate confidence limits for the average distance travelled >by each driver, the z statistic should be used. this is pure speculation ... i have yet to hear of any convincing case where the variance is known but, the mean is not _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
At 08:08 PM 12/7/01 +, J. Williams wrote: >On 6 Dec 2001 11:34:20 -0800, [EMAIL PROTECTED] (Dennis Roberts) wrote: > > >if anything, selectivity has decreased at some of these top schools due to > >the fact that given their extremely high tuition ... i was just saying that IF anything had happened ... that it might have gone down ... i was certainly not saying that it had ... but i do think that it could probably not get too much more selective ... so it probably has sort of stayed where it has over the decades ... so if grade inflation has occurred there it would not likely be due to an increased smarter incoming class _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Experimental Correlation Coefficients
i would say that karl has demonstrated that IF we know conditions of manipulation or not ... we can have a better or lesser idea of what (if anything) impacted (caused?) what that i will grant him to argue that r or eta has anything to do with this ... i would respectfully disagree they are just byproducts of our manipulations ... one cannot equate the byproducts with THE manipulations .. At 03:23 PM 12/6/01 -0500, Wuensch, Karl L wrote: > My experimental units are 100 classrooms on campus. As I walk into >each room I flip a perfectly fair coin in a perfectly fair way to determine >whether I turn the room lights on (X = 1) or off (X = 0). I then determine >whether or not I can read the fine print on my bottle of smart pills (Y = 0 >for no, Y = 1 for yes). From the resulting pairs of scores (one for each >classroom), I compute the phi coefficient (which is a Pearson r computed >with dichotomous data). Phi = .5. I test and reject the null hypothesis >that phi is zero in the population (using chi-square as the test statistic). >Does correlation (phi is not equal to zero) imply causation in this case? >That is, can I conclude that turning the lights on affects my ability to >read fine print? could be here that if you did not have glasses ... you could not have read anything with or without light ... and, since you did have glasses ... the r you get is because of the implicit interaction between light or not, and glasses or not > ~~~ >Karl L. Wuensch, Department of Psychology, >East Carolina University, Greenville NC 27858-4353 >Voice: 252-328-4102 Fax: 252-328-6283 >mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> >http://core.ecu.edu/psyc/wuenschk/klw.htm ><http://core.ecu.edu/psyc/wuenschk/klw.htm> > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
generally speaking, it is kind of difficult to muster sufficient evidence that the amount of grade inflation that is observed ... within and across schools or colleges ... is due to an increase in student ability i find it difficult to believe that the average ability at a place like harvard has gone up ... but if so, very much over the years ... if anything, selectivity has decreased at some of these top schools due to the fact that given their extremely high tuition ... they need to keep their dorms full and, making standards higher and higher would have the opposite effect on keep dorms filled At 11:58 AM 12/6/01 -0500, Rich Ulrich wrote: >Just in case someone is interested in the Harvard instance >that I mentioned -- while you might get the article from a newsstand >or a friend -- > >On Sun, 02 Dec 2001 19:19:38 -0500, Rich Ulrich <[EMAIL PROTECTED]> >wrote: > >= _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When does correlation imply causation?
i repeat ... the r value shows the extent to which a straight line (in a 2 variable problem) can pass through a scatterplot and, be close TO the data points in that sense, r is an index value for the extent to which a straight line MODEL fits the data ... knowing how the dots on the scatterplot got to be ... is totally outside the realm of what r can know At 10:06 AM 12/6/01 -0700, Alex Yu wrote: >Whether we can get causal inferences out of correlation and equations has >been a dispute between two camps: > >For causation: Clark Glymour (Philosopher), Pearl (Computer scientist), >James Woodward (Philosopher) > >Against: Nancy Cartwright (Economist and philosopher), David Freedman >(Mathematician) > >One comment fromm this list is about that causal inferences cannot be >drawn from non-experimental design. Clark Glymour asserts that using >Causal Markov condition and faithfulness assumption, we can make causal >interpretation to non-experimental data. > > >Chong-ho (Alex) Yu, Ph.D., MCSE, CNE, CCNA >Psychometrician and Data Analyst >Assessment, Research and Evaulation >Cisco Systems, Inc. >Email: [EMAIL PROTECTED] >URL:http://seamonkey.ed.asu.edu/~alex/ > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Stat question
the reality of this is ... sometimes getting notes from other students is helpful ... sometimes it is not ... there is no generalization one can make about this most student who NEED notes are not likely to ask people other than their friends ... and, in doing so, probably know which of their friends they have the best chance of getting good notes from ... (at least READABLE!) ...even lazy students are not likely to ask for notes from people that even THEY know are not going to be able to do them any good but i don't think we can say anything really systematic about this activity other than, sometimes it helps ... sometimes it does not help At 06:24 PM 12/5/01 -0800, Glen wrote: >Jon Miller <[EMAIL PROTECTED]> wrote in message > > > You can ask the top students to look at their notes, but you should be > prepared > > to find that their notes are highly idiosyncratic. Maybe even unusable. > >Having seen notes of some top students on a variety of occasions >(as a student and as a lecturer), that certainly does happen >sometimes. But just about as likely is to find a set of notes that >are actually better than the lecturer would prepare themselves. > >Glen > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: When does correlation imply causation?
what if we see, from the side, one person on the right swing his/her fist ... and, as the fist arrives at what appears to be the impact point of the face of a person on the left ... that IMMEDIATELY the person on the left falls backwards and down now, we do this over and over again ... and observe the same result we want to say that the impact to the FACE ... CAUSED the person on the left to fall down but, did it? in a sense, this is like a perfect correlation in that ... when the person swung to the RIGHT .. the person on the left NEVER fell backwards and down with an r = 1, karl would say that this has removed ALL extraneous factors from the possibility of an explanation for why the person fell ... or did not fall however, what is UNBEKNOWNST to the viewer ... there was a clear panel between the two people ... thrower of the punch and, the one on the left ... and, each time the person threw a punch ... and it LOOKED like the punch TO the person "caused" a fall down, ... in actuality there was an electronic triggering mechanism that ... when the panel was activated ... lead to some electrical shock being applied to the person on the left ... that made the person fall down it could be that if you viewed this from the angle of both people ... 90 degrees rotated ... you could see that there was a clear separation of 2 or more feet between where the punch hit the clear panel and ... where the face was leaning up against the panel so, the punch did NOT directly touch the face ... and in that sense, could not have caused the person to fall backwards and down even though these may be the "facts" ... the sideways view clearly has missed that there were OTHER things in the chain of events ... that LEAD to the person falling backwards and down r values ... even perfect ones ... are in an impossible statistical position to say anything about cause ... or, more importantly, what causes what the r that goes along with some manipulative procedure that varies X and produces some corresponding Y change ... is just a tag along statistic ... and may not really (even when perfect) suggest anything about HOW X, when manipulated, PRODUCED the Y change that we see ... thus, the use of r in this case as an index of how MUCH X CAUSES Y ... is a statistical stretch _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: When does correlation imply causation?
perhaps the problem here is with the word ... "cause" say i put in a column some temps in F ... then use the F to C formula ... and get the corresponding C values ... then i do an r between the two and find 1.0 now, is the formula the "cause" of the r of 1? maybe we might see it as a cause but ... then again ... if i had Cs and changed to Fs ... i get the same thing ... now, what if we have 3 dosage levels (we manipulate) of 10, 20, and 30 ... and we find that on the criterion ... we get a graph of Plot - * 1.50+ - res1- - * - 1.00+ - -* - +-+-+-+-+dosage 12.0 16.0 20.0 24.0 28.0 and an r = 1 MTB > corr c1 c2 Correlations: dosage, res1 Pearson correlation of dosage and res1 = 1.000 P-Value = * but, for another set of data we get MTB > plot c3 c1 Plot res2- * - - 1.00+ - - * - - 0.80+* - +-+-+-+-+dosage 12.0 16.0 20.0 24.0 28.0 MTB > corr c1 c3 Correlations: dosage, res2 Pearson correlation of dosage and res2 = 0.982 P-Value = 0.121 and an r = .982 the fact is that we could find and equation (not linear) where the dots about would be "hit" perfectly ... ie, find another model where there is no error but, the r is not 1 ... so, does that mean that some other factor is detracting from our "cause" of Y? the fact that the pattern is NOT linear (to which r is a function) ... is not a detraction ... if the model fails to find an r = 1, then that does not mean that there is lack of perfect "cause", whatever that means ... only that the model does not detect it so, in that sense, lower rs do not necessarily mean that there are other "errors" or "extraneous" factors that enter ... i would not call using the wrong model an "extraneous" factor At 02:12 PM 12/5/01 -0500, Wuensch, Karl L wrote: >Dennis warns "the problem with this is ... does higher correlation mean MORE >cause? lower r mean LESS cause? >in what sense can think of cause being more or less? you HAVE to think that >way IF you want to use the r value AS an INDEX MEASURE of cause ..." > >Dennis is not going to like this, since he has already expressed a disdain >of r-square, omega-square, and eta-square like measures of the strength of >effect of one variable on another, but here is my brief reply: > >R-square tells us to what extent we have been able to eliminate, in our data >collection procedures, the contribution of other factors which influence the >dependent variable. > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When does correlation imply causation?
At 07:36 AM 12/5/01 -0500, Karl L. Wuensch wrote: > Accordingly, I argue that correlation is a necessary but not a > sufficient condition to make causal inferences with reasonable > confidence. Also necessary is an appropriate method of data > collection. To make such causal inferences one must gather the data by > experimental means, controlling extraneous variables which might confound > the results. Having gathered the data in this fashion, if one can > establish that the experimentally manipulated variable is correlated with > the dependent variable (and that correlation does not need to be linear), > then one should be (somewhat) comfortable in making a causal > inference. That is, when the data have been gathered by experimental > means and confounds have been eliminated, correlation does imply causation. the problem with this is ... does higher correlation mean MORE cause? lower r mean LESS cause? in what sense can think of cause being more or less? you HAVE to think that way IF you want to use the r value AS an INDEX MEASURE of cause ... personally, i think it is dangerous in ANY case to say that r = cause ... if you can establish that as A goes up ... so does B ... where you manipulated A and measured B ... (or vice versa) ... then it is fair to say that the causal connection THAT IS IMPLIED BECAUSE OF THE WAY THE DATA WERE MANIPULATED/COLLECTED also has a concomitant r ... BUT, i think one still needs to be cautious when then claiming that the r value itself is an indicant OF cause > > > So why is it that many persons believe that one can make causal > inferences with confidence from the results of two-group t tests and > ANOVA but not with the results of correlation/regression techniques. I > believe that this delusion stems from the fact that experimental research > typically involves a small number of experimental treatments that data > from such research are conveniently evaluated with two-group t tests and > ANOVA. Accordingly, t tests and ANOVA are covered when students are > learning about experimental research. Students then confuse the > statistical technique with the experimental method. I also feel that the > use of the term "correlational design" contributes to the problem. When > students are taught to use the term "correlational design" to describe > nonexperimental methods of collecting data, and cautioned regarding the > problems associated with inferring causality from such data, the students > mistake correlational statistical techniques with "correlational" data > collection methods. I refuse to use the word "correlational" when > describing a design. I much prefer "nonexperimental" or "observational." > > > > In closing, let me be a bit picky about the meaning of the word > "imply." Today this word is used most often to mean "to hint" or "to > suggest" rather than "to have as a necessary part." Accordingly, I argue > that correlation does imply (hint at) causation, even when the > correlation is observed in data not collected by experimental means. Of > course, with nonexperimental models, the potential causal explanations of > the observed correlation between X and Y must include models that involve > additional variables and which differ with respect to which events are > causes and which effects. > >-- >Karl L. Wuensch, Department of Psychology, >East Carolina University, Greenville NC 27858-4353 >Voice: 252-328-4102 Fax: 252-328-6283 ><mailto:[EMAIL PROTECTED]>[EMAIL PROTECTED] >http://core.ecu.edu/psyc/wuenschk/klw.htm _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When does correlation imply causation?
correlation NEVER implies causation ... and i agree with mike totally At 09:01 AM 12/5/01 -0600, Mike Granaas wrote: >We really need to emphasize over and over that it is the manner in which >you collect the data and not the statistical technique that allows one to >make causal inferences. > >Michael > > > > > > > > > > > > > > > > Karl L. Wuensch, Department of Psychology, > > East Carolina University, Greenville NC 27858-4353 > > Voice: 252-328-4102 Fax: 252-328-6283 > > [EMAIL PROTECTED] > > http://core.ecu.edu/psyc/wuenschk/klw.htm > > > > > > > >*** >Michael M. Granaas >Associate Professor[EMAIL PROTECTED] >Department of Psychology >University of South Dakota Phone: (605) 677-5295 >Vermillion, SD 57069 FAX: (605) 677-6604 >*** >All views expressed are those of the author and do not necessarily >reflect those of the University of South Dakota, or the South >Dakota Board of Regents. > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
glass and stanley
does anyone know where i might get a copy of the 1970 book, stat. methods in ed. and psy. ... by gene glass and julian stanley? mine seems to have disappeared and, i would like to retrieve a copy thanks for any leads PLEASE SEND ME A PERSONAL NOTE IF YOU HAVE ANY INFO ... and not to the list mailto:[EMAIL PROTECTED]> _____ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: experimental design(doubt)?
i think this points out that it is hard to really give good responses sometimes when all the details are not known ... in this case, we really don't have sufficient information on HOW samples were selected and assigned, METHODS and orders that items were heated and then porcelain applied, and on and on for it to be totally randomized, we would have to have something akin to ... having 72 samples ... assigning them to temp and porcelain type first ... then executing this "design" in that order ... could be that sample 1 got 430 degrees and type b, sample 2 might have gotten temp 700 with porcelain type b, and so on but, we probably know that is NOT what happened ... because that would have created implementation problems we know in factories ... there are runs of different items at different times ... they might have a run of X from 8AM to NOON, then there is a transition period before Y gets done from 1PM to 5PM ... in this instance, it probably was the case that all 24 were heated to the first temp ... then when that was all done, the oven was revved up to the next higher temp and then the next 24 were heated ... and a final heat up to the last temp saw the final 24 done so, this is not exactly a totally randomized plan ... since there could have been some systematic difference from one batch to the other we also don't know how the porcelain was applied ... it might have been that after all were heated to temp 1 ... then the first 12 that came out of the oven were given porcelain type A ... since this was easier to do ... then the last 12 got (after the change over) porcelain type B if either temp or type of porcelain made a BIG difference, these procedural details probably don't make a hill of beans of difference but, of course, if the impacts (though maybe real) were very small, then some systematic error might make a difference as i said ... we just don't know however, i think trying to give the "design" the proper NAME is really not that important ... the real important matter is whether the implementation of the plan was sufficiently close enough to a fully randomized design that ... an analysis according to that design would be satisfactory in this case bottom line is: snippets of information given to the list ... does not necessarily allow us to field the ensuing inquiries ... and, it is better to probe more about methods and procedures first ... then to rush off with some analysis conclusion BUT WE DO IT ANYWAY! >If the samples have been treated independently, that is each sample is >individually raised to the randomly assigned temperature and >subsequently treated with the assigned porcelain type, the design is a >completely randomized design. Any application effects (including >possible deviations of supposed temperatures and irregularities during >the whole proces of heating and subsequent cooling) will contribute to >the random error of the observations. But when all samples of the same >temperature treatment are simultaneously put in the furnace and >treated as one batch the situation is different. In that case >application effects (whose existence and magnitude is not known >generally) are confounded with the effects of the temperature >treatment. In my comment I supposed and stated that probably this was >the situation at hand. I have to admit that the original message is >not entirely clear on this point. == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Stat question
At 06:13 PM 12/1/01 -0500, Stan Brown wrote: >Jon Miller <[EMAIL PROTECTED]> wrote in sci.stat.edu: > > > >Stan Brown wrote: > > > >> I would respectfully suggest that the OP _first_ carefully study the > >> textbook sections that correspond to the missed lectures, get notes from > >> a classmate > > > >This part is of doubtful usefulness. > >Doubtful? It is "of doubtful usefulness" to get notes from a >classmate and study the covered section of the textbook? Huh? perhaps doubtful IF the students OP asked to look at were terrible students who took terrible notes ... and/or ... OP when reading the text could not make anything of it ... but, those are two big ifs usually, students won't ask to see the notes of students whom they know are "not too swift" ... and, also ... usually students who read the book do get something out of it ... maybe not enough the issue here is ... it appeared (though we have no proof of this) that the original poster did little, if anything, on his/her own ... prior to posting a HELP to the list stan seemed to be reacting to that assumption and, i don't blame him >-- >Stan Brown, Oak Road Systems, Cortland County, New York, USA > http://oakroadsystems.com/ >"My theory was a perfectly good one. The facts were misleading." >-- /The Lady Vanishes/ (1938) > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
At 08:29 AM 12/1/01 -0500, Stan Brown wrote: >How I would analyze this claim is that, when the advertiser says >"90% of people will be helped", that means 90% or more. Surely if we >did a large controlled study and found 93% were helped, we would not >turn around and say the advertiser was wrong! But I think that's >what would happen with a two-tailed test. > >Can you explain a bit further? would the advertiser feel he/she was wrong if the 90% value was a little less too ... within some margin of error from 90? probably not perhaps you want to say that the advertiser is claiming around 90% or MORE, or at LEAST 90% ... again ... we are getting far too hung up in how some hypothesis is stated ... is not the more important matter ... what sort of impact is there? if that is the case ... testing a null ... ANY null ... is really not going to help you you need to look at the SAMPLE data ... then ask yourself ... what sort of a real effect might there be if i got the sample results that i did? if you then want to superimpose on this a question ... i wonder if 90 or more could have been the truth ... fine but that is an after thought this does not call for a hypothesis test >-- >Stan Brown, Oak Road Systems, Cortland County, New York, USA > http://oakroadsystems.com >My reply address is correct as is. The courtesy of providing a correct >reply address is more important to me than time spent deleting spam. > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >===== _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: experimental design(doubt)?
At 03:11 PM 11/30/01 -0200, Ivan Balducci wrote: >Hi, members: >Please, I am hoping to get some clarification on this problem: >In my University, (I work in Dental School, in Brazil), >a dentist brought to me your data: >Experimental Unit: >cilindrical shape of Titanium pure ( 72 samples): diameter: 4mmm; height: 5mm >submeted to Shear Test (Instrom) >.. >The Ti were to heating in furnace: >24 samples to 430ºC; >24 samples to700ºC >24 samples to and 900ºC >.. >post hoc >... >12 samples (from 430ºC) received porcelain type A >12 samples (from 430ºC) received porcelain type B >... >12 samples (from 700ºC) received porcelain type A >12 samples (from 700ºC) received porcelain type B >. >12 samples (from 900ºC) received porcelain type A >12 samples (from 900ºC) received porcelain type B >... > >Objectives: >Effect Interaction between the variables: Temperature and Porcelain >on Shear Data; >TIA, >ivan >My question is: She made an Split-Plot design ? >Whole plot: Temperature >Split: Porcelain. looks like a simple randomized design to me ... in effect, you have selected 12 at random and raised to 430 AND gave porcelain type A ... and the other 5 combinations... 3 by 2 design ... fully randomized ... unless i am missing something _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting p-value = .99
forget the statement of the null build a CI ... perhaps 99% (which would correspond to your .01 sig. test) ... let that help to determine if the claim seems reasonable or not in this case ... p hat = .85 .. thus q hat = .15 stan error of a proportion (given SRS was done) is about stan error of p hats = sqrt ((p hat * q hat) / n) = sqrt (.85 * .15 / 200) = about .025 approximate 99% CI would be about p hat +/- 2.58 * .025 = .85 +/- .06 CI would be about .79 to .91 ... so, IF you insist on a hypothesis test ... retain the null personally, i would rather say that the pop. proportion might be between (about) .79 to .91 ... doesn't hold me to .9 problem here is that if you have opted for .05 ... you would have rejected ... (just barely) At 02:39 PM 11/29/01 -0500, you wrote: >On a quiz, I set the following problem to my statistics class: > >"The manufacturer of a patent medicine claims that it is 90% >effective(*) in relieving an allergy for a period of 8 hours. In a >sample of 200 people who had the allergy, the medicine provided >relief for 170 people. Determine whether the manufacturer's claim >was legitimate, to the 0.01 significance level." > >(The problem was adapted from Spiegel and Stevens, /Schaum's >Outline: Statistics/, problem 10.6.) > > >I believe a one-tailed test, not a two-tailed test, is appropriate. >It would be silly to test for "effectiveness differs from 90%" since >no one would object if the medicine helps more than 90% of >patients.) > >Framing the alternative hypothesis as "the manufacturer's claim is >not legitimate" gives > Ho: p >= .9; Ha: p < .9; p-value = .0092 >on a one-tailed t-test. Therefore we reject Ho and conclude that the >drug is less than 90% effective. > >But -- and in retrospect I should have seen it coming -- some >students framed the hypotheses so that the alternative hypothesis >was "the drug is effective as claimed." They had > Ho: p <= .9; Ha: p > .9; p-value = .9908. > >Now as I understand things it is not formally legitimate to accept >the null hypothesis: we can only either reject it (and accept Ha) or >fail to reject it (and draw no conclusion). What I would tell my >class is this: the best we can say is that p = .9908 is a very >strong statement that rejecting the null hypothesis would be a Type >I error. But I'm not completely easy in my mind about that, when >simply reversing the hypotheses gives p = .0092 and lets us conclude >that the drug is not 90% effective. > >There seems to be a paradox: The very same data lead either to the >conclusion "the drug is not effective as claimed" or to no >conclusion. I could certainly tell my class: "if it makes sense in >the particular situation, reverse the hypotheses and recompute the >p-value." Am I being over-formal here, or am I being horribly stupid >and missing some reason why it _would_ be legitimate to draw a >conclusion from p=.9908? > >-- >Stan Brown, Oak Road Systems, Cortland County, New York, USA > http://oakroadsystems.com >My reply address is correct as is. The courtesy of providing a correct >reply address is more important to me than time spent deleting spam. > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
At 01:35 PM 11/28/01 -0600, jim clark wrote: >Hi > >On Tue, 27 Nov 2001, Thom Baguley wrote: > > I'd argue that they probably aren't that independent. If I ask three > > questions all involving simple algebra and a student doesn't > > understand simple algebra they'll probably get all three wrong. In > > my experience most statistics exams are better represented by a > > bimodal (possibly a mix of two skewed normals) than a normal > > distribution. Essay based exams tend to end up with a more unimodal > > distribution (though usually still skewed). > >The distribution of grades will depend on the distribution of >difficulties of the items, one of the elements examined by >psychometrists in the development of professional-quality >assessments. well, not exactly ... it depends on a joint function of how hard items turn OUT to be AND, where i set the cut scores for grades items can be real difficult ... but still exhibit some spread .. hence my distribution of grades may or may not exhibit some spread depending on where i set the A, B, etc. points item difficulties will determine (usually) the general SHAPE of the distribution of SCORES ... but grades are on top of scores and do NOT have to conform to the shape of the distribution of scores unless your semantics was equating the term grades with the term scores ... _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Q and t
in what way are the Q values (critical) and t ... connected? someone pointed out that Q = t * sqrt 2 ... for example, doing a tukey follow up test ... with 3 means and 27 df for mserror ... the critical (.95 one tail) value for Q is about 3.51 (note: the mtb output shows this cv of 3.51) now, that means that t * sqrt 2 = 3.51 ... or, t will equal about 2.48 in this case in minitab, doing a onew anova ... or glm ... with tukey as the paired comparison method ... the 95% CI for the paired difference in means (for the problem i was working with) did go from (mean difference) +/- 2.48 * stan error of the difference my question is ... how would we go to the standard t table ... and find a t value like 2.48 in a situation with the df value we have above ... 27 ... or knowing that we have 3 means to compare? that is ... how would someone know how to construct this 95% CI ... that mtb prints out when doing tukey follow up tests? ... _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Meas. Position
The Educational Measurement position (Assistant/Associate Professor) at Penn State University has been officially posted at the College of Education website ... If you have any interest in this or know of someone who might be, please pass along the link below Thanks http://www.ed.psu.edu/employment/edpsymeas.asp This should also be posted at AERA.NET shortly _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: anyone using datadesk ?
datadesk is now linked on minitabs homepage ...as is minitab being linked on datadesks homepage ... At 11:36 AM 11/23/01 +, Frank Mattes wrote: >Hi >I discovered datadesk (www.datadesk.com), a application which allows >graphical exploration of your data. >I'm wondering if anyone is using this software > >Thank > >Frank Mattes > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= ========== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When Can We Really Use CLT & Student t
At 12:49 PM 11/21/01 -0500, Ronny Richardson wrote: >As I understand it, the Central Limit Theorem (CLT) guarantees that the >distribution of sample means is normally distributed regardless of the >distribution of the underlying data as long as the sample size is large >enough and the population standard deviation is known. nope ... clt says nothing of the kind it says that regardless of the shape of the target population ... as n increases, the shape of the sampling distribution of means is better and better APPROXIMATED by the normal distribution that is, even if the target population is quite different from normal ... if we take decent sized samples ... we can say and not be TOO wrong that the sampling distribution of means looks something like a normal ... here is a quick simulation taking samples of n=50 (based on 1 samples) from a chi square distribution with 1 df . ..::.. :. .. .::.. .::. ..::.. .: . +-+-+-+-+-+---C51 0.30 0.60 0.90 1.20 1.50 1.80 even though the chi square distribution is radically + skewed, the sampling distribution looks pretty darn close to a normal distribution ... but it never will be exactly one ... it does NOT say that it will GET to and BECOME a normal distribution if the population is not normal ... the sampling distribution will not be normal regardless of n ... but, it could be that your EYES could not tell the difference >It seems to me that most statistics books I see over optimistically invoke >the CLT not when n is over 30 and the population standard deviation is >known but anytime n is over 30. This seems inappropriate to me or am I >overlooking something? you are mixing two metaphors ... if we know the sd of the population ... then we know the real sampling error ... ie, standard error of the mean ... if we do NOT know the population sd, and substitute our estimate of that from the sample, then we are only estimating the standard error of the mean thus ... knowing or not knowing the population sd helps us to know or only to estimate the real standard error ... but this is unconnected with shape of sampling distribution shape of sampling distribution is partly a function of shape of population AND random sample size ... >When the population standard deviation is not know (which is almost all the >time) it seems to me that the Student t (t) distribution is more >appropriate. However, t requires that the underlying data be normal, or at >least not too non-normal. My expectations is that most data sets are not >nearly "normal enough" to make using t appropriate. > >So, if we do not know the population standard deviation and we cannot >assume a normal population, what should we be doing-as opposed to just >using the CLT as most business statistics books do? > >Ronny Richardson > > >Ronny Richardson > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
best inference
on this near holiday ... at least in the usa ... i wonder if you might consider for a moment: what is the SINGLE most valuable concept/procedure/skill (just one!) ... that you would think is most important when it comes to passing along to students studying "inferential statistics" what i am mainly looking for would be answers like: the notion of being able to do __ that sort of thing something that if ANY instructor in stat, say at the introductory level failed to discuss and emphasize ... he/she is really missing the boat and doing a disservice to students _____ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: CIs for adjusted rates
At 08:35 AM 11/21/01 -0600, Scheltema, Karen wrote: >I have complication rates for a given procedure. I was thinking of using >indirect standardization as a method of risk adjustment given that some >doctors see more complex patients. What I can't figure out is how I would >go about calculating a 95% CI after the risk adjustment. Any pointers would >be greatly appreciated. why not define a group of doctors or cases as ... less complex patients .. and more complex patients ... and do the confidence intervals separately (you might do one overall interval to see how it compares to the two separate ones ... perhaps there is little difference) one problem i do see is that ... some doctors deal with both so, the two classes above (less complex, more complex) are not totally independent >Karen Scheltema, M.A., M.S. >Statistician >HealthEast >Research and Education >1700 University Ave W >St. Paul, MN 55104 >(651) 232-5212 (phone) >(651) 641-0683 (fax) >[EMAIL PROTECTED] > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Meas. Position
The EdPsy program at Penn State is now recruiting for a faculty member strong in quantitative skills, particularly in Measurement ... to begin Fall Semester 2002. This is a regular tenure track line at the appointment level of Assistant or Associate Professor level. Until our College of Education webpage posts the official job advertisement (and it is posted to locations like AERA.NET, etc.), I have TEMPORARILY posted this job notice at the following site: http://roberts.ed.psu.edu/users/droberts/Meas1.htm If you have any questions, please do NOT address them to the list ... but, send me a note at: <mailto:[EMAIL PROTECTED]> _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Maximized lambda4
At 01:49 PM 11/19/01 -0500, Wuensch, Karl L wrote: > Callender and Osburn (Educational and Psychological Measurement, >1977, 37, 819-825) developed a method for estimating maximized lambda4, the >greatest split-half reliability coefficient among all possible split halves >for a scale. The method is quite tedious to do by hand, and the authors >provided a FORTRAN program for accomplishing it. Not having a FORTRAN >compiler, I'm considering writing an SAS program (IML) to do this, but don't >want to waste my time reinventing the wheel if someone else has already >written an SAS or SPSS program for this purpose. If you are aware of any >such program, please advise me. Thanks. > > By the way, Callender and Osburn's work indicates that maximized >lambda4 is a much better estimate of a test's true reliability than is >lambda3 (coefficient alpha). how do we know what a test's true reliability is? > ~~~ >Karl L. Wuensch, Department of Psychology, >East Carolina University, Greenville NC 27858-4353 >Voice: 252-328-4102 Fax: 252-328-6283 >mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> >http://core.ecu.edu/psyc/wuenschk/klw.htm ><http://core.ecu.edu/psyc/wuenschk/klw.htm> > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: biostatistics careers
At 03:08 PM 11/19/01 +, A.J. Rossini wrote: > >>>>> "BW" == Bruce Weaver <[EMAIL PROTECTED]> writes: > > BW> On Sun, 18 Nov 2001, Stan Brown wrote: > > >> What _is_ "biostatistics", anyway? A student asked me, and I > >> realized I have only a vague idea. > > > BW> There was a thread on "biostatistics versus statistics" a > BW> couple years ago, I think, but I was unable to find it at > BW> google groups. Maybe someone out there saved some of it. > > >But it's much easier than that. Biostatistics is simply statistics >(design, descriptive, and inferential) applied to medical, basic >biology, and public health problems. well, one difference in bio stat is a strong emphasis on probability sorts of problems ... _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
part. SS
i have put up a little worked out example of the partitioning of SS in a simple 3 group situation ... for ANOVA ... with a diagram of where the components come from ... http://roberts.ed.psu.edu/users/droberts/introstat/sspart1.png this might be helpful to some ... if you want to print ... you should do it in the landscape (horizontal) mode _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
the general problems evaluating students are how much time do you have for (say) exams, what can be reasonably expected that students will be able to do with that amount of time, what content can you examine on, and ... what sort of formats do you opt for with your exams in statistics, let's say you have covered several inferential topics from midterm to the end ... maybe some difference in means situations using t tests and building CIs, perhaps something about tests and CIs about proportions, and then a unit on ANOVA (note: the above are just examples) how much can you have them do ... in an open ended way ... in an hour? for example, i can easily visualize having very small sets of data ... that they work with for each of the above ... with appropriate distribution tables at hand ... giving written explanations of their results ... but, is it realistic to expect them to do something from all of these in one hour? i really doubt it so, in the final analysis, ANY test (no matter what the format) is just a sample of all the things you could ask them to do ... thus, no matter what you do and what they show ... you are still left with many unanswered questions about their knowledge and skill i think it would be fair to say that in having students work problems and show their work ... you do get a better idea of their knowledge of THAT but, generally, you are not able to have as widespread coverage of the material you have covered since the last test ... than perhaps with some recognition item approach ... where you can cover more but, clearly, you get less information about any specific piece of knowledge or skill ultimately, it is a tradeoff ... and, as i think someone else mentioned, if the instructor is also strapped with large classes (which is so common these days) ... practical considerations enter that weigh perhaps more heavily than pedagogic best practice dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Introductory Statistics text
At 07:34 PM 11/18/01 -0800, Melady Preece wrote: >I am looking for a new and improved Statistics text for an introductory (3rd >year) stats course for psychology majors...I would welcome any >suggestions/reviews, etc. > >Melady Preece improved over what? what are you using? what don't you like about it? is software used and if so, does the book you are using (or would like to use) help out in this? >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= ====== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
... for much simpler problems ... even shown work could have been lucked into ... i would also like to again make a push for correct answers rising to the approximate same level of importance as process ... we just cannot take lightly the fact that when someone gets the wrong answer, saying that this is not THAT important ... there are many many situations where making a mistake in the answer can be critical ... to some decision that is being made, to some NEXT action that is taken, etc. ... we need to stress both process and accuracy ... i sure hope that the bombardier is not just graded on process > > ___ > > dennis roberts, educational psychology, penn state university > > 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] > > http://roberts.ed.psu.edu/users/droberts/drober~1.htm > > > > = > > Instructions for joining and leaving this list and remarks about > > the problem of INAPPROPRIATE MESSAGES are available at > > http://jse.stat.ncsu.edu/ > > = > >-- >Roy St. Laurent >Mathematics & Statistics >Northern Arizona University >http//odin.math.nau.edu/~rts == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
At 08:56 AM 11/16/01 -0700, Roy St Laurent wrote: >It's not clear to me whether recent posters are serious about these >examples, but >I will reiterate my previous post: > >For most mathematics / statistics examinations, the "answer" to a >question is the >*process* by which the student obtains the incidental final number or >result. >The result itself is most often just not that important to evaluating >students' >understanding or knowledge of the subject. And therefore an unsupported > >or lucky answer is worth nothing. the problems with the above are twofold: 1. this assumes that correct answers are NOT important ... (which believe me if you are getting change from a cashier, etc. etc. ... ARE ... we just cannot say that knowing the process but not being able to come up with a correct answer ... = good performance) 2. that answers without any OTHER supplied information on the part of the examinee can't be taken as "knowledge" when, it (sometimes) can be what if you asked on an exam ... the following: 1. what is the mean of 10, 9, 8, 8 and 7? _ 2. what is the mean of 27, 23, 19, 17 and 16? 3. what is the mean of 332, 234, 198, 239, and 200? _ 4. what is the mean of 23.4, 19.8, 23.1, 19.0, and 26.4? _ and, for each of 1 to 4 ... they put down in the blanks, the correct answers would you be willing to say that they know how to calculate the mean ... ie, they know the process that is needed (and can implement it)? i think you would EVEN though there is no other supporting process information given by the examinee so, the statement that no credit should be given when there is no supporting other evidence (ie, the process is shown) ... can't be considered necessarily valid the problem here is NOT that no supporting evidence is given, the problem is that with ONLY ONE instance of some given concept/skill that we are attempting to assess on the part of the examinee, you are not nearly as sure given only ONE CORRECT RESPONSE to one item ... whether it could have been an answer because of real knowledge or, just happens to be the right answer that was arrived at (luckily for the examinee) through some faulty process _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
would we give full credit for 87/18 = 7/1 ... 8's cancel? >Full marks. As Napoleon used to ask, "Is he lucky?". :) He/she deserves it.! > > -- >John Kane >The Rideau Lakes, Ontario Canada > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= ========== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 08:03 PM 11/15/01 +, Radford Neal wrote: >Radford Neal: > > >> The difference is that when dealing with real data, it is possible for > >> two populations to have the same mean (as assumed by the null), but > >> different variances. In contrast, when dealing with binary data, if > >> the means are the same in the two populations, the variances must > >> necessarily be the same as well. So one can argue on this basis that > >> the distribution of the p-values if the null is true will be close to > >> correct when using the pooled estimate (apart from the use of a normal > >> approximation, etc.) > >Jerry Dallal: > > >But, if the null hypothesis is that the means are the same, why > >isn't(aren't) the sample variance(s) calculated about a pooled > >estimate of the common mean? > > >An interesting question. i think what this shows (ie, these small highly technical distinctions) is that ... that most null hypotheses that we use for our array of significance tests ... have rather little meaning null hypothesis testing is a highly overrated activity in statistical work in the case of differences between two proportions ... the useful question is: i wonder how much difference (since i know there is bound to be some [even though it could be trivial]) there is between the proportions of A population versus B population? to seek an answer to the real question ... no notion of null has to even be entertained == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 08:51 AM 11/15/01 -0600, jim clark wrote: >The Ho in the case of means is NOT about the variances, so the >analogy breaks down. That is, we are not hypothesizing >Ho: sig1^2 = sig2^2, but rather Ho: mu1 = mu2. So there is no >direct link between Ho and the SE, unlike the proportions >example. would it be correct then to say ... that the test of differences in proportions is REALLY a test about the differences between two population variances? >Best wishes >Jim > > >James M. Clark (204) 786-9757 >Department of Psychology(204) 774-4134 Fax >University of Winnipeg 4L05D >Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] >CANADA http://www.uwinnipeg.ca/~clark > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >===== _____ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 04:26 PM 11/15/01 +0100, Rolf Dalin wrote: >The significance test produces a p-value UNDER THE CONDITION >that the null is true. In my opinion it does not matter whether we >know it isn't true. It is just an assumption for the calculations. And >these calculations do not produce exactly the same information as >the CI for the difference. They state in some sense, if the procedure >was repeted, how probable it would be to ... etc. this might make sense if the sample p*q values were the same for BOTH samples ... but if they are not (which will almost always be the case in real data) ... then you already have SOME evidence that the null is perhaps not true (of course, we know that it is not exactly true anyway ... so that sort of tosses out the notion of pooling so as to get a better estimate of a COMMON variance) earlier in their presentation, moore and mccabe say that they prefer to use a CI to test some null in this case ... but, if one did a z test with the unpooled estimator for standard error, this would lead to a "valid" significance test ... HOWEVER ... then they go on to say that INSTEAD, they will adopt the pooled standard error approach since it is the " ... more common practice" that logic escapes me if we can build a CI using the un pooled standard error formula and, find that to be ok to see if some null value like 0 difference in population proportions is inside or outside of the CI, i don't see any need to switch the denominator formula in the z test JUST because we want to use the z test STATISTIC to test the null a little more consistency in logic would seem to be in the best interests of students trying to learn this ... i would still argue that the extent to which you would not be willing to use the pooled standard error formula in the case of differences in means, would be the same extent to which you would not be willing to use the pooled standard error formula when it comes to differences in proportions ... i don't see that the logic really is any different but, this is just my opinion _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
diff in proportions
in the moore and mccabe book (IPS), in the section on testing for differences in population proportions, when it comes to doing a 'z' test for significance, they argue for (and say this is commonly done) that the standard error for the difference in proportions formula should be a POOLED one ... since if one is testing the null of equal proportions, then that means your null is assuming that the p*q combinations are the SAME for both populations thus, this is a case of pooling sample variances to estimate a single common population variance but since this is just a null ... and we have no way of knowing if the null is true (not that we can in any case) ... i don't see any logical progression that would then lead one to also assume that the p*q combinations are the same in the two populations ... hence, i don't see why the pooled variance version of the standard error of a difference in proportions formula would be the recommended way to go in their discussion of differences in means ... they present FIRST the NON pooled version of the standard error and that is there preferred way to build CIs and do t tests ... though they also bring in later the pooled version as a later topic (and of course if we KNEW that populations had the same variances, then the pooled version would be useful) it seems to me that this same logic should hold in the case of differences in proportions comments? ========== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re:
At 07:42 AM 11/14/01 -0800, Carl Huberty wrote: >I, too, prefer closed-book tests in statistical methods courses. I also >like short-answer items, some of which may be multiple-choice >items. [Please don't gripe that all multiple-choice items assess only >memory recall; such items, if constructed well, may be very helpful in >assessing learning!] I think that a very important aspect of evaluation >of student performance and knowledge pertains to variability; variability >in the sense of class performance. If assessment of student learning does >not reflect some variability in student performance, there is a very >serious problem with the assessment process used! good point ... and, EVEN if every student came in with the same identical ability (say math skill) ... there is no way that whatever happens in the course will equally impact all of the students so ... even then, course performance measures (tests, projects, etc.) should reflect some non trivial variation of course, if the above were true, then there would be LESS variation than when (as is typical) students have a fairly wide range (even in upper level courses) of ability AND added to that impacting on variation in course performance measures, will be the differential impact of the course itself ... in NO case that i can think of, no realistic case that is, would we have any expectation that there would be 0 variance in course performance measures now, some might say ... well, what if we were "mastery" oriented in the course could it not be true that at the end of the course ... everyone has mastered all the required skills? the simple answer to this is NO ... if you find that all the scores at the end are the same ... then your measures did NOT have adequate ceiling ... and, you are missing detecting the "more mastery" that some students had over others you might think they are equal but, clearly they are not _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
the problem with any exam ... given in any format ... is the extent to which you can INFER what the examinee knows or does not know from their responses in the case of recognition tests ... where precreated answers are given and you make a choice ... it is very difficult to infer anything BUT what the choice was ... if that choice was keyed as correct and the examinee selected it ... first, you have to give credit for it and second ... about all you can do is to assume the person understood the item (which may or may not be correct of course) in open ended problems ... where the person has to SUPPLY responses ... the instructor is in somewhat of a better position to assess knowledge ... assuming that the directions to the examinee were clear as to what he/she was to put down ... however, even here, there are limits what if i ask the ? ... what is the mean of 34, 56, 29, and 32? and i leave space ... and your directions say to round to 1 place and he/she puts 37.8 in the space ... sort of looks like he/she knew what to do (but you can't be positive since, he/she could have added up slightly incorrectly but still when rounding, it = 37.8) but, what if they put 37.2 or 38.1 ... what do you make of it? now, if WORK were required to be displayed, you might be able to see what happened and then "grade" accordingly ... if the process was totally messed up ... no credit but, if you see they added up the 4 values incorrectly but, had the right process ... almost full credit but, not total ... they DID make some mistake what if they added correctly but for some unknown reason ... had in the back of their mind n-1 and divided by 3? is that = to a mistake of dividing by 4 but having added up the 4 numbers wrong? now, in a mc format ... we might have 37.8 as one of the answers but, other options that incorporate some adding up the numbers INcorrectly but dividing by 4, adding up the 4 numbers correctly but dividing by n-1, etc. so, if they don't select the keyed answer ... BUT, they select some other option ... then you MIGHT be able to "infer" some partial knowledge just as if you see an incorrect answer and see that they divided by 3 rather than 4 BUT YOU HAVE TO BE CAREFUL DOING THIS ... since you still see no worked out evidence of where they might have gone astray generally, open ended ? that require physical responses to be made off a better opportunity to "gauge" what the examinee knows but ... again ... there are real limits the bottom line here is that no matter how you examine the student, unless you do lots of followup probing ... you will be in a difficult position to know very precisely what the person knows or does not know an exam is a very limited sample of behavior from which you are not in a very good position to extend your "inference" about what the examinee was "really thinking" when faced with that item or test to attempt to infer more is playing a sheer folly game _____ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standard Deviation!
i think you are asking the wrong question ... because, as far as i know ... there is only really one standard deviation concept ... square root of the variance (average of squared deviations around the mean in a set of data) ... perhaps what you are really interested in is HOW should VARIABILITY be measured in your context? perhaps the variance (ie, standard deviation) is not what you are after ... _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: p value
At 05:06 PM 11/2/01 -0500, Wuensch, Karl L wrote: > Dennis wrote: " it is NOT correct to say that the p > value (as >traditionally calculated) represents the probability of finding a > result >LIKE WE FOUND ... if the null were true? that p would be ½ of > what is >calculated." > > Jones and Tukey (A sensible formulation of the significance test, >Psychological Methods, 2000, 5, 411-414) recently suggested that the p >which should be reported is "the area of the t distribution more positive or >more negative (but not both) than the value of t obtained," just as Dennis >suggests in his post. i would not disagree with this ... but, we have to realize that software (most i think) does NOT do it that way ... if we did adopt the position of just reporting the p beyond the point you got ... either to the right side or left side but not both ... then, what will we use as the cut value ... .025??? or ... .05 as a 1 tail test? for rejecting the null? we certainly will have a problem continuing to say that we "set" alpha at .05 ... in the usual two tailed sense > ~~~ >Karl L. Wuensch, Department of Psychology, >East Carolina University, Greenville NC 27858-4353 >Voice: 252-328-4102 Fax: 252-328-6283 >mailto:[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]> >http://core.ecu.edu/psyc/wuenschk/klw.htm ><http://core.ecu.edu/psyc/wuenschk/klw.htm> > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >========= == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
p value
most software will compute p values (say for a typical two sample t test of means) by taking the obtained t test statistic ... making it both + and - ... finding the two end tail areas in the relevant t distribution ... and report that as p for example ... what if we have output like: N Mean StDev SE Mean exp 20 30.80 5.20 1.2 cont 20 27.84 3.95 0.88 Difference = mu exp - mu cont Estimate for difference: 2.95 95% CI for difference: (-0.01, 5.92) T-Test of difference = 0 (vs not =): T-Value = 2.02 P-Value = 0.051 DF = 35 for 35 df ... minitab finds the areas beyond -2.20 and + 2.02 ... adds them together .. and this value in the present case is .051 now, traditionally, we would retain the null with this p value ... and, we generally say that the p value means ... this is the probability of obtaining a result (like we got) IF the null were true but, the result WE got was finding a mean difference in FAVOR of the exp group ... however, the p value does NOT mean that the probability of finding a difference IN FAVOR of the exp group ... if the null were true ... is .051 ... right? since the p value has been calculated based on BOTH ends of the t distribution ... it includes both extremes where the exp is better than the control ... AND where the cont is better than the exp thus, would it be fair to say that ... it is NOT correct to say that the p value (as traditionally calculated) represents the probability of finding a result LIKE WE FOUND ... if the null were true? that p would be 1/2 of what is calculated this brings up another point ... in the above case ... typically we would retain the null ... but, the p of finding the result LIKE WE DID ... if the null were true ... is only 1/2 of .051 ... less than the alpha of .05 that we have used thus ... what alpha are we really using when we do this? this is just a query about my continuing concern of what useful information p values give us ... and, if the p value provides NO (given the results we see) information as to the direction of the effect ... then, again ... all it suggests to us (as p gets smaller) is that the null is more likely not to be true ... given that it might not be true in either direction from the null ... how is this really helping us when we are interested in the "treatment" effect? [given that we have the direction of the results AND the p value ... nothing else] ====== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Recommend Masters level math stats text
At 02:08 PM 10/29/01 +, Jason Owen wrote: >Hi -- you might consider looking at John Rice's text "Mathematical >Statistics and Data Analysis" (2nd ed.), Duxbury Press. I would >consider it an ideal Master's level text. what assumptions are being made ... from the original post ... about the math background or what statistics courses (if any) students would have under their belts ... when taking this masters level course? this is important _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help for DL students in doing assignments
a prime # is a natural number GREATER than 1 that can be divided ONLY by 1 and itself ... a prime number has NO factors other than 1 and itself i think 2 qualifies ... and is not 2 ... even? send check to bob ASAP At 10:40 PM 10/15/01 +0200, you wrote: > >Mr. Dawson wrote: > > >Well, they do say what goes around comes around; I'd love to see > what > >mark the dishonest DL student gets having had his homework done for him > >by somebody who: > > > >(a) believes all primes to be odd; > >... >### Let's assume that any prime is NOT odd >### It means that is is even (no other way among integers!) >### So that prime has 3 dividers: "1",this prime and "2" >### which contradicts with prime definition: >### ("prime is integer that has only two dividers: 1 and this prime >itself") >### Dear Mr. Dawson, please send me at least ONE even prime >### and i shall give you $1,000,000. > > > > >-Robert Dawson > > > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Mean and Standard Deviation
At 04:32 PM 10/12/01 -0500, you wrote: >A colleague of mine - not a subscriber to this helpful list - asked me if >it is possible for the standard deviation >to be larger than the mean. If so, under what conditions? what about z scores??? mean = 0 and sd = 1 >At first blush I do not think so - but then I believe I have seen >some research results in which standard deviation was larger than the mean. > >Any help will be greatly appreciated.. >cheersECD > >___ > >Edward C. Dreyer >Political Science >The University of Tulsa > > > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >===== > == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Are parametric assumptions importat ?
At 01:44 PM 10/12/01 -0400, Lise DeShea wrote: >I tell my students that the ANOVA is not robust to violation of the equal >variances assumption, but that it's a stupid statistic anyway. All it can >say is either, "These means are equal," or "There's a difference somewhere >among these means, but I can't tell you where it is." i don't see that this is anymore stupid that many other null hypothesis tests we do ... if you want to think " stupid" ... then think that it is stupid to think that the null can REALLY be exactly true ... so, the notion of doing a TEST to see if you retain or reject ... is rather stupid TOO since, we know that the null is NOT exactly true ... before we even do the test _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Are parametric assumptions importat ?
At 12:59 PM 10/12/01 -0300, you wrote: >While consulting people from depts of statistics about this, a few of them >were arguing that these assumption testing are just a "legend" and that >there is no problem in not respecting them ! note: you should NOT respect any stat expert who says that there is no problem ... and not to worry about the so called "classic" assumptions all they are doing is making their consultation with you EASIER for them! every test you might want to do has 1 or more assumptions about either how samples were taken and/or parameters (and other things) about the population in some cases, violations of one or more of these make little difference in the "validity" of the tests (simulation studies can verify this) ... but, in other cases, violations of one or more can lead to serious consequences (ie, yielding a much larger type I error rate for example that you thought you were working with) ... there is no easy way to make some blanket statement as to what assumptions are important and which are not because ... this depends on a specific test (or family of similar tests) usually, for a particular test ... "good" texts will enumerate the assumptions that are made AND, will give you some mini capsule of the impact of violations TO those assumptions _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: Standardized Confidence Intervals
At 03:04 PM 10/9/01 -0700, Dale Glaser wrote: > It would seem that by standardizing the CI, as Karl suggests, then we > may be able to get a better grasp of the dimensions of error...at > least I know the differences between .25 SD vs. 1.00 SD in terms of magnitude well, yes, 1 sd means about 4 times as much "spread" (in sd units that is) than .25 sd (whether it be error or anything else) ... but, UNLess you know what the underlying scale is ... what the raw units mean ... have some feel for the "metric" you started with ... i don't see that this really makes it "instantaneously" more understandable i would like to see a fully worked out example ... where we have say: regular effect sizes next to standardized effect sizes ... and/or regular CIs next to standardized CIs ... and then try to make the case that standardized values HELP one to UNDERSTAND the data better ... or the inference under examination ... we might even do some study on this ... experimental ... where we vary the type of information ... then either ask Ss to elaborate on what they think the data mean ... or, answer some mc items about WHAT IS POSSIBLE to infer from the data ... and see if standardizing really makes a difference my hunch is that it will not >..or is this just a stretch?!!! > >Dale N. Glaser, Ph.D. >Pacific Science & Engineering Group >6310 Greenwich Drive; Suite 200 >San Diego, CA 92122 >Phone: (858) 535-1661 Fax: (858) 535-1665 ><http://www.pacific-science.com>http://www.pacific-science.com > == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: Standardized Confidence Intervals
At 03:04 PM 10/9/01 -0700, Dale Glaser wrote: >Dennis..yes, the effect size index may be arbitrary, but for argument >sake, say I have a measure of 'self-esteem', a 10 item measure (each item >a 5-pt. Likert scale) that has a range of 10-50; sample1 has a 95% CI of >[23, 27] whereas a comparison sample2 has CI of [22, 29]. Thus, by >maintaining the CI in its own unit of measurement, we can observe that >there is more error/wider interval for sample1 than sample2 (for now >assuming equal 'n' for each sample). >However, it is problematic, given the inherent subjectivity of measuring >self-esteem, to claim what is too wide of an interval for this type of >phenomenon. i did not know that CIs could tell you this ... under any circumstance ... i don't see that standardizing it will solve this problem ... supposedly, CIs tell you something about the parameter values ... and nothing else ... i don't think it is within the capacity of ANY statistic ... to tell you if some CI is too wide or too narrow ... WE have to judge that ... given what we consider in our heads ... is too much error or what we are willing to tolerate as precision of our estimates > How do we know, especially with self-report measures, where indeed the > scaling may be arbitrary, if the margin of error is of concern? It would > seem that by standardizing the CI, as Karl suggests, then we may be able > to get a better grasp of the dimensions of error...at least I know > the differences between .25 SD vs. 1.00 SD in terms of > magnitude..or is this just a stretch?!!! you do this ahead of time ... BEFORE data are collected ... perhaps with some pilot work as a guide to what sds you might get ... and then you design it so you try to work withIN some margin of error ... i think the underlying problem here is trying to make sense of things AFTER the fact ... without sufficient PREplanning to achieve some approximate desired result after the fact musings will not solve what should have been dealt with ahead of time ... and certainly, IMHO of course, standardizing things won't solve this either karl was putting regular CIs (and effect sizes) and standardized CIs (or effect sizes) in juxtaposition to those not liking null hypothesis testing but, to me, these are two different issues ... i think that CIs and/or effect sizes are inherently more useful than ANY null hypothesis test ... again, IMHO ... thus, brining null hypothesis testing into this discussion seems not to be of value ... of course, i suppose that debating to standardize or not standardize effect sizes and/or CIs ... is a legitimate matter to deal with ... even though i am not convinced that standardizing "these things" will really gain you anything of value we might draw some parallel between covariance and correlation ... where, putting the linear relationship measure on a 'standardized' dimension IS useful ... so that the boundaries have some fixed limits ... which covariances do not ... but, i am not sure that the analog for effect sizes and/or CIs ... is equally beneficial >Dale N. Glaser, Ph.D. >Pacific Science & Engineering Group >6310 Greenwich Drive; Suite 200 >San Diego, CA 92122 >Phone: (858) 535-1661 Fax: (858) 535-1665 ><http://www.pacific-science.com>http://www.pacific-science.com > >-Original Message- >From: dennis roberts [<mailto:[EMAIL PROTECTED]>mailto:[EMAIL PROTECTED]] >Sent: Tuesday, October 09, 2001 1:52 PM >To: Wuensch, Karl L; edstat (E-mail) >Subject: Re: Standardized Confidence Intervals dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Standardized Confidence Intervals
At 03:45 PM 10/9/01 -0400, Wuensch, Karl L wrote: >Some of those who think that estimation of the size of effects is more >important than the testing of a nil hypothesis of no effect argue that we >would be better served by reporting a confidence interval for the size of >the effect. Such confidence intervals are, in my experience, most often >reported in terms of the original unit of measure for the variable involved. >When the unit of measure is arbitrary, those who are interested in >estimating the size of effects suggest that we do so with standardized >estimates. It seems to me that it would be useful to present confidence >intervals in standardized units. why? you only get further away from the original data scale/units you are working with ... in what sense ... is ANY effect size indicator anything BUT arbitrary? i don't see how trying to standardize it ... or any confidence interval ... makes it anything other than still being in arbitrary units ... i would argue that whatever the scale is you start off using ... that is as CLOSE as you can get to the real data ... even if the scale does not have any "natural" or "intuitive" kind of meaning standardizing an arbitrary variable does NOT make it more meaningful ... just like converting raw data to a z score scale does NOT make the data more meaningful standardizing a variable may have useful properties but, imputing more meaning into the raw data is not one of them ========== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: MCAS, statistics and other math problems
At 12:41 PM 10/5/01 -0500, Christopher J. Mecklin wrote: >(4) If the Massachusetts Department of Education really wants to include a >boxplot item on the test, it should either be a multiple choice question >written so that the correct answer is the same no matter which type of >boxplot one was taught, or an open-ended question where the students >actually create boxplots for 2 data sets and compare/contrast the 2 >distributions. The readers then should be aware of both types of boxplots >when assessing the question. > >That's my two cents, anyway actually, i think the above is worth at least 3 cents but, the main issue re: boxplots ... is the fact that a boxplot indicates a median in the "box", rather than a mean (say) in the "box" ... really really important ENOUGH to waste an entire question (1 in 6 about statistical things) on a test that is such high stakes? seems like iF you wanted to use a boxplot as a data reporting tool ... within the context of an item on a test like this ... that, you would focus on something important like: spread of scores, or ... what is an approximate average value, or ... whether the distribution seems to be symmetrical or skewed ... ========== dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ranging opines about the range
tructions for joining and leaving this list and remarks about > > the problem of INAPPROPRIATE MESSAGES are available at > > http://jse.stat.ncsu.edu/ > > ========= > > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: MCAS, statistics and other math problems
gene ... we have been through this sort of discussion before and i, for one, totally sympathize with you in this situation ... but, it is difficult for outsiders ... outside of mass. (not being residents or parents of impacted kids) ... to really know how to respond to this and what to do about it clearly, there are serious problems with some test items and hence, there are serious problems with the SCORES examinees make ON these tests there appear to be also, clear problems with the item/test review process that has been used and, the reasons for and methods of, legitimate appeal when tests count this much, then all benefits of the doubt need to go TO the students ... not to some rigidly formulated process what LEGAL challenges have been organized and made within mass.? it seems to me that the only real way to make progress on this problem is to fight fire with fire ... and that usually means well organized legal efforts ... what can WE do, as individuals who read this list ... who are interested in this problem? probably, very little but, i think most give you our "moral" support ... and wish you well ... At 02:33 AM 10/5/01 +, EugeneGall wrote: >During the last week in August, there was a lengthy thread on sci.stat.edu >about problems with the probability and statistics questions in MCAS, the high >stakes test required for graduating from a MA public high school. _________ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: MCAS, statistics and other math problems
At 07:03 AM 10/5/01 -0500, Olsen, Chris wrote: >Professor Gallagher and All -- > > >It would appear that neither the "appeal systems" nor a claim of >"technical adequacy" would be a response to your concern about bad >questions. The claim of technical adequacy, i.e. "that good students tend >to answer them correctly anyway, but poor students don't" does not, to my >mind, constitute technical adequacy. this is absolutely correct ... all they have to go on is the score on the test ... if we toss in "ability" on something else as a defense ... then, why give them THIS test in the first place? _____ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =