Re: One-tailed, two-tailed
On Sun, 30 Dec 2001, Stan Brown wrote in part: A. G. McDowell [EMAIL PROTECTED] wrote: The significance value associated with the one-tailed test will always be half the significance value associated with the two-tailed test, For means, yes. Not for proportions, I think. Oh? Why not? Is there something about proportions that militates against assigning 1/2 alpha to each tail of the sampling distribution? snip, the rest -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Missing data cell problem
In trying to clear out my e-mail inbox, I came across this post, for which there seemed not to have been any responses. On Fri, 2 Feb 2001, Caroline Brown wrote: I have an analysis problem, which I am researching solutions to, and David Howell of UVM suggested I mail the query to you. My problem is how to deal with a two way- repeated measures design, in which one cell could not be measured: A1 A2 A3 B1 ok ok ok B2 - ok ok B3 ok ok ok B4 ok ok ok There is a good theoretical reason for this absence, as levels of factor A are set sizes, and A1 is one item, Factor B is cueing to spatial location and in the 1 item set size, there are no other items competing for 'encoding' resources (thus there can be no INVALID cue). If you know of any texts or papers on this issue, or have any thoughts as to its solution, I would be most grateful. One approach is to estimate the cell mean in the A1B2 cell, under the constraint that it not contribute to the AxB interaction; and then carry out the usual 2-way ANOVA (but with one fewer d.f. for interaction). If we use the following two contrasts, one for main effects in A and one for main effects in B, their product represents a contrast involving the 12 cells. Set that contrast equal to zero (so it doesn't contribute to the interaction SS. (All other interaction contrasts orthogonal to this one will not involve the missing cell.) For A: 2A1 - A2 - A3. For B: -B1 + 3B2 - B3 - B4. Product contrast: -2A1B1 + A2B1 + A3B1 + 6A1B2 - 3A2B2 - 3A3B2 - 2A1B3 + A2B3 + A3B3 - 2A1B4 + A2B4 + A3B4 = 0, whence A1B2 = (2A1B1 - A2B1 - A3B1 + 3A2B2 + 3A3B2 + 2A1B3 - A2B3 - A3B3 + 2A1B4 - A2B4 - A3B4)/6 (where 2A1B3 = twice the cell mean in the (A1,B3) cell, etc.) You now have cell means for each cell and can carry out the usual ANOVA. Because the estimated value of A1B2 infects your A1 average and your B2 average, the row and column effects (sources A and B in the ANOVA) are not, strictly speaking, independent; although the A2:A3 contrast is independent of contrasts involving only B1, B3, B4. Hope this helps (if belatedly!). -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Statistical illiteracy
On Wed, 26 Dec 2001 [EMAIL PROTECTED] wrote (edited): I came across a table of costume jewelry at a department store with a sign that said 150% off. I asked them how much they would pay me to take it all off of their hands. I had to explain to them what 150% meant, and they then explained to me how percentages are computed in the retail trade: first we cut the price in half (50%). Then we cut it in half again. Now we have cut it in half a third time. 50% + 50% + 50% = 150% off. ... ... if they advertise a 150% discount directly, without referring to the sequence of three 50% discounts, might they not be liable to legal action for misrepresentation? I would tell the clerk in the store, Ah, you get 150% off by taking 75%-off of 75%-off. I'll take it. (1/16 price vs. 50%-off 50%-off 50%-off =1/8 price). Why settle for 1/16? Take 60% off after 90% off. Or 55% after 95%. Or 50% after 100%, which ought to underline the illogic even for arithmetically illiterate retailers. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Logarithms (was: When to Use t and When to Use z Revisited)
On Tue, 11 Dec 2001, Vadim and Oxana Marmer wrote: besides, who needs those tables? we have computers now, don't we? I was told that there were tables for logarithms once. I have not seen one in my life. Is not it the same kind of stuff? If you _want_ to see one, you have no farther to go than to Sterling Library and look up what there is under mathematical tables. (Unless, in the years since I worked there as an undergraduate, they've thrown them all out, which I would hope to be unlikely.) -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: analysis of criterion test data
On 24 Dec 2001, Carol Burris wrote: I am a doctoral student who wants to use student performance on a criterion test, a state Regents exam, as a dependent variable in a quasi-experimental study. The effects of previous achievement can be controlled for using a standardized test, the Iowa test of Basic skills. What kind of an analysis can I use to determine the effects of a particular treatment on Regents exam scores? The Regents exam produces a percentage correct score, not a standardized score, therefore it is not interval data, Non sequitur, and probably not true. Percentage correct, if it means what it says, is the same variable as number of items correct (merely reduced to a percentage by dividing by the total number of items and multiplying by 100%), which is about as interval as you can expect to get in this business. (Even a standardized score is often but a linear transformation of the number of items right.) Of course, if you have mis-stated things (I don't have personal knowledge of the Regents exams and the marking thereof) and what is produced is a set of percentiles rather than percentage correct, THAT variable is not interval (although it can be converted to an interval score fairly readily, by making some assumptions about the form of the distribution). and I can not use analysis of covariance (or at least that is what I surmise from my reading). Any suggestions? The phrase can not should not even be in your vocabulary in this context. You can ALWAYS carry out an analysis of covariance (or a multiple regression analysis, or an analysis of variance; and any of these in their univariate or multivariate forms). Whether the results mean what you would like them to mean is another matter, of course, and that depends to some degree on what assumptions you are willing (and what assumptions you are UNwilling!) to make about the variables you have and about the models you are entertaining. First carry out your analyses (several of them, if you're unsure, as most of us are at the outset, which one is best in some useful sense); then look for ways in which the universe may be misleading you (or ways in which you may be deceiving yourself). If several analyses seem to be telling you much the same thing (at least in a general way), then that thing is probably both believable and reliable. If they tell you different things, you know the data isn't different, so the differences must be reflecting differences (possibly subtle ones) in the questions being addressed by the several analyses: which in turn means that something interesting is going on, and it msay repay you well to find out what that something is. However, if the analysis you think you want is analysis of covariance, I'd strongly urge you to carry it out as a multiple regression problem, or as a general linear model problem; analysis of covariance programs often do not permit the user to examine whether the slope of the dependent variable on the covariate interacts with the treatment variable (that is, whether the slopes are different in different groups, thus contradicting the too-facile assumption of homogeneity of regression). Such an interaction does not invalidate the analysis; it merely makes the interpretation more challenging. And if such an interaction is visibly present, the analysis that assumes its absence will in general have less power to detect _other_ interesting things. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Is this how you would have done it?
On Sat, 22 Dec 2001, Ralph Noble asked: How would you have done this? A local newspaper asked its readers to rank the year's Top 10 news stories by completing a ballot form. There were 10 choices on all but one ballot (i.e. local news, sports news, business news, etc.), and you had to rank those from 1 to 10 without duplicating any of your choices. One was their top pick, 10 their lowest. Only one ballot had more than 10 choices, because of the large number of local news stories you could choose from. I would have thought if you only had 10 choices and had to rank from 1 to 10, then you'd count up all the stories that got the readers' Number One vote and which ever story got the most Number One votes would have been declared the winner. That is certainly one way of determining a winner. But if one were going to do this in the end, there is not much point to asking for ranks other than 1, because that information is not going to be used at all. (Unless, of course, one uses a variant of this method for the breaking of ties, or for obtaining a majority of votes cast for the winner.) Not so in the case of this newspaper. So maybe I do not understand statistics. Non sequitur. You are not discussing statistics, you are discussing the choice of methods of counting votes. The newspaper told the readers there were several ways it could have tallied the rankings. This is true. Several may be an understatement. The newspaper decided to weight everybody's responses and gave each first place vote a value of 10, each second place nine, each third place eight, and so on. They then added together the values for each story and then ranked the stories by point totals. So is this an accurate way to have tallied the votes? Why not, assuming they didn't err in their arithmetic? In what sense do you want to mean accurate? I would use the word to describe the care with which the chosen method was carried out, not the choice of method, as you appear to mean. Accurate ordinarily refers, at least by implication, to how closely some standard or other is being met: what standard did you have in mind? And why weight them since the pool in all but one category only had 10 items to choose from? One answer is, precisely because all categories (but the one, and you haven't quite described what happened to the one, but I'll assume that only the ranks 1 to 10 were used in that case) had 10 items. If you add up all the 1st, 2nd, 3rd, etc. votes _without_ weighting them (that is to say, weighting them equally instead of unequally), you get the same total for each item, and have no way of declaring a winner. (This may not be true for the one category, since there are more than 10 items but only 10 ranks to be apportioned among them.) One could, of course, have weighted them according to their ranks (1st = 1, 2nd = 2, etc.) and chosen the one with the _lowest_ point total. (This of course is equivalent to what the newspaper actually did: this point total equals 11 minus the newspaper's point total, and you get the same winners this way.) Or according to the reciprocal of their ranks (1st = 1, 2nd = 1/2, 3rd = 1/3, etc.) and added those up, and taken the highest score. This is not equivalent to the method actually used, although sometimes the results are not different. Etc. If you conclude from all this that the choice of counting method for tallying votes is an arbitrary one, you are quite right. It is. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: s-function in SPSS curve estimation
On Thu, 20 Dec 2001, Johannes Hartig wrote: Does anyone know the original applications or the meaning of the S-function in SPSS? I know the function itself: Y = e**(b0 + (b1/t)) or ln(Y) = b0 + (b1/t) and I know how the curve looks like, but I am wondering in which fields of research this function is typically used and which empirical relations it describes? You may find it looks a little more like other functions you have seen somewhere if you rewrite it as Y = a*e**(b1), or equivalently ln(Y) = ln(a) + b1 When it is desired to find the value of a, it is simply e**(b0), from your equation above. In biological contexts, this describes an exponential growth curve (which applies to some period of almost any organism's life, usually its extreme youth, before environmental constraints restrict its growth rate). Then the parameter b1 is positive and is intimately connected to doubling time, the length of time during which the organism doubles in size. I suspect that this is why your original formulation had b1/t in the exponent. If b1 is negative, then the equation models exponential decay, and the parameter b1 is connected (in exactly the same way as above) to half-life. Applications include (perhaps obviously) the diminution over time of the radioactivity of a radioactive substance. - DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: Statistical illiteracy
On Fri, 14 Dec 2001, Wuensch, Karl L wrote: I came across a table of costume jewelry at a department store with a sign that said 150% off. I asked them how much they would pay me to take it all off of their hands. I had to explain to them what 150% meant, and they then explained to me how percentages are computed in the retail trade: first we cut the price in half (50%). Then we cut it in half again. Now we have cut it in half a third time. 50% + 50% + 50% = 150% off. Interesting. Not altogether surprising, though. In a conversation with a local bank mortgage person, I explained that part of my income is in Canadian funds, deposited into my bank in Toronto, and the current exchange rate is (approximately) 1.50 (Canadian $ for each US $). She then wanted to calculate the equivalent US income by discounting the Canadian value by 50%. I pointed out that this was incorrect: one would discount the Canadian value by 33%. She said I hear what you're saying, but went on to indicate that it somehow wasn't relevant. I could not tell whether (a) she didn't believe me, (b) she didn't know how to deal with the arithmetic of exchange rates, (c) this is the way we do it here, (d) something else, or (e) a combination of the above. Whatever the case, I decided it would be the better part of valor to deal with another bank. But back to your retail trade: if they advertise a 150% discount directly, without referring to the sequence of three 50% discounts, might they not be liable to legal action for misrepresentation? -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When to Use t and When to Use z Revisited
On Sun, 9 Dec 2001, Ronny Richardson wrote in part: Bluman has a figure (2, page 333) that is supposed to show the student When to Use the z or t Distribution. I have seen a similar figure in several different textbooks. So have I, sometimes as a diagram or flow chart, sometimes in paragraph or outline form. The figure is a logic diagram and the first question is Is sigma known? If the answer is yes, the diagram says to use z. I do not question this; however, I doubt that sigma is ever known in a business situation and I only have experience with business statistics books. Depends partly on what parameter one is addressing (either as a hypothesis test or as a confidence interval). For the mean of an unknown empirical distribution, I expect you're right. But for the proportion of persons in a population who would want to purchase (for a currently topical example) a Segway, the population variance is a known function of the proportion (the underlying distribution being, presumably, binomial), and for this case the t distribution is simply inappropriate, and one ought to use either the proper binomial distribution function, or else the normal approximation to the binomial (perhaps after satisfying oneself that N is sufficiently large for the approximation to be credible with the hypothesized (or observed) value of the proportion; various textbook authors offer assorted recipes for this purpose). { Snip, discourse on N = 30, although I'd think it were rather on df = 30. } However, other authors go well beyond 30. Aczel (3, inside cover) has values for 29, 30, 40, 60, and 120, in addition to infinity. Levine (4, pages E7-E8) has values for 29-100 and then 110 and 112, along with infinity. I could go on, but you get the point. If you always switch to z at 30, then why have t tables that go above 28? Again, the infinity entry I understand, just not the others. { Snip, assorted quotes ... } So, Berenson seems to me to be saying that you always use t when you must estimate sigma using s. Levine (4, page 424) says roughly the same thing, ... So, I conclude {slightly edited -- DB} 1) we use z when we know the sigma and either the data are normally distributed or the sample size is greater than 30 so we can use the central limit theorem. I would amend this to the sample size is large enough that we can... Whether 30 is in fact large enough or not depends rather heavily on what the true shape of the parent population actually is. (If it's roughly symmetrical and bell-shaped, 30 may be O.K.) 2) When n30 and the data are normally distributed, we use t. 3) When n is greater than 30 and we do not know sigma, we must estimate sigma using s so we really should be using t rather than z. Now, every single business statistics book I have examined, including the four referenced below, use z values when performing hypothesis testing or computing confidence intervals when n30. Are they 1. Wrong 2. Just oversimplifying it without telling the reader or am I overlooking something? I vote for both 1. and 2., since 2. is in my view a subset of 1, although others may not share this opinion. I would add 3. Outdated. on the grounds that when sigma is unknown, the proper distribution is t (unless N is small and the parent population is screwy) regardless how large the sample size may be. The main (if not the only) reason for the apparent logical bifurcation at N = 30 or thereabouts was that, when one's only sources of information about critical values were printed tables, 30 lines was about what fit on one page (plus maybe a few extra lines for 40, 60, 120 d.f.) and one could not (or at any rate did not) expect one's business students to have convenient access to more extensive tables of the t distribution. And, one suspects latterly, authors were skeptical that students would pay attention to (or perhaps be able to master?) the technique of interpolating by reciprocals between 30 df and larger numbers of df (particularly including infinity). But currently, _I_ would not expect business students to carry out the calculations for hypothesis tests, or confidence intervals, by hand, except maybe half a dozen times in class for the good of their souls: I'd expect them to learn to invoke a statistical package, or else something like Excel that pretends to supply adequate statistical routines. And for all the packages I know of, there is a built-in function for calculating, or approximating, the cumulative distribution of t for ANY number of df. The advice in any _current_ business- statistics text ought to be, therefore, to use t _whenever_ sigma is not known. And if the textbook isn't up to that standard, the instructor jolly well should be. { Snip, references. See the original post for more details. } -- DFB.
Re: What usually should be done with missing values ...
On 1 Dec 2001, jenny wrote: What should I do with the missing values in my data. I need to perform a t test of two samples to test the mean difference between them. How should I handle them in S-Plus or SAS? 1. What do S-Plus and/or SAS do with missing values by default? (All packages have defaults, and sometimes they're even sensible ones. If your package(s) do what you want done, or at least do something you can live with, that's probably the most comfortable resolution of your question.) 2. Why are there missing values? And what do these reasons imply (if anything) about the values themselves? There are essentially two choices available: (a) treat the values as missing, that is, discard each of the cases for which the variable in question is missing for the duration of the analysis of that variable, and retrieve those cases again when dealing with some other variable for which their value is not missing. This is the default in MINITAB and SPSS, although for some analyses (in both packages) the missing cases are deleted listwise (in multiple regression, for example, if any of the variables in the model be missing, the whole case is deleted fron the analysis) and for some the missing cases are deleted pairwise (in reporting a correlation matrix, for example, a case is deleted from the computation of a correlation coefficient if either of the two variables is missing, but is retained for other correlation coefficients for which both variables are non-missing in this case). (b) Impute some value to the missing variable for this case. There are a great variety of imputation schemes, all of them (so far as I know) suffering from the logical defect that one must assume something about the missing value, and the assumption may not only be untrue, it may be wildly in error. One approach is to substitute the mean of this variable for the missing value; but if the _reason_ the value is missing implies that the actual value is likely to be extremely high or extremely low, this is evidently not a good strategy. Another approach is to use some variant of multiple regression to predict the missing value from the existing values of other variables; again, this assumes that the missing value would be close to the regression line (or surface), and if the _reason_ implies an extreme value or outlier, this is not particularly likely to yield a realistic value. This is of course a simplified account (some might say oversimplified) of the problem of missing-ness, but may suggest some useful ideas. Personally, I generally prefer to acknowledge that I don't know the value that's missing, and let the case be temporarily discarded, at least for a first run at an analysis (or series of analyses); most of the time. And if I chose to use a method of imputation, I'd usually want to report results both of analyses in which the missing data are honestly missing, and analyses in which imputed values are used, so that I (and my readers) could see the effect(s) of the imputation. And since you want to test for differences between means, you almost certainly should NOT substitute a _mean_ for any missing value. If you substitute the overall mean, you will tend to diminish the real difference, if any, between the two sample means, and if there's a lot of missing data you could end up not finding differences where they would have been evident if you'd permitted the missing cases to be discarded. If yhou substitute the mean of this subgroup, you will not change the apparent difference between the means, but you WILL reduce the within-group (pooled or not) variance, so that you will have spuriously high sensitivity to differences between the means. Whether there is an aregument that would support any other method of imputation in your case, I cannot tell. I'm inclined to doubt it, but that maybe merely a reflection of my usual skepticism (or, perhaps, curmudgeonliness). Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
On Tue, 27 Nov 2001, Thom Baguley wrote in part: Donald Burrill wrote: On Fri, 23 Nov 2001, L.C. wrote: The question got me thinking about this problem as a multiple comparison problem. Exam scores are typically sums of problem scores. The problem scores may be thought of as random variables. By the central limit theorem, the distribution of a large number of test scores should look like a Normal distribution, Provided, of course, that the test scores in question are iid. Now it is possible to imagine that test scores for different persons are measured independently (although I am aware of skepticism in the ranks on this point!), but that they are identically distributed seems unlikely at best. I'd argue that they probably aren't that independent. If I ask three questions all involving simple algebra and a student doesn't understand simple algebra they'll probably get all three wrong. True. But this does not seem to me to speak to the issue of independence, which as I understand it is an assumption that responses made by student A to items on a test are unrelated to (i.e., do not affect and are not affected by) the responses made by student B to those items. Surely student A, who has not (let us suppose) adequately remembered what s/he needs to know of simple algebra, is not to be held responsible for the fact that student B doesn't remember any either? In my experience most statistics exams are better represented by a bimodal (possibly a mix of two skewed normals) than a normal distribution. Essay based exams tend to end up with a more unimodal distribution (though usually still skewed). Interesting. Scores on my exams tend to be negatively skewed in general, and to show evidence of several clusters (that may or may not show up as apparent modes): the several persons at the bottom, often clustered at some little distance from their nearest neighbor(s), who almost seem dtermined to fail; and two to four clusters moving up the scale from there, which sometimes fall into ranges useful for grades of D, C, B. Sometimes, but not always, there are another few students clustered at the top. -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students: A Statistical Perspective
On Sat, 24 Nov 2001, L.C. wrote: Thanks for the reply! As for the iid, it's reasonable to believe the questions could be drawn from some population. Why not the answers? If the questions are selected in accordance with some table of specifications, they are not from _a_ population, but from many; and there is no _a priori_ reason I can think of to suppose that their item characteristics are iid. As for the answers, the usual reason for wanting to evaluate students is precisely because they are (or one hopes they are!) different in their levels of skill (or whatever): the task is to assess these skill levels, and it is nonsense to assume that all the persons are id on the measure on which one hopes to identify differences. (Hey! I've heard much worse justifications for statistical assumptions! :) At any rate, bell curves do arise often enough in this context to be written about. Of course, bell curve does not necessarily imply normal distribution. You can get quite nice bell curves from binomial distributions, e.g. Also of course, any real data must be discrete, not continuous, so cannot technically be normally distributed anyway. (It is possible that the distribution may be more or less well approximated by a normal distribution with the same mean variance, but that's not the same thing.) As for wanting gaps in the resulting distribution... That was my point. When you do have a bell curve, it shouldn't be satisfying; it should be disturbing. Depends on how bell-like the curve is. For almost any interesting variable that can be measured on humans, one expects rather a lot of people in the middle, and progressively fewer toward the extremes, of the distribution; doesn't one? (And if not, why not?) This is the maddening aspect of psychometry - they engineer these nice normal distributions on which to base their diagnoses. You'd think they'd *want* bimodal, discrete, or mixed continuous/discrete distributions, but no. They diagnose by Z scores (thereby defining their own prevalences :) and assert that they are discovering diseases, and not punishing unusual people. Best Regards, -Larry (And they get to testify in court) C. Hmm. This thread started out as evaluating students, in the context of classes and teacher-made tests, as I recall. Not exactly the same thing as diagnosing (in a quasi-medical sense) or discovering diseases, I shouldn't think. One wonders, then, why you aren't posting these complaints in a newsgroup of psychometricians, rather than one of statistics teachers? Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Need help with a probability problem
On 20 Nov 2001, J. Peter Leeds wrote: I'm working on a formula for measuring decision making skill and am trying to estimate the probability that a person of known skill can distinguish among different response option contrasts and avoid a type II error. One effective way of avoiding a Type II error is to reject the hypothesis being tested. Of course, this entails a non-zero probablility of making a Type I error... :-) Seriously, though, I believe it is not possible to _avoid_ a Type II error in the process of accepting the hypothesis being tested; one can only [attempt to] control the probability of such an error. Perhaps this is what you meant, but it isn't exactly what you wrote. The problem actually breaks down to a rather simple analogy: Imagine that a man has been sentenced by court to run a gauntlet composed of four club-wielding executioners. The court places the best execution You mean executioner, surely? at the beginning of the gauntlet followed by the second, third and fourth best. Based on past performance the first executioner has a .90 probability of striking the man, while the remaining executioners have .50, .30, and .20 respectively. What is the man's probability of being struck by at least one of the executioners and how is this calculated? Notice that the events are not independent because if the man is fast (or lucky, or skillful?) enough to make it past the first executioner his odds of making it past the rest are improved since he will have survived the best executioner. In other words, the probabilities associated with the other three executioners are NOT .50, .30, and .20 as advertised, but some (presumably) smaller values? In other words, the probability of being struck by the second executioner is .50 only if one has already been struck by the first executioner? This doesn't seem very sensible... And what model have you (if any) for recalculating the other three probabilities for those who manage to escape the first (and then the second, and then the third) executioner? I do not see why you quote values of alleged probabilities, only to say in the next breath that those probabilities are false. Nor do I quite believe your assertion of non-independence: seems to me they might very well BE independent, if only one knew what the REAL probabilities were. No? What is this sort of problem called? (e.g., conditional probability, joint probability, Bayesian probability, etc.). Please excuse the inanity of the example but it is much easier than trying to explain my research. Easier it may be, but one can't help suspecting that some aspects of the inanities evident are not paralleled by structures or relationships in whatever your real problem is; which rather vitiates the underlying (if unstated) assumption that analysis of the inane example will be in some way helpful in analyzing the real circumstances. Or, to put it another way, the inane example may be wholly inadequate as a model for whatever phenomenon you're really trying to deal with. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: F distribution
On 17 Nov 2001, Myles Gartland wrote: In an F distribution, the critical value for the lower tail is the reciprocal of the the critical value of the upper tail (with the degrees of freedom switched). Why? I understand how to calculate it, but do not get why the math works. Essentially for the same reason that in the normal distribution the critical value for the lower tail is the negative of the critical value for the upper tail. Thnke about it. For F = V1/V2, where V1 and V2 are two variance estimates with numbers of degrees of freedom n1 and n2 respectively, the relevant F distribution is said to have n1 and n2 degrees of freedom, naming the numerator first and then the denominator. For F = V2/V1, the relevant F distribution has n2 and n1 d.f. (hence the interchange of the numbers of degrees of freedom to which you allude). Notice that V2/V1 is the reciprocal of V1/V2. If V1/V2 is sufficiently larger than 1 that the hypothesis of equal variances in the populations can be rejected, then V2/V1 must be sufficiently smaller than 1 to permit rejection. Hence the critical value for V2/V1 must be the reciprocal of the critical value for V1/V2, and the d.f. are interchanged simply by the choice of which direction to divide. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: regression on timeseries data: differencing ?
On Tue, 13 Nov 2001, Wendy (alias Eric Duton?) wrote: When applying multiple regression on timeseries data, should I check (similarly to ARIMA-models) for unit roots in the dependent variable and the predictor variables and perform the necessary differencing OR could I simply start the multiple regression analysis on the pure timeseries and check the residuals on the general assumptions of regression analysis (esp. autocorrelation) ? and I replied: 1. Why do you write as though these were mutually exclusive options? to which Eric Duton responded: Actually I'm a bit confused. When looking at a timeseries course they stress the need for stationarity of the series. Courses always simplify, and sometimes oversimplify. On the other hand in Multiple regression theory they stress the errors should be iid N(0,constant var). I don't know about should. It is often convenient if this is true, and in the nature of things the observed residuals (errors) always have mean 0 anyway. So strictly speaking it seemed to me I shouldn't worry about preliminary stationarity tests in multiple regression between timeseries and just check the residuals afterwards. But then I saw a paper where they did check for stationarity before estimating the parameters ... And of course another where they didn't ... Therefore I'm totally lost whether I should or should not carry over the preliminary stationarity testing into multiple regression theory when confronted with timeseries for Y and X's. Should and should not have no meaning to me in the absence of any context that would indicate the value system, or perhaps the theology, that specifies the nature of should. I do not understand why you waffle around worrying about should when you could have been carrying out BOTH analyses, after which you would know if the difference in analytical approach entails any difference(s) in results, and whether any such difference(s) be interesting enough to pursue and attempt to explain (via future research). Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: When one should stop sampling?
On 12 Nov 2001, Niko Tiliopoulos wrote: I am acting as the stats advisor for my unit in the psychology department of the University of Edinburgh, UK. Last week a colleague of mine presented me with the following issue, and I am not quite sure how to respond: She is running a psychological experiment, in which she a priori specified her sample size as 200 people. _Intended_ sample size, surely? Where was this specified? In a proposal for funding the research in question? There may be a non-statistical question lurking in the underbrush here, about the degree to which she is (or thinks she is, or may be made to appear as though she is) committed to actually _using_ 200 people, and what the penalty (if any) is for not using that many. She has already sampled 40 participants and a preliminary effect size (ES) analysis suggests an almost zero effect. Based on previous research, she was expecting a detectable effect even with 40 subjects - though I suspect she was not expecting enough power to get a significant result at that stage. In addition, it appears that the reason the ES she gets is nowhere close to the expected figure may be because of a design flaw. Do you mean a flaw in the original design (that is, in the logic) of the study, or a glitch in carrying out some intended aspect of the design (that is, in the implementation of a perfectly adequate design)? And is there evidence on the alleged flaw, persuasive enough to convince T.C. Mits (or equivalent!) that this isn't just a weak excuse invented to conceal some more culpable defect? (Although I have difficulty imagining what I might have meant by that phrase!) So she asked me whether it is justified to go up to, say, 100 participants, check again her ES and if it's still near zero, stop sampling, Why waste the time and energy of _another_ 60 subjects, if one really believes [part of] the problem is a design flaw? The more obvious approach would be to fix the (newly-discerned?) problem, and start over with a new batch of Ss. Is there a justifiable reason for NOT fixing defects when they've been found? or whether she had to sample all 200 people because she had said so in her protocol? Her protocol (by which I suppose you mean her research proposal?) surely embodied, at least implicitly, the conditions that the research design was competently done, and that the procedures were being (or, were to be) carried out consistently with the design. The presence of a design flaw, even if it's only a putative one, denies one or both of these conditions, and therefore logically revokes any implicit responsibilities to carry out the entire protocol as originally specified. However, I can imagine scenarios in which a legal (as distinct from logical) responsibility may exist that would need to be addressed in legal terms. (I can't tell whether any such thing applies in this case, of course.) I do think it would be foolish to keep sampling when one has grounds to believe that there is no effect or that there is a flaw in the study. Right. That's called sending good money after bad, and (at least according to North American folkore) Scots are noted for their antipathy toward any such activity. I believe that if the plot of subjects versus power, suggested that the power curve levelled after a given sample size, that would be enough justification to stop sampling (needless to say that participants that satisfy her protocol are precious and hard to find). Probably overkill, and quite possibly impossible. (After all, we keep being reminded that the null hypothesis is never actually true, which implies that the ES is not exactly zero, which implies that with a sufficient sample size (maybe ten million or so?) the power curve would indeed level out -- near power = 1.0.) If one wanted to invoke a statistical argument (in the face of whatever logical argument and/or evidence exists of a design flaw and/or of an ES an order of magnitude smaller than one had reason to expect in the beginning), it might be more persuasive to show that an upper bound on ES (say, the top of a 95% confidence interval) would imply no practical value whatever for so small an ES. (Presumably the presence of an interestingly large ES would have implied some change, or recommendation for change, in practice somewhere.) Her query though sounds to me more like a methodological (if not ethical) one, rather than a true statistical problem, and thus this bottom-up justification may not suffice. Dunno. Attempts to identify the pure effect of any problem or condition (e.g., to distinguish between inherited and environmental influences in the presence of both) are usually doomed to failure by their very nature. And are you suggesting that there IS, or could be, some justification for imposing a failed experimental protocol on a group of innocent bystanders (the additional Ss whose time
Re: regression on timeseries data: differencing ?
On Tue, 13 Nov 2001, Wendy (alias Eric Duton?) wrote: When applying multiple regression on timeseries data, should I check (similarly to ARIMA-models) for unit roots in the dependent variable and the predictor variables and perform the necessary differencing OR could I simply start the multiple regression analysis on the pure timeseries and check the residuals on the general assumptions of regression analysis (esp. autocorrelation) ? Wendy 1. Why do you write as though these were mutually exclusive options? 2. Why did you send three (!) copies to the list? -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Evaluating students
On Wed, 14 Nov 2001, Alan McLean wrote in part: Herman Rubin wrote: A good exam would be one which someone who has merely memorized the book would fail, and one who understands the concepts but has forgotten all the formulas would do extremely well on. Since to understand the concepts almost always means understanding (and hence knowing) the formulas, I would interpret someone who has 'forgotten all the formulas' as understanding the concepts only in the most superficial manner, and so should do badly! Non sequitur. To know formulas (in a deep sense of understanding them) is one thing; to be able to write them verbatim is another thing altogether (and something that xerographic copiers do better than people do, by and large). Of course, it is easier to ask questions about the details of formulas than to probe a student's deepr understandings... Overall, the evaluation of students is driven mostly by budget, (lecturers') time, lecturers' interest, the number of students, politics - the best one can do is to assess students as honestly as possible within the range allowed by these factors! Sadly, this is true; and not infrequently exacerbated by administrative rulings (not to say interference!). At the university where I teach part-time, for example, course marks are to be submitted within 72 hours of the final examination. Not a circumstance that encourages (let alone rewards) setting the kinds of exams that Herman describes. -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Z Scores and stuff
You persist in repeating your original request in your original phrasing, with no elaboration(s) that might resolve the ambiguities therein. On Sat, 10 Nov 2001, Mark T wrote: On Fri, 09 Nov 2001 Rich Ulrich [EMAIL PROTECTED] wrote: On Thu, 8 Nov 2001 Mark T [EMAIL PROTECTED] wrote: What are the formulae for calculating the mean to z, larger proportion and smaller proportion of a z-score (standardised score) on a standard normal distribution? I know about tables listing them all, but I want to know how to work it out for myself :o) Do you want the calculus, or just a numerical approximation? For starters, in my stats-FAQ, see http://www.pitt.edu/~wpilib/statfaq/gaussfaq.html Thanks for your reply. Ummm, unfortunately I don't understand this :o) Not surprising. I am by no means a mathematician. I am studying psychology and 1/4 of my course is statistics *for psychology*, ie it's pretty basic without any of the advanced stuff (I hope!). A pity, if true. Adequate practice of psychology requires considerably more than a minimum knowledge -- and understanding! -- of statistics. All I want to know, for interest's sake, is how one calculates the mean to z, Yes, you said that before. In the same words. For the sake of (possibly) furthering the conversation, I will assume that what you meant was something like Given a value x of a variable X, which has a known mean, how does one convert x to z? (Your language admits of several other possible meanings, but I'll leave it to you to clarify what you intended, if it wasn't what I've conjectured (and if you can).) The formula you request, for this purpose, converts x to z: z = (x - mean)/sd where sd is the known standard deviation of the variable X. Now, I'm sure your statistics instruction includes this equation; it follows that the question you really want to ask is (probably) something else. In which case we all await with interest your clarification. larger proportion and smaller proportion of a standardised score, without having to read through a long list of numbers. Hmm. Numbers scare you, do they? There are essentially three ways of going about this part: 1. Look the proportions up in a table of the standard normal distribution, which by your account you are apparently too lazy to do. Sounds as though you're being inefficient, by the way: there's no need to read through a long list of numbers, only to look up a single number in the table (the other proportion you can get by subtracting from 1.) 2. Use convenient statistical software (MINITAB, SAS, SPSS, a TI-83 calculator, etc.) to calculate the proportions by numerical approximation. This of course does not satisfy your request for the formulae. 3. Start with the mathematical expression for the density function of a standard normal distribution, and integrate it from minus infinity to z. Which is what Rich was referring to when he asked if you wanted the calculus. Again, by your account you haven't the mathematics for this; especially as the integral in question does not exist in closed form. (Which, of course, is precisely the reason why tables were constructed in the first place, to avoid a _very_ tedious computational chore every time one had a value of z for which proportions, or probabilities, were needed.) Forgive me if that was covered in your FAQ, but I couldn't see it! Perhaps you could point me in the direction of the formulae? Forgive me if my candour is uncomfortable, but this sounds to me very like asking a sorcerer for the spell(s) you think he uses. Do you want a magic wand also, and perhaps a cloak of invisibility? I am reminded of the time, years ago, when the mother of a high-school student telephoned me for assistance in a problem the boy had been set by his math teacher. (I noticed at the time that it wasn't the _boy_ who called me.) He'd been asked to figure out the possible scores one could get in a hand at cribbage (or perhaps to explain why a score of 19 is not possible -- I don't remember precisely). Mother was sure there must be a formula for doing this (she evidently looked on mathematics as you do, as a domain wholly of magic and populated by sorcerers), and was audibly disappointed to be told The only way to do this is to enumerate the possible hands. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Can I Use Wilcoxon Rank Sum Test for Correlated Clustered Data??
On Thu, 1 Nov 2001, Chia C Chong wrote: I am a beginner in the statistical analysis and hypothesis. I have 2 variables (A and B) from an experiment that was observed for a certain period time. I need to form a statistical model that will model these two variables. Seems to me you're asking in the wrong place. The _model_ cannot be determined statistically, nor (in general) by statisticians. It arises from the investigator's knowledge of the substantive area in which the experiment was carried out, and of the reasons why the experiment was designed conducted in the first place. Given a model, or, better, a series of more or less complex models, a statistician can help you decide among them, and can help you arrive at numerical values for (at least some of) the parameters of the models. As an initial step, I plot the histograms of A B separately to see how the data were distributed. How would you (or the investigator) expect them to be distributed? In particular, why would you think they might follow any of the usual theoretical distributions? (In other words, what's the theory behind your expectations -- or your lack of expectations?) However, it seems that both A B can't be easily described by a simple statistical distributions like Gaussian, uniform etc via visualisation. Hence, I proceeded to plot the Quantile-Quantile plot (Q-Q plot) What did you think this would tell you? and trying to the fit both A and B with some theoretical distributions (all distributions avaiable in Matlab!!). Again, none of the distributions seem can descibe then completely. Then I was trying to perform the Wilcoxon Rank Sum test. What hypothesis were you testing, and why was the Wilcoxon test relevant to it? From the data, it seems that A B might be correlated in some sense. You have not described a scatterplot of A vs. B (or B vs. A, whichever pleases you). Why not? My question is, what can I purely rely on the Wilcoxon Rank Sum Test to find the parameters of the distributions that can describe A B?? Since the Wilcoxon is allegedly a distribution-free test, I'm quite bemused by the idea that it might help one _find_ parameters... How do perform test to see whether A B are really correlated?? Practically all pairs of variables are correlated, to one degree or another. What will it signify to you if A and B are (or are not) really correlated (whatever really is intended to mean)? How if A or/and B are overlay of two or more distributions?? Hmm. By overlay, do you mean mixture, perhaps? Can this test tell me?? What make thing more tricky is that clustering was also observed in both A B. At the same times, or in the same places? I really hope to get an idea how to start with the statistical analysis for this kind problem...# I'm sorry, but I don't yet perceive precisely what the problem is that the data were intended (or designed?) to address. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: What is a confidence interval?
In reviewing some not-yet-deleted email, I came across this one, and have no record of its error(s) having been corrected. On Sat, 29 Sep 2001, John Jackson wrote: How do describe the data that does not reside in the area described by the confidence interval? For example, you have a two tailed situation, with a left tail of .1, a middle of .8 and a right tail of .1, the confidence interval for the middle is 90%. Well, no. You describe an 80% C.I., not a 90% C.I. Is it correct to say with respect to a value falling outside of the interval in the right tail: For any random inverval selected, there is a .05% probability that the sample will NOT yield an interval that yields the parameter being estimated and additonally such interval will not include any values in area represented by the left tail. If you're still referring to the 80% C.I. introduced above, .05% probability is not applicable. [Not even if you had stated it correctly, either as .05 probability or as 5% probability. ;-) ] Can you make different statements about the left and right tail? Not for the case you have described. Had you chosen to compute an asymmetric C.I. (perfectly possible in theory, hardly ever done, so far as I am aware, in practice) it would be otherwise. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Comparing percent correct to correct by chance
On Sun, 28 Oct 2001, Melady Preece wrote: Hi. I want to compare the percentage of correct identifications (taste test) to the percentage that would be correct by chance 50%? (only two items being tasted). Can I use a t-test to compare the percentages? What would I use for the s.d. for by chance percentage? (0?) Standard comparison would be the formal Z-test for a proportion; see any elementary stats text. If you have a reasonably large sample size, use the normal approximation to the binomial; if you have a small sample, it may be necessary to use the binomial distribution itself, which is considerably more tedious unless you have comprehensive tables. Sounds as though you'd wish to test H0: P = .50 vs. H1: P .50. For the Z-test, use the S.D. of a proportion associated with the hypothesized value (.5): SD = SQRT(pq/n) where p = the hyp. value (.5 in this case), q = 1-p, n = sample size. You may want to examine the translation of chance into a proportion of .5. I don't think I know what by chance means in the context of your investigation; certainly .5 is a possible interpretation, but I can imagine situations where it would be incorrect. (For example, if the two items are always presented in the same order, and there is a predilection in your population to identify the first correctly more frequently than the second, just because they're first and second, the chance hypothesis might be more properly represented by a number .5. This problem might be countered if the items were presented in counterbalanced order.) Also, if the respondents know beforehand what the two items are (just not which one is which), the situation is different from one in which the two items might (so far as the respondents know) come from a long-ish array of items. Thus if the task were to decide between chocolate and strawberry, the latter might be mis-identified more often if raspberry were [thought to be] a possible alternative. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Graphics CORRESPONDENCE ANALYSIS
On Wed, 24 Oct 2001, Rich Ulrich wrote in part: It has been my impression (from google) that CA is more popular in European journals than in the US, so there might be better sites out there in a language I don't read. (CA = correspondence analysis, ou en francais analyse des correspondances) In Canada, and to a lesser extent in the U.S., correspondence analysis is also known under the name dual scaling. For references consult Professor Emeritus Shizuhiko Nishisato of the University of Toronto: Shizuhiko Nishisato [EMAIL PROTECTED]. -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Final Exam story
The story is about six students who ... The instructor ... tells them to report the next day for an exam with only one question. If they all get it right they all pass. They were seated at corners of the room and could not communicate. Must have been an interesting room, with six corners :) The one question was, Which tire? I remember that the likelihood of all four pickng the same tire was quite small, but I forgot how to calculate it explicitly. Assuming an ordinary vehicle with 4 tires, and that the students' responses are independent, (1/4)^6 = 1/4096. I would particularly appreciate a general solution (N students, M tires). For a room with N corners? The generalization ought to be obvious. On 12 Oct 2001, Dubinse wrote: I had promised a colleague a story that illustrates probability and now I forgot how to solve it formally. The story is about six students who go off on a trip and get drunk the weekend before their statistics final. They return a few days late and beg for a second chance to take the final exam. They tell a story about how they were caught in a storm and their car blew a tire and ended up in a ditch and they needed brief hospitalization etc. The instructor seems very easy going about the whole thing and tells them to report the next day for an exam with only one question. If they all get it right they all pass. They were seated at corners of the room and could not communicate. The one question was, Which tire? I remember that the likelihood of all four pickng the same tire was quite small, but I forgot how to calculate it explicitly (except for listing all the possible outcomes). I would particularly appreciate a general solution (N students, M tires). Thanks. Stephen Dubin VMD http://www.hometown.aol.com/dubinse [EMAIL PROTECTED] Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Bimodal distribution
On Fri, 12 Oct 2001, Desmond Cheung (of Simon Fraser University, Vancouver, BC) wrote: Is there any mathematical analysis to find how much the two peaks stand out from the other data? Hard to answer, not knowing where you're coming from with the question. Any answer depends on the model(s) you wish to entertain that would generate a bimodal distribution. The more usual question, I believe, is how much separation there is between the modes (peaks), which is a horizontal distance, rather than how much the modes stand out from the other data, which rather sounds like a vertical distance. One suspects that you might usefully begin by consulting the literature on mixtures of normal distributions, or perhaps on mixtures more generally. Is there any formulas to find the variance/deviation/etc that's similar to the unimodal distribution case? Formulas for variance, std. deviation, etc., do not depend on the shape of the distribution, except insofar as the functional form of the distribution may lead to a simpler formula, as in the case of a binomial distribution. Otherwise, if you want/need the variance (etc.) of a bimodal distribution, use the same formulas you use for any other empirical distribution. Incidentally, you write the unimodal distribution case as though there were only one unimodal distribution. There are lots. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ranging opines about the range
William B. Ware [EMAIL PROTECTED] wrote: Anyway, more to the point... the add one is an old argument based on the notion of real limits. Suppose the range of scores is 50 to 89. It was argued that 50 really goes down to 49.5 and 89 really goes up to 89.5. Thus the range was defined as 89.5 - 49.5... thus the additional one unit... I recall textbooks (in the late 1960s and 1970s) that defined both an exclusive range (= max - min) and an inclusive range (= max - min + 1), the latter invariably being illustrated with examples of data that came in integers. (In fact, the examples _may_ always have been of variables that were counts.) On Sat, 6 Oct 2001, Stan Brown replied: Perhaps a better argument is that if you count the numbers you get forty of them: 50, 51, 52, ..., 59 makes ten, and similarly for the 60s, 70s, and 80s. I see the argument, but I don't know as I'd call it better. Seems to be confusing apples with oranges. By the idea of range, does one want to mean the _distance_ between the largest and smallest values in the data, or the _number_of_different_values_ between those two extremes? (These are NOT equivalent concepts!) And if the latter is of interest, does one want the number of different values _in_this_data_set_, or the number of _possible_ different values that might have been observed (under what hypothetical conditions?)? The inclusive range rule supplies the latter (under the assumption that the possible values can only be integers, which is an interesting restriction in itself) -- but not for all imaginable variables. [Counterexample: What's the range of possible values of a hand in cribbage? The smallest possible value is 0, the largest is 29. The exclusive range (in a possibly artificial data set that includes all possible hands, or at least all possible values) is 29-0 = 29. The inclusive range is 30, which is the number of integers between 0 and 29 inclusive. The number of _actual_values_ that can possibly be observed is 29 (of the integers from 0 to 29, 19 is not a possible value for a cribbage hand).] Anyway: one justification for arguing about how to calculate the range lies in not having decided whether one wants to mean range in the sense of distance in the measured variable, or range in the sense of number of [possible?] different values of the measured variable, and indeed in not having perceived that there _is_ such a distinction to be made. As William Ware reminds us, in the idea of range as distance, there may still be a distinction to be made based on the size of the units of measurement to which the measured variable is reported, and on whether one wishes to include the (presumed) half-units at either end of the empirical distribution (or, for variables like age that are customarily truncated rather than rounded, the (presumed) whole unit at the right end). The inclusive argument seems essentially to require (i) that the latent variable being measured be continuous, (ii) that one knows the precision of measurement to which the measured variable is being reported, and (iii) that one wishes not so much to describe the (empirical) sample in hand as to make inferences to the population from which one conceives it to have been drawn, under a specific (but usually only IMplicit) model under which the observed values are thought to have been derived from the latent values. Hmph. Didn't intend to be quite so long-winded. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with Minitab Problem?
Turns out the method I originally suggested is unnecessarily cumbersome. A more elegant method is described below. On Sat, 29 Sep 2001, Donald Burrill wrote in part: COPY c1-c35 to c41-c75; # Always retain the original data OMIT c1 = '*'; OMIT c2 = '*'; . . . ; OMIT c35 = '*'. There is probably a limit on the number of subcommands that MINITAB can handle (or on the number of OMIT subcommands that COPY can handle), but I don't know offhand what it is. Well, the limit is one: only one OMIT subcommand per COPY command. That makes this procedure distinctly tedious, for 35 columns. A more efficient method: ADD c1-c35 c36 This puts the sum of c1-c35 in c36, but if any one (or more) of c1-c35 are missing, the result is missing: so c36 has '*' for every row where there is a missing datum in some column(s). A reasonable next step is to see how much data is left: N c36 reports the number of non-missing values in c36. If that value is zero, or some other very small number, you might want to re-think your strategy before proceeding: COPY c1-c35 c41-c75; OMIT c36 '*'. Columns c41-c75 now contain only rows of the original c1-c35 for which all of the values are NON-missing. snip, the rest -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: E as a % of a standard deviation
On Sun, 30 Sep 2001, John Jackson wrote: Here is my solution using figures which are self-explanatory: Sample Size Determination pi = 50% central area 0.99 confid level= 99% 2 tail area 0.5 sampling error 2% 1 tail area 0.025 z =2.58 n1 4,146.82 Excel function for determining central interval NORMSINV($B$10+(1-$B$10)/2) n 4,147 The algebraic formula for n was: n = pi(1-pi)*(z/e)^2 It is simply amazing to me that you can do a random sample of 4,147 people out of 50 million and get a valid answer. It is not clear what part of this you find amazing. (Would you otherwise expect an INvalid answer, in some sense?) Thme hard part, of course, is taking the random sample in the first place. The equation you used, I believe, assumes a simple random sample, sometimes known in the trade as a SRS; but it seems to me VERY unlikely that any real sampling among the ballots cast in a national election would be done that way. I'd expect it to involve stratifying on (e.g.) states, and possibly clustering within states; both of which would affect the precision of the estimate, and therefore the minimum sample size desired. As to what may be your concern, that 4,000 looks like a small part of 50 million, the precision of an estimate depends principally on the amount of information available -- that is, on the size of the sample; not on the proportion that amount bears to the total amount of information that may be of interest. Rather like a hologram, in some respects; and very like the resolving power of an optical instrument (e.g., a telescope), which is a function of the amount of information the instrument can receive (the area of the primary lens or reflector), not on how far away the object in view may be nor what its absolute magnitude may be. What is the reason for taking multiple samples of the same n - to achieve more accuracy? I, for one, don't understand the point of this question at all. Multiple samples? Who takes them, or advocates taking them? snip, the rest Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with Minitab Problem?
I second Dennis' question. While indeed MINITAB recognizes the missing values, what it does with them depends on the procedure being used: e.g., for CORRelation it uses all cases for which each pair of variables is complete (pairwise deletion of missing data), and therefore, for a data set like yours, the numbers of cases (as well as the particular set of cases) used for each correlation coefficient are possibly different; whereas for REGRession, where any of the variables named on the REGRession command is missing, the case is deleted (listwise deletion). Whether it is even useful to construct a subset of the data for which all variables are non-missing depends on how badly infected the variables are with missing data, and on whether the missing data occur in (useful?) patterns. If you have about 10% missing in each column, unsystematically spread through the set of columns, you could end up with a subset containing zero cases. To answer your question however, on the (possibly unjustified) assumption that it's a useful thing to do: COPY c1-c35 to c41-c75; # Always retain the original data OMIT c1 = '*'; OMIT c2 = '*'; . . . ; OMIT c35 = '*'. There is probably a limit on the number of subcommands that MINITAB can handle (or on the number of OMIT subcommands that COPY can handle), but I don't know offhand what it is. (It is also imaginable that the OMIT subcommand permits naming more than one column, which would greatly simplify things, but I am inclined to suspect not.) If 35 subcommands are too many, proceed in batches of, say, 10 (or whatever): copy c1-c35 to c41-c75, omitting '* in c1-c10; then copy c41-c75 to c81-c115, omitting '*' in c51-c60; then copy c81-c115 back to c41-c75, omitting '*' in c101-c110; then copy c41-c75 to c81-c115, omitting '*' in c71-c75. Finally, to check that no missing values have been retained, count the number of missing values in that set of columns: NMISS c81 NMISS c82 . . . NMISS c115 To avoid having to inspect the result for each column, store the NMISSes in 35 constants: NMISS c81 k1 NMISS c82 k2 . . . NMISS c115 k35 copy them into an unused column somewhere (e.g., c116): COPY k1-k35 c116 and verify that they're all zero by SSQ c116 which will return 0 iff all values in the colunmn are 0. An easier way of verifying that there are no missng values in c81-c115 is to call for the INFO window (or give the INFO command: INFO c81-c115 ) which will report, inter alia, the number of missing values in each column. (I prefer the command in this situation, to avoid being confused by information about columns not relevant to the question.) On Fri, 28 Sep 2001, John Spitzer wrote: I have a dataset which has about 35 column. Many of the cells have missing values. Since MINITAB recognizes the missing values, I can perform the statistical work I need to do and don't need to worry about the missing values. Perhaps you don't need to, but you probably should. However, I would like to be able to obtain the subset of observations which MINITAB used for its calculations. As remarked above, this subset may vary from one pair of columns to another, or from one list of columns to another, depending on the calculations being performed. Yes, you definitely should worry about the missing values. I would like to be able to create a worksheet with only the rows from my dataset which do NOT contain any missing values. Which may or may not correspond to any particular subset of the data that MINITAB defined for its work. snip, hypothetical example Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: E as a % of a standard deviation
On Fri, 28 Sep 2001, John Jackson wrote in part: My formula is a rearrangement of the confidence interval formula shown below for ascertaining the maximum error. E = Z(a/2) x SD/SQRT N The issue is you want to solve for N, but you have no standard deviation value. Oh, but you do. In the problem you formulated, unless I misunderstood egregiously, you are seeking to estimate the proportion of defective (or pirated, or whatever) CDs in a universe of 10,000 CDs. There is then a maximum value for the SD of a proportion: SD = SQRT[p(1-p)/n] where p is the proportion in question, n is the sample size. This value is maximized for p = 0.5 (and it doesn't change much between p = 0.3 and p = 0.7 ). If you have a guess as to the value of p, you can get a smaller value of SD, but using p = 0.5 will give you a conservative estimate. You then have to figure out what that 5% error means: it might mean +/- 0.05 on the estimated proportion p (but this is probably not a useful error bound if, say, p = 0.03), or it might mean 5% of the estimated proportion (which would mean +/- 0.0015 if p = 0.03). (In the latter case, E is a function of p, so the formula for n can be solved without using a guesstimated value for p until the last step.) Notice that throughout this analysis, you're using the normal distribution as an approximation to the binomial b(n,p;k) distribution that presumably really applies. That's probably reasonable; but the approximation may be quite lousy if p is very close to 0 (or 1). Thbe thing is, of course, that if there is NO pirating of the CDs, p=0, and this is a desirable state of affairs from your clients' perspective. So you might want to be in the business of expressing the minimum p that you could expect to detect with, say, 80% probability, using the sample size eventually chosen: that is, to report a power analysis. The formula then translates into n = (Z(a/2)*SD)/E)^2 Note: ^2 stands for squared. You have only the confidence interval, let's say 95% and E of 1%. Let's say that you want to find out how many people in the US have fake driver's licenses using these numbers. How large (N) must your sample be? Again, you're essentially trying to estimate a proportion. (If it is the number of instances that is of interest, the distribution is still inherently binomial, but instead of p you're estimating np, with SD = SQRT[np(1-p)] and you still have to decide whether that 1% means +/- 0.01 on the proportion p or 1% of the value of np. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One more time--Two Factor Kruskal-Wallis
Hi, Carol. I'm taking the liberty of posting this to the Edstat (statistical education) list as well as the Minitab list. On Fri, 21 Sep 2001, Carol DiGiorgio wrote: My question is: I would like to run 2-way ANOVA on my data. Unfortunately it doesn't meet the assumptions of normality or homogeneity of variance. I've worked with the data to find a transformation, but have been unable to find one. 1. Which assumption of normality? The only one that comes close to being _required_ is the assumption that the _residuals_ from the model are normally distributed. (I ask, because it seems often to be believed that the raw variable itself, infected by possible effects of the design factors, should be normally distributed; this is not the case.) 2. How badly unequal are you cell variances? Unless they vary by at least an order of magnitude, unequal variances won't much affect your conclusions, and if your cell, n's are equal (or if not, if the cells with the larger variances have the larger n's), the size of the test (that is, the empirical P-level) will be not far from the nominal value. 3. Unequal variances will affect the sensitivity of post hoc comparisons, however. I want to run a non-parametric 2-way ANOVA using Minitab, and determine whether the factors or the interaction are significant (I'm guessing a 2 factor Kruskal-Wallis, but I don't know what tests exist). If any of the factors were significant I would like to run a non-parametric multiple comparison test to determine where there are significant differences. Is it possible to do this in Minitab (or any other statistical program)? If I were doing it, I'd run an ordinary two-way ANOVA, using either TWOWAY or ANOVA; or, if the design were unbalanced, using GLM (since neither TWOWAY nor ANOVA will handle unbalanced data). Then inspect the pattern(s) among the means, probably displaying them graphically, with an eye toward possible useful interpretations. If I were really concerned that the unequal variances might represent something real in the population of interest (rather than an inconvenience of sampling, in this particular sample), I'd convert the dependent variable to ranks (in another column of the worksheet!) and repeat the two-way analysis on the ranks. This would give you the equivalent of a two-way Kruskal-Wallis, or a Friedman, test. YOu haven't described your data well enough for me to tell whether a Friedman test is appropriate (see FRIEDMAN in the MINITAB Reference Manual). If it is not, you can ALWAYS simulate a two-way analysis in the framework of a one-way analysis by identifying each cell separately: e.g., a 3x4 two-way ANOVA can be analyzed as a one-way ANOVA with 12 levels. (This would apply to KRUSKAL-WALLIS (q.v.) as well as to ONEWAY.) You just have to be clever, afterwards, in defining the particular contrasts (or sets of contrasts) that identify what a two-way analysis would have reported as main effects and interactions -- but, again, that's just a matter of displaying the cell means (or medians) in the form of a two-way layout. Thank you in advance. Carol HTH.-- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: effect size/significance
On Thu, 13 Sep 2001, Paul R. Swank wrote in part: Dennis said other than being able to say that the experimental group ... ON AVERAGE ... had a mean that was about 1.11 times (control group sd units) larger than the control group mean, which is purely DESCRIPTIVE ... what can you say that is important? However, can you say even that unless it is ratio scale? Yes, well, Dennis was referring to a _difference_. When the underlying scale is interval, differences ARE ratio scale: zero means zero. -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Definitions of Likert scale, Likert item, etc.
On Sat, 8 Sep 2001, Magenta wrote in part: (responding to Rich Ulrich's remark:) Michelle, I hope that you now know that you got tangled up in hypothetical illustrations which you now regret. Sure do, I think that if you redid it so that the scale was now: don't agreestrongly agree |___| that would give you a ratio scale between no agreement and strong agreement. Well, in SOME circumstances, perhaps it might; but I don't see a persuasive rationale for it WOULD give you a ratio scale [emphasis, obviously, added]. You would then be able to use, e.g. ANOVA, on your test results, which would be numeric in millimeters. Or other units of length -- sixteenth-inches, micro-furlongs, etc. But really, you don't need a ratio scale for ANOVA, you know. At most you need an interval scale, and even then approximately (that is, approximately interval) works very well much of the time. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: Boston Globe: MCAS results show weakness in teens' grasp of
On Tue, 28 Aug 2001, Dennis Roberts wrote in part: however ... the flagging of outliers is totally arbitrary ... i see no rationale for saying that if a data point is 1.5 IQRs away from some point ... that there is something significant about that If the data are normally distributed (or even approximately so, what seems to be called empirically distributed these days), the 3rd quartile + 1.5 IQR locates a point 2.0 std. devs. above the mean; symmetrically, the 1st quartile minus 1.5 IQR gets you 2.0 SDs below the mean. Close enough to the central 95% of the distribution, for the precision of the 1.5. Of course, the antique 5% standard is rather out of fashion nowadays, but this was, I believe, the underlying rationale for Tukey's choice of the region box +/- 1.5 IQR as a rule-of-thumb (or convention) for initial identificaiton of potential outliers. On the question of whether the whiskers of a box--whisker plot should be made to cease at box +/- 1.5 IQR, note that some current undergraduate textbooks distinguish between a quick boxplot which shows the range but not outliers, and a full boxplot which uses the box +/- 1.5 IQR rule. (Of course, if there are no outliers -- by that definition -- the two are identical.) Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: a problem.
On Sun, 26 Aug 2001 [EMAIL PROTECTED] wrote: I have trouble to solve this probability problem. Hope get help here. There is N balls, Pick up M1 balls with replacement from them. what is the expected value of different balls we pick up? Expected value of what characteristic of the balls? Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: adjusted r-square
On 21 Aug 2001, Atul wrote: How do we calculate the adjusted r-square when the error degrees of freedom are zero ? (Or in other words, number of samples is equal to the number of regression terms including the constant.) Such a situation leads to a zero in the denominator in the expression for calculating adjusted r-square. Depends in part on the expression you use, but in any case you also get a zero in the numerator. Cf. Draper Smith, Eq. (2.6.11b): the right-hand expression indeed contains (n-p) in the denominator, but it also includes (1-R^2) in the numerator, which produces the indeterminate quotient 0/0. In the middle expression of that equation, the quotient (residual SS)/(n-p) appears, which is also 0/0. All of which only emphasizes that the result of any analysis for which the error d.f. = 0 is meaningless: whether r-square, or regression coefficients, or error mean square, ... . Statistical conclusions cannot, in general, be drawn from such an analysis. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: large N, categorical outcomes, significance?
One approach: (I assume that by residual you mean (O-E)/sqrt(E) for each cell of a two-way frequency table, where O=observed frequency and E=expected frequency under the null hypothesis). For the several (or the single) largest residual(s), report O and E as proportions (of total N). Express the residual in terms of proportions, which will turn out to include N (or its square root) as a factor. Show that the residual can be whatever it was (105.6, say) only if N is as large as it is in your dataset, and that the same proportions for some smaller (more reasonable?) N would _not_ produce a significant residual. For purposes of this exercise, you could express the total chi-square in terms of proportions and N, and show that for the observed proportions only values of N larger than some value would produce a significant result; or you could take, for any single cell, a critical value for chi-square with one d.f. (One could argue for d.f. = (r-1)(c-1)/(rc), since the table has rc cells but only (r-1)(c-1) d.f., but 1 d.f. is arguably conservative, and finding critical values for fractional d.f. may be difficult.) On 17 Aug 2001, JDriscoll wrote: I have a large dataset (N can be 2,000-9,000) with mostly categorical outcome variables. Any chi square is significant with residuals of 100+ for tiny differences. I know one can determine effect size for continuous variables and show result is sign only due to size of the N, but...how do I do this for categorical outcome variables? Thanks! Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Presenting results of categorical data?
On 14 Aug 2001, Nolan Madson wrote: I have a data set of answers to questions on employee performance. The answers available are: Exceeded Expectations Met Expectations Did Not Meet Expectations The answers can be assigned weights [that is, scores -- DFB] of 3,2,1 (Exceeded, Met, Did Not Meet). Our client wants to see the results averaged, so, for example, we see that all employees in all Ohio offices for the year 2001 have an average performance rating of 1.75 while all employees in all Illinois offices have an average performance rating of 2.28. One of my colleagues says that it is not valid to average categorical data such as this. His contention is that the only valid form of representation is to say that 75% of all respondents ranked Ohio employees as having Met Expectations or Exceeded Expectations. Your colleague is correct about categorical data. It is not clear whether he be correct about data such as this. Your responses are clearly at least ordinal (in the order you gave them, from most effective to least effective). The question is whether the differences between adjacent values are both approximately equal: that is, whether Exceeded Expectations is roughly the same distance (in some conceptual sense) from Met Expectations as Did Not Meet Expectations is. (And whether this be the case for all the variables in question.) These are difficult questions to argue in the abstract, either on theoretical or empirical grounds -- although for empirical data you could always carry out a scaling analysis and see if the scale values thus derived are approximately equidistant. Probably more important than arguing about whether your data are only nominal (i.e., categorical), or only ordinal or of interval quality is, what do your clients (and/or the publics to whom they report) understand of various styles of reportage? I suspect that some folks would be much happier with 75% of respondents in Ohio met or exceeded expectations, while only 60% of respondents in Illinois did so, together with a statement that the difference is significant (or not), than with a statement like all employees in all Ohio offices ... had an average performance rating of 1.75 while all employees in all Illinois offices had an average performance rating of 2.28, also with a statement about the statistical value of the distinction. OTOH, some people prefer the latter. No good reason not to report in both styles, in fact. Can anyone comment on the validity of using averages to report on categorical data? Well, now, as the question is put, the answer is (of course!) that averages are NOT valid for categorical data (unless the categories are at least ordinal and more or less equally spaced). But that begs the question of whether categorical data be an adequate description of YOUR data. I'd judge it is not: it appears to be at least ordinal. The question whether it be also interval, at least approximately, depends on the internal representations your respondents made of the questions and the possible responses, which is a little hard to find out at this point. However, if (as is often the case) the response medium depicted the three possible responses on a linear dimension and at equal intervals, it's a reaosnably good bet that most of your respondents internalized that dimension accordingly. Or point me to reference sources which would help clarify the issue?-- Nolan Madson I doubt that references would help much in dealing with the facts of the matter, although they might provide you some information and help you to sound more erudite to your clients... This is essentially a measurement issue, so appropriate places to look are in textbooks on educational or psychological measurement. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: sampling/subsample query
Some clarification would help. See below. On Wed, 1 Aug 2001, Teen Assessment Project wrote: I have an overall sample of 5000+ from 40+ different towns and 6 different grades. In approximately equal numbers per town/grade, or not? Are all 6 grades (which grades?) represented in each town? Do they always coexist within schools, or are they divided (e.g., between junior-high schools and high schools)? (Etc.) How were these cases sampled from the population? (And possibly relevant: how large is the population?) [Bluntly: How do you know the overall sample is worth using as a standard of comparison, as appears to be desired?] One person wants to look at a subsample of 200 from specific towns and grades and compare this subsample with the rest of the group on outcome variables. Advice appreciated here. Why 200? (Arbitrary round number? Result of power calculation? Maximum size dictated by constraints you haven't bothered to mention? Outcome of consulting the entrails of a NH Red chicken?) Why not _all_ the respondents from the specified towns? The only demographics are self-reported family structure and maternal/paternal education. If you know the names of the towns (which seems to be implied by the description of the desired subsample), you also know the population of the towns and some (admittedly rather general) other demographic information: e.g., whether the school is located in a community (and what kind of community, e.g., Manchester HS West) or in a wilderness (Vox clamanti in deserto, as they say at Dartmouth, which would also be appropriate for John Stark Regional HS, in the wilds of western Weare). In the larger towns, do you also know the names of the schools containing the members of your sample? That might provide additional detail. (Or not, for city schools that draw students from outlying more rural or suburban areas.) Ideas: 1) I could try to match the demographics/grades of the selected 200 with 200/5000 other subjects. Why would you wish to do that? You write above, One person wants to look at a subsample of 200 ... and compare this subsample with the rest of the group on outcome variables. The matching you propose would seem, on the face of it, to invalidate any comparison between the groups to be compared. 2) I could randomly select 200/5000 other subjects and test to see if there is a sign difference in the demographics. [One presumes sign here is a contraction of significant, and does not (necessarily) imply a sign test.] True. This does not appear to be what One person wants to do, though, which is to compare (= test?) for differences in the outcome variables. Something's missing here. What does One person really want to do (or say s/he wants to do), when not constrained to speak in a kind of pseudo-statistish language? What theory informs the intent of the proposed study (or, if no theory, what kinds of practical decisions might it be reasonably expected to lead to?)? 3)?? 4)?? Alternative sampling procedures aren't useful to contemplate in the absence of design or purpose information. Outcome variables are all categorical -- By this do you mean that they are all of yes/no or true/false form (or equivalent)? Or are some of them a choice of one from among several named categories? Or multiple choices among multiple categories? Are any of these sets of ordered categories (such as one might elicit from Likert-type items)? Do the variables come in sets (or dimensions) that lend themselves to any kind of summary scoring? (E.g., total # of categories of this kind that are true or yes (or whatever) for this particular case.) Ought you to be doing some sort of scaling analysis on the categories, to produce interval-level scaled variables? (Search on dual scaling and correspondence analysis.) assuming chi-squares testing here. There are a variety of kinds of chi-square tests. If you are (as one suspects) referring to two-dimensional cross-classification tables, and testing the independence of classifications, this is of course possible. It may not be optimal: depending in large part on what the _real_ questions are that One person wants to address, and on the nature(s) of the variables of interest. Scaling of the category systems would yield variables you could subject to various linear models -- multiple regression, analysis of variance/covariance, Hotelling's T-square, etc. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/
Re: log
On 31 Jul 2001, ToM wrote: what is the opposite of a log?[logarithm] An antilog [properly, antilogarithm]. Equivalently, 10 to that power (if, as in your example, you are taking logarithms to the base 10); or e to that power (if you are taking natural logarithms), which is also called the exponential function, exp( ). If you do lg10 of 3 in spss, it gives you a number. how can i take this number and have as a solution the initial one (3)? lg10(3) = 0.47712. In SPSS-speak, 10**(0.47712) = 2.9 For natural logarithms, ln(3) = 1.0986, exp(1.0986) = 2.6 Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: web page to help use normal table
Use the table twice -- for P(0Zz1) and P(0Zz2) -- and then subtract or add, depending on whether the desired signs of z1 and z2 are the same or different. -- DFB. On Sat, 28 Jul 2001, Cantor wrote: I did not try to examine your work thoroughly but at the very beginning I try to count P(z1Zz2) up, but there is only z1 which can be changed. What about z2? in response to EAKIN MARK E [EMAIL PROTECTED], who had written: I have just finished creating an ASP web page that will help students use a normal table that gives probabilities for ranges of the standard normal that start at 0 up to a Z value. If you wish to try it, go to http://www2.uta.edu/eakin/busa3321/normaltable/p2.asp Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: confidence interval
If you don't happen to have a convenient r -- Z conversion table handy, it may be helpful to know, for step 1. below, that Z = 0.5 log((1+r)/(1-r)) or, equivalently, Z = tanh^(-1)r = the hyperbolic arctangent of r. (log is the natural logarithm.) It follows that, given a value of Z (for step 4.), r = (exp(2Z)-1)/(exp(2Z)+1) where exp(2Z) is e to the power 2Z, or, equivalently, r = tanh(Z) = the hyperbolic tangent of Z. The standard error of Z is 1/sqrt(n-3) (for step 2.), where sqrt(n-3) is the square root of (n-3). On Sat, 28 Jul 2001, dennis roberts wrote: one way is: 1. convert sample r to Fisher's BIG Z (consult conversion table) 2. find standard error of Fisher's Z ... (find formula in good stat book) 3. for 95% CI ... go 1.96 standard error (from #2) units on either side of Z (from #1) 4. convert EACH end of the CI in Fisher Z units back to r values (use table from #1 in reverse) At 05:28 AM 10/22/99 -0200, Alexandre Moura wrote: how can I construct a confidence interval for a Pearson correlation? Thanks in advance. Alexandre Moura. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: need help with SAS
On Fri, 27 Jul 2001, Nadine Wells wrote in part: Does anyone know what the power link function does in SAS? [...] when I plot the equation based on the parameter estimates, the model doesn't seem to look like I want it to. [...] I am trying to get SAS to run a model that resembles exponential growth. Any suggestions would be greatly appreciated. Nadine, if you want to model exponential growth, why are you trying to use a power function? For simple exponential growth, Y = a e^(bT) where T is time (years, in your case?) and Y is the variable whose growth is modelled. Then log(Y) = log(a) + bTwhich is a simple linear regression. It is easy to show that the doubling time is log(2)/b (all logs are natural logs, of course). It then remains to adapt the model to include your habitat complexity. How best to do this is not clear to me; but at least you could start by breaking that variable into (probably ordered?) categories, and try fitting a separate exponential function for each such category (like an ANOVA), perhaps subject to one or more constraints (e.g., a common value of b for all habitats -- that analysis would resemble, formally, an analysis of covariance but you might well prefer to model it in multiple regression terms). Hope this helps, some. --Don. Nadine's complete post: Does anyone know what the power link function does in SAS? I have to provide a parameter estimate in parenthases after the link=power command. I've been using -1 but when I plot the equation based on the parameter estimates, the model doesn't seem to look like I want it to. Does anyone know exactly what the power link function does? More specifically, I am trying to get SAS to run a model based on proportion data. That is, my dependent variable is a proportion (# of beach seine hauls that catch fish over total # of beach seines hauled), my explanatory variable is a measure of habitat complexity. I am also using year as a categorical variable. I am trying to get SAS to run a model that resembles exponential growth. Any suggestions would be greatly appreciated. Nadine Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: vote counting
The answers to your questions depend heavily on structural information that you almost certainly don't have, else one would not bother to have arranged a voting process. But consider two very different cases: A. Voters are absolutely indifferent to candidates: that is, all the candidates are equally attractive, or equally preferred by the voters. Then the identity of the candidate with the most votes is purely random, and the probability that the counted top N will correspond to the real top N will be very low indeed (in part because there IS no real top N; but even in the sense that another vote taken tomorrow would be very unlikely to reproduce the same set of top N, let alone in the same order). B. Some candidates are strongly preferred to others (by the voters as a whole, that is, as a population), and exactly N such candidates are so preferred. About the rest the voters are indifferent, on the whole. In these circumstances, one would expect a large difference between the number of votes cast for the least of the N and the number of votes cast for the greatest of the remaining candidates, and the probability that the counted top N will correspond to the real top N would be rather high (depending in part on how large 'p' is). I do not see how to estimate such a probability in the absence of any information about the distribution of preferences. I've assumed that by counting votes you mean that each voter casts exactly one ballot for (at most?) one candidate. For other voting schemes (e.g., vote for K candidates, K .LE. N, and specify one's preferences among them by assigning each candidate a preference from 1 (most favored) to K (least favored)) it is imaginable that answers to your questions might not differ, but showing that to be the case (or not) is another matter entirely. It also occurs to me that a single probability 'p' of error in voting must be a global average and is an oversimplification almost certainly. In case A above, the results of an election might be dominated by voters whose personal 'p' is large; although, again, it is not clear to me how one might show such a thing formally. -- DFB. On Wed, 25 Jul 2001, Sanford Lefkowitz wrote: In a certain process, there are millions of people voting for thousands of candidates. The top N will be declared winners. But the counting process is flawed and with probability 'p', a vote will be miscounted. (it might be counted for the wrong candidate or it might be counted for a non-existent candidate.) The latter would constitute a spoiled ballot, or not? What is the probability that the counted top N will correspond to the real top N? (there are actually two cases here: 1 where I want the order of the top N to be in the correct order and the other where I don't care if the order is correct) Thanks for any ideas, Sanford Lefkowitz Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: SRSes
Hi, Dennis! Yes, as you point out, most elementary textbooks treat only SRS types of samples. But while (as you also point out) some more realistic sampling methods entail larger sampling variance than SRS, some of them have _smaller_ variance -- notably, stratified designs when the strata differ between themselves on the quantity being measured. On Tue, 24 Jul 2001, Dennis Roberts wrote: most books talk about inferential statistics ... particularly those where you take a sample ... find some statistic ... estimate some error term ... then build a CI or test some null hypothesis ... error in these cases is always assumed to be based on taking AT LEAST a simple random sample ... or SRS as some books like to say ... but, we KNOW that most samples are drawn in a way that is WORSE than SRS I don't think _I_ know this. I know that SOME samples are so drawn; but (see above) I also know that SOME samples are drawn in a way that is BETTER than SRS (where I assume by worse you meant with larger sampling variance, so by better I mean with smaller sampling variance). thus, essentially every CI ... is too narrow ... or, every test statistic ... t or F or whatever ... has a p value that is too LOW what adjustment do we make for this basic problem? I perceive the basic problem as the fact that sampling variance is (relatively) easily calculated for a SRS, while it is more difficult to calculate under almost _any_ other type of sampling. Whether it is enough more difficult that one would REALLY like to avoid it in an elementary course is a judgement call; but for the less quantitatively-oriented students with whom many of us have to deal, we _would_ often like to avoid those complications. Certainly dealing with the completely _general_ case is beyond the scope of a first course, so it's just a matter of deciding how many, and which, specific types of cases one is willing to shoehorn into the semester (and what previews of coming attractions one wishes to allude to in higher-level courses). Seems to me the most sensible adjustment (and of a type we make at least implicitly in a lot of other areas too) is = to acknowledge that the calculations for SRS are presented (a) for a somewhat unrealistic ideal kind of case, (b) to give the neophyte _some_ experience in playing this game, (c) to see how the variance depends (apart from the sampling scheme) on the sample size (and on the estimated value, if one is estimating proportions or percentages), (d) in despite of the fact that most real sampling is carried out under distinctly non-SRS conditions, and therefore entails variances for which SRS calculations may be quite awry; and = to have yet another situation for which one can point out that for actually DOING anything like this one should first consult a competent statistician (or, perhaps, _become_ one!). Some textbooks I have used (cf. Moore, Statistics: Concepts Controversies (4th ed.), Table 1.1, page 40) present a table giving the margin of error for the Gallup poll sampling procedure, as a function of population percentage and sample size. Such a table permits one to show how Gallup's precision varies from what one would calculate for a SRS, thus providing some small emphasis for the cautionary tale one wishes to convey. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Multiple measurements
Hi, Ivan. I think your problem may not be so simple as you've described it. But to begin with the simplest: In terms of area in mm^2, simply multiplying length x width, all of the ultrasound (US) samples except one have smaller areas than any of the high-speed drill (AR) samples; 6 of the 10 AR samples have larger areas than the largest US sample, and if that sample were ignored (3.5 x 2.2 = 7.70 mm^2) ALL of the AR samples have larger areas than the remaining nine US samples. This pattern would be significant (p .001) by Tukey's Compact test (1959). A similar pattern is true of the widths; the pattern for the lengths is less compelling, but would still be significant (p .01). Similar results would be expected from the parametric methods you mention in your message (quoted below). But do you really desire to compare AR with US only on the raw dimensions of the cavities? One could define some degree of departure from the nominal dimensions (2.0 mm x 3.0 mm), and one might even specify acceptable and unacceptable ranges of values for this measure. I do not know what would be unacceptable for this exercise. But when one is preparing a cavity in a person's tooth, the prepared cavity would be unacceptably small if some of the decayed matter remained in the tooth; and the cavity would be unacceptably large if so much of the tooth had been removed that what remained was too weak to hold the dental filling. You might also ask how far each prepared cavity departed from the intended rectangular shape. But this may not be a realistic question. (I've had dentists working on my teeth since about 1945, and I think that _none_ of the fillings they prepared were rectangular in shape!) In quoting your original message below, I have taken the liberty of supplying corrected English, in [square brackets]. On Fri, 20 Jul 2001, Ivan Balducci wrote: Dear members, I am an engineering brazilian. My job is to help researches in Dental [ I am a Brazilian engineer. My job is to help researchers ... ] School about Statistics. My doubt is... [ My concern is: ] How can I to comaparing two instruments: Ultra Som ...versus...Alta Rotação (High Sound High Rotation) [ How can I compare two instruments: Ultra Som versus Alta Rotação (ultrasound vs. high-speed drill) ] Theses instruments are used in Operative Dentistry to perform preparos cavitarios (cavity prepair) [(cavity preparation)] The shape of the prepair is rectangule { The shape of the cavity is rectangular. ] WellThe situation isThe specificated area = 6mm2 (= 2mm x 3mm) [ ... The specified area = ... ] width = 2mm; length = 3mm Two samplessize sample is 10 (n = 10) for each instrument How can I aproach this problem? I can to do an Analysis Multivariate (T2 Hotteling) : instrument US x instrument AR ? [ I can do a multivariate analysis (Hotelling's T^2) ... ] Yes, this is possible. I can to do a IC (95%), or t-test, separately for each variable (width and length) and instrument ? [ I can do a confidence interval (CI), or t-test ... ] These are also possible. I can to compare the areas (width x length)...for instrument US against instrument AR ? [ I can compare the areas ... ] And so is this. Well... Which is the best, the correct way to approach a problem of this kind? Any of the ways mentioned above are possible and correct. It is not clear whether any of them is best, because it is not clear how best may usefully be defined. It is also not clear what the specific questions are that you really desire to address. I have tried to indicate some of the range of interesting questions that you might be interested in. Data:[In the data below, I think you have interchanged the labels width and length.] US: width: 2.8 2.9 2.9 3.03.0 3.12.7 2.5 3.5 3.2 length: 1.9 2.0 1.9 1.92.0 2.02.0 1.9 2.2 2.0 AR: width: 3.2 3.3 3.5 3.23.5 3.6 3.5 3.7 4.13.4 length: 2.3 2.1 2.1 2.22.7 2.6 2.5 2.4 2.02.5 very thanks for the attention and sorry my english [ Thank you very much for your attention to my problem. ] (Alternatively, you could simply write TIA, for Thanks in advance.) -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: statistical similarity of two text
On Tue, 17 Jul 2001, Cantor wrote: Does anybody know where I can find program on the website which [can] compare two texts/articles and settle whether or not they are similar assuming any significant level. Sorry, Cantor: this is not possible, in general. One can discover whether two (or more) things _differ_ (on some quantitative measure) at a specified significance level (when this is a reasonable thing to do -- it isn't always reasonable), but the formal definition of significant in statistical analysis does not permit discovering whether two (or more) things are _similar_. However, it may suffice for your purposes to discover that two things are not different enough that you can tell them apart (which is not the same thing as discovering that they are the same), on whatever measure (or set of measures) you choose to analyze. Whether this be a useful outcome or not depends heavily on how much information you have (that is, on the size of the sample available) on the things being compared. In any case, the hard part is defining the characteristics, or properties, or measures, on which the two texts/articles are to be compared. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Interpreting effect size.
On Sun, 15 Jul 2001, Melady Preece wrote: I have done a paired t-test on a measure of self-esteem before and after a six-week group intervention. There is a significant difference (in the right direction!) between the means using a paired t-test, p=.009. The effect size is .29 if I divide by the standard deviation of the pre-test mean, and .33 if I divide by the pooled standard deviation. This implies that the effect size would be larger than .33 if you were to divide by the s.d. of the post-test mean: which is evidently smaller (although probably not significantly so?) than the s.d. of the pre-test mean. But if you have paired pre/post values, you are essentially calculating the difference score (post minus pre), and constructing a t ratio using the s.d. of those differences. This would ordinarily be expected to be noticeably smaller than the s.d. of either pre-test or post-test means. Do you have a reason for not using _that_ s.d.? Question 1: Which is the correct standard deviation to use? Well, you have a choice of four: the s.d. of the pre-test mean, the s.d. of the post-test mean, the s.d. of the difference, and the pooled s.d. (resulting from pooling together the variances pre and post). The pooled s.d. would be (at least possibly) appropriate if you were performing a t-test for independent groups, but I cannot see how it could be thought suitable for paired differences (unless, perhaps, you and I mean different things by pooled s.d.). Of the other three, and in the absence of other considerations which may apply to your situation that you haven't told us about, I'd be inclined to report all three; unless circumstances (among the other considerations) led me to prefer one of them in particular. Using the pre-test s.d. may make it possible for your readers to estimate what differences they might expect to find, based on pre-test information, before getting to the post-test stage; this might be of value to some readers. Similar interpretations can be made of effect sizes calculated from the other s.d.s. I would also want to report the raw difference in means, if the raw scores are (as I assume to be the case) values that are more or less understood (e.g., number of right answers out of the number of items), since it provides something like a common-sensical measure... I'd also be interested (as a potential reader) in some summary information about the difference scores, like what proportion were negative... Question 2: Can an effect size of .29 (or .33) be considered clinically significant? Not enough information for me to tell. (And I just discovered my watch had stopped -- forgot to wind it this morning -- and am in danger of being late for today's next agendum. Good luck!) -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: EdStat: Triangular coordinates
On Tue, 10 Jul 2001, Alex Yu wrote: I am trying to understand Triangular coordinates -- a kind of graph which combines four dimensions into 2D You meant, condenses four dimensions into 3D, didn't you? Your subsequent description indicates three dimensions all together, two of them used to represent 3 variables: by joining three axes to form a triangle while the Y axis stands up. The Y axis can be hidden if the plot is depicted as a contour plot or a mosaic plot rather than a surface plot. I have a hard time to follow how a point is determined with the three axes as a triangle. There must be constraints on the values of the three variables. Commonly used for situations like a chemical mixture of 3 components. Each component can have a relative concentration between 0% and 100%, but if component A is at 100%, components B and C must both be at 0%, and the point (100%, 0%, 0%) falls at one apex of the triangle. The formal restriction, of course, is that the sum of all three concentrations equals 100%, so that there are really only two dimensions' worth of information available: (A, B, (100%-A-B)), (A, (100%-A-C), C), or ((100%-B-C), B, C). Since there is usually no reason to treat any component as more (or less) important than any other, triangular coordinates are often displayed on an equilateral triangle, and special graph paper can be purchased that has such a grid. In the absence of such paper, one can plot, say, A and B at right angles to each other and let the 45-degree line from (100,0) to (0,100) represent the C axis (and the upper boundary of the space of possible points). When there is not some such constraint on the values of the three variables, triangular coordinates don't make a whole lot of sense and may be extremely misleading. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: SPSS
On Sat, 7 Jul 2001, David Schaefer wrote: My Stats professor is having us run some correlations and what not through SPSS. She has asked us to transform some raw scores to z-scores for a reading achievement test. The commands she has asked us to type in the syntax editor is: COMPUTE zread = (reading-52.23)/10.25. EXECUTE. The 52.23 and 10.25 are the mean and standard deviation of the data, repectively. Absolutely nothing happens when I highlight and Run these commands. What were you expecting to happen? If these are the only commands that you asked to be carried out, there would be no visible happening, because no output has been called for. There will have been a variable named zread created by the COMPUTE/EXECUTE sequence and stored in the active data file, but if you have not asked for output you won't get any. You might have asked, for example, for the mean(s) and standard deviation(s) of this new variable (and perhaps other extant variables); or for a correlation matrix among several variables, including this variable; or for a listing of the values of this variable (if the number of cases is not prohibitively large). Any slight alterations of them result in a variety of error messages. Yes, that sounds reasonable, since any alteration would probably result in misspelling one or more command name(s) or variable name(s). Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help with stats please
On Sun, 24 Jun 2001, Melady Preece wrote in part: I am teaching educational statistics for the first time, and although I can go on at length about complex statistical techniques, I find myself at a loss with this multiple choice question in my test bank. I understand why the range of (b) is smaller than (a) and (c), but I can't figure out how to prove that it is smaller than (d). 1. Which of the following classes had the smallest range in IQ scores? A) Class A has a mean IQ of 106 and a standard deviation of ll. B) Class B has an IQ range from 93 to 119. C) Class C has a mean IQ of 110 with a variance of 200. D) Class D has a median IQ of 100 with Q1 = 90 and Q3 = 110. The test bank says the answer is b. Right. Since you're happy that range(B) range(A) and range(B) range(C), I'll focus on (B) vs. (D). In (B), the entire _range_ is from 93 to 119: 26 (or 27, depending on how you choose to define range) points. In (D), the central half of the distribution is from 90 to 110: the interquartile range (IQR) is 20 points, symmetric about the median; the full range must therefore be greater than 20. Now, _if_ the distribution is normal (which may be what we were to assume from the allegation that these are IQ scores; although as Dennis has pointed out, ille non sequitur -- unless these are rather large classes AND NOT SELECTED BY I.Q. (or by any variable strongly related to I.Q.)), then 10 points from Q1 to median (or from median to Q3) represents 0.67 standard deviation, which implies a standard deviation of about 15, which is larger than the standard deviation in (A) and slightly larger than that in (C). However, we need not invoke the normal distribution. We observe that the distribution in (D) is at least approximately symmetric (insofar as the quartiles are equidistant from the median). If we may assume also that the distribution is unimodal (which I should think reasonable), it then follows (from the tailing off of distributions as one approaches the extremes) that the distance from minimum to Q1 (and the distance from Q3 to maximum) is greater than the distance from Q1 to median (or median to Q3). This implies that the range of the distribution exceeds twice the interquartile range: that is, range(D) 2*20 = 40. Since the range in (B) is only 26, clearly the range of (B) is less than the range of (D). If any part of this argument remains unclear, I'd be happy to attack it again. A rough sketch should make things pretty obvious, but it's a bit of a nuisance to draw pictures in ASCII characters! --DFB. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: meta-analysis
On Fri, 22 Jun 2001, Marc Esser wrote: After a closer look at the trials which I want to summarize, I noticed that not the means are reported, but the medians. Do you have an idea how to calculate an effect size with this information, e.g. median change of hospitalization time. The p-values reported in the trials are derived from Mann-Whitney-Tests. Sorry, Marc; I don't know offhand how to deal with this situation. Perhaps someone else on the list can help. -- Don. On 17 Jun 2001, Marc wrote (edited): I have to summarize the results of some clinical trials. The information given in the trials contain: Mean effects (days of hospitalization) in treatment control groups; numbers of patients in the groups; p-values of a t-test (of the difference between treatment and control) . My question: How can I calculate the variance of the treatment difference, which I need to perform meta-analysis? Note that the numbers of patients in the groups are not equal. Is it possible to do it like this: s^2 = (difference between contr and treatm)^2/ ((1/n1+1/n2)*t^2) How exact would such an approximation be? Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: meta-analysis
On 17 Jun 2001, Marc wrote (edited): I have to summarize the results of some clinical trials. The information given in the trials contain: Mean effects (days of hospitalization) in treatment control groups; numbers of patients in the groups; p-values of a t-test (of the difference between treatment and control) . My question: How can I calculate the variance of the treatment difference, which I need to perform meta-analysis? Note that the numbers of patients in the groups are not equal. Is it possible to do it like this: s^2 = (difference between contr and treatm)^2/ ((1/n1+1/n2)*t^2) Yes, if you know t. If all you know is that p alpha for some alpha, you then know only that t the t corresponding to alpha (AND you need to know whether the test had been one-sided or two-sided -- of course, you need to know that in any case), you can substitute that corresponding t to obtain an upper bound on s^2 -- ASSUMING that the t was calculated using a pooled variance (your s^2), not using the expression for separate variances in the denominator: (s1^2/n1 + s2^2/n2). Note that this s^2 is NOT the variance of the treatment difference, which you said you wanted to know; it is the pooled variance estimate of the variance within each group. The variance of the difference in treatment means, which _may_ be what you are interested in, would be (difference)^2 / t^2 with the same caveats concerning what you know about t. How exact would such an approximation be? Depends on the precision with which p was reported. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: individual item analysis
In response to Doug Sawyer's post: I am trying to locate a journal article or textbook that addresses whether or not exam quesitons can be normalized, when the questions are grouped differently. For example, could a question bank be developed where any subset of questions could be selected, and the assembled exam is normalized? on Fri, 15 Jun 2001, dennis roberts wrote in part: also, you can normalize a distribution that is not so normal but, i would ask ... how come you want to do that? which is a good question. But I would ask the prior question: What, precisely, does Doug want to mean by normalize? And is that meaning congruent with Dennis's understanding of the word? (THEN I would ask Dennis's question!) (I note in passing that Doug is in a department of physical science, and in physical sciences normal often has the meaning perpendicular to a (line or) plane; while Dennis is in a department of educational psychology, where normal nearly always refer to the probability distribution that in physics is often called Gaussian. I can't tell from the rather spare context whether any such misunderstanding or miscommunication applies to the conversation, but if it does, 'twere better sorted out sooner than later.) -- Don. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: multivariate techniques for large datasets
On 11 Jun 2001, srinivas wrote: I have a problem in identifying the right multivariate tools to handle datset of dimension 1,00,000*500. The problem is still complicated with lot of missing data. So far, you have not described the problem you want to address, nor the models you think may be appropriate to the situation. Consequently, no-one will be able to offer you much assistance. Can anyone suggest a way out to reduce the data set and also to estimate the missing value. There are a variety of ways of estimating missing values, all of which depend on the model you have in mind for the data, and the reason(s) you think you have for substituting estimates for the missing data. I need to know which clustering tool is appropriate for grouping the observations ( based on 500 variables ). No answer is possible without context. No context has been supplied. Donald F. Burrill [EMAIL PROTECTED] 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Correction procedure
On 3 Jun 2001, Bekir wrote, in part: My aim was to compare groups 2, 3, 4, 5 with control (group 1). ... The rewiever had written me: Accordingly, a statistical penalty needs to be paid in order to account for the increased risk of a Type 1 error due to multiple comparisons. The easist way to achieve this goal to adjust the P value require to declare significance using the Bonferroni correction. 1. What is the correct meaning of the last sentence? What must I do? As you wrote, must I find the adjusted p values or declare the adjusted significance level alpha? Your choice: the two ways of approaching the problem are equivalent. Either divide the criterion significance level (alpha) by the number of comparisons, as Duncan Smith recommended, and compare the p-values reported by your statistical routine to this adjusted value; or adjust the p-values by multiplying the reported values by the number of comparisons. Thus p = 0.02 adjusted alpha = 0.125 for one of your comparisons, or adjusted p = 0.08 nominal alpha = 0.05. If I understand your reviewer correctly, (s)he seems to be requesting the latter: adjusting the p-value. 2. There are apparently and exactly three groups; groups 1, 3, 5 that had the same proportions of translocation. Wasn't it groups 1, 4, 5 that had the same proportions? Therefore, to compare only the group 2 and 3 with the control can be appropriate, can it be? Such a comparison may be appropriate (but see below); but this does not change the situation. Had groups 4 and 5 NOT had proportion equal to group 1, you would surely have wanted to make those two comparisons also. The question is not, how many comparisons were useful or significant; but how many comparisons would you have chosen to consider before you observed the results of this particular experiment. By your description, you certainly considered AT LEAST the four comparisons mentioned in your first paragraph above. Thus, there would be two comparisons and the p valus 0.008 (0.008 x 2 = 0.016) and 0.02 (0.02 x 2 = 0.04)would be significant. Is it right? As explained above, and as Duncan Smith responded, No. Duncan mentioned Dunnett's test. This might indeed be appropriate for your design, but not for the analyses you have so far done. Dunnett's test would normally follow the finding of a significant F value in a one-way analysis of variance (testing the formal hypothesis that the true (population) proportions in the five groups are all identical. Such an analysis could be undertaken with your data, but some persons (possibly including your reviewer? I don't know) would object to carrying out an analysis of variance (ANOVA) with dichotomous data. One advantage to ANOVA is the possibility of drawing conclusions more complex, and possibly more interesting, than the pairwise comparisons that you had originally envisioned. In particular, you could test the contrast between Groups 2 and 3 combined, with Groups 1, 4, and 5 combined; since it seems clear that this is the only thing that is going on in your data. Testing that contrast by the Scheffe' method, which offers experimentwise protection against the null hypotheses for any imaginable contrast, might be useful: that contrast, involving all 100 cases, is more powerfully tested than the series of pairwise comparisons, and may well be significant even against the conservative Scheffe' criterion. Whether that is useful _for_your_purposes_ is another matter entirely. If there is some useful meaning and interpretation to be gained in observing that only groups 2 and 3 differ from the control group and that groups 4 and 5 are indistinguishable from the control group, then this contrast would be useful to test formally. If that outcome does not lend itself to useful interpretation (and the advance of knowledge in the field), you would probably be better off staying with the four pairwise comparisons you started with. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: correction procedures
On 2 Jun 2001, Bekir wrote in part: I performed a study on different enteral nutrients and bacterial translocation in experimental obstructive jaundice. There was 5 groups of rats. Each group consists of 20 rats. Occurred Translocation incidences in mesenteric lymph nodes were shown in following table. My aim was to compare groups 2, 3, 4 with control(group 1) Data table deleted; see the original posting. Summary of group definitions and comparison results: Group 1 sham ligation of bile duct (fed rat chow) Group 2 bile duct ligated (fed rat chow) *p = 0.08 Group 3 bile duct ligated (fed enteral diet) ** p = 0.02 Group 4 bile duct ligated (fed enteral diet 2) Group 5 bile duct ligated (fed enteral diet 3) By chi squared test I calculated this p values. You did not specify, but presumably the chi-square test in question was of a series of 2x2 tables, comparing the numbers of translocations that occurred (vs. the numbers that didn't) in Group 1 (your control group) with each of the other groups. The reviewer commented that I should do bonferroni correction, find adjusted p value and according to this adjusted value, I should say significant or not. However, in no study have I read that the authors had written that they had adjusted bonferroni correction, especially in a comparision by chi square test. The 1-degree-of-freedom chi-square test described above is exactly equivalent to a z-test comparing the proportion of translocations in Group 1 with the proportion of translocations in the other group, for each conmparison of interest. You may perhaps find references to Bonferroni adjustments in studies where z- or t-tests were used. If bonferroni was performed then adjusted p value [Here you must mean the adjusted significance level alpha, not the p-value? -- DFB.] would be 0.05/10=0.005, 10 = nx(n-1) in our study. I do not think so. The number of comparisons you say you were interested in is three, not ten: Group 2 vs. Group 1, Group 3 vs. Group 1, and Group 4 vs. Group 1. If indeed these are the only comparisons of interest, and in the sense that these comparisons (and no others!) were planned from the beginning, then the adjusted p-values would be 0.02*3 = 0.06 and 0.008*3 = 0.024. But I do not believe this, either. If these were the only three comparisons of interest, you would not have bothered to include Group 5 in the experiment. It looks to me as though the original design had envisioned comparisons of Groups 2, 3, 4, 5 vs. Group 1, and may also have intended comparisons of Groups 3, 4, 5 vs. Group 2; so that the number of comparisons for the Bonferroni correction would be either 4, or 4+3 = 7. The corresponding adjusted p-values would be 0.02*4 = 0.08 and 0.008*4 = 0.032; or 0.02*7 = 0.14 and 0.008*7 = 0.056. Thus our results would not be significant. Is it appropriate to make bonferroni correction or simes correction in this situation? Indeed I want to compare groups 2, 3, 4 with group 1. So there would be 4 comparisons. Then you must mean compare groups 2, 3, 4, 5 with group 1? Is simes procedure is correct? How can I make Simes correction? Sorry, I'm not familiar with this procedure, at least not by that name. I hope this has been helpful. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Ninety Percent above Median
On Thu, 31 May 2001, W. D. Allen Sr. wrote: Only from the education field do we hear the statement that over ninety percent of students ranked above the median! The statement was made on TV. (1) I take it that it was the keyword students that led you to suppose that the statement had anything to do with the education field (rather than, say, the field of study the students were pursuing). (2) The statement appears, however, not to have been made by any agency of the education field, but on TV -- by which one supposes you mean broadcast television. That's not education: that's entertainment. Or, possibly, news, or the deliberate distortion thereof. (3) A couple of colleagues have already pointed out how the statement you so scornfully cite might in fact be true; although whether in fact any such interpretation can be believed is impossible to tell, in the absence of any context. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Variance in z test comparing percenteges
yOn Sat, 12 May 2001, RD wrote, inter alia: The only approach to deal with z test for means that I have seen so far was using s^2 = s1^2/n1 + s2^2/n2 formula. t test is always using pooled variance. I think not _always_. _Usually_, because (i) there is seldom a strong need to insist that the two [sub]population variances be different, (ii) the distribution of the t statistic is easier to find (no fractional numbers of degrees of freedom, e.g.), and (iii) the computations are easier. But if one were concerned about (i), as for instance when the two sample variances are quite different, one might take the alternative approach. (But see below.) Both z test and percentages comparison test are using normal distribution. Thus, intuitively I was considering them as basically the same with only difference in variance calculations. My problem is that using weighted p for one and not using pooled s^2 for another seemed inconsistent with that idea. This is where you begin to go astray. In the z test for means, the sampling distribution of the sample means (or of their mean difference) is (at least approximately) normal with mean mu and standard deviation sigma; and mu and sigma are mutually independent, either because that's true of normal distributions or because that tends to be true of empirical data (more or less regardless of the empirical distribution). But in the case of proportions (or, equivalently, of percentages) the underlying distribution is binomial: and the mean and standard deviation of a binomial distribution are NOT independent, being (for the simple count of the event in question) np and SQRT(np(1-p)), or (for the proportion) p and SQRT(p(1-p)/n). The fact that for n large enough the binomial distribution may be well approximated by a normal distribution with the same mean and variance does not alter the fact that the true distribution IS binomial, and thus has this direct connection between mean and standard deviation. It follows that in an ordinary z-test (or t-test), one can make whatever assumption one finds useful, desirable, or convenient with respect to the variance of the difference, without affecting the truth value of the null hypothesis about the mean (or the difference in means, etc.). But in dealing with proportions, if the null hypothesis specifies that P = a given value, that hypothesis ALSO specifies what the variance must be. Hence a null hypothesis that P1 = P2, or equivalently that P1-P2 = 0, specifies that the variance of the observed difference must be based on the assumed common P in the population. And the best estimate available for that common P is the usual weighted P, as you put it. Now you are saying that pooled variance may be used in z test. Sometimes, anyway. Admittedly, the point is debatable: if one is using a z test at all, one is implicitly claiming to know what the corresponding variances are, and if they're different, they're different. But if one is skeptical about the state of one's knowledge (as one probably ought to be, else why test an hypothesis about means at all?), one may suspect that one's knowledge of variances is imperfect in some degree. Then if the variances in question are not very far apart, it may be desirable to average them in some way, such as the usual pooling (or equivalently weighting by numbers of degrees of freedom). But this does not really change anything except the particular mechanics of finding an average variance. Summing the two sampling variances of the respective means and taking the square root of the sum produces an averaged standard error of the mean difference. Pooling the two variances to obtain an average variance, then multiplying by the sum (1/n1 + 1/n2) and taking the square root of that sum, produces another averaged standard error of the mean difference. The two averages are unlikely to differ much (except in pathological circumstances, perhaps), so it's rather splitting hairs to argue which one is proper. (And there's always the question, proper for what purpose or circumstances?) When would you use pooled variance in z test instead of sum and vice versa? I wouldn't bother to prescribe. If the separate variances were different enough to worry about, I'd probably want to use both a standard formula (pooled or sum, I don't care which) AND a test using the LARGER variance, to be able to assert (if it be true) that the null hypothesis can be rejected even under quite conservative assumptions. I can imagine wanting also to use the SMALLER variance, so as to produce a range of standardized effect sizes that one might reasonably believe to cover the true effect size. What are we really testing: just two means or whether those two samples come from the same population? Precisely. What we are really testing, if we are testing at all, may very well differ from
Re: Intepreting MANOVA and legitimacy of ANOVA
On Fri, 18 May 2001, auda wrote (slightly edited): In my experiment, [when] two dependent variables DV1 and DV2 [were] analyzed separately with ANOVA, the independent variable [IV (with ] two levels IV_1 and IV_2) modulated DV1 and DV2 differentially: mean DV1 in IV_1 mean DV1 in IV_2 mean DV2 in IV_1 mean DV2 in IV_2 If analyzed with MANOVA, the effect of IV was significant, Rao R(2,14)=112.60, p0.000. How to interpret this result of MANOVA? Not enough information to tell. If, for example, DV2 = -DV1 + C, C a constant, you would get results of the kind you describe above. The question unanswered as yet is whether the second DV adds any information to the system. It's been a longish while since I did any MANOVAs, but I seem to recall a section of output showing step-down analyses for each formal effect of the ANOVA structure, in which each DV was reported in the order in which it had been considered, and a test reported as to the degree to which the effect on this DV was implied by the effect on previous DVs. You haven't mentioned anything about interpreting the significant univariate effects, which leads one to suspect that they are interpretable enough. What more do you think you want from MANOVA? Can I go ahead to claim IV modulated DV1 and DV2 differentially based [on] the result from MANOVA? Or I have to do other tests? THAT you can claim based on the univariate results, unless DV1 and DV2 are so closely (if negatively!) related that there is only one phenomenon occurring, rather than two: which would be one possible reason for carrying out a MANOVA. Moreover, can I treat DV1 and DV2 as two levels of a factor, say, type of dependent variable, and then go ahead to test the data with repeated-measures ANOVA and see if there is an interaction between IV and type of dependent variable? Certainly. Of course, this is not testing the same set of hypotheses as MANOVA, so the results might be somewhat different; and you have (as you have in any case) the problem of explaining (if it needs explaining) why it is reasonable for the effect of IV to be in opposite directions on the two DVs. It might be informative to repeat some of your analyses after transforming one of the DVs to (constant - old DV) . Then a repeated-measures ANOVA would tell you whether the interaction effect, present with the original DVs, involved a difference in magnitude as well as a difference in sign. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: A regressive question
If the mean of the predictor X is zero, the intercept is equal to the mean of the dependent variable Y, however steep or shallow the slope may be. And as Jim pointed out, the standard error of a predicted value depends on its distance from the mean of X (being larger the farther away it is from the mean, the confidence band being described by a hyperbola). It would seem to follow that a test such as Alan asks about would be unusable if the mean of X is too close to 0, and would be (too?) insensitive if the mean of X is too far from 0. An intermediate region, where a test of intercept vs. mean Y might be useful, might perhaps be defined in terms of the coefficient of variation of X (or perhaps its reciprocal, if the mean of X were in danger of actually BEING zero). One rather suspects that any such test would be less powerful than the usual test of the hypothesis that the true slope is zero, which might be an interesting proposition (for someone else!) to pursue. -- Don. On Wed, 16 May 2001, Alan McLean wrote: The usual test for a simple linear regression model is to test whether the slope coefficient is zero or not. However, if the slope is very close to zero, the intercept will be very close to the dependent variable mean, which suggests that a test could be based on the difference between the estimated intercept and the sample mean. Does anybody know of a test of this sort? Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Question
On Thu, 10 May 2001, Magill, Brett wrote, inter alia: How should these data be analyzed? The difficulty is that the data are cross level. Not the traditional multi-level model however. Hi, Brett. I don't understand this statement. Looks to me like an obvious place to apply multilevel (aka hierarchical) modelling. (Have you read Harvey Goldstein's text on the method?) You have persons within organizations (just as, in educational applications of ML models, one has pupils within schools for a two-level model, and pupils within schools within districts for a three-level model), and apparently want to carry out some estimation or other analysis while taking into account the (possible) covariances between levels. If you want a simpler method than ML modelling, the method Dennis proposed at least lets you see some aggregate effects. (This does, however, put me in mind of a paper of (I think) Brian Joiner's whose temporary working title was To aggregate is to aggravate -- though it was published under another title.) ;-) Along the lines of Dennis' suggestion, you could plot Y vs X2 (or X2 vs Y) directly, which would give you the visual effect Dennis showed while at the same time showing the scatter in the X2 dimension around the organization average. For larger data sets with more organizations in them (so that perhaps several organizations would have the same (or at any rate indistinguishable, at the resolution of the plotting device used) turnover rate), you could generate a letter-plot (MINITAB command: LPLOT), using the organization ID in X1 as a labelling variable. Brett's original post presented this data structure: A colleague has a data set with a structure like the one below: IDX1 X2 Y 1 1 0.700.40 2 1 0.800.40 3 1 0.650.40 4 2 1.200.25 5 2 1.100.25 6 3 0.900.30 7 4 0.500.50 8 4 0.600.50 9 4 0.700.50 Where X1 is the organization. X2 is the percent of market salary an employee within the organization is paid -- i.e. ID 1 makes 70% of the market salary for their position and the local economy. And Y is the annual overall turnover rate in the organization, so it is constant across individuals within the organization. There are different numbers of employee salaries measured within each organization. The goal is to assess the relationship between employee salary (as percent of market salary for their position and location) and overall organizational turnover rates. How should these data be analyzed? The difficulty is that the data are cross level. Not the traditional multi-level model however. That there is no variance across individuals within an organization on the outcome is problematic. Of course, so is aggregating the individual results. How can this be modeled both preserving the fact that there is variance within organizations and between organizations? As I understand it (as implied above), this is exactly the kind of structure for which multilevel methods were invented. I suggested that this was a repeated measures problem, with repeated measurements within the organization, my colleague argued it was not. This strikes me as a possible approach (repeated measures can be treated as a special case of multilevel modelling). But most software that I know of that would handle repeated-measures ANOVA would tend to insist that there be equal numbers of levels of the repeated-measures factor throughout the design, and this appears not to be the case (your sample data, at any rate, have different numbers of individuals in the several organizations). Can this be modeled appropriately with traditional regression models at the individual level? That is, ignoring X1 and regressing Y ~ X2. That was, after a fashion, what Dennis illustrated. In a formal regression analysis, I should think it unnecessary to ignore X1; although it would doubtless be necessary to recode it into a series of indicator-variable dichotomies, ot something equivalent. It seems to me that this violates the assumption of independence. Not altogether clear. By this do you mean regression analysis? Or, perhaps, the particular analysis you suggested, ignoring X1? Or...? And what assumption of independence are you referring to? (At any rate, what such assumption that would not be violated in other formal analyses, e.g. repeated-measures ANOVA?) Certainly, the percent of market salary that an employee is paid is correlated between employees within an organization (taking into account things like tenure, previous experience, etc.). Well, would the desired model take such things into account? (If not, why not? If so, where is the problem that I rather vaguely sense lurking between the lines here?) -- Don.
Re: A question
On Fri, 4 May 2001, Alan McLean wrote: Can anyone tell me what is the distribution of the ratio of sample variances when the ratio of population variances is not 1, but some specified other number? Depends. If the two samples on which the variances are based are _independent_, s^2(1)/s^2(2) is distributed as (Var(1)/Var(2)) times the usual F distribution. (My reference for this is Glass Stanley (1970), pp 303-306.) If the sample variances are based on so-called dependent (= correlated) samples, the problem is, apparently, much more difficult (beyond the scope of this textbook, GS write). I want to be able to calculate the probability of getting a sample ratio of 1 when the population ratio is, say, 2. As the above remarks imply, if the samples are independent, that probability is the same as the probability of getting a sample ratio of 0.5 when the population variances are equal (population ratio = 1). (Since the distribution is continuous, the probability that the sample ratio _equals_ 1 -- or 0.5 -- is zero; but presumably your interest would actually be in, e.g., the probability that the sample ratio lies in the interval from 0 to 1 (or its complement, the interval from 1 to infinity); or in some other interval with 1 at one end.) Actually doing the calculation would require either F tables rather more extensive than the usual abbreviated versions that have only six to ten cumulative relative frequencies, or software like Minitab that can calculate probabilities for the standard F distribution. (Take your sample ratio, divide it by the hypothesized population ratio, and ask Minitab to evaluate the quotient as an F value.) -- Don. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Please help
I rather think the problem is not adequately defined; but that may merely reflect the fact that it's a homework problem, and homework problems often require highly simplifying assumptions in order to be addressed at all. See comments below. On Fri, 4 May 2001, Adil Abubakar wrote: My name is Adil Abubakar and i am a student and seek help. snip if anyone can help, please respond to [EMAIL PROTECTED] Person A did research on a total of 4500 people and got the following results: Q. 1. How many hours do you spend on the web? 0-7 8-15 15+ 18% 48% 34% Q. 2. Do you read a privacy policy before signing on to a web site? 1=Strongly Agree 2=Agree 3=Neutral 4=disagree 5=strongly disagree 9% 17% 20% 32%22% If this were a research situation, or intended to reflect practical realities, there would also be information about the relationship between the answers to Q. 1 and the answers to Q. 2. This information might be in the form of a two-way table of relative frequencies, or (with suitable simplifying assumptions on the variables represented by Q.1 and Q.2) as a ccorrelation coefficient. Without _some_ information about the joint distribution, I do not see how one can hope to address the questions posed below. Another person asked the same questions of 100 people and got the same results in % terms. Can it be shown via CI that the result is consistent with the expectations created by the previous survey? If the % results were indeed the same (so that all differences in corresponding %s were zero), it would not be necessary to use a CI (by which I presume you mean confidence interval) to show consistency. (HOWEVER, even identical % results do not imply consistency, unless at the same time the joint distribution were ALSO identical; and you do not report information on this point.) OTOH, if the results were merely similar but not identical, you would want some means of assessing the strength of evidence that resides in the empirical differences. That in turn depends on the assumptions you're willing to make about the two variables: do you insist on treating the responses as (ordered) categories, or would you be willing, at least pro tempore, to assign (e.g.) codes 1, 2, 3 to the responses to Q. 1, use the codes 1, 2, 3, 4, 5 supplied for Q. 2, and treat those values as though they represented approximately equal intervals? Also can it be argued that the subjects have been subjected to the questions before? Not sure what you mean by this question. If you know that the Ss have indeed been asked these questions previously (are they perhaps a subset of the original 4500?), no arguing is needed; although what this would imply about the results is unclear. If you mean, do the identical (or at least consistent) results imply that the Ss must have encountered these same questions previously, I do not see how that can be argued, at least not without more information than you've so far provided. Perhaps more to the point, why would such an argument be of interest? Can it be asserted with statistical significance, that if the survey is repeated on at least 100 people the result will [be] in the same proximity of the above survey?? No. I suggest you look closely at the definition of statistical significance: the term is quite incompatible with the assertion you propose. (If you don't see that, you might bring a focussed version of the question back to the list. If you do see that, you may still have some question that is more or less in the same ball-park as the question you've asked here, and you may wish to bring the revised question to our attention.) any help ... will be appreciated. Just need the different methodologies. Yes; but for which questions, exactly? -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Orthogonality of Designs for Experiments
Short answers below; which may or may not adequately address the lurking questions you had in mind. On Fri, 4 May 2001, Jeff wrote: Would like to ask [for] help with the following questions: 1. why designs for experiments should be orthogonal ? So that results for each factor, and each interaction between factors, will be mutually independent. 2. which problems may I encounter if I use non-orthogonal design ? Same kinds of problems you encounter in the general multiple regression situation: apparent size of effect of any predictor (or factor) will depend on the presence or absence of other predictors in the model, and also on the sequence in which the several predictors (factors and their interactions) are considered in the statistical model. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: (none)
Thanks, Rich. My semi-automatic crap detector hits DELETE when it sees things like this anyway; but... did you notice that although SamFaz (or whoever, really) claims to cite a bill passed by the U.S. Congress he she or it is actually writing from Canada? I'm not quite sure what to make of that... On Wed, 2 May 2001, Rich Ulrich wrote: On 1 May 2001 16:14:28 -0700, [EMAIL PROTECTED] (SamFaz Consulting) wrote: Under the Bill s. 1618 title III passed by the 105th US congress this letter cannot be considered SPAM as long as the sender includes contact information and a method of removal. To be removed, hit reply and type ?remove? in the subject line. Here was a message posted, that my reader saw as an attachment. The lines above were at the start of the SPAM. Ahem. I am about 100% sure that the above is a lie. In multiple ways. For instance, Is there a legal definition of SPAM? snip, useful advice, because you've all already read it -- Don. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: probability and repeats
On Tue, 1 May 2001, Dale Glaser wrote in part: a colleague just approached me with the following problem at work: he wants to know the number of possible combinations of boxes, with repeats being viable...so, e.g,. if there are 3 boxes then what he wants to get at is the following answer (i.e, c = 10): Let me try to rephrase this. We have a store of boxes. There are k (say; here k = 3) different kinds of boxes, and we have a sufficiently large supply of each kind that when we amass m (say) different boxes, they may all be of the same kind. In the example, m = k = 3. It is not clear whether m = k is necessary, or whether (e.g.) one might have 3 types of boxes, but be selecting groups of 4 boxes (or some other number). As you point out, for m = k = 3, the number n (say) of different collections (or combinations?) of boxes is 10; when m = k = 4, n = 35. Since you specify that the collection 331 is the same as 133, I'd want to report each collection in monotonic increasing order of box number, and list the collections in lexicographic order, thus (2nd column): 111 111 This way I'd be less likely to omit 222 112 one or more collections inadvertently. 333 113 (I think.) 112 122 If k = 3 and m = 4, we have n = 15: 221 123 332 133 1133 113 222 1112 1222 2223 223 223 1113 1223 2233 331 233 1122 1233 2333 123 333 1123 1333 ...so there are 10 possible combinations (not permutations, since 331 = 133)...however, when I started playing around with various combinations/factorial equations, I realized that there really isn't a pool of 3 boxes ... there has to be a pool of 9 numbers, in order to arrive at combinations such as 111 or 333 so any assistance would be most appreciated as I can't seem to find an algorithm in any of my texts..thank you.dale glaser I can't offer you a convenient algorithm for calculating n for given m and k, but the following line of thought may perhaps suggest something to you. For m = k = 3, we have 1 combination [123] with no repeats, 6 combinations with one pair [112, 113, 122, 133, 223, 233], and 3 triplets [111, 222, 333]. The 6 can be arrived at by taking the k = 3 pairs and multiplying by the 2 possible odd singles, and if course the number of triplets (or in general m-tuplets) is k = 3. For m = k = 4, there is again 1 combination with no repeats (because m = k); and now 12 combinations involving 1 pair (there are k = 4 pairs, and for each pair there are 3C2 = 3 odd pairs [e.g., 1123,1124,1134]; 12 combinations involving 1 triplet (k = 4 triplets, and for each there are k-1 = 3 odd singletons); 6 combinations involving 2 pairs (4C2, I think); and k = 4 quadruplets. In lexicographical order, these 35 combinations are: 1123 1222 1244 2244 1112 1124 1223 1333 2223 2333 3334 1113 1133 1224 1334 2224 2334 3344 1114 1134 1233 1344 2233 2344 3444 1122 1144 1234 1444 2234 2444 In lexicographical order within numbers of repeats as listed above: 1234 1224 2234 1114 2224 1122 3344 1123 1233 2334 1222 2333 1133 1124 1244 2344 1333 2444 1144 1134 1334 1112 1444 3334 2233 1223 1344 1113 2223 3444 2244 but they may make better sense in an order that emphasizes the repeats: 1234 1223 2334 1112 2224 2444 1144 1224 1244 1113 1333 3444 2233 1123 2234 1344 1114 2333 2244 1124 1233 2344 1222 3334 1122 3344 1134 1334 2223 1444 1133 Anyway, good luck in finding an appropriate algorithm or formula for n in terms of m and k (or just in terms of k, if the conditions of the problem require that m = k). -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/
Re: Joining edstat
On Sat, 28 Apr 2001 [EMAIL PROTECTED] wrote: I just joined the listserv. Our professor is giving us extra credit if we join an email list re: stats. I was able to pull up one of his messages from last year. Pretty cool. Have a great day! You might ask him whether additional extra credit is awarded if you also learn about the usual rules of conduct, sometimes called netiquette. One of them is that persons posting to the listserv are expected to include their proper names at least, preferably accompanied by their affiliations (e.g., college or place of employment or home address, or combinations of these). You might start by visiting the web site mentioned in the trailer automatically appended to this message by edstat. Your e-mail program almost certainly has the facility to include a signature file (sometimes called a .sig) automatically; and even if you think you have valid reason(s) for not doing that as a routine courtesy for all your e-mail, you can easily import such a file into your message for polite communication with listservs and other correspondents, and ought to do so. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Help me an idiot
On Sat, 28 Apr 2001, Abdul Rahman wrote: Please help me with my statistics. If you order a burger from McDonald's you have a choice of the following condiments: ketchup, mustard , lettuce. pickles, and mayonnaise. A customer can ask for all these condiments or any subset of them when he or she orders a burger. How many different combinations of condiments can be ordered? No condiment at all counts as one combination. Your help is badly needed. Why? All you have to do is construct all the possibilities and count them. Shouldn't be that hard. If you want a method for dealing with more general cases, that might be another matter, of course. But even that would yield to the same procedure, if you went about it in a systematic enough fashion. So how have you approached the problem so far? (I'm a New Englander, and we tend to disapprove of laziness. If you haven't even tried to solve it yourself [and problems like this are almost certainly dealt with in your textbook!], I'm not interested in providing any help at all.) -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: A disarmingly simple conjecture
On Wed, 18 Apr 2001, Giuseppe Andrea Paleologo wrote: I am dealing with a simple conjecture. Given two generic positive random variables, is it always true that the sum of the quantiles (for a given value p) is greater or equal than the quantile of the sum? snip, technical translation of the question into algebra Any insight or counterexample is greatly appreciated. I am sure this is proved in some textbook, but independently from that, I think this should be doable via elementary methods... If this were a theorem, perhaps it should be. But it does not seem inherently reasonable to me. (Herman Rubin has provided a mathematical response denying the conjecture; but I'd like to look at it from a different perspective. I'd be interested in opinions whether this line of reasoning is valid.) If I understand you correctly, you conjecture that for two random variables (X and Y, say) and their sum (Z, say, = X + Y), the sum of the third quartile of X and the third quartile of Y would be greater than or equal to the third quartile of Z. But this would seem to imply, by symmetry, that the sum of the _first_ quartile of X and the first quartile of Y should be LESS than or equal to the first quartile of Z. There being nothing especially magical about quartiles (whether first, second, or third), these two statements together would imply that the sum of a quantile of X and the corresponding quantile of Y must be BOTH less than or equal to, AND greater than or equal to, the corresponding quantile of Z: that is, the sum of the quantiles must always EQUAL the corresponding quantile of the sum. But for this proposition, I believe there exist lots of counterexamples. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: errors in journal articles
On Fri, 27 Apr 2001, Lise DeShea wrote in part: I teach statistics and experimental design at the University of Kentucky, and I give journal articles to my students occasionally with instructions to identify what kind of research was conducted, what the independent and dependent variables were, etc. For my advanced class, I ask them to identify anything that the researcher did incorrectly. snip, description of defective article One of my students wrote on her homework, It is especially hard to know when you are doing something wrong when journals allow bad examples of research to be published on a regular basis. Mmmm. It isn't really any harder to _know_ when you're doing something wrong; it may be somewhat more disheartening to realize that there may be no adequate check on one's own silly mistakes, later. I'd have pointed out to your student that one instance (possibly selected by her professor with malice aforethought? -- and even if not, the student wouldn't necessarily know that) hardly supports the phrase published on a regular basis. Just emphasizes the need to maintain a healthy skepticism, and to be prepared to proofread with a critical eye. (Just 'cause it's printed doesn't mean it's true...) I'd like to hear what other list members think about this problem and whether there are solutions that would not alienate journal editors. Not to mention one's (you should pardon the expression) colleagues. Depends partly on sensitivity of editors and/or authors to criticism. Mainly, as TR once put it, speak softly (i.e., politely) and carry a big stick (i.e., evidence that, even if politely phrased, clearly illuminates the fact of an error). But it is worth remembering that journal editors (at least, the ones I've known) are editors only for limited terms: three years is not unusual, I think, and while an editor may be reappointed for a subsequent (second, third, ...) term, it seems to be more usual to serve for two terms and then let somebody else do it. So even if you get off on a wrong foot with one editor, that misfortune needn't carry over to the next editor. Some years back I encountered a systematic error in a journal article. The author had reported total scores from a series of Likert-like items, and showed a histogram. The histogram displayed decided spikes, about twice as high as the surrounding landscape, at regular intervals: scores of 20, 25, 30, 35, apparently. (Maximum score was 40, minimum 10.) These were so interesting that the author spent a page or more intepreting them (as the results of patterned responses by the respondents, by which was meant responding with all 3's (e.g.) to all items). And indeed, if such patterning were present to any great degree, it would have showed up in just this way. Only thing was, the histogram program used had been allowed to set its own parameters, and in the range of, say, 20 to 30, where there should have been ten scores, there were only eight histogram bars. The spikes were of course the bars that contained two scores: 20 and 21, 25 and 26, 30 and 31, etc. First thing I did was write to the author. Wasn't polite enough, I guess (although I was trying to be), because he never acknowledged my letter. Then I e-mailed the editor, who wanted a response from the author before he took any action (which I thought reasonable enough), and suggested that I write a letter to the editor identifying the problem, which he'd then ask the author to reply to. Various things intervened about then, and I never got that letter written, I'm afraid. But I've frequently used that article as an example in class (usually presenting it as a puzzle, to see if anyone is sharp-eyed enough to see what's wrong, and usually presenting only the histogram and the relevant paragraph or two in the article). Helps to illustrate the points reported above: be skeptical, and sharp-eyed. And I take the opportunity to point out that this error, obvious as it is once one has seen it, eluded the author, the audience at the AERA session where the paper was presented, the audience at a European meeting where it was presented, at least two associate editors (that journal routinely farms papers out to at least two readers before publishing), and the journal editor himself. (And, presumably, most of the journal readership -- I never saw a critical letter from anyone else on this point.) snip, various economic concerns (Of course, you could always suggest that your _student_ to write a naive little letter to the author, asking naive little questions...) -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264
Re: ANCOVA vs. sequential regression
On Mon, 23 Apr 2001, jim clark wrote: On 22 Apr 2001, Donald Burrill wrote: If I were doing it, I'd begin with a full model (or augmented model, in Judd McClelland's terms) containing three predictors: y = b0 + b1*X + b2*A + b3*(AX) + error where A had been recoded to (0,1) and (AX) = A*X.[1] A number of sources (e.g., Aiken West's Multiple regression: testing and interpreting interactions) would recommend centering X first (i.e., subtracting out its mean to produce deviation scores). Yes, this is always an option. Usually recommended to avoid certain computational problems that may arise if the distribution of X has a particularly low coefficient of variation, for example, and if the model contains many variables (and in particular interactions among them). Such problems are unlikely to arise in so simple a model as [1], and are more effectively dealt with when they do arise by deliberately orthogonalizing the predictors. I've never quite understood why deviations from a sample mean, which is after all a random function of the particular sample one has, should be preferred either to the original values of X (unless there ARE distributional problems) or to deviations from some value inherently more meaningful than a sample mean. You might also consider whether dummy coding (0,1), as recommended by Donald, would be best or perhaps effect coding (-1, 1). Also a possibility, of course. Note that the interpretations of the several coefficients (b0, b2, and b3 in particular) change with changes in coding of the dichotomy A. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: normal approx. to binomial
On Tue, 10 Apr 2001, Gary Carson wrote: It's the proportion of success (x/n) which has approxiatmenly a normal distribution for large n, not the number of success (x). Both are approximately normal. (If the r.v. W = (x/n) is (approximately) normally distributed, then the r.v. V = x = n*W must also be; only with a mean and standard deviation each n times as large as for W.) -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-472-3742 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
SAT z3 (Was: Re: (no subject))
Everything you need is in what you wrote. You do understand that "z" is the usual shorthand for "a standard score", and that a standard score is the representation of a given raw score as its deviation from the population mean in standard-deviation units? The rest is merely a lookup in a table of the standard normal distribution. (I find it to be somewhat less than 0.15%, though.) -- DFB. On Mon, 2 Apr 2001, Jan Sjogren wrote: SAT scores are approximately normal with mean 500 and a standard devotion 100. Scores of 800 or higher are reported as 800, so a perfect paper is not required to score 800 on the SAT. What percent of students who take the SAT score 800? The answer to this question shall be: SAT scores of 800+ correspond to z3; this is 0.15%. Please help me understand this. I don't understand how I get that z3??? and that it is 0.15%? Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: convergent validity
On Fri, 30 Mar 2001 [EMAIL PROTECTED] wrote: Donald Burrill writes: On Thu, 29 Mar 2001, H.Goudriaan wrote in part: - my questionnaire items are measured on 5- and 7-point Likert scales, and consequently not (bivariate) normally distributed; Real data hardly ever is. Do you need it to be? Usually the question of interest is whether it's close enough to be an adequate approximation for guv'mint work. Ok, I understand and agree. But isn't it a bit naive to think that a group of variables with 5 categories may result in a good factor analysis (or whatever other parametric analyses)? I frankly don't see the relevance of naivete to the question at hand. It isn't, one gathers, as though you had any choice in the matter: either in the number of points on each item scale (since this is all, as you told Dennis, an existing scale) nor in the bivariate distribution of the two constructs in which (one gathers) you are interested. (And you haven't said why you think you want these two constructs to be bivariate normal -- rather than, say, linearly related and unimodal. Nor, for that matter, have you indicated whether you have examined the bivariate distribution in question and actually found it to depart worrisomely from a reasonable distribution.) You also replied to Dennis that you have 16 items, 11 of which are alleged to measure one construct and 5 measure another. That sounds to me like two variables, one with a potential range of 11 to 55 and the other with a potential range of 5 to 25 (for the 5-point scales; where you have 7-point scales the potential range will be somewhat wider). I should think that your interest would then lie in the validity of these two variables, not in the individual items that contribute to them; unless you want to do an item analysis of one kind or another. You write also, "with 5 categories". If you insist that the item responses must be treated as _categories_, rather than ordered points on a scale, then you ought, one would think, to be applying the methods of dual scaling (also known as correspondence analysis). Or, if you allow that the responses are ordered, use the variation of dual scaling that applies to ordered categories. (All this for dealing with data at the item level, of course.) You haven't explicitly said (that I recall), but you seem to be unwilling to treat the item responses as of approximately interval scale. Why not? Do you have evidence that the scale intervals are grossly unequal? (That seems to me unlikely.) Or are the distributions of responses for some items so peculiar as to generate serious doubt about the intervals? (If so, you might wish to convert any such item to a series of 0/1 categories -- which brings us back to dual scaling.) -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: convergent validity
On Thu, 29 Mar 2001, H.Goudriaan wrote in part: - my questionnaire items are measured on 5- and 7-point Likert scales, so they're not measured on an interval level Non sequitur. and consequently not (bivariate) normally distributed; Real data hardly ever is. Do you need it to be? Usually the question of interest is whether it's close enough to be an adequate approximation for guv'mint work. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: statistical errors
On Thu, 22 Mar 2001, Paul R Swank wrote: I prefer the ocular test myself. Were you referring to the intraocular traumatic test? (It strikes you between the eyes.) -- Don. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Was: MIT Sexism statistical bunk
On Thu, 15 Mar 2001, dennis roberts wrote in part: ps ... a conclusion that lots of people don't agree with one another will not be too helpful Maybe not, but it sure would be realistic -- which might be reassuring to some of our students who have their own doubts on that score about our discipline. -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One tailed vs. Two tailed test
On Tue, 13 Mar 2001, Will Hopkins wrote in part: Example: you observe an effect of +5.3 units, one-tailed p = 0.04. Therefore there is a probability of 0.04 that the true value is less than zero. Sorry, that's incorrect. The probability is 0.04 that you would find an effect as large as +5.3 units (or more), if (a) the true value is zero and (b) the sampling distribution of the test statistic is what you think it is. (The probability of finding an effect this large, in this direction, is less than 0.04 if the true value is less than zero (and your sampling distribution is correct).) snip But why test at all? Just show the 95% confidence limits for your effects, and interpret them: "The effect could be as big as upper confidence limit, which would mean Or it could be lower confidence limit, which would represent... Therefore... " Doing it in this way automatically addresses the question of the power of your study, which reviewers are starting to ask about. If your study turns out to be underpowered, you can really impress the reviewers by estimating the sample size you would (probably) need to get a clear-cut effect. I can explain, if anyone is listening... You had in mind, I trust, the _two-sided_ 95% confidence interval! -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Regression with repeated measures
Hi, Rich. The only answer I recall having seen on the listserve was one suggesting multilevel (aka "hierarchical") modelling. If one wanted to address the problem without ML modelling, I'd be inclined to proceed as follows: (1) I assume, in the absence of commentary to the contrary, that the "strong spatial correlations" among the values in the 6x6 grids have much the same structure from grid to grid and from respondent to respondent. (Even if this is an oversimplification, it's a starting point.) (2) Use the 6 rows and the 6 columns of the grid as categorical variables in an ANOVA-like approach; the contents of each cell being, as you write, the dependent variable. You don't mention what varible(s), nor how many of them, you're using as predictor(s); but specify the analysis as an ANCOVA, in a GLM routine if you're using MINITAB, with the predictor(s) as covariates and the rows columns as ANOVA factors. You'll get a 5-df measure of "row" effect, a 5-df "column" effect, and a 25-df "interaction" effect; if these are large enough to be interesting, you can try your hand at fitting various models to the pattern of results using whatever you think is going on in the "spatial correlations". If the structure of the spatial correlations is not replicated across grids, or at least across SOME grids, this approach may not be fruitful; but it can't hurt to try it, in any case. I'm not sure what to make of the 14 grids. They might represent another (14-level) ANOVA factor, but I can't tell from your description. Hope this is helpful. As you probably recognized, it's essentially the same kind of approach I suggested to Mike Granaas for his repeated measures problem. -- Don. On Wed, 28 Feb 2001, Rich Strauss wrote: I don't have an answer, but I'm very glad this question was asked because I'm having a similar problem. I have 14 grids, values from which are to be used as the dependent variable in a regression. Each 6x6 grid consists of 36 observation points. Their are some fairly strong spatial correlations among the values at each grid, so I certainly can't treat them as if they were independent, yet reducing each grid to a single mean value (the other extreme) seems like a foolish waste of power. I'm trying to figure out how to use all of the observations, but also use the estimated spatial autocorrelations to weight them in the regression. (The design was originally created to answer a very different question, which is how I got into this mess.) -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Patenting a statistical innovation
In response to dennis roberts, who wrote in part: i see "inventing" some algorithm as snip not quite in the same genre of developing a process for extracting some enzyme from a substance ... using a particular piece of equipment specially developed for that purpose i hope we don't see a trend IN this direction ... On Wed, 7 Mar 2001, Paige Miller replied: If it so happens that while I am in the employ of a certain company, I invent some new algorithm, then my company has a vested interest in making sure that the algorithm remains its property and that no one else uses it, especially a competitor. Thus, it is advantageous for my employer to patent such inventions. In this view, mathematical inventions are no different than mechanical, chemical or other inventions. Yes. And in another domain of discourse, statistical methods invented by statisticians like Abraham Wald, who worked on military problems during WWII, were military secrets until the war ended. "Official secret" is the governmental/military equivalent of "patent". -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: norm curve template
Dennis also included [EMAIL PROTECTED] among his addressees, but I am not on that list and therefore cannot reply to them... On Tue, 6 Mar 2001, dennis roberts wrote: may eons ago ... 1974 to be precise ... i had this idea of making a small plastic normal and skewed curve template ... that would help students draw both types ... with information about the distributions on the template ... that would help them work with problems by being able to make a nice sketch ... Yes. I had such a template for years, and found it very useful, both for preparing handouts and overheads. I don't know where it is now -- it got lost at some point -- and I don't remember where I came across it in the first place, nor exactly when (about 1980, I would guess). Both curves side by side comprising most of the long edge of a template about 7 inches long, an internal straight line in two segments representing the usual X-axis, with printed marks for mean and +/- 1, 2, 3, s.d.'s for both distributions. if anyone is interested in a historical artifact (relic?) ... have a look at http://roberts.ed.psu.edu/users/droberts/statmat.jpg i still think it WAS a good idea ... just didn't have the right "marketing" team in place Yes, it was. But somebody evidently did. -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: power,beta, etc.
In response to Dennis's earlier statement, "that is ... power in many cases is a highly overrated CORRECT decision" I wrote: Well, no. Overrated it may be (that lies, I think, in the eye of the beholder); but a _decision_ it is definitely not. Power is the _probability_ of making a particular decision -- which, of course, like all decisions, may or may not be correct. On Mon, 5 Mar 2001, dennis roberts replied: sorry we don't MAKE this decision ... the only decision we make in this case is to reject the null ... Precisely. We DECIDE to reject the null hypothesis. Why do you say, of this decision, "we don't MAKE this decision" ??? it is only the statisticians who overlay on TOP of this ... the consequence OF that reject decision ... saying that IF the null had been false (of which the S has no clue about) ... THEN the consequence of that reject decision is called power Sorry, Dennis. Power is defined AS A PROBABILITY. In particular, it is the probability of rejecting the [null] hypothesis being tested. Some people prefer to define it as a conditional probability (we've been over this once already), the condition being that the hypothesis being tested is false. But power is not a _consequence_ of a decision, in any sense of "consequence" that I know about. If you know a sense in which "consequence" applies, I'd be interested in the formal definition, and where that definition can be found in a standard reference. As you pointed out in your earlier note, power is sometimes defined as 1 - beta. This definition, when used, can make sense only if power is understood as a probability, since beta is a probability. (At least, I understand beta to be a probability. Are you going to argue the contrary?) this is one reason i raised this issue ... because, we only make 2 possible decisions with respect to our investigation ... we retain ... we reject ... we DON'T determine the consequence of that decision ... so, in this sense ... saying that there is a consequence associated with a particular act ... retaining or rejecting ... "power is the probability of MAKING (emphasis added from don's comment) ... a particular decision ... " ... sounds like WE did this ... when we did NOT DO this Dennis: the decision in question was to reject the null hypothesis. If "we did NOT DO this" (your emphasis), who, pray tell, did?? And if the decision you're trying to talk about is not "to reject", what IS that decision that, you claim, "we did NOT DO"? all we did was to reject the null Yes. Precisely. That was a decision, and we made it. And the probability of our making that decision, like all probabilities, depends on the state of nature -- in particular, on the value of the parameter in question -- when we made it. That probability is called "power". i still think there would be value ... in: 1. making it clear that the S only makes decisions of the retain kind ... and reject kind ... that's it! But is this somehow not clear from the beginning? Who alleges that any other kind of decision is made? If one is testing an hypothesis, the result of the test is a decision to reject or to retain (or, in the classical mode, to fail to reject). ... (I do hope that your "S" is not one of the experimental subjects in the study whose data are being analyzed (although that's what "S" usually refers to). Those decisions are made by the investigator (or analyst), not by any participating Subject.) -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: power,beta, etc.
On Sun, 4 Mar 2001, dennis roberts wrote in part: i know that sometimes power is "defined" as 1 - beta ... but, beta could therefore (algebraically and logically) be defined as 1 - power Only for the conditional definition of power; I would wish to add the conditional clause "when the null hypothesis is false". so, these are circular in a way Yes, of course: in the same sense that a glass may be said to be one-third full or two-thirds empty. that is ... power in many cases is a highly overrated CORRECT decision Well, no. Overrated it may be (that lies, I think, in the eye of the beholder); but a _decision_ it is definitely not. Power is the _probability_ of making a particular decision -- which, of course, like all decisions, may or may not be correct. -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Fisher's z-transformation
On Sat, 3 Mar 2001, Arenson, Ethan wrote: Would someone please remind me the formula for Fisher's z-transformation of correlation coefficients? Z = 0.5 log[(1 + r)/(1 - r)] (using the natural logarithm). Its standard error is 1/sqrt(n - 3) ("sqrt" = "square root of"). To convert back:r = (exp(2Z) - 1)/(exp(2Z) + 1) ("exp(2Z)" is the natural antilogarithm of 2Z, aka e to the power 2Z). Equivalently, Z = tanh(r) and r = inverse tanh(Z) ("tanh" = hyperbolic tangent). -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Trend analysis question
On Sun, 4 Mar 2001, Philip Cozzolino wrote in part: However, after the cubic non-significant finding, the 4th and 5th order trends are significant. Intuitively, it seems that if there is no cubic trend of significance, there will not be any higher order trend, but this is relatively new to me. Your intuition is, in this case, incorrect. The five trends are mutually independent in the sense that any combination of them may be operating. (I am for the moment accepting the implied premise that a power function of the IV is a reasonable function to try to fit to your data. In most instances I know of, this is not "really" the case, and the power function is more usefully thought of as an approximation to whatever the "real" functionality is.) This may be seen by considering the following relationships between Y and X (think of them as DV and IV if you wish): I. + * * -* * Y - -* * - + * * - - * * - * - +-+-+-+-+-+- X II.+ * - * ** - Y - ** * - + * * * - - * * * - - * * +-+-+-+-+-+- X In I. above, the linear trend is approximately zero, and the quadratic component of X accounts for nearly all the variation in Y. A "rule" that claimed "If the linear trend is insignificant there can be no significant quadratic trend" is clearly false in this case. In II. above, both the linear and quadratic components of trend are virtually zero -- certainly insignificant -- and the cubic component accounts for nearly all the varition in Y. Similar situations can be imagined, where only the quartic, or only the quintic, or only the linear, quadratic, and quartic, or any other arbitrary combination of the basic trends are significant, and other components are not. If you are carrying out your trend analysis by using orthogonal polynomials (as you probably should be), try constructing the model derived from your linear + quadratic fit only, and plot those as predicted values against X; then construct the model derived from linear + quadratic + quartic + quintic, and plot those predicted values against X. You may find it illuminating also to plot the residuals in each case against X, especially if you force the same vertical scale on the two sets of residuals. I note in passing that you haven't stated how much of the variance of Y is accounted for by each of the significant components, nor how much residual variance there is after each component is entered. That also might be illuminating. -- DFB. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: power,beta, etc.
On Sat, 3 Mar 2001, dennis roberts wrote: when we discuss things like power, beta, type I error, etc. ... we often show a 2 by 2 table ... similar to null truenull false retain correct type II, beta reject type I, alpha power Similar, but not the same. I usually present a table correcterror: Type II of "states of affairs", without probabilities; error: Type I correct see table at right. (And usually with the rows interchanged, so that "Type I error" LOOKS like the first kind of error one encounters.) It seems to me that to include the probabilities in the same 2x2 table as the "states of affairs" would be actively to invite rampant (or at least, and more alliteratively, couchant) confusion of the concepts. I have another problem with writing "power" in the lower right cell, apart from the fact that it's a probability and not a state of affairs. I'm aware that many people think of power as a conditional probability (of rejecting the null when it's false); but I came to understand it as an UNconditional probability (of rejecting the null, period). This definition permits drawing power curves that include the parameter value specified by the null hypothesis: the power at that point (or, in that case) is alpha. For a symmetric two-sided alternative, this is also the minimum value of power. Since the value of power approaches alpha as the parameter value approaches the value specified in the null hypothesis, it seems a little silly to omit that one point from the continuous curve. i think that we need a bit of overhaul to this typical way of doing things ... 1. each cell needs to have a name ... label ... that reflects the consequence of the decision (retain, reject) that was made i propose something along the lines of null true null false retaintype I correct, 1C type II error, 2E rejecttype I error, 1Etype II correct, 2C I've long been persuaded of the need to distinguish between the two different kinds of errors. That there are two distinct kinds is not at all obvious, evidently; some folks seem never to master the distinction. But I am not convinced that we need to distinguish between two kinds of correct decision. After all, the decisions themselves are different: to reject, or to retain (though some folks prefer "accept" to "retain"). Knowing the decision, and that it is (at least hypothetically) correct, is surely all one needs to know. "Correct rejection" or "correct retention" (or "acceptance") of the hypothesis being tested seems to me easier to handle and apprehend than "a Type I correct decision" or "a Type II correct decision". then, we have names or symbols for probabilities attached to each cell null true null false retain WHAT NAME/SYMBOL FOR THIS??beta reject alpha power If you want to construct such a table, I'd recommend including the marginal row, showing the column totals to be 1 (or, if one prefers, 100%). That helps to emphasize the conditional nature of the probabilities being displayed: conditional on the state of nature, not on the decision. And consistent with my understanding of power, I'd present such a table thus: State of nature null true null false P{retain}1 - alpha beta Power alpha1 - beta -- Total 1 1 Sometime along about now one really ought to point out that a 2x2 table like this is grossly oversimplified. Beta (and therefore power) cannot be evaluated for "null false". It can be evaluated only for a specified particular value of the parameter that is different from the value specified in the null hypothesis. And, ceteris paribus, the farther that parameter value is from the null-hypothetical value, the smaller is beta (and the larger is power). This leads more or less directly to the idea of a power curve, and then to the variations in such a curve as a function of alpha and sample size. DOES ANYONE HAVE SOME SUGGESTION AS TO HOW THE UPPER LEFT CELL MIGHT BE REFERRED TO via A SYMBOL??? OR, SOME NAME THAT IS DIFFERENT FROM POWER BUT ... STILL GIVES THE FLAVOR THAT A CORRECT DECISION HAS BEEN MADE (better than making an error)? Do you have a reasoned objection to "1 - alpha"? In other contexts we routinely use, e.g., "1 - Rsq" for the proportion of variance unexplained by the model being considered. The "1 minus" construction shows the logical and arithmetical connection between two quantities, which can easily get lost if one uses very different-looking terms for those quantities. 2. i think it would be helpful to first identify each cell with a
Re: Post-hoc comparisons
Hi, Esa! You've had a couple of responses; here's another. You state "pairwise comparisons"; but it strikes me as at least possible that you might want (or might _also_ want) to consider more complex comparisons if any such comparisons seemed to offer a more parsimonious (or, perhaps, more theory-related?) explanation of the differences among the four conditions. (E.g., conditions A B vs. conditions C D; or, condition B vs. conditions A C D; or, condition A vs. conditions B D and condition C vs. conditions B D.) I would ordinarily think of using the Scheffe' method (or the Tukey method, if the sample sizes were equal in each condition and one's interest really were _only_ in pairwise comparisons): its experimentwise Type I error rate means no need for Bonferroni or similar calculations; just convert your binary response to a proportion passed (or proportion failed, if that be easier to interpret) and do a one-way ANOVA on that proportion in the four treatments. -- Don. On Fri, 2 Mar 2001, Esa M. Rantanen wrote: I have a question concerning pairwise comparisons between four treatment conditions. snip I have a single factor experiment with four levels of the factor (treatment conditions) and a discrete dependent measure (pass/fail), resulting in a 2 x 4 contingency table. ... Chi-Sq. analysis [has found] a statistically significant difference between the (treatment) groups (all 4!). snip I would appreciate [it] if anyone would confirm my reasoning above and offer any advice on how to proceed with the analysis of pairwise differences in the case of categorical (dichotomous) data. References to relevant literature would also be welcome! -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Regression with repeated measures
On Wed, 28 Feb 2001, Mike Granaas wrote in part (and 2 paragraphs of descriptive prose quoted at the end): ... is there some method that will allow him to get the prediction equation he wants? Probably the best approach is the multilevel (aka hierarchical) modelling advocated by previous respondents. Possible problems with that approach: (1) you'll need purpose-built software, which may not be conveniently available at USD; (2) the user is usually required (as I rather vaguely recall from a brush with Goldstein's ML3 a decade ago) to specify which (co)variances are to be estimated in the model, both within and between levels, and if your student isn't up to this degree of technical skill, (s)he may not have a clue as to what the output will be trying to say. For a conceptually simpler, if less rigorous, approach, the problem could be addressed as an analysis of covariance (to use the now old-fashioned language), using the intended predictor as the covariate and the 10 (or whatever number of) trials for each S as a blocking variable (as in randomized blocks in ANOVA). This would at least bleed off (so to write) some of the excess number of degrees of freedom; especially if one also modelled interaction between predictor and blocking variable (which might well require a GLM program, rather than an ANCOVA program), as in testing homogeneity of regression. The blocking variable itself might be interpretable (if one were interested) as an (idiosyncratic?) amalgam of practice/learning and fatigue. -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 -- The situation as Mike desribed it: I have a student coming in later to talk about a regression problem. Based on what he's told me so far he is going to be using predicting inter-response intervals to predict inter-stimulus intervals (or vice versa). What bothers me is that he will be collecting data from multiple trials for each subject and then treating the trials as independent replicates. That is, assuming 10 trials/S and 10 S he will act as if he has 100 independent data points for calculating a bivariate regression. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: basic stats question
Perhaps jthis is too superficial -- no time to think more deeply just now. But I suspect the difference between your two scenarios below is that with exactly 5 computers to deal with (i.e., population size = 5) you are sampling without replacement (which is only sensible, for the background scenario!); whereas with the textbook problem you are assuming that the probabilities do not change (and in any case they aren't the probabilities that correspond to your N=5 situation!), which is equivalent to sampling _with_ replacement (or, what is much the same thing, assuming the number of entities available to sample from is infinite -- which is probably _not_ sensible for any real-life scenario!). -- Don. On Mon, 26 Feb 2001, James Ankeny wrote in part: ... consider a problem where a manufacturer has five seemingly identical computers, though two are really defective and three are good. ... we want the probability of the event A="order is filled with two good computers." ... then S={D1D2,D1G1,D1G2,D1G3,D2G1,D2G2,D2G3,G1G2,G1G3,G2G3}. Thus, P(A)= 0.30. snip ... Yet, another similar problem in my textbook states that the probabilities of a computer being good and defective (from a particular manufacturer) are 0.90 and 0.10, respectively. Then, if we want to test five computers, we may construct the sample space S=S1xS2xS3xS4xS5, where Si={G,D} for i=1,...,5. Hence, if A="all five computers tested are good," P(A)=(0.90)^5. Why is that we can use the Cartesian product in this case but not in the other case? -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: pizza
On Sat, 24 Feb 2001, Mike Granaas wrote: Interesting point. Yes, if the Ss do something other than a random guess the binomial model would be violated. The question then becomes what would they do if they are uncertain? I suspect that they would fall back on visual inspection...which piece appears to be different than the others (less green pepper, more browned, etc) Such information is probably relevant often enough that "guessing" would be well above 1/3. So what you would then have is evidence that Ss can in fact do better than "chance", but you might NOT know whether that improvement is due to their actually being able to perform as claimed, or to some other factor(s) relevant to identifying the "odd pizza out": a human-cum-pizza version of "Clever Hans", pehaps? Using blindfolded Ss will deal with that problem, and gets us back to the question that Dennis is asking. I'm guessing that rather than going through some sort of a systematic process (e.g. binary decision for the first piece, progress to second piece only if first piece was judged "same".) Umm: Logical problems here. (1) How can _first_ piece be judged "same"? Same as what? (2) Why would Ss not taste all three pizzas, given the ground rules Dennis specified (or implied) at the outset? ... Ss will in fact do something more like guessing. Only they will condition their guesses such that if they picked slice A as different on the previous trial they will first consider slices B and C on the current trial (they will actually avoid selecting the same slice position on sequential trials). How did "sequential trials" get into the scenario? As I read Dennis' description, each S was to taste the three pizzas presented (perhaps tasting each more than once, but not attacking a whole 'nother SET of pizzas). Furthermore they will try to equalize the number of position choices they make across the experiment so that they choose each of A, B, and C three times and one of those a fourth time. This sounds as though you thought each S were going to have ten separate trials at identifying the "odd pizza out", with a different set of three pizzas each time. I don't see how else "choosing each of A, B, and C three times and one of those a fourth time" could mean anything else; but if I've misunderstood, doubtless your reply will explain. However interesting such an experiment might be, it's not the experiment that I thought Dennis described. snip, the rest -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: intermediate stats textbook?
A quick reply. Looks somewhat like the second course ("Intermediate Statistics and Research Design") I taught for some years at OISE, Toronto, which was (and is) the Graduate Department of Education for the University of Toronto. Ask for more later if you want... On Tue, 20 Feb 2001, Lise DeShea wrote: I am looking for a textbook to use in a second-semester stats course in a College of Education. ... Material covered in the class includes: -- Review of kinds of research; kinds of variables; hypothesis testing; errors; one- and two-sample stats; and simple regression -- one-way and two-way fixed-effects ANOVA -- Multiple comparison procedures (usually I provide supplemental material instead of relying on the text) -- Intro to power -- simple repeated measures design -- split-plot design -- Intro to multiple regression I'd summarize this as ANOVA, multiple comparisons, and multiple regression. I could usually find a textbook suitable for ANOVA (more on that below), which I could supplement with an intro to MR I'd compiled that was rather heavily based on Bottenberg Ward; or a textbook good for MR, which I could supplement with the "Rules of Thumb for Writing the ANOVA Table" (originally Millman Glass, J. Educ. Meas. 1967, and reproduced in Glass Stanley 1971 and Glass Hopkins 1984: "Stat'l Methods in Educ. Psychology"). Textbooks for ANOVA: Glass Stanley (later Glass Hopkins, 2nd ed.), esp. when I'd used it for the first course ("Elements of Statistics"). Keppel: Design Analysis: A Researcher's Handbook. Textbooks for MR: Darlington: Regression Linear Models. Judd McClelland: Data Analysis/A Model-Comparison Approach. Pedhazur: Multiple Regression in Behavioral Research. Never did find a text that combined BOTH multiple regression ANOVA with enough depth for my purposes. (Good luck!) I am looking for a book that is conceptual so that my generally math-phobic grad students in Education don't freak out with "symbol-shock," yet I want careful coverage of assumptions and robustness. Your "yet" suggests you're aware of the inherent logical contradiction in that sentence :-). In my view it is not useful to pander to math-phobia; one must deal with it, but in a strategy that helps the feckless student to cope with the phobia and, just maybe, eventually overcome it. One of my stategies was to emphasize at the outset that this is NOT a course in mathematics (nor, really, a course in statistics, though I didn't usually say THAT), but a course in several foreign languages (algebra, computing, statistics, research come to mind) and in a quite foreign (or at least unaccustomed) style of thinking. (Just for one example: for students like yours, it's virtually certain that no one has ever told them that algebra is mostly about using pronouns instead of proper nouns for talking about numbers; so that treating the particular forms of algebra used in statistics as a kind of grammarian's approach to quantitation is wholly new, and might get their minds off the phobic stuff for a minute or two. Thus "X" is a pronoun for "a particular value of a variable", where "variable" itself is a pronoun for "performance on the English test" (or whatever!); the subscript "i" hung onto the "X" is a pronoun for "the individual who supplied that particular datum"; so "X_i" is "that particular datum whose value is 17.5 when the individual referred to is no. 4, who is Mary Smith". Similarly with symbols for operations: you want to add up X_1 and X_2 and ... and X_n to get the Sum of all the X_i, or perhaps more simply "Sum(X_i)", but "Sum" is a long word (3 letters!), so we substitute its initial "S", but spell it in Greek (Sigma producing the same sound in Greek as S does in English) in ouir ongoing campaign to help convince the uninitiated that, really, it's all Greek to us...) Incidentally, on ANOVA: I've never been convinced that all that agricultural terminology was much help to anyone except agronomists and the like, so I _never_ used terms like "split plot". By starting off with a generalizable symbology, one can focus on the _ideas_ of the details, and the carrying out of the details, without having to know rather artificial labels just to be able to look up the relevant design. Using AxB for two crossed factors, C(D) for a factor C nested in another factor D, superscripts "r" and "f" for "random" and "fixed" (or just an asterisk * for "random", if you prefer), and subscripts for the number of levels of a factor, one can represent any complete balanced design in a single formula like R*(S*(CxG)xM) for "Replications" (what someone has called "the ubiquitous nested factor", which in a design like this one might only have one level, so it has zero SS and zero df, but its presence helps one see what the proper denominator mean squares would be if one had them available) nested within "Subjects", which
Re: citations journals (satire)
I note that in the literature cited, the word "nauseam" (in the Latin phrase "ad nauseam") is misspelled both times it appears. -- DFB. On Sat, 17 Feb 2001, Jeff Rasmussen wrote: a spoof on the glut of journals: http://psychology.iupui.edu/skew/milestn.htm "Writing a scientific paper and expecting an effect is like dropping a lotus petal into the Grand Canyon and waiting to hear an echo" -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Two sided test with the chi-square distribution?
On Thu, 8 Feb 2001, jim clark wrote in part: We all agree that it is confusing, but I do believe that the use of one-tailed and two-tailed to refer to directional vs. non-directional hypotheses (rather than uniquely to one or two tails of a distribution) is very wide-spread and quite common. There would not be a problem if the hypotheses in question were STATED. It's this sloppy habit of saying "F test" or "chi square test", with no hint of WHICH "F test" or "chi square test" one is talking about, that impedes communication. That is probably what led to the posting that initiated this thread. Yes. "I thought the chi-square test was always two-sided", or words to that effect, the querent wrote. He she or they have not, in all the correspondence since, said what the hypothesis being tested was. I had written: It is still possible to use the F _statistic_ to test the null hypothesis that Var1 = Var2, in circumstances where it is entirely possible that Var1 Var2, Var1 = Var2, or Var1 Var2. In such cases _both_ tails of the F distribution are of interest, not just the upper tail. --- and Jim replied: Right, but if one calculates F_larger/F_smaller, then one is only looking at the upper tail of the F distribution even though one is doing a non-directional test (i.e., two-tailed in the vernacular). The appropriate critical value for a non-directional test would be F_.05. Whoops! Not if you want to test at the usual 5% level! For a non-directional test of the null hypothesis that two variances are equal, the critical value would be F_(alpha/2). If you made a directional hypothesis and predicted which variance was going to be larger (as implied in F's use for anova and regression), then you would compare the obtained value of F to F_.10, not F_.05. I'll agree with you if you halve those subscripts! (Or acknowledge that you wanted to test at the 10% level...) You state that using F in ANOVA and regression imply that one had a _prediction_ of which variance would be larger. This is not how I understand the idea of "predicting", which I take to imply that one could have predicted something in the opposite direction. In ANOVA the null hypothesis _of interest_ is commonly expressed as "all the means are equal" (in some language or other), vs. "some of the means differ", and the alternative hypothesis is indeed non-directional -- in the metric of the subgroup means. But the hypothesis actually _tested_ (using F) is the null hypothesis that a particular variance component is zero, vs. the alternative that it isn't, and since a variance component cannot be negative, the alternative really is that the variance component in question is positive: thus in the metric of variances the alternative hypothesis is one-sided. This is a matter of algebra, not of "predicting" the direction of an effect. However, perhaps others are more willing to use "predict" in this rather sloppy (from my point of view ;-) fashion. You are using the upper tail (i.e., one-tail) of the distribution to test a directional (i.e., "one-tailed") hypothesis. Yes. Because a result in the _lower_ tail would tend to confirm the null hypothesis, not reject it. Like Don, I hope that language can become clearer on these issues, but my suspicion is that it will be a long, long time before one- vs. two-tailed stops meaning directional vs. non-directional alternative hypotheses for most people. I have no problem with that. I just wish that people would say what they're talking about: if it's a hypothesis test that is of concern, what is the hypothesis and what is the test statistic, for example. To say only "chi-square test" or "F test" or "z test" is simply insufficient. -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: ANOVA : Repeated Measures?
If for each Subject you have 4 Measures in each of the 3 Conditions, then both Conditions and Measures are repeated-measures factors: you design may be symbolized as S x C x M -- that is, Subjects (5 levels) are crossed with both Conditions and Measures. This design is equivalent to R(SxCxM) where R (Replications) is "the ubiquitous nested factor" as one author has put it, random with one level. (And since it only has one level, it has zero degrees of freedom and zero sum of squares; but using it formally often helps one to see what the proper error mean square would be for each effect modelled in the design, even if no such mean square is actually available in the data.) Your choices then are, for each of the three factors, whether to treat it as fixed or random. Conditions are presumably fixed -- they usually are, because they usually represent all the conditions one is interested in considering. (I can imagine wanting to treat them as a random sample of 3 drawn randomly from a population of possible experimental conditions, but that seems to me very unlikely.) Measures might go either way. If what they represent is a series of opportunities to observe the subjects' response to each condition, one might treat the factor as fixed, the levels representing the sequence (1st, 2nd, 3rd, 4th) in which the opportunities are presented. This would permit examining differences among the 4 levels as possibly reflecting learning (one becomes a little more skilled each time one is asked to respond to a condition, perhaps?), or fatigue (after one has done it once, the action starts to become boring or otherwise wearisome), or a kind of resultant between learning and fatigue. Or, if you really think it reasonable to model each encounter as equivalent to each other encounter (in the same Condition), and the only variation among levels of Measure is random replication variance, Measure might be treated as random. Subjects are usually treated as random, because one usually wants to generalize to a population of subjects "like these", and one may even have selected the Ss randomly from a pool of potential Ss for the experiment. But you haven't very many Subjects, and perhaps you want to model individuial differences between them of some kind or other; or, for some as yet unspecified reason, you are interested only in these particular Ss and not in a population of Ss which they might be argued to represent; in either of which cases you may wish to treat Ss as fixed. Of course, to carry out _any_ tests of hypotheses, at least one of the three factors must be declared random, or you will have no legitimate error mean square against which to test the hypothesis mean square for any of the possible effects. In terms of your three possibilities: (a) has C and S fixed, M random; (b) has C and M fixed, S random (although I don't think it correct to describe S as a "repeated-measure" factor: in my lexicon, a "repeated measure" factor is any factor in a design that is _crossed with_ S); (c) has C fixed, S and M random. It may be informative to carry out more than one formal analysis, using different fixed/random choices. This would tell you what results are robust with respect to those choices, and what results depend on how you choose to treat one or another of the formal factors. In case it's useful, here is a table of the proper error mean squares for each effect: Error mean square under Source(a) (b) (c) C CM CS (CS + CM - CSM) S SM -- SM M -- SM SM CS CSM -- CSM CM -- CSM CSM SM -- -- CSM CSM-- -- -- (Where the entry is "--", the proper error mean square would be R(SCM), if it were available. In its absence, one could use the mean square for CSM, making the assumption that there is no 3-way interaction -- that may or may not be a reasonable assumption to make.) -- DFB. On Fri, 9 Feb 2001, Sylvain Clment wrote: We have data from an experiment in psychology of hearing. There are 3 experimental conditions (factor C). We have collected data from 5 subjects (factor S). For each subject we get 4 measures of performance (M for Measure factor) in each condition. What is the best way to analyse these data? We've seen these possibilities : a) ANOVA with repeated measures with 2 fixed factors : subjects conditions and the different measures as the repeated measure factor (random factor). b) ANOVA with two fixed factor (condition measure) and a random factor (repeated measure- subject factor). c) ANOVA with one fixed factor (condition) and the other two as random. snip, arguments in favor of one or another of these
Re: Two sided test with the chi-square distribution?
On Tue, 6 Feb 2001, jim clark wrote in part: The problem is that one-tailed test is taken as synonymous with directional hypothesis (e.g., Ha: Mu1Mu2). This causes no confusion with distributions such as the t-test, because directional implies one-tailed. This correspondence does not hold for other statistics, such as the F and Chi2. The statement is not correct. The correspondence certainly holds for F and chi-square _statistics_. What it seems not to hold for is certain particular hypothesis tests for which those statistics are the commonly used test statistics. The "large F" Jim speaks of below celarly refers to an analysis of variance (and one with only two groups, at that!). In that context, while the hypotheses _of interest_ are the null hypothesis that the several means Mu_j are identical, vs. the two-sided alternative hypothesis that some of them are different, the formal hypothesis tested by the F statistic is the null hypothesis that a certain variance component equals zero, vs. the alternative hypothesis that it does not equal zero; and since a variance component cannot be negative, the _test_ is one-sided, in the metric of variances: one rejects only for F sufficiently greater than 1 for the result to be improbable under the null hypothesis. It is still possible to use the F _statistic_ to test the null hypothesis that Var1 = Var2, in circumstances where it is entirely possible that Var1 Var2, Var1 = Var2, or Var1 Var2. In such cases _both_ tails of the F distribution are of interest, not just the upper tail. Similarly, one may use Chi-square to test the null hypothesis that a variance has a specified value, and wich to reject if the evidence leads one to believe that the true value is LESS, OR if the true value is GREATER, than the value specified. One can get a large F by either Mu1Mu2 or Mu1Mu2 (or by positive or negative R, ...). Therefore the one-tail of the distribution corresponds (normally) to a two-tailed or non-directional test. However, there is absolutely nothing wrong with making the necessary adjustment to make the test directional (i.e., equivalent to the one-tailed t-test), and therefore referring to it (confusingly, of course) as a one-tailed test. On this point, one must agree with Thom: such a use of language can only be confusing, as you acknowledge. "Newspeak", it was called in "1984". -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: unequal n's: quadratic weights
On Tue, 30 Jan 2001, Kathleen Bloom wrote: If you have unequal n's, and want to determine linear parameters, you can develop new coefficients by taking the normal unweighted coefficients (e.g., -1, 0, +1, for three group design) and the formula: n1(X1) + n2(X2) + n3(X3)/ n1+n2+n3 where the X's are 1, 2, and 3 because you have 3 groups. This gives you a new mean of the Xs... (i.e., no longer 1+2+3/3 = 2), and from there you calculate the new coefficients (e.g., 1 - ?, 2 - ?, 3 - ?, gives you the new linear coefficients) for the 3-group design with unequal n's. You get the same results if you use X's corresponding to the unweighted coefficients (-1, 0, +1). I should suppose that for quadratic estimates you'd play the same game with the quadratic unweighted coefficients (+1, -2, +1). However, I've never played around much with weighted trend analyses, so my supposition may possibly be incorrect. It may be better to retain the grand mean calculated from (-1,0,+1), equal to your "?" minus 2 (let's call that ""), and generate coefficients from the unweighted quadratic coefficients as 1-, -2-, 1-. I note in passing that your decision to pursue weighting-by-sample-size implies that you have decided to assign equal weight to individual cases and NOT to assign equal weight to each subgroup. (Had you chosen equal weight for each subgroup, you'd use the "unweighted" (they're not really UNweighted, they're _equally_ weighted) coefficients directly.) I've not encountered situations where it seemed necessary to give equal importance to each individual case, enough to make it worth the extra effort to weight the coefficients -- and think about what it means, that the "grand mean" for the data depends on which trend you're currently pursuing, and that the several trends (linear, quadratic, cubic, ...) are explicitly NOT orthogonal. From there you can do things like determine the the weighted linear estimated parameters. They are given in the spss oneway printout... as I understand it... i.e., the weighted (for sample size) beta for the linear contrast. Notice that all this does is change the distance between each group mean and the grand mean; it does not change the relative distances between groups, which are still equally spaced. It has never been very clear to me what advantage one gets from the weighted parameters, especially as those estimates depend on the accident of how many observations you were able to find for each group. For this reason (among others) I am inclined to favor equal weighting in general. If it turns out that the choice of weighting influences the conclusion(s) to be drawn, one has compelling arguments for repeating the experiment, this time with a proper (equal-numbers-of-cases) design and carefully random selection of cases. snip ... My means are 2.05, 6.38, and 12.08 for the three groups respectively. In other words.. what does one calculate and how? You might reasonably try using the regression module (rather than one-way anova) to compare output: predict Y from X1 (X1 = 1,2,3 for the three groups, and X1 = -1,0,+1 if you want to confirm that the results are the same for this coding); and for an alternate (quadratic) model, predict Y from X1 and X2 (X1 = -1,0,+1; X2 = +1,-2,+1). You have not, by the way, said what you're doing this analysis FOR, so it's a bit difficult to know whether one is offering useful advice. Or not. -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Margin Analysis Qstn
On Mon, 29 Jan 2001, Chris wrote in part: My current job requires me to analyze margins from the sales of various products and provide an average for each during the quarter. I am using a very large sample of all product sales by month. (Margin, i.e. not markup. For those not familiar, markup is what a business does to receive Margin. Margin is a measure of profitability. A typical calculation for margin is, (Unit Resale Price - Unit Cost) / Unit Resale Price ). snip, sample data Assuming a normal distribution, what method should I use to calculate my averages? Should I simply take the sample mean? Should I remove anomalies like 112% margins? Should I calculate upper and lower control limits and place my data into a normal curve? As Rich Ulrich implied, why would you wish to assume a normal distribution? As to what kind of average to compute, what will you (or your superior, or client, as the case may be) _do_ with the average once you have it? If it is to be related to anything like total profit (or revenue?) during the period represented in the data, you'd about have to multiply by sales volume before averaging, for instance. As to the 112% margin, I take it that you don't have the underlying resale value or cost, else you could calculate the margin directly. Basically, you have two choices: (1) discard the anomaly whenever you encounter one; (2) guess what error in logic or arithmetic led to the anomalous value, and correct it (112% might not be so unreasonable if the denominator had been unit cost instead of unit resale price, for example). Option (2) is _always_ chancy, but may be viable if you have something better than a guess to go on. -- DFB. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: change scores
On Fri, 26 Jan 2001, Rich Ulrich quoted me: DB: What most people who use "ordinal" and "disordinal" seem to mean is a plot of the cell means (or of regression lines), with no adjustment for main effects: so, a display that includes the interaction AND the main effects. I take it that's what you mean here. and replied: Yes. Just like "most people," I use the definition that has draws a distinction, instead of the one that does not. Why do you prefer the one that does not? Mostly because that was the formal definition given in the textbooks I learned from, donkey's years ago... and because I think it useful to be able to distinguish between main effects and interactions (an interaction being a systematic effect among cell means (in ANOVA) that is not accounted for by the main effects in the design; a corresponding definition can be written, mutatis mutandis, for regression contexts). DB: Then: a disordinal display -- of what plot? (As remarked in a thread a year or two ago, an interaction (displayed as a plot of cell means or of regression lines) may appear ordinal from one direction and disordinal from the other.) - I remember someone claimed that. (Oui, moi. -- DB) I remember an example that failed to make the point. I don't remember a valid example, or that the point was generally accepted. - I hope this is not a failure of my memory. But if it's my problem, I hope you will reproduce the illustration, or cite it somewhere. As requested. Consider the two-way table of cell means below: B1 B2 A110 20 A240 30 40 - 1 40 - 2 - - 30 - 2 30 - 2 - - 20 - 220 - 1 - - 10 - 110 - 1 - - 0 ---+--+--- ---+--+--- A1 A2 B1 B2 Plotting Y-bar vs. A, we have the left-hand diagram (plotting symbols are levels of B); plotting Y-bar vs. B, we have the right-hand diagram above (symbols are levels of A). The left-hand plot is disordinal (B2 B1 at A1, but B1 B2 at A2), the right-hand plot is ordinal (A1 A2 at both levels of B). Rich continues: The only effect that is never potentially artifactual is the crossover of the means, the Disordinal interaction (as most of us define it). I take it you must mean, whenever an interaction plot is disordinal in any orientation? I have some mild difficulty with this, since AFAIK the idea of "(dis)ordinal" has not been extended beyond two-way interactions, and more complex situations may well be of interest... That one that can't be explained as measurement error (such as, strong regression owing to poor reliability); scaling (such as, ceiling effects); or "regression" towards the conditional expected values (such as, the real-life example I just cited). -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: A-D in matlab
On Sun, 28 Jan 2001, Veeral Patel wrote in part: Out of curiousty i decided to write a small prog to perform the A-D test in matlab for the gumbel distribution. Obtaining the gumbel parameters is easy. however the difficulty is in the actual A-D computation formula as stated by Stephens(1977). A2=-[Sum(i=1,n)(2i-1){logzi+log(1-zn+1-i)}]/n-n whats the point of /n and -n since they both cancel out anyway. Uh-huh. Perhaps you need to consider the difference between A2 = - [long expression]/n - n and A2 = - [long expression]/(n - n). The latter of course is infinitely large. In your program, you appear not to have divided by n, which surely means that your value of A2 will be n times too large, even if you have correctly computed [long expression]. -- DFB. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 Department of Mathematics, Boston University[EMAIL PROTECTED] 111 Cummington Street, room 261, Boston, MA 02215 (617) 353-5288 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =