Avoiding Linear Dependencies in Artificial Data Sets
Hi I like to use small, artificially generated data sets with integer parameters to introduce analyses. Often, however, I find it difficult to avoid undesirable contingencies among the scores (e.g., linear dependencies in within-subject designs). Is there an algorithmic way to generate such scores and avoid such dependencies? Here is a small example with 4 scores for each of 5 subjects. The following analysis reveals the undesirable linear dependencies. I'm assuming the dependencies arise from the noise vectors that I used to generate the cell scores by adding them to the main effect of the factor and the subject effects. Is there a systematic way to create such noise vectors to avoid linear dependencies? data list free / subj vl lo hi vh begin data 1 3 3 5 52 1 3 7 9 3 6 8 8 10 4 7 8 6 7 5 3 3 9 9 end data manova vl lo hi vh /wsf = conc(4) /print = cell /contr(conc) = poly - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Cell Means and Standard Deviations Variable .. VL Mean Std. Dev. N 95 percent Conf. Interval For entire sample 4.000 2.449 5 .959 7.041 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Variable .. LO Mean Std. Dev. N 95 percent Conf. Interval For entire sample 5.000 2.739 5 1.600 8.400 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Variable .. HI Mean Std. Dev. N 95 percent Conf. Interval For entire sample 7.000 1.581 5 5.037 8.963 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Variable .. VH Mean Std. Dev. N 95 percent Conf. Interval For entire sample 8.000 2.000 5 5.517 10.483 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Tests of Between-Subjects Effects. Tests of Significance for T1 using UNIQUE sums of squares Source of Variation SS DFMS F Sig of F WITHIN CELLS 40.00 4 10.00 CONSTANT 720.00 1720.00 72.00 .001 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Estimates for T1 --- Individual univariate .9500 confidence intervals CONSTANT Parameter Coeff.Std. Err. t-Value Sig. t Lower -95%CL- Upper 1 12.00 1.41421 8.48528 .00106 8.07351 15.92649 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - * * * * * * * * * * * * * * * * * A n a l y s i s o f V a r i a n c e -- Design 1 * * * * * * * * * * * * * * * * * Tests involving 'CONC' Within-Subject Effect. Mauchly sphericity test, W = .0 Chi-square approx. = . with 5 D. F. Significance = . Greenhouse-Geisser Epsilon = .40650 Huynh-Feldt Epsilon = .49123 Lower-bound Epsilon = .3 AVERAGED Tests of Significance that follow multivariate tests are equivalent to univariate or split-plot or mixed-model approach to repeated measures. Epsilons may be used to adjust d.f. for the AVERAGED results. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * W A R N I N G * The WITHIN CELLS error matrix is SINGULAR. * * * These variables are LINEARLY DEPENDENT * * * on preceding ones ..* * * T3* * * Multivariate tests will be skipped. * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 07:51:26The University of Winnipeg SUN SPARCSolaris * * * * * * * * * * * * * * * * * A n a l y s i s o f V a r i a n c e -- Design 1 * * * * * * * * * * * * * * * *
IRT/Rasch Modeling with SAS?
Hi all, I am working on a dissertation that analyzes some international tests of mathematics achievement. I need to use the responses (which can be considered "correct"/"incorrect") and estimate an IRT (Item Response Theory) model to describe the test. In a nutshell, assume the test measures a single trait in an individual. Then the IRT curve that I am looking for (something they call a 3-parameter logistic, which I think is not a 100% correct name) is described by the following function (best viewed in a fixed-width font): 1 - c P(T)= c + --- -1.7 a(T - b) 1 + e This curve takes a person's ability T and produces the probability that a person with that trait will answer the question correctly. The main problem, of course, is that T is unknown, as well as the three parameters a, b, and c. So the estimation problem is quite tricky. I can't find a reference that tells me exactly the recipe for finding it, but the best I can tell is that the algorithm would start with an initial guess for T, fit the curve parameters a, b, and c, then use this curve to re-estimate T. The process repeats until some convergence criterion is reached. Does anyone know if SAS will do this? I have found a piece of software that claims to fit "Rasch models", but the classical Rasch model is a one-parameter version of what I'm looking for (set b and c to zero, and you have a Rasch model). Plus, the software costs about $1000, and I don't have that to spare. The software (one called "BIGSTEPS" is the only one I can find that will deal with the 89,000 students I have to deal with) is not exactly "Microsoft Bob" in its ease of use. This whole IRT/Rasch area is brand new to me, so I may be asking the wrong crowd, but if anyone has any SAScode or guidance, I'd sure like to hear it. -- ` ___ ' - (O o) - --ooO--(_)--Ooo--- _ __ __ _ | | \/ | __ \ ® Lee Creighton | | \ / | |__) | SAS Statistical Instruments _ | | |\/| | ___/ | |__| | | | | |[EMAIL PROTECTED] \/|_| |_|_| 5275R SAS Campus Drive (919) 531-3755 Statistical Discovery Software = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical
Hi On 12 Mar 2001, Radford Neal wrote: Yes indeed. And the context in this case is the question of whether or not the difference in performance provides an alternative explanation for why the men were paid more (one supposes, no actual salary data has been released). In this context, all that matters is that there is a difference. As explained in many previous posts by myself and others, it is NOT appropriate in this context to do a significance test, and ignore the difference if you can't reject the null hypothesis of no difference in the populations from which these people were drawn (whatever one might think those populations are). Personally, I am not interested in the question of statistical testing to dismiss the alternative explanation being proposed; indeed, I suspect that the original claim about gender being the cause of salary differences would not stand up very well either to statistical tests. But there does seem to me to be more than just saying ... "see there is a difference" and that statistical procedures would have a role to play. For example, wouldn't the strength and consistency of the differences influence your confidence that this was indeed the underlying factor? The same difference in means due to one or two outliers would surely not mean the same thing as a uniform pattern of productivity differences, would it? And wouldn't you want to demonstrate that there was a significant and ideally strong within-group relationship between productivity and salary before claiming that it is a reasonable alternative for the between-group differences? Or at least, wouldn't that strengthen the case? I appreciate that in some domains (e.g., intelligence testing), people are reluctant to make inferences about between-group differences on the basis of within-group correlations, but that is the basic logic of ANCOVA and related methods. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
At 02:25 PM 3/12/01 +, Radford Neal wrote: In this context, all that matters is that there is a difference. As explained in many previous posts by myself and others, it is NOT appropriate in this context to do a significance test, and ignore the difference if you can't reject the null hypothesis of no difference in the populations from which these people were drawn (whatever one might think those populations are). the problem with your argument is this ... now, whether or not formal inferential statistical procedures are called for ... if there is a difference in salary ... and differences in any OTHER factor or factors ... one is in the realm of SPECULATION as to what may or may not be the "reason" or "reasons" for THAT difference in other words ... any way you say that the difference "may be explained by" is a hypothesis you have formulated ... so, in this general context ... it still is a statistical issue ... that being, what (may) causes what ... and, this calls for some model specification ... that links difference in salaries TO differences in other factors/variables if we do not view it as some kind of a statistical model ... then we are in no position to really talk about this case ... not in any causal or quasi causal way ... and, i thought that was the main purpose of this entire matter ... what LEAD to the gap in salaries?? ... was it something based on merit? or something based on bias? i don't see how else we could check up on these kinds of issues other than some statistical questions being asked ... then tested in SOME fashion (though i am not specifying exactly how) Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ = _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
One tailed vs. Two tailed test
Hi, all, We are testing a group of subjects on their performance in two different conditions (say, A and B), and we are testing them individually. We have an alternative hypothesis that reaction time in condition A should be longer than in condition B, so we perform a one-tailed t test. However, for some subjects, they showed the pattern reverse to our alternative hypothesis--RT B RT A, and the p value is significant under one tailed test. Could we claimed that these "reversed" subjects showed "significant" results in the opposite direction, or we should treat them as non-significant results? Thanks, Erik = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Avoiding Linear Dependencies in Artificial Data Sets
It isn't actually that easy, in the sense that most data humans make up has a low efficiency with respect to design criteria -- the determinant of the cross-product matrix tends to be small. The simplest way is to use a computer program that calculates algorithmic designs. jim clark wrote: Hi I like to use small, artificially generated data sets with integer parameters to introduce analyses. Often, however, I find it difficult to avoid undesirable contingencies among the scores (e.g., linear dependencies in within-subject designs). Is there an algorithmic way to generate such scores and avoid such dependencies? Here is a small example with 4 scores for each of 5 subjects. The following analysis reveals the undesirable linear dependencies. I'm assuming the dependencies arise from the noise vectors that I used to generate the cell scores by adding them to the main effect of the factor and the subject effects. Is there a systematic way to create such noise vectors to avoid linear dependencies? snip -- Bob Wheeler --- (Reply to: [EMAIL PROTECTED]) ECHIP, Inc. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One tailed vs. Two tailed test
auda wrote: Hi, all, We are testing a group of subjects on their performance in two different conditions (say, A and B), and we are testing them individually. We have an alternative hypothesis that reaction time in condition A should be longer than in condition B, so we perform a one-tailed t test. However, for some subjects, they showed the pattern reverse to our alternative hypothesis--RT B RT A, and the p value is significant under one tailed test. Could we claimed that these "reversed" subjects showed "significant" results in the opposite direction, or we should treat them as non-significant results? If you do a one-tailed test, no. The fact that you are entertaining this possibility suggests you should be using a two-tailed test. The one-tailed test has no power to detect differences in the discounted (non-predicted) direction hence should only be used when you would reject such a finding a priori. I'm a bit puzzled as to why you test each participant individually? You'd expect (unless the effect is huge) for some participants to go against the average pattern. If you do need to test each person individually you need to use the two-tailed non-directional test and use a correction for multiple testing (e.g., Bonferonni or similar). Thom = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
Jim: I agree with Radford Neal's comments, and urge careful reconsideration of the foundation behind some of the comments made. For example, suppose you had a department in which the citation data were Males Females 12220 1298 2297 1102 The male with 12220 is, let's imagine, a Nobel Prize winner. The salaries for the 4 people are Males Females 156,880 121,176 112,120 114,324 The females approach the dean of science and declare that there is discrimination against them. They've measured the labs, and the men have more space. Moreover, they feel marginalized and depressed, as their status has been slowly slipping in the department. Moreover, they are paid less than men of the same age. Careful examination of mean salary shows that the mean salaries are 134,500 for men and only 117,750 for women. With great brouhaha, the administration, without publishing the above data, declares that there was a discrimination problem, and it was addressed by giving both the women a 16,000 raise. As Radford Neal has pointed out succinctly, the argument about outliers is irrelevant, and I want to emphasize with this example that it is irrelevant on numerous levels. First of all, it is not necessarily clear whether, and in which of several senses, our Nobel Prize winner is an outlier in his group. Second, even if he is -- so what? Surely you would not argue that this means he didn't deserve his salary! In fact, careful examination of the salary data [never made public by the administration] together with the performance data might well have led to the conclusion that it is the male faculty who are underpaid. Although, as Dr. Neal pointed out, it is not logically relevant to the issue, I would like to explore your notion, echoed without justification by Rich Ulrich, that the huge difference in citation performance between MIT senior men and women might be due to "one or two outliers." Take a look at the data again, and tell me which male data you consider to be outliers within the male group, and why. For example, are the men with 2133 and 893 "outliers," or those with 12830 and 11313? The data for the senior men and women: 12 year citation counts: MalesFemales -- 128302719 113131690 106281301 43961051 2133 935 893 --- As for the notion of exploring the relationship between salary, gender, and performance -- I'd be more than happy to examine any data that MIT would make available. They will, of course, not make such data available. It is too private, they say. Best regards, Jim Steiger -- James H. Steiger, Professor Dept. of Psychology University of British Columbia Vancouver, B.C., V6T 1Z4 - Note: I urge all members of this list to read the following and inform themselves carefully of the truth about the MIT Report on the Status of Women Faculty. Patricia Hausman and James Steiger Article, "Confession Without Guilt?" : http://www.iwf.org/news/mitfinal.pdf Judith Kleinfeld's Article Critiquing the MIT Report: http://www.uaf.edu/northern/mitstudy/#note9back Original MIT Report on the Status of Women Faculty: http://mindit.netmind.com/proxy/http://web.mit.edu/fnl/ On Mon, 12 Mar 2001 08:55:17 -0600, jim clark [EMAIL PROTECTED] wrote: Hi On 12 Mar 2001, Radford Neal wrote: Yes indeed. And the context in this case is the question of whether or not the difference in performance provides an alternative explanation for why the men were paid more (one supposes, no actual salary data has been released). In this context, all that matters is that there is a difference. As explained in many previous posts by myself and others, it is NOT appropriate in this context to do a significance test, and ignore the difference if you can't reject the null hypothesis of no difference in the populations from which these people were drawn (whatever one might think those populations are). Personally, I am not interested in the question of statistical testing to dismiss the alternative explanation being proposed; indeed, I suspect that the original claim about gender being the cause of salary differences would not stand up very well either to statistical tests. But there does seem to me to be more than just saying ... "see there is a difference" and that statistical procedures would have a role to play. For example, wouldn't the strength and consistency of the differences influence your confidence that this was indeed the underlying factor? The same difference in means due to one or two outliers would surely not mean the same thing as a uniform pattern of productivity differences, would it? And wouldn't you want to demonstrate that there was a significant and ideally strong within-group relationship between productivity and salary before claiming that it is a reasonable alternative for the between-group differences? Or at least, wouldn't that
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical
Hi On Mon, 12 Mar 2001, Irving Scheffe wrote: Jim: For example, suppose you had a department in which the citation data were Males Females 12220 1298 2297 1102 When I said outlier, I had in mind hypothetical data of the following sort (it doesn't matter to me whether it is the salaries or the citation rates): MalesFemales 170001000 10001000 10001000 10001000 Avg 50001000 vs. Males Females 50001000 50001000 50001000 50001000 Avg 50001000 I would view the latter somewhat differently than the former with respect to differences between these samples of males and females, and with respect to the kinds of explanations I would seek (e.g., somewhat general to males, something specific to male 1). The male with 12220 is, let's imagine, a Nobel Prize winner. The salaries for the 4 people are Males Females 156,880 121,176 112,120 114,324 Of course if the salaries were: Males Females 112,120 121,176 156,880 114,324 You probably might want not to promote the hypothesis of productivity differences explaining the gender differences. That was the point of one of my later comments. As Radford Neal has pointed out succinctly, the argument about outliers is irrelevant, and I want to emphasize with this example that it is irrelevant on numerous levels. First of all, it is not necessarily clear whether, and in which of several senses, our Nobel Prize winner is an outlier in his group. Second, even if he is -- so what? Surely you would not argue that this means he didn't deserve his salary! Assuming a correlation between productivity and salary (or winning of Nobel prizes). In fact, careful examination of the salary data [never made public by the administration] together with the performance data might well have led to the conclusion that it is the male faculty who are underpaid. I'm in perfect agreement with this, although I still think that statistics would play a positive role in identifying the determinants of salary. Although, as Dr. Neal pointed out, it is not logically relevant to the issue, I would like to explore your notion, echoed without justification by Rich Ulrich, that the huge difference in citation performance between MIT senior men and women might be due to "one or two outliers." I don't remember making any such attribution. I asked a question about whether detractors of statistical testing would view equivalently differences due to some outliers and more consistent results, in the sense I showed above. I'm not sure it is any more palatable to have one's motives misconstrued by people arguing against gender-related bias than to have them misconstrued by people arguing for gender-related bias. Take a look at the data again, and tell me which male data you consider to be outliers within the male group, and why. For example, are the men with 2133 and 893 "outliers," or those with 12830 and 11313? Not having taken any position on it, I am not too sure I feel any compulsion to answer your question. I guess I would turn it around and say, would you interpret your results exactly the same as the modified results that I have presented below? The data for the senior men and women: 12 year citation counts: MalesFemales -- 128302719 113131690 106281301 43961051 2133 935 893 --- Average 7032 1539 Modified (Hypothetical ... for pedagogical purposes only ... no hidden agenda results ...) Males Females 34500 1500 1500 1500 1500 1500 1500 1500 1500 1500 1500 Avg 7000 1500 To me, these data are much less suggestive of general differences in productivity between males and females, would not be an adequate account of widespread (i.e., consistent or uniform across individuals) differences in salaries, and so on. Am I correct to assume that for you the consistency of the differences between the groups (which is what a statistical test measures) is completely irrelevant? Or are you implicitly engaging in inferential-like thinking when you examine the actual distributions? As for the notion of exploring the relationship between salary, gender, and performance -- I'd be more than happy to examine any data that MIT would make available. They will, of course, not make such data available. It is too private, they say. But were the data made available to you, would you use any statistical procedures in the examination? Would you care whether the differences in salary were significant? The differences in productivity? The differences in any number of potential confounding variables? What about the significance and strength of the relationships between predictors and salary? What about whether the gender difference was significant after productivity was
Re: One tailed vs. Two tailed test
auda wrote: Hi, all, We are testing a group of subjects on their performance in two different conditions (say, A and B), and we are testing them individually. We have an alternative hypothesis that reaction time in condition A should be longer than in condition B, so we perform a one-tailed t test. However, for some subjects, they showed the pattern reverse to our alternative hypothesis--RT B RT A, and the p value is significant under one tailed test. Could we claimed that these "reversed" subjects showed "significant" results in the opposite direction, or we should treat them as non-significant results? Don't do one-tailed tests. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: Avoiding Linear Dependencies in Artificial Data Sets
I'm not clear on what your design is but it seems that the problem is in the between S effect not within. Note that you only have 4 df within and 4 dependent variables = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Speaking of ANOVA in SPSS...
I'm trying to reduce all stats to a few simple procedures that students can do EASILY with available stats packages. A two-way ANOVA or an ANCOVA is as complex as I want to go. I thought SPSS would do the trick, but I was amazed to discover that it can't. Here's the example. I want students to convert repeated-measures data into unpaired t tests or non-repeated measures ANOVA, by using change scores between the time points of interest. That's no problem when there is just the group effect: the analysis becomes a simple unpaired t test. But when you have an extra between-subjects effect (e.g. males and females in the treatment and control groups) it becomes a two-way ANOVA. You make a column of change scores between the time points of interest (e.g., post and pre), and that's your dependent variable. The two independent effects are group (exptal and control, say) and sex (male and female). The group term gives the effect of the treatment averaged for males and females. Again, no problem there, but what I want is an appropriate customized contrast of the interaction term, which yields the difference in the overall effect between males and females. SPSS version 10 can't do it. I checked the on-line help, and it looks like you have to use the command language. Well really, what student is going to manage that? It's out of the question. Sure, you can get a p value for the interaction, but I want confidence limits for the difference between males and females. I've got my students to convert the p value, the degrees of freedom, and the observed value of the effect into confidence limits, but I shouldn't have to resort to that. I'd also like SPSS to do an ANCOVA, but again I want to do contrasts for the interaction, and again, they ain't there. Or did I miss something? If so, please let me know. And can you let me know of any simple, and preferably CHEAP or FREE, packages that will do what I want? Will -- Will G Hopkins, PhD FACSM University of Otago, Dunedin NZ Sportscience: http://sportsci.org A New View of Statistics: http://newstats.org Sportscience Mail List: http://sportsci.org/forum ACSM Stats Mail List: http://sportsci.org/acsmstats Be creative: break rules. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One tailed vs. Two tailed test
At 7:34 PM + 12/3/01, Jerry Dallal wrote: Don't do one-tailed tests. If you are going to do any tests, it makes more sense to one-tailed tests. The resulting p value actually means something that folks can understand: it's the probability the true value of the effect is opposite to what you have observed. Example: you observe an effect of +5.3 units, one-tailed p = 0.04. Therefore there is a probability of 0.04 that the true value is less than zero. There was a discussion of this notion a month or so ago. A Bayesian on this list made the point that the one-tailed p has this meaning only if you have absolutely no prior knowledge of the true value. Sure, no problem. But why test at all? Just show the 95% confidence limits for your effects, and interpret them: "The effect could be as big as upper confidence limit, which would mean Or it could be lower confidence limit, which would represent... Therefore... " Doing it in this way automatically addresses the question of the power of your study, which reviewers are starting to ask about. If your study turns out to be underpowered, you can really impress the reviewers by estimating the sample size you would (probably) need to get a clear-cut effect. I can explain, if anyone is listening... Will -- Will G Hopkins, PhD FACSM University of Otago, Dunedin NZ Sportscience: http://sportsci.org A New View of Statistics: http://newstats.org Sportscience Mail List: http://sportsci.org/forum ACSM Stats Mail List: http://sportsci.org/acsmstats Be creative: break rules. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: One tailed vs. Two tailed test
On Tue, 13 Mar 2001, Will Hopkins wrote in part: Example: you observe an effect of +5.3 units, one-tailed p = 0.04. Therefore there is a probability of 0.04 that the true value is less than zero. Sorry, that's incorrect. The probability is 0.04 that you would find an effect as large as +5.3 units (or more), if (a) the true value is zero and (b) the sampling distribution of the test statistic is what you think it is. (The probability of finding an effect this large, in this direction, is less than 0.04 if the true value is less than zero (and your sampling distribution is correct).) snip But why test at all? Just show the 95% confidence limits for your effects, and interpret them: "The effect could be as big as upper confidence limit, which would mean Or it could be lower confidence limit, which would represent... Therefore... " Doing it in this way automatically addresses the question of the power of your study, which reviewers are starting to ask about. If your study turns out to be underpowered, you can really impress the reviewers by estimating the sample size you would (probably) need to get a clear-cut effect. I can explain, if anyone is listening... You had in mind, I trust, the _two-sided_ 95% confidence interval! -- Don. -- Donald F. Burrill[EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 (603) 535-2597 184 Nashua Road, Bedford, NH 03110 (603) 471-7128 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =