Re: Student's t vs. z tests
In article [EMAIL PROTECTED], dennis roberts [EMAIL PROTECTED] wrote: students have enough problems with all the stuff in stat as it is ... but, when we start some discussion about sampling error of means ... for use in building a confidence interval and/or testing some hypothesis ... the first thing observant students will ask when you say to them ... assume SRS of n=50 and THAT WE KNOW THAT THE POPULATION SD = 4 ... is: if we are trying to do some inferencing about the population mean ... how come we know the population sd but NOT the mean too? most find this notion highly illogical ... but we and books trudge on ... and they are correct of course in the NON logic of this scenario thus, it makes a ton more sense to me to introduce at this point a t distribution ... this is NOT hard to do ... then get right on with the reality case I don't find this persuasive. I think that any student who has the abstract reasoning ability needed to understand the concepts involved will not have any difficult accepting a statement that "this situation doesn't come up often in practice, but we'll start with it because it's simpler". I have my doubts that introducing the t distribution is "NOT hard", if by that you mean that it's not hard to get them to understand what's actually happening. Of course, it's not very hard to get them to understand how to plug the numbers into the formula. I think one could argue that introducing the z test first is MORE realistic. The situation where there are "nuisance" parameters that affect the distribution of the test statistic but are in practice unknown is TYPICAL. It's just a lucky break that the t statistic doesn't depend on sigma. After seeing the z test, students will realize how lucky one is to have such a statistic, and will realize that one shouldn't expect that to happen all the time. (Well, the really good ones might realize all this.) Radford Neal = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
In article [EMAIL PROTECTED], Thom Baguley [EMAIL PROTECTED] wrote: Why not think of it in terms of "Could this difference be produced by 6 players of equal ability influenced by a large number of random factors". In that case a significance test might have some value in evaluating the hypothesis that one group was better. Recall that this baseball example was intended to clarify how one should go about determining whether or not there is reason to think that MIT discriminated against women faculty. From your comment, I'd guess that you think that MIT should not pay faculty based on their actual achievements, but rather on the basis of some estimate of their ability, disregarding "random factors". That's an interesting opinion, but would a policy of paying based on actual achievement (or a noisy estimate of actual achievement) constitute discrimination? Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: On inappropriate hypothesis testing. Was: MIT Sexism statistical bunk
In article [EMAIL PROTECTED], Rich Ulrich [EMAIL PROTECTED] wrote: I agree, if you don't have "statistical power," then you don't ask for a 5% test, or (maybe) any test at all. The JUSTIFICATION for having a test on the MIT data is that the power is sufficient to say something. The reason why one should NOT do a significance test on this data, at any level, and regardless of how much power the test would have, was explained by me a while ago in the post I have repeated below. If you think there is something wrong with my reasoning, I suggest you explain the flaw. Radford Neal -- I think the statistical issue in this discussion can be boiled down to a question of how to calculate standard errors for regression coefficients. What regression? Well, there isn't one, because there isn't any data, but the discussions seems to presuppose the possibility of data that for each faculty member gives their salary (the response variable, y), their gender (x1, coded as a dummy variable), and some indicator of performance (x2). The question is whether one has evidence that the regression coefficient for the dummy gender variable (x1) is non-zero. This will require computing the standard error for the estimate of this regression coefficient. The accepted procedure for computing this standard error involves the sample correlation between the two predictors, x1 and x2. When the sample correlation is high, the standard errors for the regression coefficients will tend to be high, making it more difficult to conclude that the coefficient for gender is non-zero. The procedure apparently being advocated by some posters is to perform a test of the null hypothesis that the correlation between x1 and x2 in the population is zero, and if there is not sufficient evidence to reject this null hypothesis, compute the standard errors for the regression coefficients as if the predictors were uncorrelated. I believe that this procedure is not generally accepted, for very good reasons. Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =