Fwd: Re: diff in proportions
This is true. I simulated the null distributions, those obtained when the null hypothesis is true, which is what the centered t-distribution represents. I didn't look at the sampling distributions for different effect sizes. >Date: Sat, 17 Nov 2001 00:19:06 -0600 >From: jim clark <[EMAIL PROTECTED]> >Subject: Re: diff in proportions >Sender: [EMAIL PROTECTED] >X-Sender: [EMAIL PROTECTED] >To: [EMAIL PROTECTED] >Organization: The University of Winnipeg >X-Authentication-warning: dex.pathlink.com: news set sender to > [EMAIL PROTECTED] using -f >Original-recipient: rfc822;[EMAIL PROTECTED] > >Hi > >On 16 Nov 2001, Rich Strauss wrote: >> I've just done some quick simulations in Matlab, constructing randomized >> null distributions of the t-statistic under both scenarious: (1) sample >> variances based on sample means vs. (2) variances about the pooled mean. >> Assuming I've done everything correctly, the result is that the null >> distribution of the t-statistic in the second case consistently >> approximates the theoretical t-distribution more closely that that of the >> first case. This seems to be true regardless of sample sizes and of >> whether the two sample sizes are identical or different. This result >> implies that the t-statistic should indeed be calculated about a pooled >> estimate of the common mean, as Jerry Dallal suggested. >> >> I could pass on the details of my simulation if anyone is interested, but >> mostly I'd appreciate it if someone could repeat this simulation >> independently of mine to see whether it holds up. > >This simply cannot be generally true. It probably only applies >when the null is in fact true, which may be the case for your >simulations. To appreciate the illogical nature of this >recommendation, consider creating a real difference of x between >your population means, then 2x, then 3x, and so on. By the >common mean approach, you are treating the variability between >groups as though it were noise (i.e., a component in your >estimate of sigma^2, the variance about the null-hypothesis of >a common mean). It is critical to keep in mind that the null >hypothesis is in fact just that, a hypothesis that may or may >not be correct. Computing the within-group variance about the >group means is the correct way to estimate sigma^2, however, >irrespective of whether the Ho about the means is true or not. > >Best wishes >Jim > > >James M. Clark (204) 786-9757 >Department of Psychology (204) 774-4134 Fax >University of Winnipeg 4L05D >Winnipeg, Manitoba R3B 2E9[EMAIL PROTECTED] >CANADA http://www.uwinnipeg.ca/~clark > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= > = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
Hi On 16 Nov 2001, Rich Strauss wrote: > I've just done some quick simulations in Matlab, constructing randomized > null distributions of the t-statistic under both scenarious: (1) sample > variances based on sample means vs. (2) variances about the pooled mean. > Assuming I've done everything correctly, the result is that the null > distribution of the t-statistic in the second case consistently > approximates the theoretical t-distribution more closely that that of the > first case. This seems to be true regardless of sample sizes and of > whether the two sample sizes are identical or different. This result > implies that the t-statistic should indeed be calculated about a pooled > estimate of the common mean, as Jerry Dallal suggested. > > I could pass on the details of my simulation if anyone is interested, but > mostly I'd appreciate it if someone could repeat this simulation > independently of mine to see whether it holds up. This simply cannot be generally true. It probably only applies when the null is in fact true, which may be the case for your simulations. To appreciate the illogical nature of this recommendation, consider creating a real difference of x between your population means, then 2x, then 3x, and so on. By the common mean approach, you are treating the variability between groups as though it were noise (i.e., a component in your estimate of sigma^2, the variance about the null-hypothesis of a common mean). It is critical to keep in mind that the null hypothesis is in fact just that, a hypothesis that may or may not be correct. Computing the within-group variance about the group means is the correct way to estimate sigma^2, however, irrespective of whether the Ho about the means is true or not. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 05:12 PM 11/16/01 +, you wrote: >>On Thu, 15 Nov 2001, Jerry Dallal wrote: >>> But, if the null hypothesis is that the means are the same, why >>> isn't(aren't) the sample variance(s) calculated about a pooled >>> estimate of the common mean? I've just done some quick simulations in Matlab, constructing randomized null distributions of the t-statistic under both scenarious: (1) sample variances based on sample means vs. (2) variances about the pooled mean. Assuming I've done everything correctly, the result is that the null distribution of the t-statistic in the second case consistently approximates the theoretical t-distribution more closely that that of the first case. This seems to be true regardless of sample sizes and of whether the two sample sizes are identical or different. This result implies that the t-statistic should indeed be calculated about a pooled estimate of the common mean, as Jerry Dallal suggested. I could pass on the details of my simulation if anyone is interested, but mostly I'd appreciate it if someone could repeat this simulation independently of mine to see whether it holds up. Rich Strauss = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
>On Thu, 15 Nov 2001, Jerry Dallal wrote: >> But, if the null hypothesis is that the means are the same, why >> isn't(aren't) the sample variance(s) calculated about a pooled >> estimate of the common mean? Another thought on this... A simpler question is, for a one-sample test of the hull hypothesis that the mean is zero, why don't we find a p-value based on something like a t statistic, but in which the variance is estimated by the average squared differences of the data points from zero, rather than from their sample mean? I investigated this once, and came to the conclusion that the final result (after finding the distribution of the test statistic, and calculating p-values on that basis) is no different from the usual t test. Perhaps the same is the case for a two-sample test, which would explain why no one talks about the possibility of doing it this way. Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
> Jerry Dallal wrote: > >But, if the null hypothesis is that the means are the same, why >isn't(aren't) the sample variance(s) calculated about a pooled >estimate of the common mean? I looked at this some years ago. The answer is straightforward: it would be logically valid to do so but you would lose a *lot* of power. A hypothesis test is essentially a proof by contradiction; in such an argument you are permitted to run with the hare and hunt with the hounds, changing sides as often as you like. Thus, at any stage, you may appeal to the null hypothesis or to the data; any inconsistency between the two, no matter how byzantine the argument, is evidence against the null. If you think about the two-sample-T as a two-level ANOVA (a roughly correct idea), the pooled estimate of the mean gives you the SST; the usual method gives you the SSE. As you expect the SSTr to be nonzero, you have SSE < SST and substituting one for the other is a Bad Thing. In an extreme case: A B 10 20 11 21 12 22 one method estimates the SD as 1, the other as 5.55. -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
In article <[EMAIL PROTECTED]>, Jerry Dallal <[EMAIL PROTECTED]> wrote: >Radford Neal wrote: >> The difference is that when dealing with real data, it is possible for >> two populations to have the same mean (as assumed by the null), but >> different variances. In contrast, when dealing with binary data, if >> the means are the same in the two populations, the variances must >> necessarily be the same as well. So one can argue on this basis that >> the distribution of the p-values if the null is true will be close to >> correct when using the pooled estimate (apart from the use of a normal >> approximation, etc.) >But, if the null hypothesis is that the means are the same, why >isn't(aren't) the sample variance(s) calculated about a pooled >estimate of the common mean? I suspect that much of the confusion comes from the overuse of the normal distribution. With a normal distribution, each sample has a mean and variance, and these are the sufficient statistics. Now SOME of this may carry over to SOME other problems, but when one is doing statistical inference, the probability model for the actual situation should be used, and there should not be an attempt to connect the inference with that from a normal model. In the case of the binomial, it is the case that the sample mean is a sufficient statistic. But it is not a "measure of central tendency", the individual Bernoulli trials are all 0 or 1. Do the actual problem, not force it into an inappropriate mold. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
In article <[EMAIL PROTECTED]>, dennis roberts <[EMAIL PROTECTED]> wrote: >At 08:03 PM 11/15/01 +, Radford Neal wrote: >>Radford Neal: >> >> The difference is that when dealing with real data, it is possible for >> >> two populations to have the same mean (as assumed by the null), but >> >> different variances. In contrast, when dealing with binary data, if >> >> the means are the same in the two populations, the variances must >> >> necessarily be the same as well. So one can argue on this basis that >> >> the distribution of the p-values if the null is true will be close to >> >> correct when using the pooled estimate (apart from the use of a normal >> >> approximation, etc.) >>Jerry Dallal: >> >But, if the null hypothesis is that the means are the same, why >> >isn't(aren't) the sample variance(s) calculated about a pooled >> >estimate of the common mean? >>An interesting question. >i think what this shows (ie, these small highly technical distinctions) is >that ... that most null hypotheses that we use for our array of >significance tests ... have rather little meaning >null hypothesis testing is a highly overrated activity in statistical work Agreed. The question is how to act. >in the case of differences between two proportions ... the useful question >is: i wonder how much difference (since i know there is bound to be some >[even though it could be trivial]) there is between the proportions of A >population versus B population? Now this is a difficult problem. It is only in translation parameter problems that it is even clear that this is what should be asked. Confidence intervals for a binomial proportion are a major headache, although for large samples, the usual asymptotic expressions give a good approximation. >to seek an answer to the real question ... no notion of null has to even be >entertained If the means are far apart, one definitely should NOT use the pooled mean to estimate the precision; the estimate of precision from that is always too large. If the means are close, the difference might be unimportant. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
Hi On Thu, 15 Nov 2001, Jerry Dallal wrote: > But, if the null hypothesis is that the means are the same, why > isn't(aren't) the sample variance(s) calculated about a pooled > estimate of the common mean? What you are testing is whether there is more variability between groups than you would expect by chance given the variability within groups. This is most clear with the F test, of course, (i.e., F = n*Vmeans/Vwithin) but t is simply a variation of this. Is the difference between X1 and X2 (i.e., variation in Xjs) greater than expected given variation within groups. Taking the common mean to calculate a variance would conflate the within and between group factors that you want to contrast. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 08:03 PM 11/15/01 +, Radford Neal wrote: >Radford Neal: > > >> The difference is that when dealing with real data, it is possible for > >> two populations to have the same mean (as assumed by the null), but > >> different variances. In contrast, when dealing with binary data, if > >> the means are the same in the two populations, the variances must > >> necessarily be the same as well. So one can argue on this basis that > >> the distribution of the p-values if the null is true will be close to > >> correct when using the pooled estimate (apart from the use of a normal > >> approximation, etc.) > >Jerry Dallal: > > >But, if the null hypothesis is that the means are the same, why > >isn't(aren't) the sample variance(s) calculated about a pooled > >estimate of the common mean? > > >An interesting question. i think what this shows (ie, these small highly technical distinctions) is that ... that most null hypotheses that we use for our array of significance tests ... have rather little meaning null hypothesis testing is a highly overrated activity in statistical work in the case of differences between two proportions ... the useful question is: i wonder how much difference (since i know there is bound to be some [even though it could be trivial]) there is between the proportions of A population versus B population? to seek an answer to the real question ... no notion of null has to even be entertained == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
Radford Neal wrote: > > The difference is that when dealing with real data, it is possible for > two populations to have the same mean (as assumed by the null), but > different variances. In contrast, when dealing with binary data, if > the means are the same in the two populations, the variances must > necessarily be the same as well. So one can argue on this basis that > the distribution of the p-values if the null is true will be close to > correct when using the pooled estimate (apart from the use of a normal > approximation, etc.) > But, if the null hypothesis is that the means are the same, why isn't(aren't) the sample variance(s) calculated about a pooled estimate of the common mean? = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
Dennis Roberts wrote: > > At 08:51 AM 11/15/01 -0600, jim clark wrote: > > >The Ho in the case of means is NOT about the variances, so the > >analogy breaks down. That is, we are not hypothesizing > >Ho: sig1^2 = sig2^2, but rather Ho: mu1 = mu2. So there is no > >direct link between Ho and the SE, unlike the proportions > >example. > > would it be correct then to say ... that the test of differences in > proportions is REALLY a test about the differences between two population > variances? No, because it would reject the null (with large enough samples) when pi_1 = 1-pi_2, despite the fact that the variances would be equal! -Robert Dawson = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
I'm not really arguing for using the pooled stdev in this case, I'm just trying to find out the reasons for significance testing procedures. I think that what were discussing here is if we should use CIs BOTH for stating effect sizes with errors AND for hypoyhesis testing. I read a book by Michael Smithson called Statistics with Confidence (SAGE, 2000). He's using CIs through the whole book in formulations of hypothethis testing. It was really nice reading and I believe students would appreciate the clearness of using fewer formulae for SEs. But then I think we also have to kill darlings like Pearson's Chi Sq. Rolf D > At 04:26 PM 11/15/01 +0100, Rolf Dalin wrote: > > > >The significance test produces a p-value UNDER THE CONDITION > >that the null is true. In my opinion it does not matter whether we > >know it isn't true. It is just an assumption for the calculations. And > >these calculations do not produce exactly the same information as the CI > >for the difference. They state in some sense, if the procedure was > >repeted, how probable it would be to ... etc. > > this might make sense if the sample p*q values were the same for BOTH > samples ... but if they are not (which will almost always be the case in > real data) ... then you already have SOME evidence that the null is > perhaps not true (of course, we know that it is not exactly true anyway > ... so that sort of tosses out the notion of pooling so as to get a better > estimate of a COMMON variance) > > earlier in their presentation, moore and mccabe say that they prefer to > use a CI to test some null in this case ... but, if one did a z test with > the unpooled estimator for standard error, this would lead to a "valid" > significance test ... HOWEVER ... then they go on to say that INSTEAD, > they will adopt the pooled standard error approach since it is the " ... > more common practice" > > that logic escapes me > > if we can build a CI using the un pooled standard error formula and, find > that to be ok to see if some null value like 0 difference in population > proportions is inside or outside of the CI, i don't see any need to switch > the denominator formula in the z test JUST because we want to use the z > test STATISTIC to test the null > > a little more consistency in logic would seem to be in the best interests > of students trying to learn this ... > > i would still argue that the extent to which you would not be willing to > use the pooled standard error formula in the case of differences in means, > would be the same extent to which you would not be willing to use the > pooled standard error formula when it comes to differences in proportions > ... i don't see that the logic really is any different > > but, this is just my opinion > > > _ > dennis roberts, educational psychology, penn state university > 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] > http://roberts.ed.psu.edu/users/droberts/drober~1.htm > ** Rolf Dalin Department of Information Tchnology and Media Mid Sweden University S-870 51 SUNDSVALL Sweden Phone: 060 148690, international: +46 60 148690 Fax: 060 148970, international: +46 60 148970 Mobile: 0705 947896, intnational: +46 70 5947896 mailto:[EMAIL PROTECTED] http://www.itk.mh.se/~roldal/ ** = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
In article <[EMAIL PROTECTED]>, dennis roberts <[EMAIL PROTECTED]> wrote: >in the moore and mccabe book (IPS), in the section on testing for >differences in population proportions, when it comes to doing a 'z' test >for significance, they argue for (and say this is commonly done) that the >standard error for the difference in proportions formula should be a POOLED >one ... > >in their discussion of differences in means ... they present FIRST the NON >pooled version of the standard error and that is there preferred way to >build CIs and do t tests ... though they also bring in later the pooled >version as a later topic (and of course if we KNEW that populations had the >same variances, then the pooled version would be useful) > >it seems to me that this same logic should hold in the case of differences >in proportions The difference is that when dealing with real data, it is possible for two populations to have the same mean (as assumed by the null), but different variances. In contrast, when dealing with binary data, if the means are the same in the two populations, the variances must necessarily be the same as well. So one can argue on this basis that the distribution of the p-values if the null is true will be close to correct when using the pooled estimate (apart from the use of a normal approximation, etc.) Radford Neal Radford M. Neal [EMAIL PROTECTED] Dept. of Statistics and Dept. of Computer Science [EMAIL PROTECTED] University of Toronto http://www.cs.utoronto.ca/~radford = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
dennis roberts wrote: > > in the moore and mccabe book (IPS), in the section on testing for > differences in population proportions, when it comes to doing a 'z' test > for significance, they argue for (and say this is commonly done) that the > standard error for the difference in proportions formula should be a POOLED > one ... since if one is testing the null of equal proportions, then that > means your null is assuming that the p*q combinations are the SAME for both > populations thus, this is a case of pooling sample variances to estimate a > single common population variance > > but since this is just a null ... and we have no way of knowing if the null > is true (not that we can in any case) ... i don't see any logical > progression that would then lead one to also assume that the p*q > combinations are the same in the two populations ... hence, i don't see why > the pooled variance version of the standard error of a difference in > proportions formula would be the recommended way to go > > in their discussion of differences in means ... they present FIRST the NON > pooled version of the standard error and that is there preferred way to > build CIs and do t tests ... though they also bring in later the pooled > version as a later topic (and of course if we KNEW that populations had the > same variances, then the pooled version would be useful) > > it seems to me that this same logic should hold in the case of differences > in proportions > Either form is valid, that is, either produces a test of the requisite size under the null. To my knowledge, neither test has been proven uniformly superior in terms of power. There are some alternatives where each is the better. While I don't have the text and it may be using a version of the test that is different from the way I usually see it constructed, the way it's typically formulated, the square of the pooled statistic is equal to the usual Pearson chi-square statistic for homogeneity of proportions. = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 08:51 AM 11/15/01 -0600, jim clark wrote: >The Ho in the case of means is NOT about the variances, so the >analogy breaks down. That is, we are not hypothesizing >Ho: sig1^2 = sig2^2, but rather Ho: mu1 = mu2. So there is no >direct link between Ho and the SE, unlike the proportions >example. would it be correct then to say ... that the test of differences in proportions is REALLY a test about the differences between two population variances? >Best wishes >Jim > > >James M. Clark (204) 786-9757 >Department of Psychology(204) 774-4134 Fax >University of Winnipeg 4L05D >Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] >CANADA http://www.uwinnipeg.ca/~clark > > > > >= >Instructions for joining and leaving this list and remarks about >the problem of INAPPROPRIATE MESSAGES are available at > http://jse.stat.ncsu.edu/ >= _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
At 04:26 PM 11/15/01 +0100, Rolf Dalin wrote: >The significance test produces a p-value UNDER THE CONDITION >that the null is true. In my opinion it does not matter whether we >know it isn't true. It is just an assumption for the calculations. And >these calculations do not produce exactly the same information as >the CI for the difference. They state in some sense, if the procedure >was repeted, how probable it would be to ... etc. this might make sense if the sample p*q values were the same for BOTH samples ... but if they are not (which will almost always be the case in real data) ... then you already have SOME evidence that the null is perhaps not true (of course, we know that it is not exactly true anyway ... so that sort of tosses out the notion of pooling so as to get a better estimate of a COMMON variance) earlier in their presentation, moore and mccabe say that they prefer to use a CI to test some null in this case ... but, if one did a z test with the unpooled estimator for standard error, this would lead to a "valid" significance test ... HOWEVER ... then they go on to say that INSTEAD, they will adopt the pooled standard error approach since it is the " ... more common practice" that logic escapes me if we can build a CI using the un pooled standard error formula and, find that to be ok to see if some null value like 0 difference in population proportions is inside or outside of the CI, i don't see any need to switch the denominator formula in the z test JUST because we want to use the z test STATISTIC to test the null a little more consistency in logic would seem to be in the best interests of students trying to learn this ... i would still argue that the extent to which you would not be willing to use the pooled standard error formula in the case of differences in means, would be the same extent to which you would not be willing to use the pooled standard error formula when it comes to differences in proportions ... i don't see that the logic really is any different but, this is just my opinion _ dennis roberts, educational psychology, penn state university 208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED] http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
Re: diff in proportions
Hi On 15 Nov 2001, dennis roberts wrote: > in the moore and mccabe book (IPS), in the section on testing for > differences in population proportions, when it comes to doing a 'z' test > for significance, they argue for (and say this is commonly done) that the > standard error for the difference in proportions formula should be a POOLED > one ... since if one is testing the null of equal proportions, then that > means your null is assuming that the p*q combinations are the SAME for both > populations thus, this is a case of pooling sample variances to estimate a > single common population variance > > but since this is just a null ... and we have no way of knowing if the null > is true (not that we can in any case) ... i don't see any logical > progression that would then lead one to also assume that the p*q > combinations are the same in the two populations ... hence, i don't see why > the pooled variance version of the standard error of a difference in > proportions formula would be the recommended way to go The p value that one is calculating assumes that the Ho is true, doesn't it. That is, what is p(zobt > zalpha | Ho true)? So assuming equality is correct assuming Ho true; that is, p1 = p2 in the population. > in their discussion of differences in means ... they present FIRST the NON > pooled version of the standard error and that is there preferred way to > build CIs and do t tests ... though they also bring in later the pooled > version as a later topic (and of course if we KNEW that populations had the > same variances, then the pooled version would be useful) > > it seems to me that this same logic should hold in the case of differences > in proportions The Ho in the case of means is NOT about the variances, so the analogy breaks down. That is, we are not hypothesizing Ho: sig1^2 = sig2^2, but rather Ho: mu1 = mu2. So there is no direct link between Ho and the SE, unlike the proportions example. Best wishes Jim James M. Clark (204) 786-9757 Department of Psychology(204) 774-4134 Fax University of Winnipeg 4L05D Winnipeg, Manitoba R3B 2E9 [EMAIL PROTECTED] CANADA http://www.uwinnipeg.ca/~clark = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
RE: diff in proportions
Title: RE: diff in proportions Dennis, I am not sure about this, but here goes anyway. Since the decision making process is based on Type I error (Critical Point and p-value), and since Type I error is under the assumption that the Null Hypothesis is true, then the "pooled" formula is appropriate. However, when one is doing Power calculations, then one would not use the "pooled" formula (similar to using a non-central t with continuous data). Howard Kaplon -Original Message- From: dennis roberts [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 15, 2001 8:30 AM To: [EMAIL PROTECTED] Subject: diff in proportions in the moore and mccabe book (IPS), in the section on testing for differences in population proportions, when it comes to doing a 'z' test for significance, they argue for (and say this is commonly done) that the standard error for the difference in proportions formula should be a POOLED one ... since if one is testing the null of equal proportions, then that means your null is assuming that the p*q combinations are the SAME for both populations thus, this is a case of pooling sample variances to estimate a single common population variance but since this is just a null ... and we have no way of knowing if the null is true (not that we can in any case) ... i don't see any logical progression that would then lead one to also assume that the p*q combinations are the same in the two populations ... hence, i don't see why the pooled variance version of the standard error of a difference in proportions formula would be the recommended way to go in their discussion of differences in means ... they present FIRST the NON pooled version of the standard error and that is there preferred way to build CIs and do t tests ... though they also bring in later the pooled version as a later topic (and of course if we KNEW that populations had the same variances, then the pooled version would be useful) it seems to me that this same logic should hold in the case of differences in proportions comments? == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =
diff in proportions
in the moore and mccabe book (IPS), in the section on testing for differences in population proportions, when it comes to doing a 'z' test for significance, they argue for (and say this is commonly done) that the standard error for the difference in proportions formula should be a POOLED one ... since if one is testing the null of equal proportions, then that means your null is assuming that the p*q combinations are the SAME for both populations thus, this is a case of pooling sample variances to estimate a single common population variance but since this is just a null ... and we have no way of knowing if the null is true (not that we can in any case) ... i don't see any logical progression that would then lead one to also assume that the p*q combinations are the same in the two populations ... hence, i don't see why the pooled variance version of the standard error of a difference in proportions formula would be the recommended way to go in their discussion of differences in means ... they present FIRST the NON pooled version of the standard error and that is there preferred way to build CIs and do t tests ... though they also bring in later the pooled version as a later topic (and of course if we KNEW that populations had the same variances, then the pooled version would be useful) it seems to me that this same logic should hold in the case of differences in proportions comments? == dennis roberts, penn state university educational psychology, 8148632401 http://roberts.ed.psu.edu/users/droberts/drober~1.htm = Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =