I need Help!!
Please help me to solve this problem, I am stuck... An inspector inspects large truckloads of potatoes to determine the proportion p in the shipment with major defects prior to using the potatoes to make potato chips. Unless there is clear evidence that this proportion is less than 0.10 she will reject the shipment. To reach a decision she will test the hypotheses H0: p=0.10, Ha: p<0.10 Using the large sample test for a population proportion. To do so, she selects an SRS of 50 potatoes from the more than 2000 potatoes on the truck. Suppose that only 2 of potatoes sampled are found to have major defects. Determine the P-value of her test. I really appreciate your helps. John Lexmark Struggling Statistics Sutdent. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: VIF
On 31 May 2000, Vmcw wrote: > >>It is 10. I hope, you are talking about Variance Inflation Factor. > >>More than 10 indicates severe multicollinearity. Thus spake Jin Singh. And someone else (was it Dave Heiser?) retorted, sensibly I thought, > >And where does this magic number come from? :) To which Tom in PA replied (possibly tongue-in-cheek?), > Neter, Wasserman, Nachtsheim, and Kutner, of course! (or is it Wasserman, > Kutner, Neter, and Nachtsheim or one of the other 22 permutations?). I've heard of a Wasserman (or Wassermann?) test, but didn't think it had to do with VIF. Dunno about all those other blokes. But apart from argument by Appeal to Irrelevant Authority at HeadQuarters, was there actually some _reasoning_ underlying the selection of VIF = 10, or was it just someone's arbitrary guess (like the 10 subjects per variable one is supposed to have before one dares essay a factor analysis)? -- Don. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: VIF
>>It is 10. I hope, you are talking about Variance Inflation Factor. More >than >>10 indicates severe multicollinearity. > > >And where does this magic number come from? :) > > Neter, Wasserman, Nachtsheim, and Kutner, of course! (or is it Wasserman, Kutner, Neter, and Nachtsheim or one of the other 22 permutations?). Tom in PA === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: VIF
In article <000701bfca86$f831b9a0$047c6395@sprint>, [EMAIL PROTECTED] says... > >It is 10. I hope, you are talking about Variance Inflation Factor. More than >10 indicates severe multicollinearity. And where does this magic number come from? :) >Jin > >Jineshwar Singh, Coordinator, IDS >Interdisciplinary Department >George Brown College >St .James campus >[EMAIL PROTECTED] >* >You cannot control how others act but you can >control how you react. >416 -415-2089 >http://www.gbrownc.on.ca/~jsingh > >- Original Message - >From: Karen Scheltema <[EMAIL PROTECTED]> >To: <[EMAIL PROTECTED]> >Sent: Tuesday, May 30, 2000 4:51 PM >Subject: VIF > > >> What is the usual cutoff for saying the VIF is too high? >> >> Karen Scheltema, M.A., M.S. >> Statistician >> HealthEast >> 1700 University Ave W >> St. Paul, MN 55104 >> (651) 232-5212 fax: (651) 641-0683 -- T.S. Lim [EMAIL PROTECTED] www.Recursive-Partitioning.com __ Get paid to write a review! http://recursive-partitioning.epinions.com === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: VIF
Karen Scheltema wrote in message <[EMAIL PROTECTED]>... >What is the usual cutoff for saying the VIF is too high? I don't see that there can be any general criterion for saying that a VIF is too large. A large value indicates collinearity between predictor variables. In some fields, this cannot be avoided. I have one set of set for which most of the VIFs are in excess of a million. The data are from NIR spectroscopy, where this is unavoidable. If you do have large VIFs then make sure that your least-squares software uses some form of orthogonal reduction. If it uses the normal equations, and hence squares the condition number, then you could be in trouble. > >Karen Scheltema, M.A., M.S. >Statistician >HealthEast >1700 University Ave W >St. Paul, MN 55104 >(651) 232-5212 fax: (651) 641-0683 > -- Alan Miller, Retired Scientist (Statistician) CSIRO Mathematical & Information Sciences Alan.Miller -at- vic.cmis.csiro.au http://www.ozemail.com.au/~milleraj http://users.bigpond.net.au/amiller/ === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
RE: VIF
On Tue, 30 May 2000, Dale Glaser wrote: > Karen..off the top of my head, the VIF is the inverse of tolerance, > hence, if tolerance = (1 - r^2j), then VIF = 1/(1-r^2j).. Yes, Dale is correct. > ... r^2j would be the percentage of variation accounted for by the > predictors in predicting the other predictor.. e.g., the linear > combination of x1 and x2 in predicting x3; > anyway, as with any cutoff value there can be an element of > arbitrariness, Indeed. > though some have registered concern if VIF > 10.0; my personal opinion > is that the aforementioned cutoff value is way too liberal; I agree, if one is using the idea of "cutoff"; though possibly I am thinking of "conservative" rather than "liberal", since I have seen (and dealt with) VIFs exceeding several hundreds. They don't frighten me particularly, partly because by orthogonalizing they can be reduced to manageable levels. Even partly orthogonalizing can reduce VIFs to values like 2 or 1.5, at least in some circumstances. > for VIF to equal 10.0 then 1/=(1 - .9) entails a multiple R of > .9486!!!; for me it is a stretch to conceive that collinearity only > becomes problematic when R = .9486...I'll be interested to see what > others think Strictly speaking, "multicollinearity" implies R = 1.000, I believe. (I don't know why Dale calculates R; the effective information is that [with VIF = 10] R^2 = 0.9, and 10% of the original variance in the predictor remains unaccounted for. As one of our colleagues (Rich Ulrich, I think) recently remarked in another context, with R^2 values this large one may often usefully consider their complementary values (1 - R^2).) Most computer regression programs have a control based on tolerance (the reciprocal of VIF); I believe Minitab's default tolerance threshold is around 0.0001 or 0.0002, implying VIFs of 10,000 or 5,000 respectively. This of course is not to be taken as an indication of "good practice", but of where the systems analysts thought the algorithm was in danger of breaking down: "severe multicollinearity" indeed. But a lurking question, as my earlier post may have suggested, is whether the multicollinearity apparently present is inherent in the nature of the variables, or an artifact of variable construction. The latter was the case in the problem addressed in the paper on the Minitab web site. Karen's original question was: > What is the usual cutoff for saying the VIF is too high? Depends on the purpose for which you think you want a "cutoff", and whether you propose to implement it blindly and without further thought, or as a (very!) rough guideline regarding where the currents (and perhaps the undertow) may be dangerous and REQUIRE further thought; just for two examples. -- Don. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: partial least squares regression
On Tue, 30 May 2000, Karen Scheltema wrote: > Can someone enlighten me about how partial least squares regression > works to handle multicollinearity. Depends partly on whether you're looking at real, or spurious, multicollinearity; and may depend on where the multicollinearity actually arises. We may need more description of your particular problem for a useful conversation. What I understand by "partial regression" is not really different from multiple regression: it involves "partialling" out the effects of various predictors both from other (in general susbequent) predictors and from the response variable, often either in a sequential manner or in a way that implies a sequence by assigning an order of hierarchical importance, if you will, to the several predictors. If the multicollinearity arises from artificial variables computed as the product of two or more raw variables, usually in seeking evidence for or against the presence of interactions between those raw variables, there is an obvious sort of hierarchy: raw variables > 2-way interactions > 3-way interactions > ... If the multicollinearity is "built in", so to speak, because there really exists a (near-)linear combination among the predictors (or even more than one such combination), one may need to decide which variable(s) to EXclude in order to avoid the multicollinearity: this is another way of saying that one needs to assign a hierarchy of importance to the predictors. In any event, the main problem with multicollinearity is computational: in finite precision, estimation becomes unreliable as the apparent zero-order correlations become less distinguishable from 1.000... The most effective approach to the problem that I know of is to (begin to) orthogonalize at least some of the predictors with respect to some or all of the others. For an example applying this idea in practice (where multicollinearity arose from computing raw interaction variables) see my paper on the Minitab web site (www.minitab.com and look for Resources, then White Papers). > Can SPSS do partial least squares regression? Yes, if we're talking on the same wavelength. Requires computing residual variables and adjoining them to the variables in the data set, then using them as predictors in place of the products (or raw variables) they're residuals from. -- DFB. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: VIF
It is 10. I hope, you are talking about Variance Inflation Factor. More than 10 indicates severe multicollinearity. Jin Jineshwar Singh, Coordinator, IDS Interdisciplinary Department George Brown College St .James campus [EMAIL PROTECTED] * You cannot control how others act but you can control how you react. 416 -415-2089 http://www.gbrownc.on.ca/~jsingh - Original Message - From: Karen Scheltema <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Tuesday, May 30, 2000 4:51 PM Subject: VIF > What is the usual cutoff for saying the VIF is too high? > > Karen Scheltema, M.A., M.S. > Statistician > HealthEast > 1700 University Ave W > St. Paul, MN 55104 > (651) 232-5212 fax: (651) 641-0683 > > > Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com > > > > === > This list is open to everyone. Occasionally, less thoughtful > people send inappropriate messages. Please DO NOT COMPLAIN TO > THE POSTMASTER about these messages because the postmaster has no > way of controlling them, and excessive complaints will result in > termination of the list. > > For information about this list, including information about the > problem of inappropriate messages and information about how to > unsubscribe, please see the web page at > http://jse.stat.ncsu.edu/ > === > === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
RE: VIF
Karen..off the top of my head, the VIF is the inverse of tolerance, hence, if tolerance = (1 - r^2j), then VIF= 1/(1-r^2j)..[excuse the sloppiness of the notation, but r^2j would be the percentage of variation accounted for by the predictors in predicting the other predictor..ie., the linear combination of x1 and x2 in predicting x3]; anyway, as with any cutoff value there can be an element of arbitrariness, though some have registered concern if VIF > 10.0; my personal (possibly misinformed!) opinion is that the aforementioned cutoff value is way too liberal; for VIF to equal 10.0 then 1/=(1 - .9) entails a multiple R of .9486!!!; for me it is a stretch to conceive that collinearity only becomes problematic when R = .9486...I'll be interested to see what others think Dale Glaser, Ph.D. Senior Statistician, Pacific Science and Engineering Group Adjunct faculty/lecturer, SDSU/USD/CSPP San Diego, CA. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Karen Scheltema Sent: Tuesday, May 30, 2000 1:52 PM To: [EMAIL PROTECTED] Subject:VIF What is the usual cutoff for saying the VIF is too high? Karen Scheltema, M.A., M.S. Statistician HealthEast 1700 University Ave W St. Paul, MN 55104 (651) 232-5212 fax: (651) 641-0683 Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: sas vs s-plus for qc (fwd)
Does not appear to be moving that direction with Minitab itself. They are teaming with Hertzler Systems (www.hertzlersystems.com) and Qualifine to jointly provide what looks to be a very nice real-time SPC system. Contact qualifine (www.qualifine.com I think) to find out more. I'm sure Minitab could also give you some info on that. "Donald F. Burrill" wrote: > Sorry, all; my attempt to mail this to "Ken K." directly failed. > Presumably he reads the list, since he posted to it. > -- DFB. > > Donald F. Burrill [EMAIL PROTECTED] > 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] > MSC #29, Plymouth, NH 03264 603-535-2597 > 184 Nashua Road, Bedford, NH 03110 603-471-7128 > > -- Forwarded message -- > Date: Wed, 24 May 2000 13:25:43 -0400 (EDT) > From: Donald F. Burrill <[EMAIL PROTECTED]> > To: "Ken K." <[EMAIL PROTECTED]> > Cc: "Donald F. Burrill" <[EMAIL PROTECTED]> > Subject: Re: sas vs s-plus for qc > > On Wed, 24 May 2000, Ken K. wrote in part: > > > I should have mentioned that MINITAB does not provide, and does appear > > to plan to offer, real-time data collection and SPC > > Did you mean "does appear", or "does not appear" ? > -- DFB. > > Donald F. Burrill [EMAIL PROTECTED] > 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] > MSC #29, Plymouth, NH 03264 603-535-2597 > 184 Nashua Road, Bedford, NH 03110 603-471-7128 > > === > This list is open to everyone. Occasionally, less thoughtful > people send inappropriate messages. Please DO NOT COMPLAIN TO > THE POSTMASTER about these messages because the postmaster has no > way of controlling them, and excessive complaints will result in > termination of the list. > > For information about this list, including information about the > problem of inappropriate messages and information about how to > unsubscribe, please see the web page at > http://jse.stat.ncsu.edu/ > === === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: Statistical Calculator for Palm OS
If you do some searching at http://www.palmgear.com you should be able to find some function sets that work with the RPN calculator. There is also a nice statistical analysis package called Palm Stat. PalmGear also has that - search for "palm stat". Eric Turkheimer wrote: > Is there a good statistical calculator for the Palm OS? Not just the > usual mean and SD functions, it would be especially useful if some > statistical distribution functions were included. Seems like a natural. > > Eric > [EMAIL PROTECTED] === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
VIF
What is the usual cutoff for saying the VIF is too high? Karen Scheltema, M.A., M.S. Statistician HealthEast 1700 University Ave W St. Paul, MN 55104 (651) 232-5212 fax: (651) 641-0683 Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
partial least squares regression
Can someone emlighten me about how partial least squares regression works to handle multicollinearity. Can SPSS do partial least squares regression? Karen Scheltema, M.A., M.S. Statistician HealthEast 1700 University Ave W St. Paul, MN 55104 (651) 232-5212 fax: (651) 641-0683 Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
WHY is heteroscedasticity bad? (fwd)
This may not fully answer all your questions, but the various formulas for inference for regression have a place where you plug in THE variance of the points around the regression model, i.e., they treat this as a constant. If it is not a constant, the results of the formulas will not be correct. Specific ramifications would depend on what formual you are interested in, the exact manner in which the variance varies, etc. - Forwarded message from Markus Quandt - >From [EMAIL PROTECTED] Tue May 30 15:54:27 2000 X-Authentication-Warning: jse.stat.ncsu.edu: majordom set sender to [EMAIL PROTECTED] using -f To: [EMAIL PROTECTED] Date: Tue, 30 May 2000 19:03:49 +0200 From: Markus Quandt <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Organization: Universitaet zu Koeln X-Sender: [EMAIL PROTECTED] Subject: WHY is heteroscedasticity bad? Sender: [EMAIL PROTECTED] Precedence: bulk Hello all, when discussing linear regression assumptions with a colleague, we noticed that we were unable to explain WHY heteroscedasticity has the well known ill effects on the estimators' properties. I know WHAT the consequences are (loss of efficiency, tendency to underestimate the standard errors) and I also know why these consequences are undesirable. What I'm lacking is a substantial understanding of HOW the presence of inhomogeneous error variances increases the variability of the coefficients, and HOW the estimation of the standard errors fails to reflect this. I consulted a number of (obviously too basic) textbooks, all but one only state the problems that arise from het.sc. The one that isn't a total blank (Kmenta's Elements of Econometrics, 1986) tries to give an intuitive explanation (along with a proof of the inefficiency of the =DF estimators with het.sc.), but I don't fully understand that. Kmenta writes: "The standard least squares principle involves minimizing [equation: sum of squared errors], which means that each squared disturbance is given equal weight. This is justifiable wehn each disturbance comes from the same distribution. Under het.sc., however, different disturbances come from different distributions with different variances. Clearly, those disturbances that come from distributions with a smaller variance give more precise information about the regression line than those coming from distributions with a larger variance. To use sample information efficiently, one should give more weight to the observations with less dispersed disturbances than to those with more dispersed disturbances." p. 272 I see that the conditional distributions of the disturbances obviously differ if het.sc. is present (well, this is the definition of het.sc., right?), and that, IF I want to compensate for this, I can weight the data accordingly (Kmenta goes on to explain WLS estimation). But firstly, I still don't see why standard errors increased in the first place... And secondly, is it really legitimate to claim that OLS is 'wrong', if it treats differing conditional disturbances with equal weight? Assume the simple case of increasing variances of Y with increasing values of X, and therefore het.sc. present. With differing precision of prediction for different X values, the standard error (SE) of the regression coefficient (b) should become conditional on the value of X, the higher X, the higher SE, with E(b) constant over all values of X - correct? Then, isn't the standard error as estimated by OLS implicitly an _average_ over all these conditional SEs (just following intuition here)? How can we claim that the specific SE at the X value with the lowest disturbance is the 'true' one? (Exception: het.sc. is due to uneven measurement error for Y - I can see that the respective data points are less reliable.) Regarding the first question: Can this be answered at all without the formal proof? Thanks for your patience, MQ -- Markus Quandt === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === - End of forwarded message from Markus Quandt - -- _ | | Robert W. Hayden | | Department of Mathematics / | Plymouth State College MSC#29 | | Plymouth, New Hampshire 03264 USA | * | 82 River Street /| Ashland, NH 03217-9702 | ) (603) 968-9914 (home) L_/ fax (
WHY is heteroscedasticity bad?
Hello all, when discussing linear regression assumptions with a colleague, we noticed that we were unable to explain WHY heteroscedasticity has the well known ill effects on the estimators' properties. I know WHAT the consequences are (loss of efficiency, tendency to underestimate the standard errors) and I also know why these consequences are undesirable. What I'm lacking is a substantial understanding of HOW the presence of inhomogeneous error variances increases the variability of the coefficients, and HOW the estimation of the standard errors fails to reflect this. I consulted a number of (obviously too basic) textbooks, all but one only state the problems that arise from het.sc. The one that isn't a total blank (Kmenta's Elements of Econometrics, 1986) tries to give an intuitive explanation (along with a proof of the inefficiency of the =DF estimators with het.sc.), but I don't fully understand that. Kmenta writes: "The standard least squares principle involves minimizing [equation: sum of squared errors], which means that each squared disturbance is given equal weight. This is justifiable wehn each disturbance comes from the same distribution. Under het.sc., however, different disturbances come from different distributions with different variances. Clearly, those disturbances that come from distributions with a smaller variance give more precise information about the regression line than those coming from distributions with a larger variance. To use sample information efficiently, one should give more weight to the observations with less dispersed disturbances than to those with more dispersed disturbances." p. 272 I see that the conditional distributions of the disturbances obviously differ if het.sc. is present (well, this is the definition of het.sc., right?), and that, IF I want to compensate for this, I can weight the data accordingly (Kmenta goes on to explain WLS estimation). But firstly, I still don't see why standard errors increased in the first place... And secondly, is it really legitimate to claim that OLS is 'wrong', if it treats differing conditional disturbances with equal weight? Assume the simple case of increasing variances of Y with increasing values of X, and therefore het.sc. present. With differing precision of prediction for different X values, the standard error (SE) of the regression coefficient (b) should become conditional on the value of X, the higher X, the higher SE, with E(b) constant over all values of X - correct? Then, isn't the standard error as estimated by OLS implicitly an _average_ over all these conditional SEs (just following intuition here)? How can we claim that the specific SE at the X value with the lowest disturbance is the 'true' one? (Exception: het.sc. is due to uneven measurement error for Y - I can see that the respective data points are less reliable.) Regarding the first question: Can this be answered at all without the formal proof? Thanks for your patience, MQ -- Markus Quandt === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Re: question about minitab
On Tue, 30 May 2000, Niklas Hansen wrote: > I'm a statistics student in sweden who needs some help. > I'm running a best subsets (regression) and I get the following > output: < output deleted > > What I would like to know is what does the statistic C-p mean? > I would be very happy if someone could explain it to me... C-p is usually written as C with subscript p . I don't recall who invented it, but I remember encountering it in statistical journals several decades ago. Minitab's Reference Manual (1989, for Release 7; there are probably more modern references) has this to say: [begin Minitab quote] The C-p statistic is given by the formula C-p = (SSEp/MSEm) - (n - 2p) where SSEp is SSE [Sum of squares due to error] for the best model with p parameters (including the intercept, if it is in the equation), and MSEm is the mean square error for the model with all m predictors. In general, we look for models where C-p is small and is also close to p. If the model is adequate (i.e., fits the data well), then the expected value of C-p is approximately equal to p, the number of parameters in the model. A small value of C-p indicates that the model is relatively precise (has small variance) in estimating the true regression coefficients and predicting future responses. This precision will not improve much by adding more predictors. Models with considerable lack of fit have values of C-p larger than p. See [9] for more on C-p. [end of Minitab quote] The reference [9] cited is R.R. Hocking (1976). "A Biometrics Invited Paper: The Analysis and Selection of Variables in Linear Regression," Biometrics 32, pp. 1-49. Donald F. Burrill [EMAIL PROTECTED] 348 Hyde Hall, Plymouth State College, [EMAIL PROTECTED] MSC #29, Plymouth, NH 03264 603-535-2597 184 Nashua Road, Bedford, NH 03110 603-471-7128 === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
RE: Ordinal log-linear model
I just downloaded LEM also and the manual, which as you said is in POSTSCRIPT format, is readable by what I assume are various sources, but the one I have is GHOSTSCRIPT; do a search for GHOSTSCRIPT and you can download it for free...in the old days it was kind of a pain to read PS files as you had to save fonts files in separate subdirectories, but now it's a lot easierdale glaser -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Buoy Sent: Sunday, May 28, 2000 6:35 AM To: [EMAIL PROTECTED] Subject:Re: Ordinal log-linear model It's me again Following your suggestions downloaded LEM. The program is working (I examined featured examples which I downloaded too). I also downloaded zipped manual for LEM. The filename was: MANUAL.PS and in README.TXT was written, that this is a "postscript" format (???) which can be printed with "postscript" printer or viewed with "postscript" viewer. What is that - POSTSCRIPT viewer / printer? Or simply How can I view or print the LEM manual? Desperate Michal Bojanowski === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ === === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
question about minitab
Im a statistics student in sweden who needs some help. Im running a best subsets (regression) and I get the following output: Response is skad_tot D n a k r y n o i N a t d d l B f y _ _ u u l i t b k u m m _ l _ y o n m m Adj.t a k g r g y y Vars R-sq R-sqC-p s r r o g k d 1 2 1 83.8 82.8 42.6179.36 X 1 81.9 80.8 49.3189.70 X 2 89.8 88.4 23.9147.55 X X 2 89.6 88.2 24.6148.95 X X 3 94.2 92.9 10.4115.14 X X X 3 92.5 90.9 16.3130.75 X X X 4 95.3 93.88.5107.63 X X X X 4 95.2 93.78.9108.65 X X X X 5 96.9 95.64.990.937 X X X X X 5 96.6 95.25.994.921 X X X X X 6 97.4 95.95.287.205 X X X X X X 6 97.2 95.66.090.823 X X X X X X 7 97.4 95.67.190.850 X X X X X X X 7 97.4 95.67.191.162 X X X X X X X 8 97.4 95.19.095.406 X X X X X X X X What I would like to know is what does the statistica C-p mean? I would be very happy if someone could explain it to me... Regards Niklas Hansen University of mälardalen Västerås, Sweden === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
dispersal distance
Hi All I look for a refrence on statistical analysis ( discrete probabality distribution and survival function) for dispersal random distance, (animal movement). thank you. === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===
Statistical process control
Hi, I'm looking for books or references on "Statistical process control". Does anyone could give me such references. thank you, -- Franck GOLLIOT === This list is open to everyone. Occasionally, less thoughtful people send inappropriate messages. Please DO NOT COMPLAIN TO THE POSTMASTER about these messages because the postmaster has no way of controlling them, and excessive complaints will result in termination of the list. For information about this list, including information about the problem of inappropriate messages and information about how to unsubscribe, please see the web page at http://jse.stat.ncsu.edu/ ===