First, thanks for your reply!

Bob Hayden schrieb:

> This may not fully answer all your questions, but the various formulas
> for inference for regression have a place where you plug in THE
> variance of the points around the regression model, i.e., they treat
> this as a constant.  If it is not a constant, the results of the
> formulas will not be correct.  Specific ramifications would depend on
> what formual you are interested in, the exact manner in which the
> variance varies, etc.

Yes, I'm aware that non-constant variances let the usual computational rules break 
down if you really want to
incorporate them in the model. But my problem is: Why do I want to consider them?
As stated below, there are really two questions:
1) Why do the standard errors *increase* -- is there a intuitive explanation (more 
intuitive than the proof
that the estimate of b is not efficient)? So, I think the specific formula of interest 
is either the
calculation of b or -- more probably -- of its SE. Subquestion: Why does OLS fail to 
reflect this tendency to
have larger SEs?
2) Regarding Kmenta's argumentation: Inefficiency of OLS is proved by relating it to 
WLS' better performance.
This *assumes* that weighting down the observations with higher residuals is 
legitimate. How do I justify this
strong assumption, apart from the case when I know that the larger residuals result 
from higher *true* random
disturbances? In other words: If residual diagnostics show uneven distribution of 
residuals, shouldn't be the
first step to search for unobserved predictors? And if I cannot measure these, isn't 
OLS as good a guess at the
true relationships of the observed variables as WLS would be?

(Because Mr. Hayden started a new thread, I do not delete his quotation of my original 
message, in order to
retain the context.)

Regards, MQ

> ----- Forwarded message from Markus Quandt -----
>
> Hello all,
>
> when discussing linear regression assumptions with a colleague, we
> noticed that we were unable to explain WHY heteroscedasticity has
> the well known ill effects on the estimators' properties. I know
> WHAT the consequences are (loss of efficiency, tendency to
> underestimate the standard errors) and I also know why these
> consequences are undesirable. What I'm lacking is a substantial
> understanding of HOW the presence of inhomogeneous error variances
> increases the variability of the coefficients, and HOW the
> estimation of the standard errors fails to reflect this.
> I consulted a number of (obviously too basic) textbooks, all but
> one only state the problems that arise from het.sc. The one that
> isn't a total blank (Kmenta's Elements of Econometrics, 1986) tries
> to give an intuitive explanation (along with a proof of the
> inefficiency of the estimators with het.sc.), but I don't fully
> understand that.
> Kmenta writes:
> "The standard least squares principle involves minimizing
> [equation: sum of squared errors], which means that each squared
> disturbance is given equal weight. This is justifiable wehn each
> disturbance comes from the same distribution. Under het.sc.,
> however, different disturbances come from different distributions
> with different variances. Clearly, those disturbances that come
> from distributions with a smaller variance give more precise
> information about the regression line than those coming from
> distributions with a larger variance. To use sample information
> efficiently, one should give more weight to the observations with
> less dispersed disturbances than to those with more dispersed
> disturbances." p. 272
>
> I see that the conditional distributions of the disturbances
> obviously differ if het.sc. is present (well, this is the
> definition of het.sc., right?), and that, IF I want to compensate
> for this, I can weight the data accordingly (Kmenta goes on to
> explain WLS estimation). But firstly, I still don't see why
> standard errors increased in the first place... And secondly, is it
> really legitimate to claim that OLS is 'wrong', if it treats
> differing conditional disturbances with equal weight?
>
> Assume the simple case of increasing variances of Y with increasing
> values of X, and therefore het.sc. present. With differing
> precision of prediction for different X values, the standard error
> (SE) of the regression coefficient (b) should become conditional on
> the value of X, the higher X, the higher SE, with E(b) constant
> over all values of X - correct? Then, isn't the standard error as
> estimated by OLS implicitly an _average_ over all these conditional
> SEs (just following intuition here)? How can we claim that the
> specific SE at the X value with the lowest disturbance is the
> 'true' one? (Exception: het.sc. is due to uneven measurement error
> for Y - I can see that the respective data points are less
> reliable.)
>
> Regarding the first question: Can this be answered at all without
> the formal proof?
>
> ----- End of forwarded message from Markus Quandt -----

--
________________________________________________________________
 Markus Quandt




===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to