In article <8am7d1$hqj$[EMAIL PROTECTED]>, 
[EMAIL PROTECTED] says...
> 
> I think I made the formulation too wordy in previous
> post.  
> 
> Let me try this simple question:
> 
> When one wishes to do a (multi)linear regression on a set of 
> observed data, and one is in the (unusual) position of possessing
> a set of sample standard deviations (of varying degrees of f.) 
> at each value of the "explanatory" variable, how does one
> determine whether one ought or ought not to solve the weighted
> least squares problem using those sample standard deviations?
> 
> What is the usual decision test for "heterscedasticity" *before* one
> solves the regression system?  What do people do in practise?
> 
Most social scientists don't worry very much about the assumptions of OLS 
regression, noting that OLS estimates are fairly robust and can give 
unbiased estimates even if those assumptions aren't fulfilled. Exceptions 
are multilevel models and time series data, data for which the assumption 
of uncorrelated error terms is violated. But these require special 
programs, not weighted least squares.

There is also some debate on using weights for stratified sampling and/or 
to correct for sampling bias. Weighting leads to correct estimates but 
incorrect standard errors. One solution is to include the design 
variables in the model instead of weighting. Stata and Wesvar are two 
programs that can take weighting into account when calculating standard 
errors of estimates. But a quite common approach is to use weights for 
descriptive statistics, but not in multivariate models.

Weights can also be used for certain dependent variables that will 
violate the assumption of heteroscedasticity, e.g. a dichotomous 
dependent. I recently did a weighted least squares analysis for a co-
worker to replicate an analysis in another paper. The weight was 
groupn*pct*(1-pct), where groupn was the number of cases per group and 
pct was the proportion with a positive response within each group. But 
this basically amounts to a poor approximation of a logit model. Programs 
like GLIM that use iteratively reweighted least squares use pct*(1-pct) 
as the weight when estimating the model, but now pct is the predicted 
probability from the previous iteration.

As for a test for heteroscedasticity, Stata has a "hettest", which 
performs a Cook-Weisberg test and produces a chi-square statistic. They 
wrote a book in 1982, "Residuals and influence in regression". I've never 
used it though.

Hope this helps,
John Hendrickx


===========================================================================
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================

Reply via email to