Re: [R] Discrepant lm() and survreg() standard errors with weighted fits

Therneau, Terry M., Ph.D. Tue, 25 Feb 2014 08:53:26 -0800


On 02/25/2014 05:00 AM, r-help-requ...@r-project.org wrote:

Hi,

I have some measurements and their uncertainties.  I'm using an
uncensored subset of the data for a weighted fit (for now---I'll do a
fit to the full, censored, dataset when I understand the results).

survreg() reports a much smaller standard error for the model
parameter than lm(), but only when I use weights.  Am I missing
something?  Here is what I'm doing:


Survreg treats weights as case weights, and lm treats them as sampling weights.
Here is a simple example.  Data set test2 has two copies of every obs in data 
set test.

test <- data.frame(x=1:6, y=c(1,3,2,4,6,5))
test2 <- test[c(1:6, 1:6),]

summary(lm( y ~ x, data=test))$coef

             Estimate Std. Error   t value   Pr(>|t|)
(Intercept) 0.4000000  0.9039595 0.4424977 0.68100354
x           0.8857143  0.2321154 3.8158362 0.01884548

summary(lm( y~x, data=test2))$coef

             Estimate Std. Error   t value    Pr(>|t|)
(Intercept) 0.4000000  0.5717142 0.6996503 0.500096805
x           0.8857143  0.1468027 6.0333668 0.000126369

As expected, the standard error has decreased by a factor of sqrt(2)
Now fit the model using case weights:

summary(lm( y~x, data=test, weight= rep(2,6)))$coef

             Estimate Std. Error   t value   Pr(>|t|)
(Intercept) 0.4000000  0.9039595 0.4424977 0.68100354

Notice that the answer matches the first run with data set test. Repeat this experimentwith survreg, and you will find that the weighted run matches data test2. When using therobust variance, survreg treats weights as sampling weights, not case weights.

What is the "right" behavior? Neither or both: the writer of the routine simply makes achoice and sticks with it. If you really care about this read up on the survey packagewhich cares about this type type of issue, in detail, and does it right. An intermediatestep is to use a software system (stata for example) that explicitly supports more thanone kind of weight.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Discrepant lm() and survreg() standard errors with weighted fits

Reply via email to