On 02/25/2014 05:00 AM, r-help-requ...@r-project.org wrote:
Hi,

I have some measurements and their uncertainties.  I'm using an
uncensored subset of the data for a weighted fit (for now---I'll do a
fit to the full, censored, dataset when I understand the results).

survreg() reports a much smaller standard error for the model
parameter than lm(), but only when I use weights.  Am I missing
something?  Here is what I'm doing:

Survreg treats weights as case weights, and lm treats them as sampling weights.
Here is a simple example.  Data set test2 has two copies of every obs in data 
set test.

test <- data.frame(x=1:6, y=c(1,3,2,4,6,5))
test2 <- test[c(1:6, 1:6),]

summary(lm( y ~ x, data=test))$coef
             Estimate Std. Error   t value   Pr(>|t|)
(Intercept) 0.4000000  0.9039595 0.4424977 0.68100354
x           0.8857143  0.2321154 3.8158362 0.01884548

summary(lm( y~x, data=test2))$coef
             Estimate Std. Error   t value    Pr(>|t|)
(Intercept) 0.4000000  0.5717142 0.6996503 0.500096805
x           0.8857143  0.1468027 6.0333668 0.000126369

As expected, the standard error has decreased by a factor of sqrt(2)
Now fit the model using case weights:

summary(lm( y~x, data=test, weight= rep(2,6)))$coef
             Estimate Std. Error   t value   Pr(>|t|)
(Intercept) 0.4000000  0.9039595 0.4424977 0.68100354

Notice that the answer matches the first run with data set test. Repeat this experiment with survreg, and you will find that the weighted run matches data test2. When using the robust variance, survreg treats weights as sampling weights, not case weights.

What is the "right" behavior? Neither or both: the writer of the routine simply makes a choice and sticks with it. If you really care about this read up on the survey package which cares about this type type of issue, in detail, and does it right. An intermediate step is to use a software system (stata for example) that explicitly supports more than one kind of weight.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to