[R] svy / weighted regression

Laust Fri, 09 Oct 2009 04:19:51 -0700

Dear list,

I am trying to set up a propensity-weighted regression using the
survey package. Most of my population is sampled with a sampling
probability of one (that is, I have the full population). However, for
a subset of the data I have only a 50% sample of the full population.
In previous work on the data, I analyzed these data using SAS and
STATA. In those packages I used a propensity weight of 1/[sampling
probability] in various generalized linear regression-procedures, but
I am having trouble setting this up. I bet the solution is simple, but
I’m a R newbie. Code to illustrate my problem below.


Thanks
Laust

# loading survey
library(survey)

# creating data
listc <- 
c("Denmark","Finland","Norway","Sweden","Denmark","Finland","Norway","Sweden")
listw <- c(1,2,1,1,1,1,1,1)
listd <- c(0,0,0,0,1000,1000,1000,2000)
listt <- c(750000,500000,900000,1900000,5000,5000,5000,10000)
list.cwdt <- c(listc, listw, listd, listt)
country <- data.frame(country=listc,weight=listw,deaths=listd,yrs_at_risk=listt)

# running a frequency weighted regression to get the correct point
estimates for comparison
glm <- glm(deaths ~ country + offset(log(yrs_at_risk)),
weights=weight, data=country, family=poisson())
summary(glm)
regTermTest(glm, ~ country)

# running survey weighted regression
svy <- svydesign(~0,,data=country, weight=~weight)
svyglm <- svyglm(deaths ~ country + offset(log(yrs_at_risk)),
design=svy, data=country, family=poisson())
summary(svyglm)
# point estimates are correct, but standard error is way too large
regTermTest(svyglm, ~ country)
# test indicates no country differences

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] svy / weighted regression

Reply via email to