I am soliciting any suggestions for published references or insight regarding 
the use of multiple imputation in complex probability samples (i.e. weighted 
databases).  The difficulty I'm encountering is how to build an imputation 
model that appropriately reflects the complex sample design, while allowing 
appropriate calculations for the variance.  Ideally, I'd like to use SAS to 
perform MI (proc MI), then use SUDAAN to perform the analysis (accounting for 
sample design), then combine the results with SAS (proc MIANALYZE).  Many of 
the national complex samples have used hot deck imputation or simple imputation 
with a regression model to impute values, but I have yet to find a detailed 
description of MI in a complex sample. 

These are the ideas I've come up with so far:
1) modified bootstrapping technique: expand sample to full weighted size, then 
draw repeated simple random samples (equal in size to the original non-weighted 
sample), perform MI in standard fashion on each of these samples until all 
original observations have no missing values.  Benefit-still allows analysis 
with SUDAAN; potential problem: inappropriate variance calculation in MI 
process?

2)  separate sample by primary sampling units (PSUs) and strata within each 
PSU, then perform MI individually within each of these separated strata.  
benefit-accounts for majority of sample design; prob-may be limited by sample 
size in strata and fails to utilize all data in database for the MI process.

Any thoughts?  Thanks.

Craig 

Craig D. Newgard, MD, MPH
Assistant Professor
Department of Emergency Medicine
Department of Public Health & Preventative Medicine
Oregon Health & Science University
3181 Sam Jackson Park Road
Mail Code CR-114
Portland, OR 97201-3098
(503) 494-1668 (Office)
(503) 494-4640 (Fax)
[email protected]


Reply via email to