extracting dataset with average imputed values from aregImpute()
  Dear all:
  after many trials, i am still quite lost on how to exact a dataset with 
average imputed values after running aregImpute()
  take the eg in the aregImpute(Hmisc) documentation(-- which also appeared in 
our R-archive with prof. Frank E Harrell Jr as the author ):
  
--------the following is the way on how to get a completed dataset (but for 
only one draw of the k multiple imputations)-- btw,i don't quite see what 
"fit.mult.impute"(mentioned below) is for, but seems it has nothing to do with 
my question:
  aregImpute produces a list containing the multiple imputations:
  w <- aregImpute(. . .)
w$imputed$blood.pressure   # gets m by k matrix
  # m = number of subjects with blood pressure missing,
  # k = number of multiple imputations
  To get a completed dataset (but for only one draw of the k multiple 
imputations) see how fit.mult.impute does it.  I have just added the 
following example to the help file for aregImpute.
  set.seed(23)
x <- runif(200)
y <- x + runif(200, -.05, .05)
y[1:20] <- NA
d <- data.frame(x,y)
f <- aregImpute(~ x + y, n.impute=10, match='closest', data=d)
# Here is how to create a completed dataset for imputation
# number 3 as fit.mult.impute would do automatically.  In this
# degenerate case changing 3 to 1-2,4-10 will not alter the results.
completed <- d
imputed <- impute.transcan(f, imputation=3, data=d, list.out=TRUE,
                            pr=FALSE, check=FALSE)
completed[names(imputed)] <- imputed
completed  # 200 by 2 data frame
   
  -------------------however, how could one get a completed dataset for the 
average of the K draws of the k multiple imputations? 
  say, after running: 
  w <- aregImpute(. . .)
w$imputed$blood.pressure  
  we gets m by k matrix
 m = number of subjects with blood pressure missing,
 k = number of multiple imputations
  this m by k matrix is for each subject (or say, for each record) with missing 
data. So for each row (record), i could average its k multiple imputation 
results , then store the result in a separate column. HOwever, this could only 
provide myself a dataset with just that m rows (records) which have missing 
data. 
  
what i really want is to get a COMPLETED dataset, with every non-missing value 
there just as they were in the original dataset, and with each 'NA' in the 
original dataset got replaced by its average imputation value (the average of 
its k imputations). 
  
many thanks! 


       
---------------------------------
[[replacing trailing spam]]

        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to