Jean-Phillippe: I had a similar question in an off-list message exchange with Rod. Here is the set of messages:
>>> Rod Little [email protected]> 11/28/03 12:56PM >> Jonathan: MI methods are based on asymptotic theory, and transforming to something more normal and then back-transforming is a good idea, since it improves the validity of the asymptotic theory in moderate samples. The method is still valid asymptotically without the transformation, which is more a small-sample refinement. Rod On Fri, 28 Nov 2003, Jonathan Mohr wrote: > Rod: Thanks much for your response; I never thought I'd get advice from > such an authoritative source! > > My confusion regarding this issue began after reading the PROC MIANALYZE > documentation, which begins, "For some parameters of interest, it is not > straightforward to compute estimates and associated covariance matrices > with standard statistical SAS procedures. Examples include correlation > coefficients between two variables and ratios of variable means. Special > cases such as these are described in the "Examples of the Complete-Data > Inferences" section." The example for the correlation coefficient > suggests that the proper procedure for combining multiply imputed > bivariate correlations is to use the Fisher r-to-z transformation prior > to averaging (and then back-transforming). Similarly, I've seen a few > articles by Don Rubin and his colleagues where a number of procedures > were proposed for obtaining an accurate p-value for tests on an overall > model (e.g., F tests for regression models). > > Perhaps such strategies are only necessary when one wants to obtain > p-values or confidence intervals for the statistic. For example, if, for > a regression analysis, one wants R-squared only for a measure of > explained variance, then (as you suggest) it is fine to average the > multiply imputed values of R-squared. Similarly, if, for a structural > equation model, one wants the model chi squared value only to calculate > fit indices, then it may be fine to average the multiply imputed chi > squared values. > > I fear that my confusion about this may reveal my lack of statistical > sophistication. However, I've been in touch with a number of similarly > naive "users" of multiple imputation who are also confused about this > issue. I know that I and others would be grateful for any clarification > on this issue that you or anyone could provide on this topic. > > Thanks again for your kind attention! > Best, > Jon >>> "Laurenceau, Jean-Philippe" <[email protected]> 11/28/03 02:35PM >>> Rod--Would that also be the case even with a simple correlation coefficient? If so, why wouldn't something like an r-to-z transormation be involved with Rubin's rule aggregation? Thanks for your thoughts, J-P -----Original Message----- From: [email protected] on behalf of Rod Little Sent: Thu 11/27/2003 9:19 PM To: Jonathan Mohr Cc: [email protected] Subject: IMPUTE: Re: combining multiply imputed estimates of R-squared Jonathan: R-squared is just another estimand, and the correct MI procedure is to simply average the values from each MI data set. Rod Little On Mon, 24 Nov 2003, Jonathan Mohr wrote: > I am in the midst of using multiple imputation with multiple > regression. The literature I've seen focuses on combining regression > coefficients and corresponding standard errors. However, I've seen > nothing on combining the estimates of R-squared. I would appreciate > any guidance or leads that list members can offer. Best, Jon > > __________________________________ > > Jonathan Mohr, Ph.D. > Assistant Professor > Department of Psychology > Loyola College > 4501 North Charles Street > Baltimore, MD 21210-2699 > > E-mail: [email protected] > Phone: 410-617-2452 > Fax: 410-617-5341 > __________________________________ > > ___________________________________________________________________________________ Roderick Little Richard D. Remington Collegiate Professor (734) 936-1003 Department of Biostatistics Fax: (734) 763-2215 U-M School of Public Health M4045 SPH II [email protected] 1420 Washington Hgts http://www.sph.umich.edu/~rlittle/ Ann Arbor, MI 48109-2029 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20031128/1cb05e2c/attachment.htm From wv <@t> isd.sdu.dk Sat Nov 29 14:18:56 2003 From: wv <@t> isd.sdu.dk (Werner Vach) Date: Sun Jun 26 08:25:01 2005 Subject: IMPUTE: Re: combining multiply imputed estimates of R-squared References: <pine.wnt.4.21.0311272118090.1112-100...@little-home> Message-ID: <[email protected]> Dear Jonathan and Rod, in principle I agree with Rod. However, I think in using MI one should be aware, that measure for the predictive accuracy like R-squared should be handeled with greater care than regression parameter estimates. The reason for this is, that measures of predictive accuracy are more sensitive to the choice of the model we use (implicitely) in generating the imputations than regression parameters. If we apply MI to regression models (with missing values in the covariates) many people use procedures, which assume that the regression model is correctly specified (and negelecting the general advice, that the model we using to generate the MIs should be more general than the model we would like to analyse). So it will happen frequently, that if we have for example in reality a quadratic model, we draw imputations still in a way assuming a linear model. This is no big problem, as long as we look on the regression parameters, as one does not introduce bias in the estimation this way (although confidence intervals will be too optimistic). However, with respect to measures of predictive accuracy we will introduce a bias, because the imputations make the data as looking like the model. So whenever one would like to use MI to measure predictive accuracy, I recommend to base the generation of the MIs on models, which are much more general than the regression model to be analysed, e.g. including quadratic terms and interactions and perhaps heterogeneous variances. Best Werner Rod Little schrieb: >Jonathan: R-squared is just another estimand, and the correct MI procedure >is to simply average the values from each MI data set. Rod Little > >On Mon, 24 Nov 2003, Jonathan Mohr wrote: > > > >>I am in the midst of using multiple imputation with multiple >>regression. The literature I've seen focuses on combining regression >>coefficients and corresponding standard errors. However, I've seen >>nothing on combining the estimates of R-squared. I would appreciate >>any guidance or leads that list members can offer. Best, Jon >> >>__________________________________ >> >>Jonathan Mohr, Ph.D. >>Assistant Professor >>Department of Psychology >>Loyola College >>4501 North Charles Street >>Baltimore, MD 21210-2699 >> >>E-mail: [email protected] >>Phone: 410-617-2452 >>Fax: 410-617-5341 >>__________________________________ >> >> >> >> > >___________________________________________________________________________________ >Roderick Little >Richard D. Remington Collegiate Professor (734) 936-1003 >Department of Biostatistics Fax: (734) 763-2215 >U-M School of Public Health >M4045 SPH II [email protected] >1420 Washington Hgts http://www.sph.umich.edu/~rlittle/ >Ann Arbor, MI 48109-2029 > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.utsouthwestern.edu/pipermail/impute/attachments/20031129/85c6e36a/attachment.htm
