IMPUTE: estimating R^2, etc.

Paul von Hippel Mon, 09 Jun 2003 16:36:15 -0700

A few months ago, I posted a note asking how to estimate R^2 (and other quantities) when values are multiply imputed. A respondent suggested that I use the same strategy as that used to estimate the regression coefficients: get a point estimate from each imputed data set, and average these.

Today I began to wonder about this. Consider the regression Y=rX+e where X and Y are standard normal variables. Then R^2 = r^2. It was suggested that R^2 could be estimated by averaging the estimates of R^2=r^2 across multiple imputations. Yet r is estimated by averaging the estimates of r across multiple imputations. In general, these estimates will not agree: if r>0, then the estimate of R^2 will be less than the squared estimate of r. If the estimator of r is unbiased, then the proposed estimate of R^2 must be biased.

It strikes me there must be a lot of quantities for which we cannot obtain unbiased estimates using this procedure. Pertinent citations would be most appreciated.

Best wishes,
Paul von Hippel
Statistician
Ohio State University

IMPUTE: estimating R^2, etc.

Reply via email to