>>> Peter Hannan <[email protected]> 09/23/05 2:05 PM >>>
Paul, try a null model in GLM.  Peter


Does anyone have a neat recipe for estimating descriptive statistics 
(means and standard deviations) from multiply imputed data using SAS. 
I've done this a number of different ways, but they all seem like 
more trouble than they should be. It appears that getting means and 
standard deviations is substantially harder than getting regression estimates!

Best,
Paul

-----------------------------
Peter J Hannan
Senior Research Fellow
Division of Epidemiology and Community Health, SPH
University of Minnesota
1300 South 2nd St. #300
Minneapolis, MN. 55454-1015

email: [email protected] 
voice: 612-624-6542
FAX  : 612-624-0315





-------------- next part --------------
A non-text attachment was scrubbed...
Name: Header
Type: application/octet-stream
Size: 1609 bytes
Desc: not available
Url : 
http://lists.utsouthwestern.edu/pipermail/impute/attachments/20050923/5b3f60b5/Header.obj
From Howells_W <@t> bmc.wustl.edu  Tue Sep 27 10:35:38 2005
From: Howells_W <@t> bmc.wustl.edu (Howells, William)
Date: Tue Sep 27 10:36:06 2005
Subject: [Impute] the PE statistic with imputed data
Message-ID: <2ada428b6944da4b8f8a2fdf4e60e52a197...@exchange.wusm-pcf.wustl.edu>

I'm interested in calculating what some have referred to as the
"proportion explained" statistic from two regression models with imputed
data.  This statistic comes up in the analysis of indirect effects (or
surrogate variables, or mediation effects, depending on the literature,
eg. Freedman and Schatzkin, Am J Epi 1992).  PE = (C-C')/C where C = the
unadjusted effect of some independent variable and C' = the same effect
adjusted by the putative mediator.  PE quantifies the proportion
reduction in the independent variable due to mediation.  If PE = 1 there
is complete mediation.  If PE = 0, there is no mediation.  

Calculation of standard errors for PE is controversial, depending on the
outcome, in my case a time to event outcome.  I'm using the method due
to Lin, Fleming, and DeGruttola (Stats in Medicine 1997).  But with 50
imputed datasets.  My question is whether I am combining the imputations
correctly.  I first impute my 50 datasets, n=600 each.  I run the two
regression models within each imputed dataset and obtain C and C', and
apply the Lin et al formula to obtain both PE and se(PE).  Then I use
the usual formulas as implemented in SAS PROC MIANALYZE to obtain the
combined PE over the m=50 imputations and the combined variance using
the within and between imputation variance.  All seems well.  I think
this is the right approach.  

The other way of doing it is to first calculate C and C' separately by
averaging over the imputations and then find PE from these C and C'.
Note that mathematically this produces a different result than the above
method.  For example, with m=2 imputations that produced (C,C') = (6,4)
and (3,1) then PE_1 = (6-4)/6 = 1/3 and PE_2 = (3-1)/3 = 2/3.  The first
method produces (1/3 + 2/3)/2 = 1/2.  The second method produces
[(6+3)/2 - (4+1)/2] / [(6+3)/2] = (9/2-5/2)/ 9/2 = 4/9.  I'm just
looking for confirmation that the second method is incorrect.  Thanks.  

Bill Howells, MS
Wash U Med School, St Louis

<br/>The materials in this message are private and may contain Protected 
Healthcare Information. If you are not the intended recipient, be advised that 
any unauthorized use, disclosure, copying or the taking of any action in 
reliance on the contents of this information is strictly prohibited. If you 
have received this email in error, please immediately notify the sender via 
telephone or return mail.

Reply via email to