[Impute] How to calculate likelihood ratio test formultiply-imputed data using proc surveylogis

allison Fri Mar 30 07:51:42 2007

Tom:

As Craig notes, the likelihood ratio test is not valid here. But the problem
has nothing to do with multiple imputation.  It's because SURVEYLOGISTIC
does not modify the likelihood to take clustering and other design factors
into account. SURVEYLOGISTIC does conventional ML estimation and then
adjusts the standard errors to account for the design factors. Methods for
combining log-likelihoods in the multiple imputation setting are, in fact,
well established (see, e.g., my 2001 book, Missing Data).


As Craig also notes, the solution here is to do Wald tests.  But it's
actually very easy to do this with PROC MIANALYZE using the TEST statement.
For example, to test model 4 against model 3, run model 4 and include the
statement:

test disease1=0, disease2=0 /mult;

To test model 3 against model 2, run model 3 and include the statement

test rxuse=0, healthvisits=0 / mult;

The TEST statement in MIANALYZE is slightly different then the one in PROC
REG and PROC LOGISTIC.  Specifically, if you omit the MULT option, you only
get separate tests for each hypothesis. MULT gives the joint test.

-----------------------------------------------------------------
Paul D. Allison, Professor and Chair
Department of Sociology
University of Pennsylvania
3718 Locust Walk
Philadelphia, PA  19104-6299
215-898-6712, 215-898-6717
215-573-2081 (fax)
http://www.ssc.upenn.edu/~allison
 

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Craig Newgard
Sent: Thursday, March 29, 2007 8:40 PM
To: [email protected]; [email protected]
Subject: Re: [Impute] How to calculate likelihood ratio test
formultiply-imputed data using proc surveylogis

Tom,
I have struggled with a similar issue recently.  It appears that you're
working with complex survey data (clusters, strata, weights), which adds
complexity both to the MI model and analysis.  For the MI model, you should
include features important to the sampling design (ie, clusters and strata),
in order to minimize bias in the MI results (Raghunathan has a paper on
this).  

For the analysis, the LR test can't be used with complex survey data.
Instead, the Wald statistic can be used.  I'd suggest running your MI model,
then analyze each of the MI models, saving the covariance matrices and point
estimates (I have used Stata for this part, as it was easier than in SAS),
which can then be used to compute the joint significance of multiple terms
(NORM has a fairly easy mechanism for employing this).  One other option is
to use the "test" command in Stata for each model (whichever joint terms
you'd like to test) - this will give you some idea as to joint significance,
though Stata micombine is not yet able to handle multiply imputed complex
survey data.

Hope this helps.  

Craig 

Craig D. Newgard, MD, MPH
Assistant Professor
Department of Emergency Medicine
Department of Public Health and Preventive Medicine
Center for Policy and Research in Emergency Medicine
Oregon Health & Science University
3181 SW Sam Jackson Park Road 
Mail Code CR-114
Portland, Oregon 97239-3098
phone (503) 494-1668
fax (503) 494-4640
[email protected] 

----------------------------------------------------------------------------
--------------------------------
Confidential: In accordance with ORS 41.675.
The information contained in this EMAIL message is confidential and
protected by law. The information is intended only for the person or
business identified in the document. If you are not the intended recipient,
a sharing, printing, storing or copying of the information will result in a
violation of the law. If you have received this EMAIL by mistake, please
notify the sender of this EMAIL and copy the Office of Information Privacy
and Security at [email protected] . 


>>> "Bohman, Thomas M" <[email protected]> 03/29/07 10:08 AM >>>
Greetings,

 

I am using SAS proc surveylogistic to estimate four nested models. I've
presented simplified versions of each model below

 

                       Model1   Model2   Model3  Model4

Predictor    

Age                  x          x            x                   x

Gender              x         x            x                   x

WorkingStatus              x            x                   x

Region                         x            x                    x

HealthVisits                                x                   x

RXuse                                        x                   x

Disease1                                                         x

Disease2                                                         x

 

I would like to test the joint effect of adding each additional set of
variables to the predictors entered in previous models. I would normally
calculate the Likelihood Ratio (LR) test by multiplying -2 by the
difference in the log transformed likelihoods as shown below:

 

LR = -2*(lnL1-lnL2)

 

Where ln is the log transformation, L1 is likelihood for Model1 and L2
is likelihood for Model 2 with the LR value distributed as Chi-Square
and having degrees of freedom equal to the difference in number of
predictors between the two models. 

 

However, my question arises from using multiple imputation (proc mi) to
impute missing values in 10 different imputed datasets and then using
proc mianalyze to combine the results from these ten datasets and obtain
the correct test statistics. I'm not sure how to deal with the LR test
since there are 10 different values for the log-likelihoods for each
model. One simple strategy would be to average the log-likelihoods
across the 10 models and use the averaged results. However, I can't find
any literature that supports this approach.  I've included below the
basic code that I'm using to run one of the models.

 

 
**----------------------------------------------------------------------
-------**;

      **-- Create Multiple Imputations Model with all predictors--**;

 
**----------------------------------------------------------------------
-------**;

 

       proc mi data=nhis.nhis_aa_recode3  seed=21355417 nimpute=10  out
= nhis_aa_recode3_imp;

            mcmc chain=multiple displayinit initial=em(itprint);   

            var Age Gender WorkingStatus Region HealthVisits RXuse
Disease1 Disease2 ; 

         run;

      ods output close;

 

 
**----------------------------------------------------------------------
-------**;

      **-- Run Model 1  predictors--**;

 
**----------------------------------------------------------------------
-------**;

      proc surveylogistic data=nhis_aa_recode3_imp ;

         cluster h_psu; 

         strata h_stratum;

         WEIGHT h_WTFA_SA;

            model dependent(descending) =  Age Gender / COVB expb;

            by _imputation_;

            ods output Parameterestimates=gmparms1 COVB=COVMAT1; 

            title3 'Survey logistic results for Model 1';

      run;

 
**----------------------------------------------------------------------
-------**;

      **-- Combine Results for Model 1  predictors--**;

 
**----------------------------------------------------------------------
-------**;

      proc mianalyze parms=gmparms1 COVB=COVMAT1 mult; 

            modeleffects  Age Gender      ; 

                  title3 'Proc MIanalyze results for Model 1';

      run;

 

Any feedback on how to accomplish this test would be greatly appreciate!
Any examples showing how to do so in SAS code would be doubly
appreciated!!

 

With best regards,

Tom

Tom Bohman, Ph.D.
Research Scientist
Addiction Research Institute/GCATTC
Center for Social Work Research
University of Texas at Austin
1 University Station R5000
Austin, TX 78712
(512) 232-0605
[email protected]

 




_______________________________________________
Impute mailing list
[email protected]
http://lists.utsouthwestern.edu/mailman/listinfo/impute

[Impute] How to calculate likelihood ratio test formultiply-imputed data using proc surveylogis

Reply via email to