Thanks, I thought a little about this. It's not obvious to me what the prior
would be. Any recommendations?

On Tue, Sep 20, 2011 at 9:41 AM, Juned Siddique
<[email protected]>wrote:

>  Hi Paul,****
>
> ** **
>
> If you use a Bayesian approach like Proc MI for the problem below, the
> posterior correlation between wave 1 and 3 is just the prior correlation. So
> one approach might be to use an informative prior for the covariance matrix
> which you can do in Proc MI.****
>
> ** **
>
> -Juned****
>
> ** **
>
> ** **
>
> ** **
>
> *From:* Impute -- Imputations in Data Analysis [mailto:
> [email protected]] *On Behalf Of *Paul von Hippel
> *Sent:* Tuesday, September 20, 2011 8:21 AM
>
> *To:* [email protected]
> *Subject:* Re: Imputing panel data, constraining correlations at long lags
> ****
>
>  ** **
>
> Thanks, Dave. You've come up with a nicely simplified version of my
> problem. Suppose I had only three waves of data, with every subject missing
> either wave 1 (your pattern A) or wave 3 (your pattern B). Ordinarily I
> would put the data in wide format -- ****
>
> ** **
>
> A O1 O2 M3****
>
> B M1 O2 O3****
>
> ** **
>
> -- and impute using a multivariate normal model. However, I don't think
> that would work in this case because the MVN model would want to estimate
> the correlation between wave 1 and wave 3, and there are no cases where both
> wave 1 and wave 3 are observed.****
>
> ** **
>
> However, if I could tell the software that this was, say, an AR(1) process
> -- or, equivalently, that partial correlation between waves 1 and 3 is zero
> -- I'd be in business.****
>
> ** **
>
> This could be done using MVN software that allowed me to impose constraints
> on the covariance matrix, or imputation software for serially correlated
> data. Does such software exist?****
>
> ** **
>
> Best,****
>
> Paul****
>
> ** **
>
> ** **
>    ------------------------------
>
> *From:* David Judkins <[email protected]>
> *To:* [email protected]
> *Sent:* Tuesday, September 20, 2011 7:25 AM
> *Subject:* Re: Imputing panel data, constraining correlations at long lags
> ****
>
> Paul,****
>
>  ****
>
> This sounds pretty challenging.  Reminds me of Andrew Gelman's JSM talk and
> 1998 JASA paper on imputation of questions not asked.  ****
>
>  ****
>
> It also reminds me of a remark some speaker made this year at JSM about
> almost all natural processes being Markov chains. Not sure I buy that, but I
> think he meant that if you have a rich enough state vector, then one past
> observation is all you need.  Of course, that would be trivially true if the
> state vector contained lagged latent values.   In this case,I doubt your
> state vector is rich enough to compensate for the brevity of the
> student-level time series, but I guess you have to work with what you have.
> ****
>
>  ****
>
> Whatever you do I imagine will involve a lot of custom programming.
> However, you might be able to Raghu's IVEware on a series of specially
> reshaped versions of your data.  For example, to impute year 3 for subject a
> and year 1 for subject B, you might create a a dataset with only A and B
> records in it shaped like this:****
>
>  ****
>
> A O1 O2 M3****
>
> B M1 O2 O3****
>
>  ****
>
> Once that was done, you could proceed to imputing Year 4 on A and B records
> and Year 2 on C records with a dataset shaped from B and C records as****
>
>  ****
>
> A O2 I3 M4****
>
> B O2 O3 M4****
>
> C M2 O3 O4****
>
>  ****
>
> And so on.  At the end of that, you would have 4 observed/imputed years per
> subject.  ****
>
>  ****
>
> There should then be a way to generalize to more than 4 per subject.  Not
> very elegant, but it might work.****
>
>  ****
>
> --Dave****
>   ------------------------------
>
> *From:* Impute -- Imputations in Data Analysis [
> [email protected]] on behalf of Paul von Hippel [
> [email protected]]
> *Sent:* Monday, September 19, 2011 5:58 PM
> *To:* [email protected]
> *Subject:* Imputing panel data, constraining correlations at long lags****
>
> I have panel data where different students are tested for overlapping
> 2-year periods. ****
>
>    - Subject A is observed for years 1 & 2. ****
>    - Subject B is observed for years 2 & 3. ****
>    - Subject C is observed for years 3 & 4. ****
>    - etc up to year 12 (of school)****
>
> For each observed year there are three separate test occasions (fall,
> winter, spring) and two subjects (reading, math).
>
> It seems to me I can impute the  missing test scores provided I am willing
> to assume something about lags that are 2 years are longer. For example, I
> could assume that the partial correlation at lags of 2 years or longer is
> zero. This is not an unreasonable assumption since the correlations at
> shorter lags are very strong (.8-.9).
>
> Is there software that will allow me to do this conveniently?
>
> My usual strategy is to reshape the data from long to wide and then impute
> using a multivariate normal model. There are several packages that will
> permit this; however, I am not aware of software that will let me constrain
> the covariance matrix in the way I have described.
>
> I have not used imputation software that are tailored for panel data --
> such as Schafer et al's PAN package, recently ported from S-Plus to R. I
> could try that, provided there is a convenient way to restrict the long
> lags.
>
> Thanks!
>
> --
> Best wishes,
> Paul von Hippel
> Assistant Professor
> LBJ School of Public Affairs
> Sid Richardson Hall 3.251
> University of Texas, Austin
> 2315 Red River, Box Y
> Austin, TX  78712
>
> mobile, preferred (614) 282-8963
> office (512) 232-3650****
>
> ** **
>



-- 
Best wishes,
Paul von Hippel
Assistant Professor
LBJ School of Public Affairs
Sid Richardson Hall 3.251
University of Texas, Austin
2315 Red River, Box Y
Austin, TX  78712

mobile, preferred (614) 282-8963
office (512) 232-3650

Reply via email to