Paul Miller wrote: > Hi Rod, > > Thanks for your input. Since I wrote initially, I've had a couple of > ideas. The first one is similar to the duration variable approach you > suggested. The idea would be to introduce a duration variable into the > imputation dataset that would be calculated using cases with complete > data for start and stop. Then the stop date could be constrained to > equal start + duration. Or possibly the stop could just be directly > calculated as start + duration. > > The second idea involves creating a set of 4 variables: MIN_START, > MAX_START, MIN_STOP, and MAX_STOP. These can generally be created using > the limited date information that I have available. If, for example, I > know that a person started taking a drug in 2004 but nothing else, I can > calculate the minimum start as 01/01/04 and the maximum start as > 12/31/04. Then I can tell IVEware to constrain the imputed value to be > between these two dates. I've been playing with this approach a little > earlier today, and, so for, it seems to be working quite well. So now > I'm just hoping that the duration approach can also be successfully > implemented. > > Thanks, > > Paul
I'm curious how IVEware does such constraints. Does it result in boundary problems, i.e., a mass of imputed values at the boundary of the constraint? Frank Harrell > > Paul J. Miller, Ph.D. > Research Scientist and Statistician > Ontario HIV Treatment Network > 1300 Yonge St., Suite 308 > Toronto, Ontario M4T 1X3 > Phone: (416) 642-6486 ext 232 > Fax: (416) 640-4245 > > -----Original Message----- > From: Roderick A. Rose [mailto:[email protected]] > Sent: Thursday, August 31, 2006 11:48 AM > To: Paul Miller; [email protected] > Subject: Re: [Impute] Imputing "Plausible" Start and Stop Dates for HIV > Antiretroviral Drugs > > Paul, > > My recommended solution is made under the (perhaps incorrect) assumption > > that what you are mainly interested in is the interval between the start > > and stop dates and not the actual stop and start dates themselves. Let > the > start date equal zero in every case (so it doesn't have to be imputed) > and > the interval is a count of days (or another unit) between zero and the > stop > date. You impute this interval. I've not used IVEware, so I'm not sure > this > will completely eliminate the problem (e.g., you might end up with > negative > intervals if the bounds statement really doesn't work well). > > Regarding the second issue of plausibility, I am curious if it is > necessary > to have precision in days; if you know it happened in May 1998, you can > err > on the side of the least undesirable bias (by making it either May 31 or > > May 1). This is an alternative to ignoring the known value and letting > it > impute a completely new and possibly unrelated value. (Or do both and > see > what happens, as many of us probably do). > > Best, > > Rod -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
