On Mon, Aug 18, 2008 at 6:18 PM, Farley, Robert <[EMAIL PROTECTED]> wrote: > My motivation is to try to correct for a "time on board" bias we see in > our surveys. Not surprisingly, riders who are only on board a short > time don't attempt/finish our survey forms. We're able to weight our > survey to the "bus stop-on by bus run" level.
So is it the problem of catching the short rides in your sample, or the problem of having those short rides complete the survey? If the former, then all you have to do is to weight by inverse probability of selection (Horvitz-Thompson estimator). This probability is probably roughly proportional to time on bus, which in turn might be proportional to the number of stops in their ride. You may not need any raking for that, just do some algebra computing those probabilities of selection. If the latter is the problem, then it is the problem of non-response. If you think that the only thing that matters in whether a person chooses to respond or not is the length of the ride, then your data are "missing at random" (MAR), one of several standard concepts in the missing data statistics (http://www.citeulike.org/user/ctacmo/article/553290). You can bypass that -- in survey statistics, that will be done with weights, again. Here, you would need to boost the weight by the inverse fraction of those who did complete the survey. In a more difficult situation, your response probability might depend on other factors, say demographics of the passengers, time of the day, etc. I would imagine you would still have MAR data, unless you have some weird questions like "Do you carry firearms on the bus?" to which the people who did have guns at the time of their ride would probably decline to answer, making the data informatively missing/not missing at random (NMAR). -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.