I think I've figured out a way to do a bootstrap variance estimate of a quantile.  I 
need to work out the code, but this is the algorithm (for a stratified cluster sample):

   Make a list of the stratum values for the sample
   For each stratum value,
      Make a list of the PSU values within that stratum
      Sample n-1 PSU values with replacement
      Get the frequency of PSU values selected 
      Attach the frequency to the sample elements within the stratum by PSU
      Construct a new weight within the stratum as the sample weight multiplied by the 
frequency

   Once the new weight is generated in all stratum, get the quantile estimate(s) from 
svyquantile using the new weight
   Repeat another 99 times to build 100 bootstrap replicates
   Get the standard deviation of the replicate estimates as the variance

What do you think?  It's kind of general.  For stratified non-clustered samples, the 
selections would be done on sample elements, not on PSUs, and for non-stratified 
cluster cluster designs, the PSU selections would be done across the whole sample, not 
by stratum.

I'm not that up with bootstrapping however.  I'm not sure how to set/save the seed 
values so running the procedure again on the same dataset will produce the same 
variance.

Fred
Thomas Lumley <[EMAIL PROTECTED]> wrote:
On Mon, 12 Apr 2004, Fred Rohde wrote:

> Thanks. I'll update the survey package. Sudaan does the standard
> errors on quantiles using Taylor series. If I can hunt down the formula
> it uses, could you add that to svyquantile?

If I can bring myself to believe it. Computing standard errors for the
normal approximation to the median is not easy even in simple random
samples.

-thomas


> Fred
>
> Thomas Lumley wrote:
> On Mon, 12 Apr 2004, Fred Rohde wrote:
>
> > Hello,
> > Is there a way to get complex sample variances in the survey package on
> > summary statistics other than means? If not, can they be added to a
> > future version? It would be be great to have them on totals, quantiles,
> > ratios, and tables (eg row percent, columns percent, etc).
> >
>
> svytotal() and svyratio() will do this for totals and ratios if you have a
> new enough version. At the moment the easiest way to get row or column
> percentages is to think of them them as ratios of means of binary
> variables and use svyratio().
>
> Quantiles are more difficult, since neither Taylor series nor jackknife
> approaches work.
>
> -thomas
>
>
> ---------------------------------
> Do you Yahoo!?


Thomas Lumley Assoc. Professor, Biostatistics
[EMAIL PROTECTED] University of Washington, Seattle

                
---------------------------------


        [[alternative HTML version deleted]]

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to