Re: [R] boot() with glm/gnm on a contingency table

Milan Bouchet-Valat Sun, 16 Sep 2012 12:05:06 -0700

Le mercredi 12 septembre 2012 à 07:08 -0700, Tim Hesterberg a écrit :
> One approach is to bootstrap the vector 1:n, where n is the number
> of individuals, with a function that does:
> f <- function(vectorOfIndices, theTable) {
>   (1) create a new table with the same dimensions, but with the counts
>   in the table based on vectorOfIndices.
>   (2) Calculate the statistics of interest on the new table.
> }
> 
> When f is called with 1:n, the table it creates should be the same
> as the original table.  When called with a bootstrap sample of
> values from 1:n, it should create a table corresponding to the
> bootstrap sample.
If anybody is interested, I've finally taken this way, the function
described above being implemented as below. The idea is to assign an
index to each observation, and identify which cell the observation comes
from using the cumulative sum. Instead of going over all indices and
adding incrementing the corresponding cell count for each, I decided to
start with the original data, decrementing the counts for missing
indices, and incrementing it for duplicates. There are probably better
implementations, but performance-wise it seems good enough.


# tab is a table object
f <- function(tab, indices) {
  cs <- cumsum(tab)

  # Remove missing observations
  for(i in setdiff(1:sum(tab), indices)) {
      index <- min(which(i <= cs))
      tab[index] <- tab[index] - 1
  }

  # Add duplicate observations
  for(i in indices[duplicated(indices)]) {
      index <- min(which(i <= cs))
      tab[index] <- tab[index] + 1
  }
}


Thanks for the pointers!

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] boot() with glm/gnm on a contingency table

Reply via email to