Dear all,
Here is a little coding problem. It falls in the category of "how can I do this efficiently?" rather than "how can I do this?" (I know how to do it inefficiently). So, if you want to take the challenge, keep reading, otherwise just skip to the next post - I won't be offended by that ;-) I am coding a particle filter and I want to use systematic resampling to "regenerate" particles. Basically, what this means is the following. 1. Start with a vector of "labels", x, say, of length N and a vector of probabilities, w, say - w[i] is the probability attached to x[i]. 2. Draw N equally spaced random number in (0,1) as u = (1 : N + runif(1)) / N. 3. Define a new sample of "labels", y, say, by inverting the cdf defined by the probabilities w. That is, set y[i] = x[k] if cumsum(w)[k-1] <= u[i] < cumsum(w)[k], i=1, ..., N. The following is a piece of R code that implements the procedure. > ### Systematic resampling > set.seed(2) > x <- LETTERS[1 : 6] # labels > w <- rexp(6) > w <- w / sum(w) # probabilities > W <- c(0, cumsum(w)) # cdf > u <- (1 : 6 + runif(1)) / 6 > indNew <- sapply(seq_along(u), + function(i) (sum(W[i] <= u & u < W[i + 1]))) > (y <- rep(x, indNew)) [1] "A" "B" "D" "D" "F" > ## graphically... > plot(stepfun(seq_along(u), W), xaxt = "n") > axis(1, at = seq_along(u), x) > abline(h = u, lty = "dotted") The question is the following. I believe the way I compute "newInd" is of order N^2. Is there a smart way of doing it which is of order N? (I know there is one, I just cannot find it...) Thanks in advance, Giovanni ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.