Dear R-help list, Part of a program I wrote seem to take a significant amount of time, therefore I am looking for an alternative approach. In order to explain what is does:
- the input is a sorted vector of integer numbers - some higher numbers may be derived (using a mathematical formula) from lower numbers, therefore they should be eliminated - at the end, the vector should contain only uniquely defined numbers Pet hypothetical example, input vector: - 2 3 4 5 6 7 8 9 10 - number 2 generates 4, 7, 10 - 2 3 5 6 8 9 (surviving vector) - number 3 generates 5 and 9 - 2 3 6 8 (surviving vector) - number 6 generates 8 - final surviving vector 2 3 6 Function foo(x, ...) generates the numbers, my current approach being: #### index <- 0 while ((index <- index + 1) < length(numbers)) { numbers <- setdiff(numbers, foo(numbers[index])) } #### This seem to take quite some time (but I don't know any other way of doing it), hence my question(s): - would there be another (quicker) implementation in R? - alternatively, should I go for a C implementation? (actually, I did create a C implementation, but it doesn't bring any more speed... it is actually a bit slower). A real-life pet example, using the function findSubsets() from the QCA package (our foo function above): #### library(QCA) testfoo <- function(x, y) { index <- 0 while((index <- index + 1) < length(x)) { x <- setdiff(x, findSubsets(y, x[index], max(x))) } return(x) } nofl <- rep(3, 14) set.seed(12345) numbers <- sort(sample(seq(prod(nofl)), 1000000)) system.time(result <- testfoo(numbers, nofl)) #### user system elapsed 8.168 2.049 10.148 Any hint will be highly appreciated, thanks in advance, Adrian -- Adrian Dusa Romanian Social Data Archive 1, Schitu Magureanu Bd. 050025 Bucharest sector 5 Romania Tel.:+40 21 3126618 \ +40 21 3120210 / int.101 Fax: +40 21 3158391 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.