Re: [Rd] In C, a fast way to slice a vector?

Patrick Aboyoun Sun, 10 May 2009 21:39:25 -0700

Saptarshi,

I know of two alternatives you can use to do fast extraction ofconsecutive subsequences of a vector:


1) Fast copy:  The method you mentioned of creating a memcpy'd vector

2) Pointer management: Creating an externalptr object in R and managethe start and end of your data

If you are looking for a prototyping environment to try, I recommendusing the IRanges and Biostrings packages from the Bioconductorproject. The IRanges package contains a function called subseq forperforming 1) on all basic vector types (raw, logical, integer, etc.)and Biostrings package contains a subseq method on an externalptrbased class that implements 2.

I was going to lobby R core members quietly about adding somethingakin to subseq from IRanges into base R since it is extremely usefulfor all long vectors and could replace all a:b calls with a <= b in Rcode, but this publicity can't hurt.


Here is an example:

source("http://bioconductor.org/biocLite.R";)
biocLite(c("IRanges", "Biostrings"))

<< download output omitted >>

suppressMessages(library(Biostrings))
x <- rep(charToRaw("a"), 1e7)
y <- BString(rawToChar(x))
suppressMessages(library(Biostrings))
x <- rep(charToRaw("a"), 1e7)
y <- BString(rawToChar(x))
system.time(x[13:1e7])

   user  system elapsed
  0.304   0.073   0.378

system.time(subseq(x, 13))

   user  system elapsed
  0.011   0.007   0.019

system.time(subseq(y, 13))

   user  system elapsed
  0.003   0.000   0.004

identical(x[13:1e7], subseq(x, 13))

[1] TRUE

identical(x[13:1e7], charToRaw(as.character(subseq(y, 13))))

[1] TRUE

sessionInfo()

R version 2.10.0 Under development (unstable) (2009-05-08 r48504)
i386-apple-darwin9.6.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Biostrings_2.13.5 IRanges_1.3.5

loaded via a namespace (and not attached):
[1] Biobase_2.5.2



Quoting Saptarshi Guha <saptarshi.g...@gmail.com>:

Hello,
Suppose in the following code,
PROTECT(sr = R_tryEval( .... ))

sr is a RAWSXP vector. I wish to return another RAWSXP starting at
position 13 onwards (base=0).

I could create another RAWSXP of the correct length and then memcpy
the required bytes and length to this new one.

However is there a more efficient method?

Regards
Saptarshi Guha

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] In C, a fast way to slice a vector?

Reply via email to