Ah wait, my bad (as always T.T), I found a much simpler explanation: colset <- sample(3e7-nr, 1e7) storage.mode(colset) [1] "integer" storage.mode(colset-1) [1] "double"
So when I was unwrapping colset I allocated new memory in Rcpp to convert from double to integer, which was no longer valid when I went out of scope. I think it is a bit dangerous that you never know if you are allocating memory or just wrapping R objects when parsing arguments in Rcpp. Is there a way of ensuring that NOTHING gets copied when parsing arguments? Can you throw an exception if the type you try to cast to is not the one you expect? You might imagine that with large datasets this is important. Sorry for bothering and thanks again, Ale On Wed, Feb 12, 2014 at 1:10 PM, Dirk Eddelbuettel <e...@debian.org> wrote: > > On 12 February 2014 at 11:47, Alessandro Mammana wrote: > | Ok I was able to find the code causing the bug. So it looks like the > > Thanks for the added detail. > > | pointers you get from an Rcpp::Vector using .begin() become invalid > | after that the Rcpp::Vector goes out of scope (and this makes sense), > | what I do not understand is that this Rcpp::Vector was allocated in R > | and should still be "living" during the execution of the Rcpp call > | (that's why I wasn't expecting the pointer to be invalid). > | > | This is the exact code (the one above is probably fine): > | @@@@@@@@@@@@@@ in CPP @@@@@@@@@@@@@@i > | > | struct GapMat { > | int* ptr; > | int* colset; > | int nrow; > | int ncol; > | > | > | inline int* colptr(int col){ > | return ptr + colset[col]; > | } > | > | GapMat(){} > | > | GapMat(int* _ptr, int* _colset, int _nrow, int _ncol): > | ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){} > | }; > | > | > | GapMat getGapMat(Rcpp::List gapmat){ > | IntegerVector vec = gapmat["vec"]; > | IntegerVector pos = gapmat["colset"]; > | int nrow = gapmat["nrow"]; > | > | return GapMat(vec.begin(), pos.begin(), nrow, pos.length()); > | } > | > | // [[Rcpp::export]] > | IntegerVector colSumsGapMat(Rcpp::List gapmat){ > | > | GapMat mat = getGapMat(gapmat); > | IntegerVector res(mat.ncol); > | > | for (int i = 0; i < mat.ncol; ++i){ > | for (int j = 0; j < mat.nrow; ++j){ > | res[i] += mat.colptr(i)[j]; > | } > | } > | > | return res; > | } > | > | @@@@@@@@@@@@@@ in R (with gdb debugger as suggested by Dirk) @@@@@@@@@@@@@@i > | library(Rcpp) > | sourceCpp("scratchpad.cpp") > | > | vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer" > | nr <- 80 > | > | colset <- sample(3e7-nr, 1e7) > | foo <- vec[colset] #this is only to trigger some obscure garbage > | collection mechanisms... > | > | for (i in 1:10){ > | colset <- sample(3e7-nr, 1e7) > | gapmat <- list(vec=vec, nrow=nr, colset=colset-1) > | cs <- colSumsGapMat(gapmat) > | print(sum(cs)) > | } > | > | [1] 80000000 > | [1] 80000000 > | [1] 80016890 > | [1] 80008144 > | [1] 80016022 > | [1] 80021609 > | > | Program received signal SIGSEGV, Segmentation fault. > | 0x00007ffff18a5455 in GapMat::colptr (this=0x7fffffffc120, col=0) at > | scratchpad.cpp:295 > | 295 return ptr + colset[col]; > | > | @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ > | > | Why did it happen? What should I do to make sure that my pointers > | remain valid? My goal is to convert safely some vectors or matrices > | that "exist" in R to some pointers, how can I do that? > > Not sure. It looks fine at first instance. But then it's early in the morning > and I had very little coffee yet... > > Maybe the fact that you tickle the gc() via vec[colset] has something to do > with it, maybe it has not. Maybe I would try the decomposition of the List > object inside the colSumsGapMat() function to keep it simpler. Or if you > _really_ want an external object to iterate over, memcpy it out. > > With really large object, you may be stressing parts of the code that have > not been stressed the same way. If it breaks, you do get to keep both pieces. > > Dirk > > -- > Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com -- Alessandro Mammana, PhD Student Max Planck Institute for Molecular Genetics Ihnestraße 63-73 D-14195 Berlin, Germany _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel