[Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-11 Thread Alessandro Mammana
Hi all,
I got another segfault using Rcpp. It is very difficult to understand
where it happens and to reduce it to a minimal example, so for now I
am not posting very precise code here, but I have a suspicion, maybe
you could help me saying if my suspect is right.

I am doing something similar:

in a .cpp file:
@@@
struct GapMat {
int* ptr;
int* colset;
int nrow;
int ncol;


inline int* colptr(int col){
return ptr + colset[col];
}

GapMat(){}

GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
};


// [[Rcpp::export]]
IntegerVector colSumsGapMat(Rcpp::IntegerVector vec,
Rcpp::IntegerVector pos, int nrow){
   GapMat mat(vec.begin(), pos.begin(), nrow, pos.length());
   IntegerVector res(pos.length());

for (int i = 0; i < pos.length(); ++i){
for (int j = 0; j < nrow; ++j){
res[i] += mat.colptr(i)[j];
}
}

return res;
}
@

from R:

vec <- a very big integer vector
nrow <- 80
pos <- a very big subset of positions, such that max(pos) + nrow < length(vec)
colsums <- colSumsGapMat(vec, pos, nrow)


from time to time I get a segfault.
Note: this is not exactly the code that produces the segfault (because
that one is very complicated), so it might be that this code is
totally fine.

My suspicion:

I am using the pointer "vec.begin()", but then I am allocating new
memory in the R area of memory with "IntegerVector res(pos.length())"
and R decides to move the original values of "vec" to some other
place, making the pointer invalid.

Is that possible

Sorry for being very vague and thx in advance!!!
Ale

-- 
Alessandro Mammana, PhD Student
Max Planck Institute for Molecular Genetics
Ihnestraße 63-73
D-14195 Berlin, Germany
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-11 Thread Dirk Eddelbuettel

In essence: "Yes"

On 11 February 2014 at 15:18, Alessandro Mammana wrote:
| Hi all,
| I got another segfault using Rcpp. It is very difficult to understand
| where it happens and to reduce it to a minimal example, so for now I
| am not posting very precise code here, but I have a suspicion, maybe
| you could help me saying if my suspect is right.

Use a debugger:

R -d gdb

and then proceed as normal. 

Make sure your compiler flags include -g as well.

Dirk


| I am doing something similar:
| 
| in a .cpp file:
| @@@
| struct GapMat {
| int* ptr;
| int* colset;
| int nrow;
| int ncol;
| 
| 
| inline int* colptr(int col){
| return ptr + colset[col];
| }
| 
| GapMat(){}
| 
| GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
| ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
| };
| 
| 
| // [[Rcpp::export]]
| IntegerVector colSumsGapMat(Rcpp::IntegerVector vec,
| Rcpp::IntegerVector pos, int nrow){
|GapMat mat(vec.begin(), pos.begin(), nrow, pos.length());
|IntegerVector res(pos.length());
| 
| for (int i = 0; i < pos.length(); ++i){
| for (int j = 0; j < nrow; ++j){
| res[i] += mat.colptr(i)[j];
| }
| }
| 
| return res;
| }
| @
| 
| from R:
| 
| vec <- a very big integer vector
| nrow <- 80
| pos <- a very big subset of positions, such that max(pos) + nrow < length(vec)
| colsums <- colSumsGapMat(vec, pos, nrow)
| 
| 
| from time to time I get a segfault.
| Note: this is not exactly the code that produces the segfault (because
| that one is very complicated), so it might be that this code is
| totally fine.
| 
| My suspicion:
| 
| I am using the pointer "vec.begin()", but then I am allocating new
| memory in the R area of memory with "IntegerVector res(pos.length())"
| and R decides to move the original values of "vec" to some other
| place, making the pointer invalid.
| 
| Is that possible
| 
| Sorry for being very vague and thx in advance!!!
| Ale
| 
| -- 
| Alessandro Mammana, PhD Student
| Max Planck Institute for Molecular Genetics
| Ihnestraße 63-73
| D-14195 Berlin, Germany
| ___
| Rcpp-devel mailing list
| Rcpp-devel@lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-12 Thread Alessandro Mammana
Ok I was able to find the code causing the bug. So it looks like the
pointers you get from an Rcpp::Vector using .begin() become invalid
after that the Rcpp::Vector goes out of scope (and this makes sense),
what I do not understand is that this Rcpp::Vector was allocated in R
and should still be "living" during the execution of the Rcpp call
(that's why I wasn't expecting the pointer to be invalid).

This is the exact code (the one above is probably fine):
@@ in CPP @@i

struct GapMat {
int* ptr;
int* colset;
int nrow;
int ncol;


inline int* colptr(int col){
return ptr + colset[col];
}

GapMat(){}

GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
};


GapMat getGapMat(Rcpp::List gapmat){
IntegerVector vec = gapmat["vec"];
IntegerVector pos = gapmat["colset"];
int nrow = gapmat["nrow"];

return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
}

// [[Rcpp::export]]
IntegerVector colSumsGapMat(Rcpp::List gapmat){

GapMat mat = getGapMat(gapmat);
IntegerVector res(mat.ncol);

for (int i = 0; i < mat.ncol; ++i){
for (int j = 0; j < mat.nrow; ++j){
res[i] += mat.colptr(i)[j];
}
}

return res;
}

@@ in R (with gdb debugger as suggested by Dirk) @@i
library(Rcpp)
sourceCpp("scratchpad.cpp")

vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
nr <- 80

colset <- sample(3e7-nr, 1e7)
foo <- vec[colset] #this is only to trigger some obscure garbage
collection mechanisms...

for (i in 1:10){
colset <- sample(3e7-nr, 1e7)
gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
cs <- colSumsGapMat(gapmat)
print(sum(cs))
}

[1] 8000
[1] 8000
[1] 80016890
[1] 80008144
[1] 80016022
[1] 80021609

Program received signal SIGSEGV, Segmentation fault.
0x718a5455 in GapMat::colptr (this=0x7fffc120, col=0) at
scratchpad.cpp:295
295return ptr + colset[col];

@@@

Why did it happen? What should I do to make sure that my pointers
remain valid? My goal is to convert safely some vectors or matrices
that "exist" in R to some pointers, how can I do that?

Thanks a lot for your help

Ale

On Tue, Feb 11, 2014 at 3:44 PM, Dirk Eddelbuettel  wrote:
>
> In essence: "Yes"
>
> On 11 February 2014 at 15:18, Alessandro Mammana wrote:
> | Hi all,
> | I got another segfault using Rcpp. It is very difficult to understand
> | where it happens and to reduce it to a minimal example, so for now I
> | am not posting very precise code here, but I have a suspicion, maybe
> | you could help me saying if my suspect is right.
>
> Use a debugger:
>
> R -d gdb
>
> and then proceed as normal.
>
> Make sure your compiler flags include -g as well.
>
> Dirk
>
>
> | I am doing something similar:
> |
> | in a .cpp file:
> | @@@
> | struct GapMat {
> | int* ptr;
> | int* colset;
> | int nrow;
> | int ncol;
> |
> |
> | inline int* colptr(int col){
> | return ptr + colset[col];
> | }
> |
> | GapMat(){}
> |
> | GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
> | ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
> | };
> |
> |
> | // [[Rcpp::export]]
> | IntegerVector colSumsGapMat(Rcpp::IntegerVector vec,
> | Rcpp::IntegerVector pos, int nrow){
> |GapMat mat(vec.begin(), pos.begin(), nrow, pos.length());
> |IntegerVector res(pos.length());
> |
> | for (int i = 0; i < pos.length(); ++i){
> | for (int j = 0; j < nrow; ++j){
> | res[i] += mat.colptr(i)[j];
> | }
> | }
> |
> | return res;
> | }
> | @
> |
> | from R:
> |
> | vec <- a very big integer vector
> | nrow <- 80
> | pos <- a very big subset of positions, such that max(pos) + nrow < 
> length(vec)
> | colsums <- colSumsGapMat(vec, pos, nrow)
> |
> |
> | from time to time I get a segfault.
> | Note: this is not exactly the code that produces the segfault (because
> | that one is very complicated), so it might be that this code is
> | totally fine.
> |
> | My suspicion:
> |
> | I am using the pointer "vec.begin()", but then I am allocating new
> | memory in the R area of memory with "IntegerVector res(pos.length())"
> | and R decides to move the original values of "vec" to some other
> | place, making the pointer invalid.
> |
> | Is that possible
> |
> | Sorry for being very vague and thx in advance!!!
> | Ale
> |
> | --
> | Alessandro Mammana, PhD Student
> | Max Planck Institute for Molecular Genetics
> | Ihnestraße 63-73
> | D-14195 Berlin, Germany
> | ___
> | Rcpp-devel mailing list
> | Rcpp-devel@lists.r-forge.r-project.org
> | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
>
> --
> Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com



-- 
Alessandro Mamm

Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-12 Thread Dirk Eddelbuettel

On 12 February 2014 at 11:47, Alessandro Mammana wrote:
| Ok I was able to find the code causing the bug. So it looks like the

Thanks for the added detail.

| pointers you get from an Rcpp::Vector using .begin() become invalid
| after that the Rcpp::Vector goes out of scope (and this makes sense),
| what I do not understand is that this Rcpp::Vector was allocated in R
| and should still be "living" during the execution of the Rcpp call
| (that's why I wasn't expecting the pointer to be invalid).
| 
| This is the exact code (the one above is probably fine):
| @@ in CPP @@i
| 
| struct GapMat {
| int* ptr;
| int* colset;
| int nrow;
| int ncol;
| 
| 
| inline int* colptr(int col){
| return ptr + colset[col];
| }
| 
| GapMat(){}
| 
| GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
| ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
| };
| 
| 
| GapMat getGapMat(Rcpp::List gapmat){
| IntegerVector vec = gapmat["vec"];
| IntegerVector pos = gapmat["colset"];
| int nrow = gapmat["nrow"];
| 
| return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
| }
| 
| // [[Rcpp::export]]
| IntegerVector colSumsGapMat(Rcpp::List gapmat){
| 
| GapMat mat = getGapMat(gapmat);
| IntegerVector res(mat.ncol);
| 
| for (int i = 0; i < mat.ncol; ++i){
| for (int j = 0; j < mat.nrow; ++j){
| res[i] += mat.colptr(i)[j];
| }
| }
| 
| return res;
| }
| 
| @@ in R (with gdb debugger as suggested by Dirk) @@i
| library(Rcpp)
| sourceCpp("scratchpad.cpp")
| 
| vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
| nr <- 80
| 
| colset <- sample(3e7-nr, 1e7)
| foo <- vec[colset] #this is only to trigger some obscure garbage
| collection mechanisms...
| 
| for (i in 1:10){
| colset <- sample(3e7-nr, 1e7)
| gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
| cs <- colSumsGapMat(gapmat)
| print(sum(cs))
| }
| 
| [1] 8000
| [1] 8000
| [1] 80016890
| [1] 80008144
| [1] 80016022
| [1] 80021609
| 
| Program received signal SIGSEGV, Segmentation fault.
| 0x718a5455 in GapMat::colptr (this=0x7fffc120, col=0) at
| scratchpad.cpp:295
| 295return ptr + colset[col];
| 
| @@@
| 
| Why did it happen? What should I do to make sure that my pointers
| remain valid? My goal is to convert safely some vectors or matrices
| that "exist" in R to some pointers, how can I do that?

Not sure. It looks fine at first instance. But then it's early in the morning
and I had very little coffee yet... 

Maybe the fact that you tickle the gc() via vec[colset] has something to do
with it, maybe it has not.  Maybe I would try the decomposition of the List
object inside the colSumsGapMat() function to keep it simpler.  Or if you
_really_ want an external object to iterate over, memcpy it out.

With really large object, you may be stressing parts of the code that have
not been stressed the same way.  If it breaks, you do get to keep both pieces.

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-12 Thread Alessandro Mammana
Ah wait, my bad (as always T.T), I found a much simpler explanation:

colset <- sample(3e7-nr, 1e7)
storage.mode(colset)
[1] "integer"
storage.mode(colset-1)
[1] "double"

So when I was unwrapping colset I allocated new memory in Rcpp to
convert from double to integer, which was no longer valid when I went
out of scope.
I think it is a bit dangerous that you never know if you are
allocating memory or just wrapping R objects when parsing arguments in
Rcpp.
Is there a way of ensuring that NOTHING gets copied when parsing
arguments? Can you throw an exception if the type you try to cast to
is not the one you expect?
 You might imagine that with large datasets this is important.

Sorry for bothering and thanks again,
Ale


On Wed, Feb 12, 2014 at 1:10 PM, Dirk Eddelbuettel  wrote:
>
> On 12 February 2014 at 11:47, Alessandro Mammana wrote:
> | Ok I was able to find the code causing the bug. So it looks like the
>
> Thanks for the added detail.
>
> | pointers you get from an Rcpp::Vector using .begin() become invalid
> | after that the Rcpp::Vector goes out of scope (and this makes sense),
> | what I do not understand is that this Rcpp::Vector was allocated in R
> | and should still be "living" during the execution of the Rcpp call
> | (that's why I wasn't expecting the pointer to be invalid).
> |
> | This is the exact code (the one above is probably fine):
> | @@ in CPP @@i
> |
> | struct GapMat {
> | int* ptr;
> | int* colset;
> | int nrow;
> | int ncol;
> |
> |
> | inline int* colptr(int col){
> | return ptr + colset[col];
> | }
> |
> | GapMat(){}
> |
> | GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
> | ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
> | };
> |
> |
> | GapMat getGapMat(Rcpp::List gapmat){
> | IntegerVector vec = gapmat["vec"];
> | IntegerVector pos = gapmat["colset"];
> | int nrow = gapmat["nrow"];
> |
> | return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
> | }
> |
> | // [[Rcpp::export]]
> | IntegerVector colSumsGapMat(Rcpp::List gapmat){
> |
> | GapMat mat = getGapMat(gapmat);
> | IntegerVector res(mat.ncol);
> |
> | for (int i = 0; i < mat.ncol; ++i){
> | for (int j = 0; j < mat.nrow; ++j){
> | res[i] += mat.colptr(i)[j];
> | }
> | }
> |
> | return res;
> | }
> |
> | @@ in R (with gdb debugger as suggested by Dirk) @@i
> | library(Rcpp)
> | sourceCpp("scratchpad.cpp")
> |
> | vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
> | nr <- 80
> |
> | colset <- sample(3e7-nr, 1e7)
> | foo <- vec[colset] #this is only to trigger some obscure garbage
> | collection mechanisms...
> |
> | for (i in 1:10){
> | colset <- sample(3e7-nr, 1e7)
> | gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
> | cs <- colSumsGapMat(gapmat)
> | print(sum(cs))
> | }
> |
> | [1] 8000
> | [1] 8000
> | [1] 80016890
> | [1] 80008144
> | [1] 80016022
> | [1] 80021609
> |
> | Program received signal SIGSEGV, Segmentation fault.
> | 0x718a5455 in GapMat::colptr (this=0x7fffc120, col=0) at
> | scratchpad.cpp:295
> | 295return ptr + colset[col];
> |
> | @@@
> |
> | Why did it happen? What should I do to make sure that my pointers
> | remain valid? My goal is to convert safely some vectors or matrices
> | that "exist" in R to some pointers, how can I do that?
>
> Not sure. It looks fine at first instance. But then it's early in the morning
> and I had very little coffee yet...
>
> Maybe the fact that you tickle the gc() via vec[colset] has something to do
> with it, maybe it has not.  Maybe I would try the decomposition of the List
> object inside the colSumsGapMat() function to keep it simpler.  Or if you
> _really_ want an external object to iterate over, memcpy it out.
>
> With really large object, you may be stressing parts of the code that have
> not been stressed the same way.  If it breaks, you do get to keep both pieces.
>
> Dirk
>
> --
> Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com



-- 
Alessandro Mammana, PhD Student
Max Planck Institute for Molecular Genetics
Ihnestraße 63-73
D-14195 Berlin, Germany
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-12 Thread Dirk Eddelbuettel

On 12 February 2014 at 13:36, Alessandro Mammana wrote:
| Ah wait, my bad (as always T.T), I found a much simpler explanation:

Isn't it lovely when persistence pays off?  ;-)
 
| colset <- sample(3e7-nr, 1e7)
| storage.mode(colset)
| [1] "integer"
| storage.mode(colset-1)
| [1] "double"
| 
| So when I was unwrapping colset I allocated new memory in Rcpp to
| convert from double to integer, which was no longer valid when I went
| out of scope.

Well that is sort-of a known issue. Look for discussions of clone() in the
archive.

| I think it is a bit dangerous that you never know if you are
| allocating memory or just wrapping R objects when parsing arguments in
| Rcpp.
| Is there a way of ensuring that NOTHING gets copied when parsing
| arguments? Can you throw an exception if the type you try to cast to
| is not the one you expect?

If you don't require an (implicit) cast and you don't use clone(), nothing
gets copied.  That;s how proxy objects around SEXP work.

|  You might imagine that with large datasets this is important.

You can also use XPtr, and XPtr in combination with bigmemory's big.matrix,
to keep data away from R.

Dirk

-- 
Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
___
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel


Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-12 Thread Romain Francois

Le 12 févr. 2014 à 13:36, Alessandro Mammana  a écrit :

> Ah wait, my bad (as always T.T), I found a much simpler explanation:
> 
> colset <- sample(3e7-nr, 1e7)
> storage.mode(colset)
> [1] "integer"
> storage.mode(colset-1)
> [1] "double"
> 
> So when I was unwrapping colset I allocated new memory in Rcpp to
> convert from double to integer, which was no longer valid when I went
> out of scope.
> I think it is a bit dangerous that you never know if you are
> allocating memory or just wrapping R objects when parsing arguments in
> Rcpp.
> Is there a way of ensuring that NOTHING gets copied when parsing
> arguments? Can you throw an exception if the type you try to cast to
> is not the one you expect?
> You might imagine that with large datasets this is important.

Silent coercion was added by design. Rcpp does not give you a « strict » mode. 

One thing you can do is something like this: 

#include 
using namespace Rcpp ;

template 
class Strict : public T {
public:
  Strict( SEXP x ) {
if( TYPEOF(x) != T::r_type::value )
  stop( "not compatible" ) ;
T::Storage::set__(x) ;
  }

} ;

// [[Rcpp::export]]
int foo( Strict v ){
  return v.size() ;
}

You’d get e.g. 

> foo(rnorm(10))
[1] 10

> foo(1:10)
Error in eval(expr, envir, enclos) : not compatible
Calls: sourceCpp ... source -> withVisible -> eval -> eval -> foo -> 
Execution halted



> Sorry for bothering and thanks again,
> Ale
> 
> 
> On Wed, Feb 12, 2014 at 1:10 PM, Dirk Eddelbuettel  wrote:
>> 
>> On 12 February 2014 at 11:47, Alessandro Mammana wrote:
>> | Ok I was able to find the code causing the bug. So it looks like the
>> 
>> Thanks for the added detail.
>> 
>> | pointers you get from an Rcpp::Vector using .begin() become invalid
>> | after that the Rcpp::Vector goes out of scope (and this makes sense),
>> | what I do not understand is that this Rcpp::Vector was allocated in R
>> | and should still be "living" during the execution of the Rcpp call
>> | (that's why I wasn't expecting the pointer to be invalid).
>> |
>> | This is the exact code (the one above is probably fine):
>> | @@ in CPP @@i
>> |
>> | struct GapMat {
>> | int* ptr;
>> | int* colset;
>> | int nrow;
>> | int ncol;
>> |
>> |
>> | inline int* colptr(int col){
>> | return ptr + colset[col];
>> | }
>> |
>> | GapMat(){}
>> |
>> | GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
>> | ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
>> | };
>> |
>> |
>> | GapMat getGapMat(Rcpp::List gapmat){
>> | IntegerVector vec = gapmat["vec"];
>> | IntegerVector pos = gapmat["colset"];
>> | int nrow = gapmat["nrow"];
>> |
>> | return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
>> | }
>> |
>> | // [[Rcpp::export]]
>> | IntegerVector colSumsGapMat(Rcpp::List gapmat){
>> |
>> | GapMat mat = getGapMat(gapmat);
>> | IntegerVector res(mat.ncol);
>> |
>> | for (int i = 0; i < mat.ncol; ++i){
>> | for (int j = 0; j < mat.nrow; ++j){
>> | res[i] += mat.colptr(i)[j];
>> | }
>> | }
>> |
>> | return res;
>> | }
>> |
>> | @@ in R (with gdb debugger as suggested by Dirk) 
>> @@i
>> | library(Rcpp)
>> | sourceCpp("scratchpad.cpp")
>> |
>> | vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
>> | nr <- 80
>> |
>> | colset <- sample(3e7-nr, 1e7)
>> | foo <- vec[colset] #this is only to trigger some obscure garbage
>> | collection mechanisms...
>> |
>> | for (i in 1:10){
>> | colset <- sample(3e7-nr, 1e7)
>> | gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
>> | cs <- colSumsGapMat(gapmat)
>> | print(sum(cs))
>> | }
>> |
>> | [1] 8000
>> | [1] 8000
>> | [1] 80016890
>> | [1] 80008144
>> | [1] 80016022
>> | [1] 80021609
>> |
>> | Program received signal SIGSEGV, Segmentation fault.
>> | 0x718a5455 in GapMat::colptr (this=0x7fffc120, col=0) at
>> | scratchpad.cpp:295
>> | 295return ptr + colset[col];
>> |
>> | @@@
>> |
>> | Why did it happen? What should I do to make sure that my pointers
>> | remain valid? My goal is to convert safely some vectors or matrices
>> | that "exist" in R to some pointers, how can I do that?
>> 
>> Not sure. It looks fine at first instance. But then it's early in the morning
>> and I had very little coffee yet...
>> 
>> Maybe the fact that you tickle the gc() via vec[colset] has something to do
>> with it, maybe it has not.  Maybe I would try the decomposition of the List
>> object inside the colSumsGapMat() function to keep it simpler.  Or if you
>> _really_ want an external object to iterate over, memcpy it out.
>> 
>> With really large object, you may be stressing parts of the code that have
>> not been stressed the same way.  If it breaks, you do get to keep both 
>> pieces.
>> 
>> Dirk
>> 
>> --
>> Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com
> 
> 
> 
> 

Re: [Rcpp-devel] Segfault, is it because of iterators/pointers?

2014-02-12 Thread Alessandro Mammana
I like the "Strict  mode" idea, I will use it, thanks!

On Wed, Feb 12, 2014 at 2:34 PM, Romain Francois
 wrote:
>
> Le 12 févr. 2014 à 13:36, Alessandro Mammana  a écrit :
>
>> Ah wait, my bad (as always T.T), I found a much simpler explanation:
>>
>> colset <- sample(3e7-nr, 1e7)
>> storage.mode(colset)
>> [1] "integer"
>> storage.mode(colset-1)
>> [1] "double"
>>
>> So when I was unwrapping colset I allocated new memory in Rcpp to
>> convert from double to integer, which was no longer valid when I went
>> out of scope.
>> I think it is a bit dangerous that you never know if you are
>> allocating memory or just wrapping R objects when parsing arguments in
>> Rcpp.
>> Is there a way of ensuring that NOTHING gets copied when parsing
>> arguments? Can you throw an exception if the type you try to cast to
>> is not the one you expect?
>> You might imagine that with large datasets this is important.
>
> Silent coercion was added by design. Rcpp does not give you a << strict >> 
> mode.
>
> One thing you can do is something like this:
>
> #include 
> using namespace Rcpp ;
>
> template 
> class Strict : public T {
> public:
>   Strict( SEXP x ) {
> if( TYPEOF(x) != T::r_type::value )
>   stop( "not compatible" ) ;
> T::Storage::set__(x) ;
>   }
>
> } ;
>
> // [[Rcpp::export]]
> int foo( Strict v ){
>   return v.size() ;
> }
>
> You'd get e.g.
>
>> foo(rnorm(10))
> [1] 10
>
>> foo(1:10)
> Error in eval(expr, envir, enclos) : not compatible
> Calls: sourceCpp ... source -> withVisible -> eval -> eval -> foo -> 
> 
> Execution halted
>
>
>
>> Sorry for bothering and thanks again,
>> Ale
>>
>>
>> On Wed, Feb 12, 2014 at 1:10 PM, Dirk Eddelbuettel  wrote:
>>>
>>> On 12 February 2014 at 11:47, Alessandro Mammana wrote:
>>> | Ok I was able to find the code causing the bug. So it looks like the
>>>
>>> Thanks for the added detail.
>>>
>>> | pointers you get from an Rcpp::Vector using .begin() become invalid
>>> | after that the Rcpp::Vector goes out of scope (and this makes sense),
>>> | what I do not understand is that this Rcpp::Vector was allocated in R
>>> | and should still be "living" during the execution of the Rcpp call
>>> | (that's why I wasn't expecting the pointer to be invalid).
>>> |
>>> | This is the exact code (the one above is probably fine):
>>> | @@ in CPP @@i
>>> |
>>> | struct GapMat {
>>> | int* ptr;
>>> | int* colset;
>>> | int nrow;
>>> | int ncol;
>>> |
>>> |
>>> | inline int* colptr(int col){
>>> | return ptr + colset[col];
>>> | }
>>> |
>>> | GapMat(){}
>>> |
>>> | GapMat(int* _ptr, int* _colset, int _nrow, int _ncol):
>>> | ptr(_ptr), colset(_colset), nrow(_nrow), ncol(_ncol){}
>>> | };
>>> |
>>> |
>>> | GapMat getGapMat(Rcpp::List gapmat){
>>> | IntegerVector vec = gapmat["vec"];
>>> | IntegerVector pos = gapmat["colset"];
>>> | int nrow = gapmat["nrow"];
>>> |
>>> | return GapMat(vec.begin(), pos.begin(), nrow, pos.length());
>>> | }
>>> |
>>> | // [[Rcpp::export]]
>>> | IntegerVector colSumsGapMat(Rcpp::List gapmat){
>>> |
>>> | GapMat mat = getGapMat(gapmat);
>>> | IntegerVector res(mat.ncol);
>>> |
>>> | for (int i = 0; i < mat.ncol; ++i){
>>> | for (int j = 0; j < mat.nrow; ++j){
>>> | res[i] += mat.colptr(i)[j];
>>> | }
>>> | }
>>> |
>>> | return res;
>>> | }
>>> |
>>> | @@ in R (with gdb debugger as suggested by Dirk) 
>>> @@i
>>> | library(Rcpp)
>>> | sourceCpp("scratchpad.cpp")
>>> |
>>> | vec <- rnbinom(3e7, mu=0.1, size=1); storage.mode(vec) <- "integer"
>>> | nr <- 80
>>> |
>>> | colset <- sample(3e7-nr, 1e7)
>>> | foo <- vec[colset] #this is only to trigger some obscure garbage
>>> | collection mechanisms...
>>> |
>>> | for (i in 1:10){
>>> | colset <- sample(3e7-nr, 1e7)
>>> | gapmat <- list(vec=vec, nrow=nr, colset=colset-1)
>>> | cs <- colSumsGapMat(gapmat)
>>> | print(sum(cs))
>>> | }
>>> |
>>> | [1] 8000
>>> | [1] 8000
>>> | [1] 80016890
>>> | [1] 80008144
>>> | [1] 80016022
>>> | [1] 80021609
>>> |
>>> | Program received signal SIGSEGV, Segmentation fault.
>>> | 0x718a5455 in GapMat::colptr (this=0x7fffc120, col=0) at
>>> | scratchpad.cpp:295
>>> | 295return ptr + colset[col];
>>> |
>>> | @@@
>>> |
>>> | Why did it happen? What should I do to make sure that my pointers
>>> | remain valid? My goal is to convert safely some vectors or matrices
>>> | that "exist" in R to some pointers, how can I do that?
>>>
>>> Not sure. It looks fine at first instance. But then it's early in the 
>>> morning
>>> and I had very little coffee yet...
>>>
>>> Maybe the fact that you tickle the gc() via vec[colset] has something to do
>>> with it, maybe it has not.  Maybe I would try the decomposition of the List
>>> object inside the colSumsGapMat() function to keep it simpler.  Or if you
>>> _really_ want an external object to itera