Dear All, I am confused about creating Rcpp Numeric Matrices larger than .Machine$integer.max. The code below illustrates some of the points (probably with too much detail ;-). These are some things that puzzle me:
1. For some values of number of rows and columns, creating the matrix is not allowed, with the message "negative length vectors are not allowed", but with other values the creation of the matrix proceeds without (apparent) troubles, even when the total size is >> 2^31 - 1. 1.a. Is this intended? 1.b. I understand the error message is coming from R (not Rcpp) and thus this is not something that can be made easier to understand? 2. The part I found confusing is that the same problem (number of cells > 2^32 - 1) is sometimes caught at object creation, but sometimes manifests itself much later (either in the C++ code or later in R). I was expecting (maybe the problem are my expectations) an error early on, when creating the matrix; if the creation proceeds without trouble, I was not expecting a segfault (as I think all cells are initialized to cero). Is the recommended procedure to check if the product of dimensions is < 2^31 - 1 before creation? (But then, this will change in R-3.0 in 64 bit systems?). Best, R. // Beginning of file max-size.cpp #include <Rcpp.h> using namespace Rcpp; // [[Rcpp::export]] NumericMatrix f1(IntegerVector nr, IntegerVector nc, IntegerVector sf = 0) { int nrow = as<int>(nr); int ncol = as<int>(nc); int segf = as<int>(sf); NumericMatrix outM(nrow, ncol); std::cout << " After creating outM" << std::endl; outM(nrow - 1, 0) = 1; std::cout << " After asigning to last row, first column" << std::endl; std::cout << " Some other value: 1, 0: " << outM(1, 0) << std::endl; if( (nrow > 1) && (ncol > 3) ) std::cout << " Some other value: nrow - 1, ncol - 3: " << outM(nrow - 1, ncol - 3) << std::endl; outM(nrow - 1, ncol - 1) = 1; std::cout << " After asigning something to last cell" << std::endl; std::cout << " Try to return the last assignment: " << outM(nrow - 1, ncol - 1) << std::endl; if((nrow >= 500000) && segf) { std::cout << "\n Assign a few around/beyond 2^32 - 1. Should segfault\n"; for(int i = 4290; i < 4300; ++i) { std::cout << " i = " << i << std::endl; outM(nrow - 1, i) = 0; } } return wrap(outM); } // End of file max-size.cpp ################################################ library(Rcpp) sourceCpp("max-size.cpp", verbose = TRUE) (tmp <- f1(4, 5)) 4294967 * 500 > .Machine$integer.max tmp <- f1(4294967, 500) object.size(tmp)/(4294967 * 500) ## ~ 8 4294967 * 501 > .Machine$integer.max tmp <- f1(4294967, 501) ## negative length vectors 500000 * 9000 > .Machine$integer.max tmp <- f1(500000, 9000) ## sometimes segfaults tmp[500000, 9000] object.size(tmp) ## things are missing prod(dim(tmp)) > .Machine$integer.max ## using either of these usually leads to segfault for(i in (4290:4300)) print(tmp[500000, i]) f1(500000, 9000, 1) ##################################################### -- Ramon Diaz-Uriarte Department of Biochemistry, Lab B-25 Facultad de Medicina Universidad Autónoma de Madrid Arzobispo Morcillo, 4 28029 Madrid Spain Phone: +34-91-497-2412 Email: rdia...@gmail.com ramon.d...@iib.uam.es http://ligarto.org/rdiaz _______________________________________________ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel