Re: [Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
My two cents worth: For the microsimulation package, we needed uniform random number streams and sub-streams at the C++ level, while supporting R's non-uniform random number distributions[*]. For this, we used the C++ RngStreams library and provided double *user_unif_rand () for user-defined RNGs. Essentially, this reproduces the L'Ecuyer-CMRG RNG provided by R, with seed manipulation at the C++ level (cf. using .Random.seed and R's C machinery). We also allow for the C++ seeds to be set to and from R. See: https://github.com/mclements/microsimulation/blob/master/src/microsimulation.cc https://github.com/mclements/microsimulation/blob/master/src/microsimulation.h (circa lines 290-355) https://github.com/mclements/microsimulation/blob/master/R/rcpp_hello_world.R (circa lines 6-36) This is not implemented using OpenMP. For our purpose, where we are doing 10^7 small simulations, we chunk the simulations at the R level and use the parallel package to call the C++ code on each chunk. This approaches scales well with more processors. We have also looked at using the C++ RngStream library with Boost.Random's and C++11's non-uniform random number distributions. Again, this has not been implemented using OpenMP (todo?). For a simple wrapper for Boost, see: https://github.com/mclements/microsimulation/blob/master/src/rngstream-boost.hpp with an example: https://github.com/mclements/microsimulation/blob/master/src/rngstream-example.cpp Two brief notes: first, we have put the RngStream library in the ssim namespace for use with the microsimulation package. Other uses of the Boost wrapper would probably want to change the namespace. Second, the implementation for C++11 random number generators and distributions required a small change to the RngStream library - specifically, to output the random uniform as an unsigned long. Interestingly, the resulting C++11 random numbers drop every second value compared with R and Boost.Random. For the C+++11 wrapper, see the code RngStream* and rngstream* in: https://github.com/mclements/microsimulation/tree/master/test Sincerely, Mark. [*] The BH package was not available, C++11 compiler requirements were not accepted on CRAN, reimplementing non-uniform random number distributions using UNURAN would have taken too long, and the R non-uniform random number distributions are extensive and well tested. ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
Hi Mark, many thanks for all the info, I will certainly have a detailed look at what you are doing. I think it would be nice to have a package that uses C++ level parallelism (OpenMP or not) to speed up random number generation at R level. For example: library(microbenchmark) library(mvtnorm) library(mvnfast) library(MASS) # Generating 1e4 50-dimensional multivariate normal r.v. N - 1 d - 50 mu - 1:d tmp - matrix(rnorm(d^2), d, d) mcov - tcrossprod(tmp, tmp) myChol - chol(mcov) microbenchmark(rmvn(N, mu, mcov, ncores = 4), +rmvn(N, mu, mcov), +rmvnorm(N, mu, mcov, method = chol), +mvrnorm(N, mu, mcov)) Unit: milliseconds expr min lq median uq max neval rmvn(N, mu, mcov, ncores = 4) 8.339534 12.85869 13.23915 13.82640 38.04688 100 rmvn(N, mu, mcov) 31.645092 32.32220 33.70345 34.03561 57.78102 100 rmvnorm(N, mu, mcov, method = chol) 57.956183 58.99730 59.84668 82.11610 90.71133 100 mvrnorm(N, mu, mcov) 60.621391 62.06400 82.86829 85.42350 90.50662 100 Here I followed Matt's suggestion and used C++11 mt19937_64 rng, with seeds generated by R::runif. Obviously what you have done by using RngStreams with Boost's distributions is much more rigorous, so probably that the approach rights to provide faster version of rnorm(), rpois() etc. Thanks, Matteo On Wed, Apr 30, 2014 at 9:10 AM, Mark Clements mark.cleme...@ki.se wrote: My two cents worth: For the microsimulation package, we needed uniform random number streams and sub-streams at the C++ level, while supporting R's non-uniform random number distributions[*]. For this, we used the C++ RngStreams library and provided double *user_unif_rand () for user-defined RNGs. Essentially, this reproduces the L'Ecuyer-CMRG RNG provided by R, with seed manipulation at the C++ level (cf. using .Random.seed and R's C machinery). We also allow for the C++ seeds to be set to and from R. See: https://github.com/mclements/microsimulation/blob/master/src/microsimulation.cc https://github.com/mclements/microsimulation/blob/master/src/microsimulation.h (circa lines 290-355) https://github.com/mclements/microsimulation/blob/master/R/rcpp_hello_world.R (circa lines 6-36) This is not implemented using OpenMP. For our purpose, where we are doing 10^7 small simulations, we chunk the simulations at the R level and use the parallel package to call the C++ code on each chunk. This approaches scales well with more processors. We have also looked at using the C++ RngStream library with Boost.Random's and C++11's non-uniform random number distributions. Again, this has not been implemented using OpenMP (todo?). For a simple wrapper for Boost, see: https://github.com/mclements/microsimulation/blob/master/src/rngstream-boost.hpp with an example: https://github.com/mclements/microsimulation/blob/master/src/rngstream-example.cpp Two brief notes: first, we have put the RngStream library in the ssim namespace for use with the microsimulation package. Other uses of the Boost wrapper would probably want to change the namespace. Second, the implementation for C++11 random number generators and distributions required a small change to the RngStream library - specifically, to output the random uniform as an unsigned long. Interestingly, the resulting C++11 random numbers drop every second value compared with R and Boost.Random. For the C+++11 wrapper, see the code RngStream* and rngstream* in: https://github.com/mclements/microsimulation/tree/master/test Sincerely, Mark. [*] The BH package was not available, C++11 compiler requirements were not accepted on CRAN, reimplementing non-uniform random number distributions using UNURAN would have taken too long, and the R non-uniform random number distributions are extensive and well tested. ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
On 4/28/2014 01:30, Matteo Fasiolo wrote: [...] As I understand, doing things such as: NumericMatrix out(n, d); #pragma omp for schedule(static) for(int kk = 0; kk d; kk++) out( _, kk) = rnorm(n); is not going to work, because rnorm() is not thread safe (in fact this code crashed R). On the other hand R level parallelism using clusterApply, mclapply etc appears to be too slow to be of any use for this purpose. Is anybody aware of any package providing a parallel C++ rng which my package might link to? Your first choice can be to go with the C++ Standard Library -- header random -- (importantly) using one object per thread: http://stackoverflow.com/questions/8813592/c11-thread-safety-of-random-number-generators See: http://en.cppreference.com/w/cpp/numeric/random http://isocpp.org/blog/2013/03/n3551-random-number-generation Since you're using OpenMP, you can get the distinct thread ID by using the `omp_get_thread_num` function: http://stackoverflow.com/questions/15918758/how-to-make-each-thread-use-its-own-rng-in-c11 Note that if you're doing large scale parallel statistics (say, Monte Carlo) you may also want a PRNG with statistical properties which don't deteriorate (e.g., no spurious correlation, let alone overlapping) for very large samples; preferably, also one that doesn't maintain any kind of global state (so-called stateless) is going to be easier to use, too (nothing to synchronize/lock across threads). A relatively recent work that's particularly relevant is Random123: https://www.deshawresearch.com/resources_random123.html http://www.thesalmons.org/john/random123/ MIT-licensed C++ version is available here: http://www.sitmo.com/article/parallel-random-number-generator-in-c/ With a simple example: http://www.sitmo.com/article/multi-threaded-random-number-generation-in-c11/ // The author is currently working on getting it into Boost.Random; discussion: http://www.wilmott.com/messageview.cfm?catid=44threadid=95882STARTPAGE=2#710955 HTH, Best, Matt ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
Hi all, many thanks for your detailed replies, I see that there are many options out there. As a first step I think I will try out the C++11 option that Matt suggested, as it seems relatively simple. In particular I might do something like: #pragma omp parallel { std::mt19937_64 engine( static_castuint64_t(seeds[omp_get_thread_num()] ) ); std::uniform_real_distributiondouble zeroToOne(0.0, 1.0); #pragma omp for for (int i = 0; i 1000; i++) { double a = zeroToOne(engine); } } were seeds is a NumericVector coming from R. That's probably not ideal, but it might give reasonable and reproducible results. Thanks, Matteo On Mon, Apr 28, 2014 at 9:39 AM, Matt D. mat...@gmail.com wrote: On 4/28/2014 01:30, Matteo Fasiolo wrote: [...] As I understand, doing things such as: NumericMatrix out(n, d); #pragma omp for schedule(static) for(int kk = 0; kk d; kk++) out( _, kk) = rnorm(n); is not going to work, because rnorm() is not thread safe (in fact this code crashed R). On the other hand R level parallelism using clusterApply, mclapply etc appears to be too slow to be of any use for this purpose. Is anybody aware of any package providing a parallel C++ rng which my package might link to? Your first choice can be to go with the C++ Standard Library -- header random -- (importantly) using one object per thread: http://stackoverflow.com/questions/8813592/c11-thread- safety-of-random-number-generators See: http://en.cppreference.com/w/cpp/numeric/random http://isocpp.org/blog/2013/03/n3551-random-number-generation Since you're using OpenMP, you can get the distinct thread ID by using the `omp_get_thread_num` function: http://stackoverflow.com/questions/15918758/how-to- make-each-thread-use-its-own-rng-in-c11 Note that if you're doing large scale parallel statistics (say, Monte Carlo) you may also want a PRNG with statistical properties which don't deteriorate (e.g., no spurious correlation, let alone overlapping) for very large samples; preferably, also one that doesn't maintain any kind of global state (so-called stateless) is going to be easier to use, too (nothing to synchronize/lock across threads). A relatively recent work that's particularly relevant is Random123: https://www.deshawresearch.com/resources_random123.html http://www.thesalmons.org/john/random123/ MIT-licensed C++ version is available here: http://www.sitmo.com/article/parallel-random-number-generator-in-c/ With a simple example: http://www.sitmo.com/article/multi-threaded-random-number- generation-in-c11/ // The author is currently working on getting it into Boost.Random; discussion: http://www.wilmott.com/messageview.cfm?catid=44; threadid=95882STARTPAGE=2#710955 HTH, Best, Matt ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
Hi, If you can assume c++11, you might bot need openmp, as you can just use standard support for threads, etc ... Although the compiler suite used on windows (if you care about that) does not support c++11 threads. This might be available later. Romain Le 28 avr. 2014 à 12:51, Matteo Fasiolo matteo.fasi...@gmail.com a écrit : Hi all, many thanks for your detailed replies, I see that there are many options out there. As a first step I think I will try out the C++11 option that Matt suggested, as it seems relatively simple. In particular I might do something like: #pragma omp parallel { std::mt19937_64 engine( static_castuint64_t(seeds[omp_get_thread_num()] ) ); std::uniform_real_distributiondouble zeroToOne(0.0, 1.0); #pragma omp for for (int i = 0; i 1000; i++) { double a = zeroToOne(engine); } } were seeds is a NumericVector coming from R. That's probably not ideal, but it might give reasonable and reproducible results. Thanks, Matteo On Mon, Apr 28, 2014 at 9:39 AM, Matt D. mat...@gmail.com wrote: On 4/28/2014 01:30, Matteo Fasiolo wrote: [...] As I understand, doing things such as: NumericMatrix out(n, d); #pragma omp for schedule(static) for(int kk = 0; kk d; kk++) out( _, kk) = rnorm(n); is not going to work, because rnorm() is not thread safe (in fact this code crashed R). On the other hand R level parallelism using clusterApply, mclapply etc appears to be too slow to be of any use for this purpose. Is anybody aware of any package providing a parallel C++ rng which my package might link to? Your first choice can be to go with the C++ Standard Library -- header random -- (importantly) using one object per thread: http://stackoverflow.com/questions/8813592/c11-thread-safety-of-random-number-generators See: http://en.cppreference.com/w/cpp/numeric/random http://isocpp.org/blog/2013/03/n3551-random-number-generation Since you're using OpenMP, you can get the distinct thread ID by using the `omp_get_thread_num` function: http://stackoverflow.com/questions/15918758/how-to-make-each-thread-use-its-own-rng-in-c11 Note that if you're doing large scale parallel statistics (say, Monte Carlo) you may also want a PRNG with statistical properties which don't deteriorate (e.g., no spurious correlation, let alone overlapping) for very large samples; preferably, also one that doesn't maintain any kind of global state (so-called stateless) is going to be easier to use, too (nothing to synchronize/lock across threads). A relatively recent work that's particularly relevant is Random123: https://www.deshawresearch.com/resources_random123.html http://www.thesalmons.org/john/random123/ MIT-licensed C++ version is available here: http://www.sitmo.com/article/parallel-random-number-generator-in-c/ With a simple example: http://www.sitmo.com/article/multi-threaded-random-number-generation-in-c11/ // The author is currently working on getting it into Boost.Random; discussion: http://www.wilmott.com/messageview.cfm?catid=44threadid=95882STARTPAGE=2#710955 HTH, Best, Matt ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
[Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
Dear Rcpp developers, I am working on a small package (here https://github.com/mfasiolo/mvn) that should provide efficient tools for the multivariate normal distribution using Rcpp/RcppArmadillo and OpenMP. Creating functions that evaluate the multivariate normal density efficiently was fairly straightforward, but I am struggling with parallel random number generation with OpenMP. As I understand, doing things such as: NumericMatrix out(n, d); #pragma omp for schedule(static) for(int kk = 0; kk d; kk++) out( _, kk) = rnorm(n); is not going to work, because rnorm() is not thread safe (in fact this code crashed R). On the other hand R level parallelism using clusterApply, mclapply etc appears to be too slow to be of any use for this purpose. Is anybody aware of any package providing a parallel C++ rng which my package might link to? I have read this posthttp://www.lindonslog.com/programming/parallel-random-number-generation-trng/ about Tina's rng, which seems to work with OpenMP and Rcpp. How hard would it be to have such a library included in my package (if at all possible)? Sorry for the possibly silly questions and thanks for any suggestion. Matteo ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
Re: [Rcpp-devel] Parallel random numbers using Rcpp and OpenMP
Matteo, On 28 April 2014 at 01:30, Matteo Fasiolo wrote: | Dear Rcpp developers, | | I am working on a small package (here) that should provide efficient tools | for the multivariate normal distribution using Rcpp/RcppArmadillo and OpenMP. | | Creating functions that evaluate the multivariate normal density efficiently | was fairly straightforward, Yes, and there is also a Rcpp Gallery post on this topic: http://gallery.rcpp.org/articles/dmvnorm_arma/ | but I am struggling with parallel random number | generation with OpenMP. | | As I understand, doing things such as: | | NumericMatrix out(n, d); | #pragma omp for schedule(static) | for(int kk = 0; kk d; kk++) out( _, kk) = rnorm(n); | | is not going to work, because rnorm() is not thread safe | (in fact this code crashed R). On the other hand R level parallelism | using clusterApply, mclapply etc appears to be too slow to be of any | use for this purpose. You cannot 'touch R' at all if you are in an OpenMP context as R is nowhere near thread safe. Use R as a 'shell' to launch a job, keep data in C++ containers and go crazy in OpenMP. Then collect results and report back to R. | | Is anybody aware of any package providing a parallel C++ rng which my | package might link to? I have read this post about Tina's rng, which | seems to work with OpenMP and Rcpp. How hard would it be to have | such a library included in my package (if at all possible)? A number of people have worked on this. I used this to try to make the 'port' of DEoptim to C++ / Rcpp work in parallel. I got it partially working but didn't get to a point where I liked the reproducibility. Around that time, Petr Savicky had already done some work, and I presume continued with that. See his page on this research which you can find at http://www2.cs.cas.cz/~savicky/randomNumbers/ -- I not sure if he has code at R-Forge or GitHub. From a glance at the page, it looks like he as a package for the SFMT generator (by the Mersenne Twister generators) for OpenMP. And at the same time Karl Forner started to some work, and also exchanged notes with Petr. Karl does have a repo at R-Forge, it has several related package, and he also had done work on SFMT via OpenMP (and Rcpp). See eg here: https://r-forge.r-project.org/scm/viewvc.php/pkg/?root=gwas-bin-tests I am going to CC Petr and Karl. It would be great to settle on a reference implementation or two, and I am sure between the two of them we will get further pointers as to what works. Cheers, Dirk | | Sorry for the possibly silly questions and thanks for any suggestion. | | Matteo | | -- | ___ | Rcpp-devel mailing list | Rcpp-devel@lists.r-forge.r-project.org | https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel -- Dirk Eddelbuettel | e...@debian.org | http://dirk.eddelbuettel.com ___ Rcpp-devel mailing list Rcpp-devel@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel