Hi Dirk, thanks for this bunch of information. I will work through it the next days.
Your point with the rnorm-Function is indeed an essential one. It takes the most time, so that the remaining time is almost negligible. A timing inside a loop with a a lot of memory writing makes more sense, here. Thanks for making this clear! Best Simon On May 26, 2013, at 7:28 PM, Dirk Eddelbuettel <[email protected]> wrote: > > On 26 May 2013 at 18:41, Simon Zehnder wrote: > | Apologize for being imprecise in my second question. I try to rephrase it > here, so more members can understand: > | > | 2'. As performance is always a matter in statistical computations, > | especially in simulations, it is of special interest to know, how storage > | (access to memory, writing to memory) of results in a loop works fastest. > > We have a rather detailed (if dated) benchmark in the Rcpp package, see > examples/ConvolveBenchmarks/ > > | Very often we have in programming the problem, that if we have to work via > | an interface (here e.g. to R or to a database) it becomes slow, when > | opening and closing this interface very often. My question was towards the > | interface between C++ and R. As a very simple example for my question, > | consider the following, > | > | EX1 > | > | library(Rcpp) > | library(inline) > | src <- 'NumericMatrix numM(m); > | for(unsigned int i = 0; i < numM.nrow(); ++i) { > | Rcpp::RNGScope scope; > > That RNGScope instance _obviously_ belongs outside the loop. > > | numM.row(i) = Rcpp::rnorm(numM.ncol()); > | } > | return numM;' > | randomN <- cxxfunction(signature(m = "matrix"), body = src, plugin = > "Rcpp", verbose = TRUE) > | m <- matrix(0, 100000, 10000) > | system.time(m <- randomN(m)) > > Don't use system.time(), use either the rbenchmark package or the > microbenchmark package which both _sample over repeated calls_. > > Numerous examples for either are in the list archives. > > | vs. > | > | EX 2 > | > | library(Rcpp) > | library(inline) > | src <- 'int n = as<int>(N); > | int k = as<int>(K); > | NumericMatrix numM(n,k); > > That is a new allocation that the code above does not do. How could this be > faster? > > | for(unsigned int i = 0; i < numM.nrow(); ++i) { > | Rcpp::RNGScope scope; > > As above. > > | numM.row(i) = Rcpp::rnorm(numM.ncol()); > | } > | return numM;' > | randomN2 <- cxxfunction(signature(N = "numeric", K = "numeric"), body = > src, plugin = "Rcpp", verbose = TRUE) > | system.time(m <- randomN2(100000,10000)) > > As above. > > In either case I would expect the _fixed cost of the rnorm() call_ to dominate > your timings. > > If you want to time matrix access, time matrix access. Do not confound it > with a second, expensive operation. > > | In the first function we allocate memory in R and fill it in C++ inside the > loop. In the second example we allocate memory in C++, fill it inside the > loop and assign the matrix to an R object. > | > | My question was now, if there will be a performance difference in using R > memory or in using C++ memory and assigning it then to R. My concerns were > about the interface between C++ and R, when used inside the loop of EX1 very > often. But it seems, that both versions perform quite similar. The only > difference we can see, is in adding the time for allocating the R matrix in > EX1: > | > | timing <- function() { > | m <- matrix(0, 100000,1000) > | m <- randomN(m) > | } > | > | So, the result would be: If you want to perform fastest in these examples > use EX2. Create an NumericVector in C++ fill it inside the loop and assign to > an R object. > > Now you conflating an R+C++ operation with a pure C++ operation. Should you > not compare, time _and report here_ the timings of both? > > | Now towards your little note at the end of your mail: > | > | I conform to your opinion, that my questions sometimes miss adequate > reactions, in form of solutions and/or a dialogue on the questioned issues. I > will give more feedback in the future. In my defence I would like to say, > that I am very often occupied with bunch of work in my institute, > (regrettably) not related to Rcpp. I really want to find more time to work in > the fields I like the most (which is programming and statistics). So, it > takes often much longer for me to read into some code, as other duties deter > me from doing things immediately. Hence, my disappearance was never meant to > be disrespectful to the members who make efforts to answer my questions and > contribute to the list. I also have some things on my todo-list which I want > to pass to the community and it bothers me, that I do not find the time to do > it immediately (I also do not understand how you find the time for all of > this as I guess you are working for a company). > > It was just a hint, you can of course do as you please. > > But if over time your only interaction with the list is "taking", readers may > well be less and less inclined to solve your problems for you when they never > get anything back in return. Time will tell. > > Dirk > > -- > Dirk Eddelbuettel | [email protected] | http://dirk.eddelbuettel.com _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
