On Thu, Feb 26, 2015 at 12:28 PM, Matt D. <[email protected]> wrote: > On 2/26/2015 18:59, Dirk Eddelbuettel wrote: >> >> On 26 February 2015 at 18:35, Matt D. wrote: >> | Which incidentally brings me to the advice I usually give in these >> situations: >> | unless you're absolutely dependent on the "features" of >> `Rcpp::NumericVector` >> | just forget about it and replace all uses with the standard container >> | `std::vector<double>`. >> >> Note that this means you will always force a copy on the way in, and on >> the >> way out. That is a guaranteed performance penalty. > > Perhaps a better self-documenting code could attempt to help the users by > having, say, `Rcpp::NumericVectorView` (or `Rcpp::NumericVectorProxy`) used > for view (proxy) purposes -- and sticking with the default (expected by R -- > as well as C++ -- programmers) for `Rcpp::NumericVector`?
>From the perspective of a performance oriented C programmer using Rcpp to improve performance of R algorithms, Matt's approach would be completely backward to my expectation. Optimized computation is cheap enough to rarely be a bottleneck, but memory access is expensive. Garbage collection is one of R's weakest spots. Almost my entire reason for dropping to Rcpp is to avoid unnecessary copies. But I agree with Matt that things could be clearer. The main issue I have is the interaction with R's "multiple names" behavior. If the vector is already multi-named, a copy is silently made as part of passing it. This is "correct" if you are making changes in place, but if you are calling a function that operates read-only on a multi-GB vector in a loop, this can destroy performance. The best solution I've found is to proactively make a single copy outside the loop by doing some sort of null-op (like A = A + 0), but this certainly feels hackish. And unless you do this everywhere, it doesn't solve the problem that a small change elsewhere in your program can drastically alter performance in the critical section. > (Alternatively, making `f(Rcpp::NumericVector & v)` signify the need for > mutation, while keeping the expected copied-value behavior for > `f(Rcpp::NumericVector v)`; or is implementing this inherently blocked by > the way RCpp has to interoperate with R through SEXPs? Similarly for > `f(std::vector<double> & v)` vs `f(std::vector<double> v)` vs `f(const > std::vector<double> & v)`?). I like this approach. Even better would be if it paid attention to 'const' such that no copy was made even if the R function call rules would normally require it: f(Rcpp::NumericVector &v): no copy unless multi-named SEXP f(Rcpp::NumericVector v) : copy unless no outside references possible (temp var) f(Rcpp::NumericVector const v): read only, no copy ever, blows up if contract violated In my world, it would be wonderful if the "&v" version had an option that either printed a warning or returned an error if a copy was required to pass the argument. I don't know if "const &v" could be shoehorned into providing this, or if it would be clear enough. --nate _______________________________________________ Rcpp-devel mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
