Dewey Dunnington created ARROW-17148:
----------------------------------------

             Summary: [R] Improve evaluation of R functions from C++
                 Key: ARROW-17148
                 URL: https://issues.apache.org/jira/browse/ARROW-17148
             Project: Apache Arrow
          Issue Type: Improvement
          Components: R
            Reporter: Dewey Dunnington


There are currently a few places where we call R code from C++ (and after 
ARROW-16444 and ARROW-16703 we will have some more where the overhead of 
calling into R might be greater than the time it takes to actually evaluate the 
function/the functions will be called in a tight loop).

The current approach uses {{cpp11::function}}. This is totally fine and safe 
but generates some ugly backtraces on error and is potentially slower than the 
lean-and-mean approach of purrr (whose entire job is to call R functions in a 
loop and has been heavily optimized). The purrr approach is to construct the 
{{call()}} and calling environment in advance and then just run `Rf_eval(call, 
env)` in the loop. This is both faster (fewer R API calls) and generates better 
backtraces (e.g., {{Error in fun(arg1, arg2)}} instead of {{Error in 
(function(a, b) { ...the whole content of the function ... })(every, deparsed, 
argument)}}.

Before optimizing that heavily we should of course benchmark to see exactly how 
much that matters!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to