It's great to see the community mobilize to try to resolve this issue. Obviously C++ has become a big part of R extensions, so it would be nice to have clear guidelines and tools to be able to use C++ safely with the R API.

Unfortunately doing this will probably require a fair bit of work. If R-core where to do this it would take away from other valuable improvements they could be making on R itself. Given there is already a supported and documented extension mechanism with access to the R API via C, I can see why R-core might be reluctant to divert resources from R development to add the same level of support for C++.

Obviously it would be impossible to try to provide better documentation and/or mechanisms for C++ extensions without some R-core involvement, but it seems like much of the grunt work could be done by others. I unfortunately have no C++ experience so cannot help here, but hopefully there are others that have the experience and the recognition in the community to offer to help and have their offer accepted. Perhaps R-consortium can even fund, although given the level of expertise required here the funding may need to be meaningful.

That seems like the natural step here. Someone with the qualifications to do so either volunteers or is funded to do this, and hopefully R-core agrees to provide input and final stamp of approval. The documentation is probably more straightforward, as tools will need more work from R-core to integrate. It is possible R-core may decline to do this, but absent someone actually offering to put in the hard work it's all theoretical.

Respectfully,

Brodie.

On 3/30/19 3:59 AM, Romain Francois wrote:
tl;dr: we need better C++ tools and documentation.

We collectively know more now with the rise of tools like rchk and improved 
documentation such as Tomas’s post. That’s a start, but it appears that there 
still is a lot of knowledge that would deserve to be promoted to actual 
documentation of best practices.

I think it is important to not equate C++ as a language, and Rcpp.

Also, C++ is not just RAII.

RAII is an important part of how Rcpp was conceived for sure, but it’s not the 
only thing C++ can bring as a language. Templates, lambdas, the stl are 
examples of things that can be used for expressiveness when just accessing data 
without interfering with R, calling R api functions ...

It would be nice that the usual « you should do that only if you know what 
you’re doing » be transformed to precise documentation, and maybe become part 
of some better tool. If precautions have to be taken before calling such and 
such functions: that’s ok. What are they ? Can we embed that in some tool.

  It is easy enough to enscope code that uses potentially jumpy code into a c++ 
lambda. This could be together with recommendations such as the body of the 
lambda shall only use POC data structures.

This is similar to precautions you’d take when writing concurrent code.

Romain

Le 30 mars 2019 à 00:58, Simon Urbanek <simon.urba...@r-project.org> a écrit :

Kevin,


On Mar 29, 2019, at 17:01, Kevin Ushey <kevinus...@gmail.com> wrote:

I think it's also worth saying that some of these issues affect C code
as well; e.g. this is not safe:

   FILE* f = fopen(...);
   Rf_eval(...);
   fclose(f);

I fully agree, but developers using C are well aware of the necessity of 
handling lifespan of objects explicitly, so at least there are no surprises.


whereas the C++ equivalent would likely handle closing of the file in the 
destructor. In other words, I think many users just may not be cognizant of the 
fact that most R APIs can longjmp, and what that implies for cleanup of 
allocated resources. R_alloc() may help solve the issue specifically for memory 
allocations, but for any library interface that has a 'open' and 'close' step, 
the same sort of issue will arise.

Well, I hope that anyone writing native code in package is well aware of that 
and will use an external pointer with finalizer to clean up native objects in 
any 3rd party library that are created during the call.


What I believe we should do, and what Rcpp has made steps towards, is make it 
possible to interact with some subset of the R API safely from C++ contexts. 
This has always been possible with e.g. R_ToplevelExec() and 
R_ExecWithCleanup(), and now things are even better with R_UnwindProtect(). In 
theory, as a prototype, an R package could provide a 'safe' C++ interface to 
the R API using R_UnwindProtect() and friends as appropriate, and client 
packages could import and link to that package to gain access to the interface. 
Code generators (as Rcpp Attributes does) can handle some of the pain in these 
interfaces, so that users are mostly insulated from the nitty gritty details.

I agree that we should strive to provide tools that make it safer, but note 
that it still requires participation of the users - they have to use such 
facilities or else they hit the same problem. So we can only fix this for the 
future, but let's start now.


I agree that the content of Tomas's post is very helpful, especially since I expect many R 
programmers who dip their toes into the C++ world are not aware of the caveats of talking to R from 
C++. However, I don't think it's helpful to recommend "don't use C++"; rather, I believe 
the question should be, "what can we do to make it possible to easily and safely interact with 
R from C++?". Because, as I understand it, all of the problems raised are solvable: either 
through a well-defined C++ interface, or through better education.

I think the recommendation would be different if such tools existed, but they don't. It 
was based on the current reality which is not so rosy.  Apparently the post had its 
effect of mobilizing C++ proponents to do something about it, which is great, because if 
this leads to some solution, the recommendation in the future may change to "use C++ 
using tools XYZ".


I'll add my own opinion: writing correct C code is an incredibly difficult 
task. C++, while obviously not perfect, makes things substantially easier with 
tools like RAII, the STL, smart pointers, and so on. And I strongly believe 
that C++ (with Rcpp) is still a better choice than C for new users who want to 
interface with R from compiled code.

My take is that Rcpp makes the interface *look* easier, but you still have to 
understand more about the R API that you think. Hence it much easier to write 
buggy code. Personally, that's why I don't like it (apart from the code bloat), 
because things are hidden that will get you into trouble, whereas using the C 
API is at least very clear - you have to understand what it's doing when you 
use it. That said, I'm obviously biased since I know a lot about R internals ;) 
so this doesn't necessarily generalize.


tl;dr: I (and I think most others) just wish the summary had a more positive 
outlook for the future of C++ with R.

Well, unless someone actually takes the initiative there is no reason to 
believe in a bright future of C++. As we have seen with the lack of adoption of 
CXXR (which I thought was an incredible achievement), not enough people seem to 
really care about C++. If that is not true, then let's come out of hiding, get 
together and address it (it seems that this thread is a good start).

Cheers,
Simon



Best,
Kevin

On Fri, Mar 29, 2019 at 10:16 AM Simon Urbanek
<simon.urba...@r-project.org> wrote:

Jim,

I think the main point of Tomas' post was to alert R users to the fact that 
there are very serious issues that you have to understand when interfacing R 
from C++. Using C++ code from R is fine, in many cases you only want to access 
R data, use some library or compute in C++ and return results. Such use-cases 
are completely fine in C++ as they don't need to trigger the issues mentioned 
and it should be made clear that it was not what Tomas' blog was about.

I agree with Tomas that it is safer to give an advice to not use C++ to call R 
API since C++ may give a false impression that you don't need to know what 
you're doing. Note that it is possible to avoid longjmps by using 
R_ExecWithCleanup() which can catch any longjmps from the called function. So 
if you know what you're doing you can make things work. I think the issue here 
is not necessarily lack of tools, it is lack of knowledge - which is why I 
think Tomas' post is so important.

Cheers,
Simon


On Mar 29, 2019, at 11:19 AM, Jim Hester <james.f.hes...@gmail.com> wrote:

First, thank you to Tomas for writing his recent post[0] on the R
developer blog. It raised important issues in interfacing R's C API
and C++ code.

However I do _not_ think the conclusion reached in the post is helpful
don’t use C++ to interface with R

There are now more than 1,600 packages on CRAN using C++, the time is
long past when that type of warning is going to be useful to the R
community.

These same issues will also occur with any newer language (such as
Rust or Julia[1]) which uses RAII to manage resources and tries to
interface with R. It doesn't seem a productive way forward for R to
say it can't interface with these languages without first doing
expensive copies into an intermediate heap.

The advice to avoid C++ is also antithetical to John Chambers vision
of first S and R as a interface language (from Extending R [2])

The *interface* principle has always been central to R and to S
before. An interface to subroutines was _the_ way to extend the first
version of S. Subroutine interfaces have continued to be central to R.

The book also has extensive sections on both C++ (via Rcpp) and Julia,
so clearly John thinks these are legitimate ways to extend R.

So if 'don't use C++' is not realistic and the current R API does not
allow safe use of C++ exceptions what are the alternatives?

One thing we could do is look how this is handled in other languages
written in C which also use longjmp for errors.

Lua is one example, they provide an alternative interface;
lua_pcall[3] and lua_cpcall[4] which wrap a normal lua call and return
an error code rather long jumping. These interfaces can then be safely
wrapped by RAII - exception based languages.

This alternative error code interface is not just useful for C++, but
also for resource cleanup in C, it is currently non-trivial to handle
cleanup in all the possible cases a longjmp can occur (interrupts,
warnings, custom conditions, timeouts any allocation etc.) even with R
finalizers.

It is past time for R to consider a non-jumpy C interface, so it can
continue to be used as an effective interface to programming routines
in the years to come.

[0]: 
https://developer.r-project.org/Blog/public/2019/03/28/use-of-c---in-packages/
[1]: https://github.com/JuliaLang/julia/issues/28606
[2]: https://doi.org/10.1201/9781315381305
[3]: http://www.lua.org/manual/5.1/manual.html#lua_pcall
[4]: http://www.lua.org/manual/5.1/manual.html#lua_cpcall

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to