On Thu, Oct 28, 2010 at 1:44 PM, Dominick Samperi <djsamp...@gmail.com> wrote: > See comments on Rcpp below. > > On Thu, Oct 28, 2010 at 11:28 AM, William Dunlap <wdun...@tibco.com> wrote: >> >> > -----Original Message----- >> > From: r-devel-boun...@r-project.org >> > [mailto:r-devel-boun...@r-project.org] On Behalf Of Andrew Piskorski >> > Sent: Thursday, October 28, 2010 6:48 AM >> > To: Simon Urbanek >> > Cc: r-devel@r-project.org >> > Subject: Re: [Rd] must .Call C functions return SEXP? >> > >> > On Thu, Oct 28, 2010 at 12:15:56AM -0400, Simon Urbanek wrote: >> > >> > > > Reason I ask, is I've written some R code which allocates two long >> > > > lists, and then calls a C function with .Call. My C code >> > writes to >> > > > those two pre-allocated lists, >> > >> > > That's bad! All arguments are essentially read-only so you should >> > > never write into them! >> > >> > I don't see how. (So, what am I missing?) The R docs themselves >> > state that the main point of using .Call rather than .C is that .Call >> > does not do any extra copying and gives one direct access to the R >> > objects. (This is indeed very useful, e.g. to reorder a large matrix >> > in seconds rather than hours.) >> > >> > I could allocate the two lists in my C code, but so far it was more >> > convenient to so in R. What possible difference in behavior can there >> > be between the two approaches? >> >> Here is an example of how you break the rule that R-language functions >> do not change their arguments if you use .Call in the way that you >> describe. The C code is in alter_argument.c: >> >> #include <R.h> >> #include <Rinternals.h> >> >> SEXP alter_argument(SEXP arg) >> { >> SEXP dim ; >> PROTECT(dim = allocVector(INTSXP, 2)); >> INTEGER(dim)[0] = 1 ; >> INTEGER(dim)[1] = LENGTH(arg) ; >> setAttrib(arg, R_DimSymbol, dim); >> UNPROTECT(1) ; >> return dim ; >> } >> >> Make a shared library out of this. E.g., on Linux do >> R CMD SHLIB -o Ralter_argument.so alter_argument.so >> and load it into R with >> dyn.open("./Ralter_argument.so") >> (Or, on any platform, put it into a package along with >> the following R code and build it.) >> >> The associated R code is >> myDim <- function(v).Call("alter_argument", v) >> f <- function(z) myDim(z)[2] >> Now try using it: >> > myData <- 6:10 >> > myData >> [1] 6 7 8 9 10 >> > f(myData) >> [1] 5 >> > myData >> [,1] [,2] [,3] [,4] [,5] >> [1,] 6 7 8 9 10 >> The argument to f was changed! This should never happen in R. >> >> If you are very careful you might be able ensure that >> no part of the argument to be altered can come from >> outside the function calling .Call(). It can be tricky >> to ensure that, especially when the argument is more complicated >> than an atomic vector. >> >> "If you live outside the law you must be honest" - Bob Dylan. > > This thread seems to suggest (following Bob Dylan) that one needs > to be very careful when using C/C++ to modify R's memory > directly, because you may modify other R variables that point > to the same memory (due to R's copy-by-value semantics and > optimizations). > > What are the implications for the Rcpp package where R > objects are exposed to the C++ side in precisely this way, > permitting unrestricted modifications? (In the original > or "classic" version of this package direct writes to R's > memory were done only for performance reasons.) > > Seems like extra precautions need to be taken to > avoid the aliasing problem.
The current Rcpp facilities has the same benefits and dangers as the C macros used in .Call. You get access to the memory of the R object passed as an argument, saving a copy step. You shouldn't modify that memory. If you do, bad things can happen and they will be your fault. If you want to get a read-write copy you clone the argument (in Rcpp terminology). To Bill: I seem to remember the Dylan quote as "To live outside the law you must be honest." > > Dominick > >> In R, .Call() does not copy its arguments but the C code >> writer is expected to do so if they will be altered. >> In S+ (and S), .Call() copies the arguments if altering >> them would make a user-visible change in the environment, >> unless you specify that the C code will not be altering them. >> >> Bill Dunlap >> Spotfire, TIBCO Software >> wdunlap tibco.com >> >> > > R has pass-by-value(!) semantics, so semantically you code has >> > > nothing to do with the result.1 and result.2 variables since only >> > > their *values* are guaranteed to be passed (possibly a copy). >> > >> > Clearly C code called from .Call must be allowed to construct R >> > objects, as that's how much of R itself is implemented, and further >> > down, it's what you recommend I should do instead. >> > >> > But why does it follow that C code must never modify an object >> > initially allocated by R code? Are you saying there is some special >> > magic difference in the state of an object allocated by R's C code >> > vs. one allocated by R code? If so, what is it? >> > >> > What is the potential problem here, that the garbage collector will >> > suddenly run while my C code is in the middle of writing to an R list? >> > Yes, if the gc is going to move the object elsewhere, that would be >> > very bad. But it looks to me like that cannot happen, because lots of >> > the R implementation itself would fail badly if it did. >> > >> > E.g.: The PROTECT call is used to increment reference counts, but I >> > see no guarantees that it is atomic with the operations that allocate >> > objects. I see no mutexes or other barriers in C code to prevent the >> > gc from running, thus implying that it *can't* run until the C >> > function completes. >> > >> > And R is single threaded, of course. But what about signal handlers, >> > could they ever invoke R's gc? >> > >> > Also, I was initially surprised not to find any matrix C APIs, but >> > grepping for examples (sorry, I don't remember exactly which >> > functions) showed me that the apparently accepted way to do matrix >> > operations from C is to simply assume R's column-first dense matrix >> > order, and access the 2D matrix as a flat 1D vector. (Which is easy.) >> > >> > > The fact that internally R attempts to avoid copying for performance >> > > reasons is the only reason why your code may have appeared to work, >> > > but it's invalid! >> > >> > I will probably change my code to allocate a new list from the C code >> > and return that, as you recommend. My main reason for doing the >> > allocation in R was just that it was simpler, especially given the >> > very limited documentation of R's C API. >> > >> > But, I didn't see anything in the "Writing R Extensions" doc saying >> > that what my code is doing is "invalid", and more importantly, I don't >> > see why it would or should be invalid... >> > >> > I'd still like to better understand why you think doing the initial >> > allocation of an object in R rather than C code is such a problem. So >> > far, I don't see any way that the R interpreter could ever tell the >> > difference. >> > >> > Wait, or is the only objection here that I'm using C in a way that >> > makes pass-by-reference semantics visible to my R code? Which will >> > work completely correctly, but is not the The Proper R Way? >> > >> > I don't actually need pass-by-reference behavior here at all, but I >> > can imagine cases where I might want it, so I'd like to understand >> > your objections better. Is using C to implement pass-by-reference >> > actually Broken, or merely Ugly? From my reasons above, I think it >> > will always work correctly and thus is not Broken. But of course >> > given R's devotion to pass-by-value, it could be considered >> > unacceptably Ugly. >> > >> > -- >> > Andrew Piskorski <a...@piskorski.com> >> > http://www.piskorski.com/ >> > >> > ______________________________________________ >> > R-devel@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-devel >> > >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > > > _______________________________________________ > Rcpp-devel mailing list > rcpp-de...@lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel > > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel