Don't know how useful it is any more, but back in the days, I gave this talk in Vienna
http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Dalgaard.pdf Looking at it now, perhaps it moves a little too quickly into the hairy stuff. On the other hand, those were the things that I had found important to figure out at the time. At a quick glance, I didn't spot anything obviously outdated. On Mar 22, 2012, at 16:15 , Ramon Diaz-Uriarte wrote: > > > > On Thu, 22 Mar 2012 10:38:55 -0400,Simon Urbanek > <simon.urba...@r-project.org> wrote: > >> On Mar 22, 2012, at 9:45 AM, Terry Therneau <thern...@mayo.edu> wrote: > >>> >>>> >>>>> >>>>>> strongly disagree. I'm appalled to see that sentence here. >>>>>> >>>>>> Come on! >>>>>> >>>>>>>> The overhead is significant for any large vector and it is in >>>>>>>> particular unnecessary since in .C you have to allocate *and copy* >>>>>>>> space even for results (twice!). Also it is very error-prone, because >>>>>>>> you have no information about the length of vectors so it's easy to >>>>>>>> run out of bounds and there is no way to check. IMHO .C should not be >>>>>>>> used for any code written in this century (the only exception may be >>>>>>>> if you are passing no data, e.g. if all you do is to pass a flag and >>>>>>>> expect no result, you can get away with it even if it is more >>>>>>>> dangerous). It is a legacy interface that dates way back and is >>>>>>>> essentially just re-named .Fortran interface. Again, I would strongly >>>>>>>> recommend the use of .Call in any recent code because it is safer and >>>>>>>> more efficient (if you don't care about either attribute, well, feel >>>>>>>> free ;)). >>>>>> >>>>>> So aleph will not support the .C interface? ;-) >>>>>> >>>> It will look at the timestamp of the source file and delete the package if >>>> it is not before 1980 ;). Otherwise it will send a request for punch cards >>>> with ".C is deprecated, please upgrade to .Call" stamped out :P At that >>>> point I'll be flaming about using the native Aleph interface and not the R >>>> compatibility layer ;) >>>> >>>> Cheers, >>>> S >>> I'll dissent -- I don't think .C is inherently any more dangerous than >>> .Call and prefer it's simplicity in many cases. Calling C at all is what >>> is inherently dangerous -- I can reference beyond the end of a vector, >>> write over objects that should be read only, and branch to random places >>> using either interface. > >> You can always do so deliberately, but with .C you have no way of preventing >> it since you don't even know what is the length! That is certainly far more >> dangerous than .Call where you can simply loop over the length, check that >> the lengths are compatible etc. Also for types like strings .C is a >> minefield that is hard to not blow up whereas .Call it is even more safe >> than scalar arrays. You can do none of that with .C which relies entirely on >> conventions with no recorded semantics. > > >>> If you are dealing with large objects and worry about memory efficiency >>> then .Call puts more tools at your disposal and is worth the effort. >>> However, I did not find the .Call interface at all easy to use at first > >> I guess this depends on the developer and is certainly a >> factor. Personally, I find the subset of the R API needed for .Call >> fairly small and intuitive (in particular when you are just writing a >> safer replacement for .C), but I'm obviously biased. Maybe in a separate >> thread we could discuss this - I'd be happy to write a ref card or cheat >> sheet if I find out what people find challenging on .Call. Nonetheless, >> my point is that it is more than worth investing the effort both in >> safety and performance. > > > After your previous email I made a mental note "try to finally learn to > use .Call since I often deal with large objects". So, yes, I'd love to see > a ref card and cheat sheet: I have tried learning to use .Call a few > times, but have always gone back to .C since (it seems that) all I needed > to know are just a couple of conventions, and the rest is "C as usual". > > > > You say "if I find out what people find challenging on > .Call". Hummm... can I answer "basically everything"? I think Terry > Thereneau says, "the things I needed to know are scattered about in > multiple places". When I see the convolve example (5.2 in Writing R > extensions) I understand the C code; when I see the convolve2 example in > 5.10.1 I think I can guess what lines "PROTECT(a ..." to "xab = > NUMERIC_POINTER ..." might be doing, but I would not know how to do that > on my own. Yes, I can go to 5.9.1 to read about PROTECT, then search for > ... But, at that point, I've gone back to .C. Of course, this might just > be my laziness/incompetence/whatever. > > > > Best, > > > R. > > > > > > > > > >>> and we should keep that in mind before getting too pompous in our lectures >>> to the "sinners of .C". (Mostly because the things I needed to know are >>> scattered about in multiple places.) >>> >>> I might have to ask for an exemption on that timestamp -- the first bits of >>> the survival package only reach back to 1986. And I've had to change >>> source code systems multiple times which plays hob with the file times, >>> though I did try to preserve the changelog history to forstall some future >>> litigious soul who claims they wrote it first (sccs -> rcs -> cvs -> svn >>> -> mercurial). :-) >>> > >> ;) Maybe the rule should be based on the date of the first appearance of the >> package, fair enough :) > >> Cheers, >> Simon >> [[alternative HTML version deleted]] > >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > -- > Ramon Diaz-Uriarte > Department of Biochemistry, Lab B-25 > Facultad de Medicina > Universidad Autónoma de Madrid > Arzobispo Morcillo, 4 > 28029 Madrid > Spain > > Phone: +34-91-497-2412 > > Email: rdia...@gmail.com > ramon.d...@iib.uam.es > > http://ligarto.org/rdiaz > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel