Re: [Rd] R vs. C

Dominick Samperi Mon, 17 Jan 2011 11:40:42 -0800

On Mon, Jan 17, 2011 at 2:08 PM, Spencer Graves <
spencer.gra...@structuremonitoring.com> wrote:


>      Another point I have not yet seen mentioned:  If your code is
> painfully slow, that can often be fixed without leaving R by experimenting
> with different ways of doing the same thing -- often after using profiling
> your code to find the slowest part as described in chapter 3 of "Writing R
> Extensions".
>
>
>      If I'm given code already written in C (or some other language),
> unless it's really simple, I may link to it rather than recode it in R.
>  However, the problems with portability, maintainability, transparency to
> others who may not be very facile with C, etc., all suggest that it's well
> worth some effort experimenting with alternate ways of doing the same thing
> in R before jumping to C or something else.
>

>      Hope this helps.
>      Spencer
>
>
>
> On 1/17/2011 10:57 AM, David Henderson wrote:
>
>> I think we're also forgetting something, namely testing.  If you write
>> your
>> routine in C, you have placed additional burden upon yourself to test your
>> C
>> code through unit tests, etc.  If you write your code in R, you still need
>> the
>> unit tests, but you can rely on the well tested nature of R to allow you
>> to
>> reduce the number of tests of your algorithm.  I routinely tell people at
>> Sage
>> Bionetworks where I am working now that your new C code needs to
>> experience at
>> least one order of magnitude increase in performance to warrant the effort
>> of
>> moving from R to C.
>>
>> But, then again, I am working with scientists who are not primarily, or
>> even
>> secondarily, coders...
>>
>> Dave H
>>
>>
This makes sense, but I have seem some very transparent algorithms turned
into vectorized R code
that is difficult to read (and thus to maintain or to change). These chunks
of optimized R code are like
embedded assembly, in the sense that nobody is likely to want to mess with
it. This could be addressed
by including pseudo code for the original (more transparent) algorithm as a
comment, but I have never
seen this done in practice (perhaps it could be enforced by R CMD check?!).

On the other hand, in principle a well-documented piece of C/C++ code could
be much easier to understand,
without paying a performance penalty...but "coders" are not likely to place
this high on their
list of priorities.

The bottom like is that R is an adaptor ("glue") language like Lisp that
makes it easy to mix and
match functions (using classes and generic functions), many of which are
written in C (or C++
or Fortran) for performance reasons. Like any object-based system there can
be a lot of
object copying, and like any functional programming system, there can be a
lot of function
calls, resulting in poor performance for some applications.

If you can vectorize your R code then you have effectively found a way to
benefit from
somebody else's C code, thus saving yourself some time. For operations other
than pure
vector calculations you will have to do the C/C++ programming yourself (or
call a library
that somebody else has written).

Dominick



>
>>
>> ----- Original Message ----
>> From: Dirk Eddelbuettel<e...@debian.org>
>> To: Patrick Leyshock<ngkbr...@gmail.com>
>> Cc: r-devel@r-project.org
>> Sent: Mon, January 17, 2011 10:13:36 AM
>> Subject: Re: [Rd] R vs. C
>>
>>
>> On 17 January 2011 at 09:13, Patrick Leyshock wrote:
>> | A question, please about development of R packages:
>> |
>> | Are there any guidelines or best practices for deciding when and why to
>> | implement an operation in R, vs. implementing it in C?  The "Writing R
>> | Extensions" recommends "working in interpreted R code . . . this is
>> normally
>> | the best option."  But we do write C-functions and access them in R -
>> the
>> | question is, when/why is this justified, and when/why is it NOT
>> justified?
>> |
>> | While I have identified helpful documents on R coding standards, I have
>> not
>> | seen notes/discussions on when/why to implement in R, vs. when to
>> implement
>> | in C.
>>
>> The (still fairly recent) book 'Software for Data Analysis: Programming
>> with
>> R' by John Chambers (Springer, 2008) has a lot to say about this.  John
>> also
>> gave a talk in November which stressed 'multilanguage' approaches; see
>> e.g.
>>
>> http://blog.revolutionanalytics.com/2010/11/john-chambers-on-r-and-multilingualism.html
>>
>>
>> In short, it all depends, and it is unlikely that you will get a coherent
>> answer that is valid for all circumstances.  We all love R for how
>> expressive
>> and powerful it is, yet there are times when something else is called for.
>> Exactly when that time is depends on a great many things and you have not
>> mentioned a single metric in your question.  So I'd start with John's
>> book.
>>
>> Hope this helps, Dirk
>>
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R vs. C

Reply via email to