Hello,

I'm arriving after the battle here.

Let's clear out some misconceptions. The copy constructor of NumericVector is very cheap, it just copies a pointer to the underlying R object, i.e. the data is not copied. This is the way Rcpp classes are implemented, they encapsulate an R object and provide nicer syntax, but they work with the data of the R object.

So when you call e.g.

double get_var(NumericVector v,double m,int l)

which triggers a copy of v, this is cheap, you don't have to worry about it. I would still recommend to use reference semantics, like others have pointers and delcare your function as:

double get_var(const NumericVector& v,double m,int l)

Learning about references and const references is very useful in C++.



Now, the data you first pass in from R is an integer vector:

> a=1:1000000
> typeof( a )
[1] "integer"

So creating a NumericVector from it will be expensive because a coercion from an integer vector to a numeric vector is needed.

Consider this code:

#include <Rcpp/Benchmark/Timer.h>

// [[Rcpp::export]]
SEXP construct(SEXP x, int n){
    Timer timer ;
    timer.step("start");
    for(int i=0; i<n; i++){
        NumericVector vec(x) ;
    }
    timer.step("NumericVector ctor") ;
    return timer ;
}

This uses our internal Timer, i.e.. to time things at the C++ level to measure how many nanoseconds are needed for n constructions of a NumericVector from an R object.

Now things are very different if I feed in an integer vector or a numeric vector:

> a <- 1:1e+06

> b <- seq(1, 1e+06, by = 1)

> typeof(a)
[1] "integer"

> typeof(b)
[1] "double"

> n <- 1000

> construct(a, n)[-1]/n
NumericVector ctor
           4128426

> construct(b, n)[-1]/n
NumericVector ctor
            36.767


On average it took 4 128 426 ns (4ms) to construct s, this is because a copy is made.

but it takes 36 ns to create the vector from a numeric vector. not free, but very cheap. This only includes copying a pointer, and doing the business of protection of the underlying object from R's garbage collection.


Now JJ's timings of callRcppVar:

// [[Rcpp::export]]
void callRcppVar(const NumericVector& v) {
   for (int i=0; i<100; i++)
      rcppVar(v);
}

This takes about the same time it takes to construct the numeric vector, so I suspect there is some compiler magic taking place, i.e. since the returned value of rcppVar is just ignored, I'm guessing the compiler just does not bother. Let's split things :

// [[Rcpp::export]]
double returnDouble() {
   return 1.0 ;
}

> system.time(returnDouble())
   user  system elapsed
      0       0       0

The price of returning a double is none, or very small. I'm just trying to evaluate the cost of the invocation of whatever sourceCpp makes since it does very nice work of marsjalling inputs and outputs.

Now, let's do the same, but taking the NumericVector as before:

// [[Rcpp::export]]
double returnDouble(const NumericVector& v) {
   return 1.0 ;
}

> system.time(returnDouble(a))
   user  system elapsed
  0.003   0.000   0.003

Those 3 ms are the time it takes to construct the NumericVector from an integer vector.

Now,

> system.time(callRcppVar(a))
   user  system elapsed
  0.003   0.000   0.002

I believe nothing is done here. Here is a version that won't be optimized away by the compiler because we need the value:

// [[Rcpp::export]]
double callRcppVar2(const NumericVector& v) {
   double x = 0.0 ;
   for (int i=0; i<100; i++)
      x += rcppVar(v);
   return x ;
}

> system.time(callRcppVar2(a))
   user  system elapsed
  0.198   0.000   0.198

That's more like it. And we are not paying the price of +=

// [[Rcpp::export]]
double callRcppVar3(const NumericVector& v) {
   double x = 0.0 ;
   for (int i=0; i<100; i++)
      x += 2.0;
   return x ;
}

> system.time(callRcppVar3(a))
   user  system elapsed
  0.003   0.000   0.003


Sorry for the long-ish email, I hope this clarifies things.

Romain

PS: Alon. You are welcome to use stack overflow or here, but if you use stack overflow, please play the game as it is supposed to. You've asked 5 questions there, got some really valid answers but you never bothered to accept any of them. You are not likely to get more answers if you don't play the game correctly. Please read the faq. http://stackoverflow.com/faq

stack overflow is really good, but it is only useful if people use it appropriately.


Le 20/12/12 18:35, Alon Honig a écrit :
My current RCPP program is only 4 times faster than its R byte code
compiled equivalent. I am trying to speed up my code and from what I
understand using pointers prevents an external function from copying the
entire object when executing its methods.   I am having difficulty
finding salient examples that use points and NumericVectors. Please tell
me how in this same code I could get the functions "get_sum" and
"get_var" to point to the NumericVector object "v" rather than copy the
whole thing.

    library(inline)
    library(Rcpp)
      a=1:1000000
    Rcpp.var = cxxfunction(signature(input="numeric"), plugin="Rcpp",
    body="

          NumericVector v = input;

          int n = v.size();

          double v_mean =  get_sum(v,n)/n;

          double v_var = get_var(v,v_mean,n);

          return wrap(v_var);

        ",includes="

          double get_var(NumericVector v,double m,int l)

          {double a = 0;

              for (int i = 0; i <l;i++)

        {a += (v[i]-m)*(v[i]-m);}

        return(a/l);

          }


          double get_sum(NumericVector v,int l)

          { double s = 0;

            for (int i = 0; i <l;i++)

        {s += v[i];}

        return(s);

          }

      ")

      b=system.time(for (i in 1:100)Rcpp.var (a))
      c= system.time(for (i in 1:100)var (a))



Thank you Alon.

P.S. I am aware that the "get_var" function provides the population
variance and not the sample variance.


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30

R Graph Gallery: http://gallery.r-enthusiasts.com

blog:            http://romainfrancois.blog.free.fr
|- http://bit.ly/RE6sYH : OOP with Rcpp modules
`- http://bit.ly/Thw7IK : Rcpp modules more flexible

_______________________________________________
Rcpp-devel mailing list
Rcpp-devel@lists.r-forge.r-project.org
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

Reply via email to