On Apr 21, 2010, at 4:39 PM, Simon Urbanek wrote:

> 
> On Apr 21, 2010, at 4:13 PM, Romain Francois wrote:
> 
>> Le 21/04/10 21:39, Simon Urbanek a écrit :
>>> 
>>> 
>>> On Apr 21, 2010, at 3:32 PM, Romain Francois wrote:
>>> 
>>>> Le 21/04/10 17:54, Matthew Dowle a écrit :
>>>>> 
>>>>>> From copyVector in duplicate.c :
>>>>> 
>>>>> void copyVector(SEXP s, SEXP t)
>>>>> {
>>>>>    int i, ns, nt;
>>>>>    nt = LENGTH(t);
>>>>>    ns = LENGTH(s);
>>>>>    switch (TYPEOF(s)) {
>>>>> ...
>>>>>    case INTSXP:
>>>>>    for (i = 0; i<   ns; i++)
>>>>>        INTEGER(s)[i] = INTEGER(t)[i % nt];
>>>>>    break;
>>>>> ...
>>>>> 
>>>>> could that be replaced with :
>>>>> 
>>>>>    case INTSXP:
>>>>>    for (i=0; i<ns/nt; i++)
>>>>>        memcpy((char *)DATAPTR(s)+i*nt*sizeof(int), (char *)DATAPTR(t),
>>>>> nt*sizeof(int));
>>>>>    break;
>>>> 
>>>> or at least with something like this:
>>>> 
>>>> int* p_s = INTEGER(s) ;
>>>> int* p_t = INTEGER(t) ;
>>>> for( i=0 ; i<  ns ; i++){
>>>>    p_s[i] = p_t[i % nt];
>>>> }
>>>> 
>>>> since expanding the INTEGER macro over and over has a price.
>>>> 
>>> 
>>> ... in packages, yes, but not in core R.
>> 
>> Hmm. I was not talking about the overhead of the INTEGER function:
>> 
>> int *(INTEGER)(SEXP x) {
>>   if(TYPEOF(x) != INTSXP && TYPEOF(x) != LGLSXP)
>>      error("%s() can only be applied to a '%s', not a '%s'",
>>            "INTEGER", "integer", type2char(TYPEOF(x)));
>>   return INTEGER(x);
>> }
>> 
>> 
>> 
>> but the one related to the macro.
>> 
>> #define INTEGER(x)   ((int *) DATAPTR(x))
>> #define DATAPTR(x)   (((SEXPREC_ALIGN *) (x)) + 1)
>> 
>> so the loop expands to :
>> 
>> for (i = 0; i<   ns; i++)
>>         ((int *) (((SEXPREC_ALIGN *) (s)) + 1))[i] = ((int *) 
>> (((SEXPREC_ALIGN *) (t)) + 1))[i % nt];
>> 
>> I still believe grabbing the pointer just once for s and once for t is more 
>> efficient ...
>> 
> 
> Nope, since everything involved is loop invariant so the pointer values don't 
> change (you'd have to declare s or t volatile to prevent that).
> 
> Try using gcc -s

Sorry, I meant gcc -S of course.


> and you'll see that the code is the same (depending on the version of gcc the 
> order of the first comparison can change so technically the INTEGER(x) 
> version can save one add instruction in the degenerate case and be faster(!) 
> in old gcc

[the reason being that INTEGER(x) does not need to be evaluated if the loop is 
not entered whereas p_s and p_t are populated unconditionally - smarter 
compilers will notice that p_s/p_t are not used in that case and thus generate 
identical code in both cases]

Cheers,
Simon


> 
> 
>>> 
>>>>> and similar for the other types in copyVector.  This won't help regular
>>>>> vector copies, since those seem to be done by the DUPLICATE_ATOMIC_VECTOR
>>>>> macro, see next suggestion below, but it should help copyMatrix which 
>>>>> calls
>>>>> copyVector, scan.c which calls copyVector on three lines, dcf.c (once) and
>>>>> dounzip.c (once).
>>>>> 
>>>>> For the DUPLICATE_ATOMIC_VECTOR macro there is already a comment next to 
>>>>> it
>>>>> :
>>>>> 
>>>>>    <FIXME>: surely memcpy would be faster here?
>>>>> 
>>>>> which seems to refer to the for loop  :
>>>>> 
>>>>>    else { \
>>>>>    int __i__; \
>>>>>    type *__fp__ = fun(from), *__tp__ = fun(to); \
>>>>>    for (__i__ = 0; __i__<   __n__; __i__++) \
>>>>>      __tp__[__i__] = __fp__[__i__]; \
>>>>>  } \
>>>>> 
>>>>> Could that loop be replaced by the following ?
>>>>> 
>>>>>   else { \
>>>>>   memcpy((char *)DATAPTR(to), (char *)DATAPTR(from), __n__*sizeof(type)); 
>>>>> \
>>>>>   }\
>>>>> 
>>>>> In the data.table package, dogroups.c uses this technique, so the 
>>>>> principle
>>>>> is tested and works well so far.
>>>>> 
>>>>> Are there any road blocks preventing this change, or is anyone already
>>>>> working on it ?  If not then I'll try and test it (on Ubuntu 32bit) and
>>>>> submit patch with timings, as before.  Comments/pointers much appreciated.
>>>>> 
>>>>> Matthew
>> 
>> 
>> -- 
>> Romain Francois
>> Professional R Enthusiast
>> +33(0) 6 28 91 30 30
>> http://romainfrancois.blog.free.fr
>> |- http://bit.ly/9aKDM9 : embed images in Rd documents
>> |- http://tr.im/OIXN : raster images and RImageJ
>> |- http://tr.im/OcQe : Rcpp 0.7.7
>> 
>> 
>> 
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to