On Sunday 20 March 2005 15:17, Adrian Bunk wrote:
> Hi Denis,
>
> what do your benchmarks say about replacing the whole assembler code
> with a
>
> #define __memcpy __builtin_memcpy
It generates call to out-of-line memcpy()
if count is non-constant.
# cat t.c
extern char *a, *b;
extern int n
Hi Denis,
what do your benchmarks say about replacing the whole assembler code
with a
#define __memcpy __builtin_memcpy
?
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a prom
On Friday 18 March 2005 11:21, Denis Vlasenko wrote:
> This memcpy() is 2 bytes shorter than one currently in mainline
> and it have one branch less. It is also 3-4% faster in microbenchmarks
> on small blocks if block size is multiple of 4. Mainline is slower
> because it has to branch twice per m
This memcpy() is 2 bytes shorter than one currently in mainline
and it have one branch less. It is also 3-4% faster in microbenchmarks
on small blocks if block size is multiple of 4. Mainline is slower
because it has to branch twice per memcpy, both mispredicted
(but branch prediction hides that in
4 matches
Mail list logo