Agner Fog wrote:
> Basile STARYNKEVITCH wrote:
>>At last, at the recent (july 2008) GCC summit, someone (sorry I forgot
> who, probably someone from SuSE)
>> proposed in a BOFS to have architecture and machine specific
> hand-tuned (or even hand-written assembly) low
>> level libraries for such basic things as memset etc..
> 
> That's exactly what I meant. The most important memory, string and math
> functions should use hand-tuned assembly with CPU dispatching for the
> latest instruction sets. My experiments show that the speed can be
> improved by a factor 3 - 10 for unaligned memcpy on Intel processors
> (http://www.agner.org/optimize/optimizing_cpp.pdf page 12).

Is this still true if you have to go through the PLT to make a position-
independent call?  That's the most common case for userspace on GNU/Linux.

Andrew.

Reply via email to