On 29.12.2011 16:07, Vladimir Panteleev wrote:
On Thursday, 29 December 2011 at 14:44:45 UTC, Don wrote:
http://www.danielvik.com/2010/02/fast-memcpy-in-c.html . It doesn't even
use inline assembler or compiler intrinsics.

Note that the memcpy described there is _far_ from optimal. Memcpy is
all about cache effciency. DMD translates memcpy to the single
instruction "rep movsd" which you'd think would be optimal, but you
can actually beat it by a factor of four or more for long lengths.

I've never seen DMD emit rep movsd. Does rep movsd even make sense when
the memory areas do not have the same alignment? memcpy in snn.lib has a
rep movsd instruction, but there's lots of other code (including what
looks like Duff's device).

It's in the backend in cod2.c, line 3260. But on closer inspection -- you're right! It's in an
if(0 && ...) block.
So it never does it, even when everything's aligned.

There's a _huge_ potential for improvement in that function.

Reply via email to