On Mon, Mar 16, 2009 at 8:43 AM, bearophile <bearophileh...@lycos.com> wrote: > Don: >>Which means that memcpy probably isn't anywhere near optimal, either.< > > Time ago I have read an article written by AMD that shows that indeed with > modern CPUs there are ways to go much faster, using vector asm instructions, > loop unrolling and explicit cache prefetching
I'm actually kind of shocked that given the prevalence of memory block copy operations that more CPUs haven't implemented it as a basic instruction. Yes, RISC is nice, but geez, this seems like a no-brainer.