Memory is slow. While slice fits to cache, memclr is measurably faster.
When slice doesn't fit cache, memclr at least not significantly faster.

I've heard, adaptive prefetching is turned on if there were 3 consequent 
accesses to same cache-line in increasing address order. So, perhaps optimised 
SSE/AVX zeroing doesn't trigger adaptive prefetch cause it uses less memory 
accesses. And then, it may vary much by CPU model: newer models may fix 
adaptive prefetch, so that memclr is great again.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to