Jan Hubicka <hubi...@ucw.cz> writes:

> Note that I think Core has similar characteristics - at least for string 
> operations
> it fares well with unalignes accesses.

Nehalem and later has very fast unaligned vector loads. There's still some
penalty when they cross cache lines however. 

iirc the rule of thumb is to do unaligned for 128 bit vectors,
but avoid it for 256bit vectors because the cache line cross
penalty is larger on Sandy Bridge and more likely with the larger
vectors.

-Andi
 
-- 
a...@linux.intel.com -- Speaking for myself only

Reply via email to