Michael Neuling wrote:
Philippe Bergheaud <fe...@linux.vnet.ibm.com> wrote:


Unaligned stores take alignment exceptions on POWER7 running in little-endian.
This is a dumb little-endian base memcpy that prevents unaligned stores.
It is replaced by the VMX memcpy at boot.


Is this any faster than the generic version?

The little-endian assembly code of the base memcpy is similar to the code 
emitted by gcc when compiling the generic memcpy in lib/string.c, and runs at 
the same speed.
However, a little-endian assembly version of the base memcpy is required (as 
opposed to a C version), in order to use the self-modifying code 
instrumentation system.
After the cpu feature CPU_FTR_ALTIVEC is detected at boot, the slow base memcpy 
is nop'ed out, and the fast memcpy_power7 is used instead.

Philippe

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to