Michael Neuling wrote:
Philippe Bergheaud <fe...@linux.vnet.ibm.com> wrote:
Unaligned stores take alignment exceptions on POWER7 running in little-endian.
This is a dumb little-endian base memcpy that prevents unaligned stores.
It is replaced by the VMX memcpy at boot.
Is this any faster than the generic version?
The little-endian assembly code of the base memcpy is similar to the code
emitted by gcc when compiling the generic memcpy in lib/string.c, and runs at
the same speed.
However, a little-endian assembly version of the base memcpy is required (as
opposed to a C version), in order to use the self-modifying code
instrumentation system.
After the cpu feature CPU_FTR_ALTIVEC is detected at boot, the slow base memcpy
is nop'ed out, and the fast memcpy_power7 is used instead.
Philippe
_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev