On Fri, May 17, 2013 at 4:47 AM, Jakub Jelinek <ja...@redhat.com> wrote: > On Fri, May 17, 2013 at 10:32:11AM +0200, Richard Biener wrote: >> On Fri, May 17, 2013 at 4:40 AM, Bill Schmidt >> <wschm...@linux.vnet.ibm.com> wrote: >> > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost >> > vectorization opportunities. Previously, GCC treated malloc'd arrays as >> > only guaranteeing 4-byte alignment, even though the glibc implementation >> > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, >> > which is sufficient to permit the missed vectorization opportunities. >> > >> > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, >> > but doing so currently exposes a latent bug that degrades a 64-bit >> > benchmark. I have therefore not included that change at this time, but >> > added a FIXME recording the information. >> > >> > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new >> > regressions. Verified that SPEC CPU2006 degradations are fixed with no >> > new degradations. Ok for trunk? Also, do you want any backports? >> >> You say this is because glibc guarantees such alignment - shouldn't this >> be guarded by a proper check then? Probably AIX guarantees similar >> alignment but there are also embedded elf/newlib targets, no? >> >> Btw, the #define should possibly move to config/linux.h guarded by >> OPTION_GLIBC? > > For that location it shouldn't be 64, but 2 * POINTER_SIZE, which is what > glibc really guarantees on most architectures (I think x32 provides more > than that, but it can override). But changing it requires some analysis > on commonly used malloc alternatives on Linux, what exactly do they > guarantee. Say valgrind, ElectricFence, jemalloc, tcmalloc, libasan. > Note that on most targets there is some standard type that requires > such alignment, thus at least 2 * POINTER_SIZE alignment is also a C > conformance matter.
Jakub, As Bill wrote earlier, 2 * POINTER_SIZE causes a different performance regression for 16 byte alignment on PPC64. 8 bytes is a work-around for now. - David