Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, 2013-05-17 at 01:11 -0400, David Edelsohn wrote: > On Thu, May 16, 2013 at 10:40 PM, Bill Schmidt > wrote: > > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost > > vectorization opportunities. Previously, GCC treated malloc'd arrays as > > only guaranteeing 4-byte alignment, even though the glibc implementation > > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, > > which is sufficient to permit the missed vectorization opportunities. > > > > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, > > but doing so currently exposes a latent bug that degrades a 64-bit > > benchmark. I have therefore not included that change at this time, but > > added a FIXME recording the information. > > > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new > > regressions. Verified that SPEC CPU2006 degradations are fixed with no > > new degradations. Ok for trunk? Also, do you want any backports? > > Okay. If you think that this is appropriate for 4.8 branch, that is > okay as well. Yes, the degradation appears to have started last summer, so updating 4.8 would be good. > > Is there a FSF GCC Bugzilla open for the 437.leslie3d problem? The > FIXME doesn't need to contain the complete explanation, but it would > be nice to reference a longer explanation and not pretend to be > Fermat's theorem that is too large to write in the marge. There is now. I opened 57309 after submitting the patch, so I can reference that. Thanks, Bill > > Thanks, David >
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, May 17, 2013 at 11:17:26AM +0200, Richard Biener wrote: > Yes - note that it's called MALLOC_*ABI*_ALIGNMENT for a reason - it > is supposed to be the alignment that is required for conforming to the C ABI > on the target. For different allocators I'd rather have a new function > attribute that tells us the alignment of the allocators allocation function > (of course replacing the allocator at runtime via ELF interposition makes > that possibly fragile as well). We shouldn't assume just because glibc Most of the allocators I've mentioned, if not all, are providing just malloc/calloc etc. entrypoints and are being in search scope before -lc, so it isn't anything you can control with attributes. I've looked at libasan so far (seems to guarantee either 16, or 64 or 128 bytes depending on flags), valgrind (seems to guarantee 8 bytes on 32-bit arches except ppc32, and 16 bytes elsewhere), ElectricFence (no guarantees at all, the default is 4 bytes, but can be overridden with EF_ALIGNMENT env option to smaller/larger value, I guess let's ignore efence). Jakub
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, May 17, 2013 at 10:47 AM, Jakub Jelinek wrote: > On Fri, May 17, 2013 at 10:32:11AM +0200, Richard Biener wrote: >> On Fri, May 17, 2013 at 4:40 AM, Bill Schmidt >> wrote: >> > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost >> > vectorization opportunities. Previously, GCC treated malloc'd arrays as >> > only guaranteeing 4-byte alignment, even though the glibc implementation >> > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, >> > which is sufficient to permit the missed vectorization opportunities. >> > >> > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, >> > but doing so currently exposes a latent bug that degrades a 64-bit >> > benchmark. I have therefore not included that change at this time, but >> > added a FIXME recording the information. >> > >> > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new >> > regressions. Verified that SPEC CPU2006 degradations are fixed with no >> > new degradations. Ok for trunk? Also, do you want any backports? >> >> You say this is because glibc guarantees such alignment - shouldn't this >> be guarded by a proper check then? Probably AIX guarantees similar >> alignment but there are also embedded elf/newlib targets, no? >> >> Btw, the #define should possibly move to config/linux.h guarded by >> OPTION_GLIBC? > > For that location it shouldn't be 64, but 2 * POINTER_SIZE, which is what > glibc really guarantees on most architectures (I think x32 provides more > than that, but it can override). But changing it requires some analysis > on commonly used malloc alternatives on Linux, what exactly do they > guarantee. Say valgrind, ElectricFence, jemalloc, tcmalloc, libasan. > Note that on most targets there is some standard type that requires > such alignment, thus at least 2 * POINTER_SIZE alignment is also a C > conformance matter. Yes - note that it's called MALLOC_*ABI*_ALIGNMENT for a reason - it is supposed to be the alignment that is required for conforming to the C ABI on the target. For different allocators I'd rather have a new function attribute that tells us the alignment of the allocators allocation function (of course replacing the allocator at runtime via ELF interposition makes that possibly fragile as well). We shouldn't assume just because glibc may at some day guarantee 256bit alignment that other allocators on linux do so, too. So - I'd rather play safe (and the default to align to BITS_PER_WORD probably catches the C ABI guarantee on most targets I suppose). Richard. > Jakub
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, May 17, 2013 at 04:55:20AM -0400, David Edelsohn wrote: > As Bill wrote earlier, 2 * POINTER_SIZE causes a different performance > regression for 16 byte alignment on PPC64. 8 bytes is a work-around > for now. I was talking about the config/linux.h definition, PPC64 can of course override it to workaround something temporarily, but because there is some problem on PPC64 we shouldn't penalize code say on x86_64, s390x or aarch64. Jakub
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, May 17, 2013 at 4:47 AM, Jakub Jelinek wrote: > On Fri, May 17, 2013 at 10:32:11AM +0200, Richard Biener wrote: >> On Fri, May 17, 2013 at 4:40 AM, Bill Schmidt >> wrote: >> > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost >> > vectorization opportunities. Previously, GCC treated malloc'd arrays as >> > only guaranteeing 4-byte alignment, even though the glibc implementation >> > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, >> > which is sufficient to permit the missed vectorization opportunities. >> > >> > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, >> > but doing so currently exposes a latent bug that degrades a 64-bit >> > benchmark. I have therefore not included that change at this time, but >> > added a FIXME recording the information. >> > >> > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new >> > regressions. Verified that SPEC CPU2006 degradations are fixed with no >> > new degradations. Ok for trunk? Also, do you want any backports? >> >> You say this is because glibc guarantees such alignment - shouldn't this >> be guarded by a proper check then? Probably AIX guarantees similar >> alignment but there are also embedded elf/newlib targets, no? >> >> Btw, the #define should possibly move to config/linux.h guarded by >> OPTION_GLIBC? > > For that location it shouldn't be 64, but 2 * POINTER_SIZE, which is what > glibc really guarantees on most architectures (I think x32 provides more > than that, but it can override). But changing it requires some analysis > on commonly used malloc alternatives on Linux, what exactly do they > guarantee. Say valgrind, ElectricFence, jemalloc, tcmalloc, libasan. > Note that on most targets there is some standard type that requires > such alignment, thus at least 2 * POINTER_SIZE alignment is also a C > conformance matter. Jakub, As Bill wrote earlier, 2 * POINTER_SIZE causes a different performance regression for 16 byte alignment on PPC64. 8 bytes is a work-around for now. - David
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, May 17, 2013 at 10:32:11AM +0200, Richard Biener wrote: > On Fri, May 17, 2013 at 4:40 AM, Bill Schmidt > wrote: > > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost > > vectorization opportunities. Previously, GCC treated malloc'd arrays as > > only guaranteeing 4-byte alignment, even though the glibc implementation > > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, > > which is sufficient to permit the missed vectorization opportunities. > > > > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, > > but doing so currently exposes a latent bug that degrades a 64-bit > > benchmark. I have therefore not included that change at this time, but > > added a FIXME recording the information. > > > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new > > regressions. Verified that SPEC CPU2006 degradations are fixed with no > > new degradations. Ok for trunk? Also, do you want any backports? > > You say this is because glibc guarantees such alignment - shouldn't this > be guarded by a proper check then? Probably AIX guarantees similar > alignment but there are also embedded elf/newlib targets, no? > > Btw, the #define should possibly move to config/linux.h guarded by > OPTION_GLIBC? For that location it shouldn't be 64, but 2 * POINTER_SIZE, which is what glibc really guarantees on most architectures (I think x32 provides more than that, but it can override). But changing it requires some analysis on commonly used malloc alternatives on Linux, what exactly do they guarantee. Say valgrind, ElectricFence, jemalloc, tcmalloc, libasan. Note that on most targets there is some standard type that requires such alignment, thus at least 2 * POINTER_SIZE alignment is also a C conformance matter. Jakub
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Fri, May 17, 2013 at 4:40 AM, Bill Schmidt wrote: > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost > vectorization opportunities. Previously, GCC treated malloc'd arrays as > only guaranteeing 4-byte alignment, even though the glibc implementation > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, > which is sufficient to permit the missed vectorization opportunities. > > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, > but doing so currently exposes a latent bug that degrades a 64-bit > benchmark. I have therefore not included that change at this time, but > added a FIXME recording the information. > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new > regressions. Verified that SPEC CPU2006 degradations are fixed with no > new degradations. Ok for trunk? Also, do you want any backports? You say this is because glibc guarantees such alignment - shouldn't this be guarded by a proper check then? Probably AIX guarantees similar alignment but there are also embedded elf/newlib targets, no? Btw, the #define should possibly move to config/linux.h guarded by OPTION_GLIBC? Richard. > Thanks, > Bill > > > 2013-05-16 Bill Schmidt > > * config/rs6000/rs6000.h (MALLOC_ABI_ALIGNMENT): New #define. > > > Index: gcc/config/rs6000/rs6000.h > === > --- gcc/config/rs6000/rs6000.h (revision 198998) > +++ gcc/config/rs6000/rs6000.h (working copy) > @@ -2297,6 +2297,13 @@ extern char rs6000_reg_names[][8]; /* register > nam > /* How to align the given loop. */ > #define LOOP_ALIGN(LABEL) rs6000_loop_align(LABEL) > > +/* Alignment guaranteed by __builtin_malloc. */ > +/* FIXME: 128-bit alignment is guaranteed by glibc for TARGET_64BIT. > + However, specifying the stronger guarantee currently leads to > + a regression in SPEC CPU2006 437.leslie3d. The stronger > + guarantee should be implemented here once that's fixed. */ > +#define MALLOC_ABI_ALIGNMENT (64) > + > /* Pick up the return address upon entry to a procedure. Used for > dwarf2 unwind information. This also enables the table driven > mechanism. */ > >
Re: [PATCH, rs6000] Increase MALLOC_ABI_ALIGNMENT for 32-bit PowerPC
On Thu, May 16, 2013 at 10:40 PM, Bill Schmidt wrote: > This removes two degradations in CPU2006 for 32-bit PowerPC due to lost > vectorization opportunities. Previously, GCC treated malloc'd arrays as > only guaranteeing 4-byte alignment, even though the glibc implementation > guarantees 8-byte alignment. This raises the guarantee to 8 bytes, > which is sufficient to permit the missed vectorization opportunities. > > The guarantee for 64-bit PowerPC should be raised to 16-byte alignment, > but doing so currently exposes a latent bug that degrades a 64-bit > benchmark. I have therefore not included that change at this time, but > added a FIXME recording the information. > > Bootstrapped and tested on powerpc64-unknown-linux-gnu with no new > regressions. Verified that SPEC CPU2006 degradations are fixed with no > new degradations. Ok for trunk? Also, do you want any backports? Okay. If you think that this is appropriate for 4.8 branch, that is okay as well. Is there a FSF GCC Bugzilla open for the 437.leslie3d problem? The FIXME doesn't need to contain the complete explanation, but it would be nice to reference a longer explanation and not pretend to be Fermat's theorem that is too large to write in the marge. Thanks, David