Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Jan Hubicka
Hi, this patch disables use of FMA in matrix multiplication loop for generic (for x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. For Intel this is neutral both on the matrix multiplication microbenchmark (attached) and spec2k17 where the difference was within noise for Core

Re: Disable FMADD in chains for Zen4 and generic

2024-01-07 Thread Hongtao Liu
On Thu, Dec 14, 2023 at 12:03 AM Jan Hubicka wrote: > > > > The diffrerence is that Cores understand the fact that fmadd does not need > > > all three parameters to start computation, while Zen cores doesn't. > > > > > > Since this seems noticeable win on zen and not loss on Core it seems like >

Re: Disable FMADD in chains for Zen4 and generic

2024-01-17 Thread Jan Hubicka
> Can we backport the patch(at least the generic part) to > GCC11/GCC12/GCC13 release branch? Yes, the periodic testers has took the change and as far as I can tell, there are no surprises. Thanks, Honza > > > > > > > > /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight > > > > 51

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Richard Biener
On Tue, Dec 12, 2023 at 3:38 PM Jan Hubicka wrote: > > Hi, > this patch disables use of FMA in matrix multiplication loop for generic (for > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. > > For Intel this is neutral both on the matrix multiplication microbenchmark > (atta

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Jan Hubicka
> > This came up in a separate thread as well, but when doing reassoc of a > chain with > multiple dependent FMAs. > > I can't understand how this uarch detail can affect performance when > as in the testcase > the longest input latency is on the multiplication from a memory load. > Do we actuall

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Alexander Monakov
On Tue, 12 Dec 2023, Richard Biener wrote: > On Tue, Dec 12, 2023 at 3:38 PM Jan Hubicka wrote: > > > > Hi, > > this patch disables use of FMA in matrix multiplication loop for generic > > (for > > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. > > > > For Intel this is

Re: Disable FMADD in chains for Zen4 and generic

2023-12-12 Thread Hongtao Liu
On Tue, Dec 12, 2023 at 10:38 PM Jan Hubicka wrote: > > Hi, > this patch disables use of FMA in matrix multiplication loop for generic (for > x86-64-v3) and zen4. I tested this on zen4 and Xenon Gold Gold 6212U. > > For Intel this is neutral both on the matrix multiplication microbenchmark > (att

Re: Disable FMADD in chains for Zen4 and generic

2023-12-13 Thread Jan Hubicka
> > The diffrerence is that Cores understand the fact that fmadd does not need > > all three parameters to start computation, while Zen cores doesn't. > > > > Since this seems noticeable win on zen and not loss on Core it seems like > > good > > default for generic. > > > > I plan to commit the pa