Re: [PR 81616] Deferring FMA transformations in tight loops

2018-01-12 Thread Richard Biener
On Wed, 10 Jan 2018, Martin Jambor wrote: > Hello, > > I would really like to ping the FMA transformation prevention patch that > I sent here in December, which, after incorporating a suggestion from > Richi, re-base and re-testing, I re-post below. I really think that it > should make into gcc

Re: [PR 81616] Deferring FMA transformations in tight loops

2018-01-10 Thread Jeff Law
On 01/10/2018 11:46 AM, Martin Jambor wrote: > Hello, > > I would really like to ping the FMA transformation prevention patch that > I sent here in December, which, after incorporating a suggestion from > Richi, re-base and re-testing, I re-post below. I really think that it > should make into gc

Re: [PR 81616] Deferring FMA transformations in tight loops

2018-01-10 Thread Martin Jambor
Hello, I would really like to ping the FMA transformation prevention patch that I sent here in December, which, after incorporating a suggestion from Richi, re-base and re-testing, I re-post below. I really think that it should make into gcc 8 in some form, because the performance wins are really

Re: [PR 81616] Deferring FMA transformations in tight loops

2017-12-21 Thread Martin Jambor
Hi, On Mon, Dec 18 2017, Richard Biener wrote: > On Fri, Dec 15, 2017 at 3:19 PM, Martin Jambor wrote: >> >> Hello, >> >> the patch below prevents creation if fused-multiply-and-add instructions >> in the widening_mul gimple pass on the Zen-based AMD CPUs and as a >> result fixes regressions of n

Re: [PR 81616] Deferring FMA transformations in tight loops

2017-12-18 Thread Richard Biener
On Fri, Dec 15, 2017 at 3:19 PM, Martin Jambor wrote: > > Hello, > > the patch below prevents creation if fused-multiply-and-add instructions > in the widening_mul gimple pass on the Zen-based AMD CPUs and as a > result fixes regressions of native znver1 tuning when compared to > generic tuning in

[PR 81616] Deferring FMA transformations in tight loops

2017-12-15 Thread Martin Jambor
Hello, the patch below prevents creation if fused-multiply-and-add instructions in the widening_mul gimple pass on the Zen-based AMD CPUs and as a result fixes regressions of native znver1 tuning when compared to generic tuning in: - the matrix.c testcase of PR 81616 (straightforward matrix