Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Arthur Eubanks
On Fri, Oct 1, 2021 at 1:53 PM Maxim Kuvyrkov wrote: > > On 1 Oct 2021, at 23:37, Arthur Eubanks wrote: > > > > Thanks for the flags, I can now reproduce. > > > > I've basically come to the same conclusion as you. There's only one new > instance of this optimization triggering throughout the who

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Arthur Eubanks
Thanks for the flags, I can now reproduce. I've basically come to the same conclusion as you. There's only one new instance of this optimization triggering throughout the whole file, in S_reginclass(). It doesn't look out of place, and S_regmatch() is identical before and after the patch. So it's

RE: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Mekhanoshin, Stanislav
[AMD Official Use Only] I have tried a lot of different options, I do not recall now. Anyway, it is reverted and I do not seem to have resources to further pursue it. Stas -Original Message- From: Maxim Kuvyrkov Sent: Friday, October 1, 2021 11:16 To: Mekhanoshin, Stanislav Cc: linaro

RE: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Mekhanoshin, Stanislav
[AMD Official Use Only] > You mentioned that you saw different results for another ARM target — could > you elaborate please? When I was trying to reproduce hmmer asm I was trying to use different ARM targets. I was never able to pick the one you were using apparently, but then got very differ

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Maxim Kuvyrkov
> On 1 Oct 2021, at 23:37, Arthur Eubanks wrote: > > Thanks for the flags, I can now reproduce. > > I've basically come to the same conclusion as you. There's only one new > instance of this optimization triggering throughout the whole file, in > S_reginclass(). It doesn't look out of place,

Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Maxim Kuvyrkov
> On 1 Oct 2021, at 21:06, Mekhanoshin, Stanislav > wrote: > > [AMD Official Use Only] > >> You mentioned that you saw different results for another ARM target — could >> you elaborate please? > > When I was trying to reproduce hmmer asm I was trying to use different ARM > targets. I was nev

[ACTIVITY] report week ending 1 Oct

2021-10-01 Thread Peter Maydell
Progress * UM-2 [QEMU upstream maintainership] + Worked through my code-review backlog + Noticed that we never got round to making our emulated GICv3 support having redistributors in more than one contiguous region; this prevents using more than 123 CPUs with the virt board. Sent o

RE: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Mekhanoshin, Stanislav
[AMD Official Use Only] Maxim, This is really difficult for me to work on this as I do not have various targets and HW affected. I am sure there were quite a lot of progressions, but as I said in the beginning regressions are also inevitable, just like every time a heuristic is involved. For t

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Maxim Kuvyrkov
Hi Arthur, Pre-processed source is in the save-temps tarballs linked below; S_regmatch() is in regexec.i . The save-temps also have .s assembly file for before and after your patch, and the only code-gen difference is in S_reginclass() function — see the attached screenshot #1. Looking into p

Re: [TCWG CI] 471.omnetpp slowed down by 8% after gcc: Avoid invalid loop transformations in jump threading registry.

2021-10-01 Thread Jeff Law
On 9/27/2021 7:52 AM, Aldy Hernandez wrote: [CCing Jeff and list for broader audience] On 9/27/21 2:53 PM, Maxim Kuvyrkov wrote: Hi Aldy, Your patch seems to slow down 471.omnetpp by 8% at -O3.  Could you please take a look if this is something that could be easily fixed? First of all, th

Re: [TCWG CI] 471.omnetpp slowed down by 8% after gcc: Avoid invalid loop transformations in jump threading registry.

2021-10-01 Thread Gerald Pfeifer
On Wed, 29 Sep 2021, Maxim Kuvyrkov via Gcc wrote: > Configurations that track master branches have 3-day intervals. > Configurations that track release branches — 6 days. If a regression is > detected it is narrowed down to component first — binutils, gcc or glibc > — and then the commit rang

Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Maxim Kuvyrkov
Hi Stanislav, I fully understand the challenges of compiler optimizations and the fact that a generally-good optimisation can slow down a small number of benchmarks. Still, benchmarking your original patch (commit 92c1fd19abb15bc68b1127a26137a69e033cdb39) on arm-linux-gnueabihf results in over

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Maxim Kuvyrkov
Hi Arthur, Thanks for looking into this! The flags to compile regexec.c were: -O3 --target=aarch64-linux-gnu -fgnu89-inline Clang was configured with (on x86_64-linux-gnu host): cmake -G Ninja ../llvm/llvm '-DLLVM_ENABLE_PROJECTS=clang;lld' -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=Tr

Re: [TCWG CI] 471.omnetpp slowed down by 8% after gcc: Avoid invalid loop transformations in jump threading registry.

2021-10-01 Thread Maxim Kuvyrkov
> On 29 Sep 2021, at 21:21, Andrew MacLeod wrote: > > On 9/29/21 7:59 AM, Maxim Kuvyrkov wrote: >> >>> Does it run like once a day/some-time-period, and if you note a >>> regression, narrow it down? >> Configurations that track master branches have 3-day intervals. >> Configurations that tr