Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Arthur Eubanks
On Fri, Oct 1, 2021 at 1:53 PM Maxim Kuvyrkov wrote: > > On 1 Oct 2021, at 23:37, Arthur Eubanks wrote: > > > > Thanks for the flags, I can now reproduce. > > > > I've basically come to the same conclusion as you. There's only one new > instance of this optimization triggering throughout the

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Arthur Eubanks
Thanks for the flags, I can now reproduce. I've basically come to the same conclusion as you. There's only one new instance of this optimization triggering throughout the whole file, in S_reginclass(). It doesn't look out of place, and S_regmatch() is identical before and after the patch. So it's

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Maxim Kuvyrkov
> On 1 Oct 2021, at 23:37, Arthur Eubanks wrote: > > Thanks for the flags, I can now reproduce. > > I've basically come to the same conclusion as you. There's only one new > instance of this optimization triggering throughout the whole file, in > S_reginclass(). It doesn't look out of place,

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Maxim Kuvyrkov
Hi Arthur, Pre-processed source is in the save-temps tarballs linked below; S_regmatch() is in regexec.i . The save-temps also have .s assembly file for before and after your patch, and the only code-gen difference is in S_reginclass() function — see the attached screenshot #1. Looking into

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-10-01 Thread Maxim Kuvyrkov
Hi Arthur, Thanks for looking into this! The flags to compile regexec.c were: -O3 --target=aarch64-linux-gnu -fgnu89-inline Clang was configured with (on x86_64-linux-gnu host): cmake -G Ninja ../llvm/llvm '-DLLVM_ENABLE_PROJECTS=clang;lld' -DCMAKE_BUILD_TYPE=Release

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-09-30 Thread Arthur Eubanks
Could I get the source file with S_regmatch()? On Mon, Sep 27, 2021 at 6:07 AM Maxim Kuvyrkov wrote: > Hi Arthur, > > Your patch seems to be slowing down 400.perlbench by 6% — due to slow down > of its hot function S_regmatch() by 14%. > > Could you take a look if this is easily fixable,

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-09-30 Thread Arthur Eubanks
I have a wild guess at a potential fix in the meantime: diff --cc llvm/lib/Transforms/Utils/SimplifyCFG.cpp index 3add561c66d5,3add561c66d5..bedbca9dd4b7 --- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp +++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp @@@ -3242,9 -3242,9 +3242,9 @@@ bool

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-09-30 Thread Maxim Kuvyrkov
Hi Arthur, Your patch seems to be slowing down 400.perlbench by 6% — due to slow down of its hot function S_regmatch() by 14%. Could you take a look if this is easily fixable, please? Regards, -- Maxim Kuvyrkov https://www.linaro.org > On 24 Sep 2021, at 15:07, ci_not...@linaro.org wrote: >

[TCWG CI] 400.perlbench slowed down by 6% after llvm: [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest

2021-09-24 Thread ci_notify
After llvm commit e7249e4acf3cf9438d6d9e02edecebd5b622a4dc Author: Arthur Eubanks [SimplifyCFG] Ignore free instructions when computing cost for folding branch to common dest the following benchmarks slowed down by more than 2%: - 400.perlbench slowed down by 6% from 9730 to 10312 perf