[Bug rtl-optimization/119285] New: [15 Regression] 5% slowdown of 519.lbm_r on Zen2 and Zen4 since r15-7932-ge355fe414aa3aa

pheeck at gcc dot gnu.org via Gcc-bugs Fri, 14 Mar 2025 02:18:43 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119285


            Bug ID: 119285
           Summary: [15 Regression] 5% slowdown of 519.lbm_r on Zen2 and
                    Zen4 since r15-7932-ge355fe414aa3aa
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pheeck at gcc dot gnu.org
                CC: vmakarov at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=286.477.0

there was a 5% exec time slowdown of the 519.lbm_r SPEC 2017
benchmark when run with -Ofast -march=native -flto PGO on an AMD Zen2 machine.
I bisected it to r15-7932-ge355fe414aa3aa.

commit e355fe414aa3aaf215c7dd9dd789ce217a1b458c
Author:     Vladimir N. Makarov <[email protected]>
AuthorDate: Mon Mar 10 16:26:59 2025 -0400
Commit:     Vladimir N. Makarov <[email protected]>
CommitDate: Mon Mar 10 16:27:51 2025 -0400

    [PR114991][IRA]: Improve reg equiv invariant calculation

    In PR test case IRA preferred to allocate hard reg to a pseudo instead
    of its equivalence.  This resulted in allocating caller-saved hard reg
    and generating save/restore insns in the function prologue/epilogue.
    The equivalence is an invariant (stack pointer plus offset) and the
    pseudo is used mostly as memory address.  This happened as there was
    no simplification of insn after the invariant substitution.  The patch
    adds the necessary code.

This is a regression against GCC 14. See the comparison
here:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.8=1077.477.0&plot.9=286.477.0&;

The same regression also happens on AMD Zen4 with -O2 -march=generic -flto. 
See the slowdown here:

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=957.477.0


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug rtl-optimization/119285] New: [15 Regression] 5% slowdown of 519.lbm_r on Zen2 and Zen4 since r15-7932-ge355fe414aa3aa

Reply via email to