https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119285
Bug ID: 119285
Summary: [15 Regression] 5% slowdown of 519.lbm_r on Zen2 and
Zen4 since r15-7932-ge355fe414aa3aa
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: pheeck at gcc dot gnu.org
CC: vmakarov at gcc dot gnu.org
Blocks: 26163
Target Milestone: ---
Host: x86_64-linux
Target: x86_64-linux
As seen here
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=286.477.0
there was a 5% exec time slowdown of the 519.lbm_r SPEC 2017
benchmark when run with -Ofast -march=native -flto PGO on an AMD Zen2 machine.
I bisected it to r15-7932-ge355fe414aa3aa.
commit e355fe414aa3aaf215c7dd9dd789ce217a1b458c
Author: Vladimir N. Makarov <[email protected]>
AuthorDate: Mon Mar 10 16:26:59 2025 -0400
Commit: Vladimir N. Makarov <[email protected]>
CommitDate: Mon Mar 10 16:27:51 2025 -0400
[PR114991][IRA]: Improve reg equiv invariant calculation
In PR test case IRA preferred to allocate hard reg to a pseudo instead
of its equivalence. This resulted in allocating caller-saved hard reg
and generating save/restore insns in the function prologue/epilogue.
The equivalence is an invariant (stack pointer plus offset) and the
pseudo is used mostly as memory address. This happened as there was
no simplification of insn after the invariant substitution. The patch
adds the necessary code.
This is a regression against GCC 14. See the comparison
here:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.8=1077.477.0&plot.9=286.477.0&
The same regression also happens on AMD Zen4 with -O2 -march=generic -flto.
See the slowdown here:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=957.477.0
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)