https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119044
Bug ID: 119044
Summary: 5-16% slowdown of 436.cactusADM since
r15-7661-g8293b9e40f12e9
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: pheeck at gcc dot gnu.org
CC: rguenth at gcc dot gnu.org
Blocks: 26163
Target Milestone: ---
Host: x86_64-linux
Target: x86_64-linux
As seen here
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=292.100.0
there was a 10% exec time slowdown of the 436.cactusADM SPEC 2006 benchmark
when run with -O2 -march=generic -flto on an AMD Zen2 machine. I bisected it
to r15-7661-g8293b9e40f12e9
ee30e2586a3142e63daaf301a561984f1d22d38d is the first bad commit
commit ee30e2586a3142e63daaf301a561984f1d22d38d
Author: Richard Biener <[email protected]>
Date: Fri Feb 21 09:58:04 2025 +0100
tree-optimization/118954 - avoid UB on ref created by predcom
When predicitive commoning moves an invariant ref it makes sure to
not build a MEM_REF with a base that is negatively offsetted from
an object. But in trying to preserve some transforms it does not
consider association of a constant offset with the address computation
in DR_BASE_ADDRESS leading to exactly this problem again. This is
arguably a problem in data-ref analysis producing such an out-of-bound
DR_BASE_ADDRESS, but this looks quite involved to fix, so the
following avoids the association in one more case. This fixes the
testcase while preserving the desired transform in
gcc.dg/tree-ssa/predcom-1.c.
PR tree-optimization/118954
* tree-predcom.cc (ref_at_iteration): Make sure to not
associate the constant offset with DR_BASE_ADDRESS when
that is an offsetted pointer.
* gcc.dg/torture/pr118954.c: New testcase.
gcc/testsuite/gcc.dg/torture/pr118954.c | 22 ++++++++++++++++++++++
gcc/tree-predcom.cc | 3 ++-
2 files changed, 24 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/torture/pr118954.c
There were also these cactusADM slowdowns in the same timeframe (so probably
caused by the same commit):
16% Zen2 -O2 -march=native -flto
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=290.100.0
16% Zen3 -O2 -march=native
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=464.100.0
These aren't regressions against older GCC versions.
Btw, there were also some speedups
21% Zen2 -Ofast -march=native
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=301.100.0
13% Zen2 -O2 -march=native
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=291.100.0
>From what I've seen it looks like the speedups balance out the slowdowns, maybe
even dominate them. So maybe this isn't an issue?
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)