https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095
Bug ID: 121095
Summary: Possibly unnecessary PRE pass on aarch64 for fpmr
Product: gcc
Version: 15.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: lucier at math dot purdue.edu
Target Milestone: ---
With this compiler:
[MacBook-Pro:gambit/gambit/lib] lucier% gcc-15 -v
Using built-in specs.
COLLECT_GCC=gcc-15
COLLECT_LTO_WRAPPER=/opt/homebrew/Cellar/gcc/15.1.0/bin/../libexec/gcc/aarch64-apple-darwin24/15/lto-wrapper
Target: aarch64-apple-darwin24
Configured with: ../configure --prefix=/opt/homebrew/opt/gcc
--libdir=/opt/homebrew/opt/gcc/lib/gcc/current --disable-nls
--enable-checking=release --with-gcc-major-version-only
--enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-15
--with-gmp=/opt/homebrew/opt/gmp --with-mpfr=/opt/homebrew/opt/mpfr
--with-mpc=/opt/homebrew/opt/libmpc --with-isl=/opt/homebrew/opt/isl
--with-zstd=/opt/homebrew/opt/zstd --with-pkgversion='Homebrew GCC 15.1.0'
--with-bugurl=https://github.com/Homebrew/homebrew-core/issues
--with-system-zlib --build=aarch64-apple-darwin24
--with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX15.sdk
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 15.1.0 (Homebrew GCC 15.1.0)
and the test file all.i from
https://gcc.gnu.org/bugzilla/attachment.cgi?id=61159
and this command line
gcc-15 -O1 -c -Wdisabled-optimization all.i
I see the warning message
all.c: In function '___H__20_all_2e_o1':
all.c:131776:161: warning: PRE disabled: 57919 basic blocks and 222307
registers; increase '--param max-gcse-memory' above 1571957
[-Wdisabled-optimization]
That PRE is called at all seems to be because of this code in gcc/gcse.cc:
static unsigned int
execute_hardreg_pre (void)
{
#ifdef HARDREG_PRE_REGNOS
doing_hardreg_pre_p = true;
unsigned int regnos[] = HARDREG_PRE_REGNOS;
/* It's possible to avoid this loop, but it isn't worth doing so until
hardreg PRE is used for multiple hardregs. */
for (int i = 0; regnos[i] != 0; i++)
{
int changed;
current_hardreg_regno = regnos[i];
if (dump_file)
fprintf(dump_file, "Entering hardreg PRE for regno %d\n",
current_hardreg_regno);
delete_unreachable_blocks ();
df_analyze ();
changed = one_pre_gcse_pass ();
if (changed)
cleanup_cfg (0);
}
doing_hardreg_pre_p = false;
#endif
return 0;
}
and the only architecture that defines HARDREG_PRE_REGNOS is aarch64:
[MacBook-Pro:~/programs/gcc-mainline] lucier% grep -R HARDREG_PRE gcc/config
gcc/config/aarch64/aarch64.h:#define HARDREG_PRE_REGNOS { FPM_REGNUM, 0 }
FPM_REGNUM refers to the fpmr register:
https://developer.arm.com/documentation/ddi0601/2025-06/AArch64-Registers/FPMR--Floating-point-Mode-Register?lang=en
which determines some properties of FP8 (8-bit floating point) arithmetic.
Why I write: The file all.i does not use any FP8 arithmetic, so I question why
a PRE pass that deals with a register that controls FP8 arithmetic is run
unconditionally.