https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121095
Bug ID: 121095 Summary: Possibly unnecessary PRE pass on aarch64 for fpmr Product: gcc Version: 15.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: lucier at math dot purdue.edu Target Milestone: --- With this compiler: [MacBook-Pro:gambit/gambit/lib] lucier% gcc-15 -v Using built-in specs. COLLECT_GCC=gcc-15 COLLECT_LTO_WRAPPER=/opt/homebrew/Cellar/gcc/15.1.0/bin/../libexec/gcc/aarch64-apple-darwin24/15/lto-wrapper Target: aarch64-apple-darwin24 Configured with: ../configure --prefix=/opt/homebrew/opt/gcc --libdir=/opt/homebrew/opt/gcc/lib/gcc/current --disable-nls --enable-checking=release --with-gcc-major-version-only --enable-languages=c,c++,objc,obj-c++,fortran --program-suffix=-15 --with-gmp=/opt/homebrew/opt/gmp --with-mpfr=/opt/homebrew/opt/mpfr --with-mpc=/opt/homebrew/opt/libmpc --with-isl=/opt/homebrew/opt/isl --with-zstd=/opt/homebrew/opt/zstd --with-pkgversion='Homebrew GCC 15.1.0' --with-bugurl=https://github.com/Homebrew/homebrew-core/issues --with-system-zlib --build=aarch64-apple-darwin24 --with-sysroot=/Library/Developer/CommandLineTools/SDKs/MacOSX15.sdk Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 15.1.0 (Homebrew GCC 15.1.0) and the test file all.i from https://gcc.gnu.org/bugzilla/attachment.cgi?id=61159 and this command line gcc-15 -O1 -c -Wdisabled-optimization all.i I see the warning message all.c: In function '___H__20_all_2e_o1': all.c:131776:161: warning: PRE disabled: 57919 basic blocks and 222307 registers; increase '--param max-gcse-memory' above 1571957 [-Wdisabled-optimization] That PRE is called at all seems to be because of this code in gcc/gcse.cc: static unsigned int execute_hardreg_pre (void) { #ifdef HARDREG_PRE_REGNOS doing_hardreg_pre_p = true; unsigned int regnos[] = HARDREG_PRE_REGNOS; /* It's possible to avoid this loop, but it isn't worth doing so until hardreg PRE is used for multiple hardregs. */ for (int i = 0; regnos[i] != 0; i++) { int changed; current_hardreg_regno = regnos[i]; if (dump_file) fprintf(dump_file, "Entering hardreg PRE for regno %d\n", current_hardreg_regno); delete_unreachable_blocks (); df_analyze (); changed = one_pre_gcse_pass (); if (changed) cleanup_cfg (0); } doing_hardreg_pre_p = false; #endif return 0; } and the only architecture that defines HARDREG_PRE_REGNOS is aarch64: [MacBook-Pro:~/programs/gcc-mainline] lucier% grep -R HARDREG_PRE gcc/config gcc/config/aarch64/aarch64.h:#define HARDREG_PRE_REGNOS { FPM_REGNUM, 0 } FPM_REGNUM refers to the fpmr register: https://developer.arm.com/documentation/ddi0601/2025-06/AArch64-Registers/FPMR--Floating-point-Mode-Register?lang=en which determines some properties of FP8 (8-bit floating point) arithmetic. Why I write: The file all.i does not use any FP8 arithmetic, so I question why a PRE pass that deals with a register that controls FP8 arithmetic is run unconditionally.