On Fri, Dec 15, 2023 at 2:25 AM haochen.jiang <haoch...@ecsmtp.sh.intel.com> wrote: > > On Linux/x86_64, > > 8afdbcdd7abe1e3c7a81e07f34c256e7f2dbc652 is the first bad commit > commit 8afdbcdd7abe1e3c7a81e07f34c256e7f2dbc652 > Author: Di Zhao <diz...@os.amperecomputing.com> > Date: Fri Dec 15 03:22:32 2023 +0800 > > Consider fully pipelined FMA in get_reassociation_width > > caused > > FAIL: gcc.dg/guality/pr58791-4.c -O2 -DPREVENT_OPTIMIZATION line > pr58791-4.c:32 i2 == 487 > FAIL: gcc.dg/guality/pr58791-4.c -O2 -DPREVENT_OPTIMIZATION line > pr58791-4.c:32 i == 486 > FAIL: gcc.dg/guality/pr58791-4.c -O2 -flto -fno-use-linker-plugin > -flto-partition=none -DPREVENT_OPTIMIZATION line pr58791-4.c:32 i2 == 487 > FAIL: gcc.dg/guality/pr58791-4.c -O2 -flto -fno-use-linker-plugin > -flto-partition=none -DPREVENT_OPTIMIZATION line pr58791-4.c:32 i == 486 > FAIL: gcc.dg/guality/pr58791-4.c -O2 -flto -fuse-linker-plugin > -fno-fat-lto-objects -DPREVENT_OPTIMIZATION line pr58791-4.c:32 i2 == 487 > FAIL: gcc.dg/guality/pr58791-4.c -O2 -flto -fuse-linker-plugin > -fno-fat-lto-objects -DPREVENT_OPTIMIZATION line pr58791-4.c:32 i == 486 > FAIL: gcc.dg/guality/pr58791-4.c -O3 -g -DPREVENT_OPTIMIZATION line > pr58791-4.c:32 i2 == 487 > FAIL: gcc.dg/guality/pr58791-4.c -O3 -g -DPREVENT_OPTIMIZATION line > pr58791-4.c:32 i == 486 > FAIL: gcc.dg/guality/pr58791-4.c -Os -DPREVENT_OPTIMIZATION line > pr58791-4.c:32 i2 == 487 > FAIL: gcc.dg/guality/pr58791-4.c -Os -DPREVENT_OPTIMIZATION line > pr58791-4.c:32 i == 486 > > with GCC configured with > > ../../gcc/configure > --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r14-6559/usr > --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld > --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl > --enable-libmpx x86_64-linux --disable-bootstrap > > To reproduce: > > $ cd {build_dir}/gcc && make check > RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr58791-4.c > --target_board='unix{-m64\ -march=cascadelake}'"
There's an extra intermediate stmt inserted (for much later use, but reassoc inserts close to defs) that is then also used for FMA forming. This disturbs things in some way: g_5 = (double) f_4; # DEBUG g => g_5 # DEBUG BEGIN_STMT h_7 = (double) b_6(D); # DEBUG h => h_7 # DEBUG BEGIN_STMT _39 = h_7 * 3.25e+0; # DEBUG D#5 => g_5 * h_7 # DEBUG i => D#5 # DEBUG BEGIN_STMT # DEBUG i2 => D#5 + 1.0e+0 # DEBUG BEGIN_STMT # DEBUG D#8 => g_5 * _39 _3 = .FMA (g_5, _39, h_7); g_5 is dead after the FMA. Interestingly removing the asm volatile (NOP : : : "memory"); asm volatile (NOP : : : "memory"); lines fixes the regression because then we can TER the FMA, keeping g_5 live for longer. Richard.