On Fri, Oct 15, 2021 at 2:15 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > My previous patch, which was intended to reduce the differences seen by > the combination of -march=cascadelake and -m32, has additionally found > some more instances where this combination behaves differently to regular > x86_64-pc-linux-gnu. The middle-end always, and backends usually, use > emit_move_insn to emit/expand move instructions allowing the backend > control over placing things in constant pools, adding REG_EQUAL notes, > and so on. Several of the AVX512 built-in expanders bypass this logic, > and instead generate moves directly using emit_insn(gen_rtx_SET (dst,src)). > > For example, i386-expand.c line 12004 contains: > for (i = 0; i < 8; i++) > emit_insn (gen_rtx_SET (xmm_regs[i], const0_rtx)); > > I suspect that in this case, loading of standard_sse_constant_p, my > change to require loading of likely spilled hard registers via a > pseudo is perhaps overly strict, so this patch/fix reallows these > immediate constants values to be loaded directly prior to reload. > > If anyone notices a (SPEC benchmark) performance regression with > this patch, I'll propose the more invasive fix to make more use of > emit_move_insn in the backend (and revert this fix), but all things > being equal it's best to leave things the way they previously were. > > This patch not only cures the regressions reported by Sunil's > tester, but in combination with the previous patch now has 7 fewer > unexpected failures in the testsuite with -m32 -march=cascadelake. > This patch has also been tested with "make bootstrap" and > "make -k check" on x86_64-pc-linux-gnu with no new failures. > > Ok for mainline? > Sorry again for the temporary inconvenience. > > > 2021-10-15 Roger Sayle <ro...@nextmovesoftware.com> > > gcc/ChangeLog > * config/i386/i386.c (ix86_hardreg_mov_ok): For vector modes, > allow standard_sse_constant_p immediate constants.
LGTM. Thanks, Uros.