When we search for a point in an inseration candidate block that has
incoming live call clobbered regs we look for REG_DEAD notes of
those and indication of FLAGS reg becoming live. But we consider
insns like
(insn 807 805 6 2 (parallel [
(set (subreg:SI (reg:HI 509) 0)
(lshiftrt:SI (reg:SI 514)
(const_int 16 [0x10])))
(clobber (reg:CC 17 flags))
])
"/home/packages/tmp/onednn-3.9.1+ds/src/cpu/x64/brgemm/jit_brgemm_amx_uker.cpp":1891:25
1213 {*lshrsi3_1}
(expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_DEAD (reg:SI 514)
(nil))))
making the FLAGS_REG live despite the REG_UNUSED note or the setter
being a CLOBBER. The following optimizes this by in turn honoring
REG_UNUSED for FLAGS_REG, pruning it immediately again.
This reduces required expensive iteration to other candidate BBs,
reducing compile-time for the testcase in the PR from hours to 6s.
Bootstrap and regtest running on x86_64-unknown-linux-gnu.
OK for trunk?
I have an optional followup simplifying the logic added in r16-3290
as well, will post for consideration after testing succeeded.
Thanks,
Richard.
PR target/123137
* config/i386/i386-features.cc (ix86_emit_tls_call): Improve
local FLAGS_REG liveness calculation.
---
gcc/config/i386/i386-features.cc | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index d5435f009cb..a25f9c245d7 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -4080,7 +4080,9 @@ ix86_emit_tls_call (rtx tls_set, x86_cse_kind kind,
basic_block bb,
rtx link;
for (link = REG_NOTES (insn); link; link = XEXP (link, 1))
- if (REG_NOTE_KIND (link) == REG_DEAD
+ if ((REG_NOTE_KIND (link) == REG_DEAD
+ || (REG_NOTE_KIND (link) == REG_UNUSED
+ && REGNO (XEXP (link, 0)) == FLAGS_REG))
&& REG_P (XEXP (link, 0)))
{
/* Mark the live caller-saved register as dead. */
--
2.51.0