Following up on posts/reviews by Segher and Uros, there's some question
over why the middle-end's lower subreg pass emits a clobber (of a
multi-word register) into the instruction stream before emitting the
sequence of moves of the word-sized parts.  This clobber interferes
with (LRA) register allocation, preventing the multi-word pseudo to
remain in the same hard registers.  This patch eliminates this
(presumably superfluous) clobber and thereby improves register allocation.

A concrete example of the observed improvement is PR target/43644.
For the test case:
__int128 foo(__int128 x, __int128 y) { return x+y; }

on x86_64-pc-linux-gnu, gcc -O2 currently generates:

foo:    movq    %rsi, %rax
        movq    %rdi, %r8
        movq    %rax, %rdi
        movq    %rdx, %rax
        movq    %rcx, %rdx
        addq    %r8, %rax
        adcq    %rdi, %rdx
        ret

with this patch, we now generate the much improved:

foo:    movq    %rdx, %rax
        movq    %rcx, %rdx
        addq    %rdi, %rax
        adcq    %rsi, %rdx
        ret

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32} with
no new failures.  OK for mainline?


2023-05-06  Roger Sayle  <ro...@nextmovesoftware.com>

gcc/ChangeLog
        PR target/43644
        * lower-subreg.cc (resolve_simple_move): Don't emit a clobber
        immediately before moving a multi-word register by parts.

gcc/testsuite/ChangeLog
        PR target/43644
        * gcc.target/i386/pr43644.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/lower-subreg.cc b/gcc/lower-subreg.cc
index 81fc5380..7c9cc3c 100644
--- a/gcc/lower-subreg.cc
+++ b/gcc/lower-subreg.cc
@@ -1086,9 +1086,6 @@ resolve_simple_move (rtx set, rtx_insn *insn)
     {
       unsigned int i;
 
-      if (REG_P (dest) && !HARD_REGISTER_NUM_P (REGNO (dest)))
-       emit_clobber (dest);
-
       for (i = 0; i < words; ++i)
        {
          rtx t = simplify_gen_subreg_concatn (word_mode, dest,
diff --git a/gcc/testsuite/gcc.target/i386/pr43644.c 
b/gcc/testsuite/gcc.target/i386/pr43644.c
new file mode 100644
index 0000000..ffdf31c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr43644.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2" } */
+
+__int128 foo(__int128 x, __int128 y)
+{
+  return x+y;
+}
+
+/* { dg-final { scan-assembler-times "movq" 2 } } */
+/* { dg-final { scan-assembler-not "push" } } */
+/* { dg-final { scan-assembler-not "pop" } } */

Reply via email to