https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114996

            Bug ID: 114996
           Summary: [15 Regression] [RISC-V] 2->2 combination no longer
                    occurring
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: law at gcc dot gnu.org
  Target Milestone: ---

So this test has started failing on RISC-V after re-introduction of the change
to avoid 2->2 combinations when i2 is unchanged:

/* { dg-do compile } */
/* { dg-require-effective-target rv64 } */
/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
/* { dg-options "-march=rv64gc_zba" } */

typedef unsigned int uint32_t;
typedef unsigned long uint64_t;

void foo(uint32_t a, uint64_t *b_ptr, uint64_t b, uint64_t *c_ptr, uint64_t c)
{
  uint64_t x = a;
  *b_ptr = b + x;
  *c_ptr = c + x;
}

/* { dg-final { scan-assembler-not "\\szext.w\\s" } } */


[ That's heavily reduced and twiddled a bit from a hot loop in xz IIRC. ]

The key thing to note about the test is we have a zero extension of a 32 bit
value to 64 bits.  The resulting 64 bit value is used in *two* subsequent
instructions.

As we start combine this looks like:

(insn 10 7 11 2 (set (reg/v:DI 136 [ x ])
        (zero_extend:DI (subreg/s/u:SI (reg/v:DI 137 [ a ]) 0))) "j.c":11:12
458 {*zero_extendsidi2_bitmanip}
     (expr_list:REG_DEAD (reg/v:DI 137 [ a ])
        (nil)))
(insn 11 10 12 2 (set (reg:DI 142 [ _1 ])
        (plus:DI (reg/v:DI 136 [ x ])
            (reg/v:DI 139 [ b ]))) "j.c":12:14 5 {adddi3}
     (expr_list:REG_DEAD (reg/v:DI 139 [ b ])
        (nil)))
(insn 12 11 13 2 (set (mem:DI (reg/v/f:DI 138 [ b_ptr ]) [1 *b_ptr_7(D)+0 S8
A64])
        (reg:DI 142 [ _1 ])) "j.c":12:10 268 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 142 [ _1 ])
        (expr_list:REG_DEAD (reg/v/f:DI 138 [ b_ptr ])
            (nil))))
(insn 13 12 14 2 (set (reg:DI 143 [ _2 ])
        (plus:DI (reg/v:DI 136 [ x ])
            (reg/v:DI 141 [ c ]))) "j.c":13:14 5 {adddi3}
     (expr_list:REG_DEAD (reg/v:DI 141 [ c ])
        (expr_list:REG_DEAD (reg/v:DI 136 [ x ])
            (nil))))
(insn 14 13 0 2 (set (mem:DI (reg/v/f:DI 140 [ c_ptr ]) [1 *c_ptr_10(D)+0 S8
A64])
        (reg:DI 143 [ _2 ])) "j.c":13:10 268 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:DI 143 [ _2 ])
        (expr_list:REG_DEAD (reg/v/f:DI 140 [ c_ptr ])
            (nil))))


Without the problematical combine change we would first combine 10->11 as a
2->2 combination.  Insn 10 would remain unchanged, but insn 11 would
incorporate the zero extension (RISC-V as an instruction for this).

After combining 10->11 (reg:DI 136) will have a single use enabling a 2->1
combination 10->13.

With the problematical combiner change the 10->11 combination fails because i2
hasn't changed and the 10->13 combination fails as well.  The net result is we
have an unnecessary zero extension in that loop.  Proper code for that testcase
is:
        add.uw  a2,a0,a2
        sd      a2,0(a1)
        add.uw  a0,a0,a4
        sd      a0,0(a3)

With that combiner change instead we're getting:

        zext.w  a0,a0
        add     a2,a0,a2
        sd      a2,0(a1)
        add     a0,a0,a4
        sd      a0,0(a3)

Reply via email to