https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103302

Jim Wilson <wilson at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wilson at gcc dot gnu.org

--- Comment #2 from Jim Wilson <wilson at gcc dot gnu.org> ---
It is the second reversed comparison that is wrong.  This is the
  u32_0 <= (...)
on the first line of foo0.  In the assembly file, this ends up as
        mv      a0,a6
        mv      a1,a7
        xor     a6,a0,a6
        xor     a7,a1,a7
        or      a6,a6,a7
        seqz    a6,a6
and note that it is comparing a value against itself when it should be
comparing two different values.

The harden compare pass is generating RTL

insn 156 152 155 6 (set (reg:TI 201)
        (asm_operands:TI ("") ("=g") 0 [
                (reg:TI 77 [ _8 ])
            ]
             [
                (asm_input:TI ("0"))
            ]
             [])) -1
     (nil))
(insn 155 156 153 6 (clobber (reg:TI 77 [ _8 ])) -1
     (nil))
(insn 153 155 154 6 (set (subreg:DI (reg:TI 77 [ _8 ]) 0)
        (subreg:DI (reg:TI 201) 0)) -1
     (nil))
(insn 154 153 160 6 (set (subreg:DI (reg:TI 77 [ _8 ]) 8)
        (subreg:DI (reg:TI 201) 8)) -1
     (nil))

Then the asmcons pass is changing this to

(insn 851 152 849 5 (clobber (reg:TI 201)) -1
     (nil))
(insn 849 851 850 5 (set (subreg:DI (reg:TI 201) 0)
        (subreg:DI (reg:TI 77 [ _8 ]) 0)) -1
     (nil))
(insn 850 849 156 5 (set (subreg:DI (reg:TI 201) 8)
        (subreg:DI (reg:TI 77 [ _8 ]) 8)) -1
     (nil))
(insn 156 850 155 5 (set (reg:TI 201)
        (asm_operands:TI ("") ("=g") 0 [
                (reg:TI 201)
            ]
             [
                (asm_input:TI ("0"))
            ]
             [])) -1
     (expr_list:REG_DEAD (reg:TI 77 [ _8 ])
        (nil)))
(insn 155 156 153 5 (clobber (reg:TI 77 [ _8 ])) -1
     (nil))
(insn 153 155 154 5 (set (subreg:DI (reg:TI 77 [ _8 ]) 0)
        (subreg:DI (reg:TI 201) 0)) 135 {*movdi_64bit}
     (nil))
(insn 154 153 854 5 (set (subreg:DI (reg:TI 77 [ _8 ]) 8)
        (subreg:DI (reg:TI 201) 8)) 135 {*movdi_64bit}
     (expr_list:REG_DEAD (reg:TI 201)
        (nil)))

Then the register allocator puts both 77 and 201 in the same register, which
means we are now clobbering values we need.

In the reload dump I see

(insn 851 152 849 5 (clobber (reg:TI 16 a6 [201])) -1
     (nil))
(insn 849 851 850 5 (set (reg:DI 16 a6 [201])
        (reg:DI 10 a0 [orig:77 _8 ] [77])) 135 {*movdi_64bit}
     (nil))
(insn 850 849 907 5 (set (reg:DI 17 a7 [+8 ])
        (reg:DI 11 a1 [ _8+8 ])) 135 {*movdi_64bit}
     (nil))
(insn 907 850 1014 5 (clobber (reg:TI 16 a6 [201])) -1
     (nil))

so the insns 849 and 850 get optimized away, but we need them.  Also, we have

(insn 854 154 852 5 (clobber (reg:TI 16 a6 [202])) -1
     (nil))
(insn 852 854 853 5 (set (reg:DI 16 a6 [202])
        (reg:DI 6 t1 [orig:86 _39 ] [86])) 135 {*movdi_64bit}
     (nil))
(insn 853 852 913 5 (set (reg:DI 17 a7 [+8 ])
        (reg:DI 7 t2 [ _39+8 ])) 135 {*movdi_64bit}
     (nil))
(insn 913 853 1010 5 (clobber (reg:TI 16 a6 [202])) -1
     (nil))

and the insns 852 and 853 get optimized away, but we need them.  The comparison
is supposed to be a0/a1 versus t1/t2, but we end up with comparing a6/a7
against itself.

asmcons is calling emit_move_insn to copy the asm source to the asm dest so it
can simplify the asm.  Since this is a multiword mode, and the riscv backend
doesn't have a movti pattern, this ends up calling emit_move_multi_word which
emits the extra clobber that causes the problem.

I suppose we could fix this by adding a movti pattern to the riscv backend to
avoid the clobbers but we shouldn't have to.  Though it would be interesting to
see if this maybe results in better code optimization.

It isn't clear exactly where the problem is.  Maybe asmcons shouldn't try to
fix an asm when the mode is larger than the word mode?  This could be left to
the register allocator to fix.  Or maybe harden compare shouldn't generate RTL
like this?  This could be a harden compare issue, or maybe an issue with the
RTL expander to emit the rtl differently.  Looks like the same issue with the
RTL expander calling emit_move_multi_word which generates the clobber.  Or
maybe a movti pattern is actually required now?

I did verify that disabling asmcons fixes the problem for this testcase.  I had
to hack the code in function.c to do that as there is no option to disable it.

Reply via email to