On Wed, Aug 02, 2023 at 01:41:00 PM Jeff Law <j...@ventanamicro.com> wrote: > >c-torture/execute/pr59014-2.c fails with the Zicond work on rv64. We >miscompile the "foo" routine because we have eliminated a required sign >extension. > >The key routine looks like this: > >foo (long long int x, long long int y) >{ > if (((int) x | (int) y) != 0) > return 6; > return x + y; >} > >So we kindof do the expected thing at expansion time. We IOR X and Y, >sign extend the result from 32 to 64 bits (note how the values in the >source are casted from long long to ints), then emit a suitable >conditional branch. ie: > >> (insn 10 4 12 2 (set (reg:DI 142) >> (ior:DI (reg/v:DI 138 [ x ]) >> (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3} >> (nil)) >> (insn 12 10 13 2 (set (reg:DI 144) >> (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 >>{extendsidi2} >> (nil)) >> (jump_insn 13 12 14 2 (set (pc) >> (if_then_else (ne (reg:DI 144) >> (const_int 0 [0])) >> (label_ref:DI 27) >> (pc))) "j.c":6:6 243 {*branchdi} >> (expr_list:REG_DEAD (reg:DI 144) >> (int_list:REG_BR_PROB 233216732 (nil))) > >When we if-convert that we generate this sequence: > >> (insn 10 4 12 2 (set (reg:DI 142) >> (ior:DI (reg/v:DI 138 [ x ]) >> (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3} >> (nil)) >> (insn 12 10 30 2 (set (reg:DI 144) >> (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 >>{extendsidi2} >> (nil)) >> (insn 30 12 31 2 (set (reg:DI 147) >> (const_int 6 [0x6])) "j.c":8:12 179 {*movdi_64bit} >> (nil)) >> (insn 31 30 33 2 (set (reg:DI 146) >> (plus:DI (reg/v:DI 138 [ x ]) >> (reg/v:DI 139 [ y ]))) "j.c":8:12 5 {adddi3} >> (nil)) >> (insn 33 31 34 2 (set (reg:DI 149) >> (if_then_else:DI (ne:DI (reg:DI 144) >> (const_int 0 [0])) >> (const_int 0 [0]) >> (reg:DI 146))) "j.c":8:12 11368 {*czero.nez.didi} >> (nil)) >> (insn 34 33 35 2 (set (reg:DI 148) >> (if_then_else:DI (eq:DI (reg:DI 144) >> (const_int 0 [0])) >> (const_int 0 [0]) >> (reg:DI 147))) "j.c":8:12 11367 {*czero.eqz.didi} >> (nil)) >> (insn 35 34 21 2 (set (reg:DI 137 [ <retval> ]) >> (ior:DI (reg:DI 148) >> (reg:DI 149))) "j.c":8:12 99 {iordi3} >> (nil)) > >Which looks basically OK. The sign extended subreg is a bit worrisome >though. And sure enough when we get into combine: > >> Failed to match this instruction: >> (parallel [ >> (set (reg:DI 149) >> (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0) >> (const_int 0 [0])) >> (reg:DI 146) >> (const_int 0 [0]))) >> (set (reg:DI 144) >> (sign_extend:DI (subreg:SI (reg:DI 142) 0))) >> ]) >> Successfully matched this instruction: >> (set (reg:DI 144) >> (sign_extend:DI (subreg:SI (reg:DI 142) 0))) >> Successfully matched this instruction: >> (set (reg:DI 149) >> (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0) >> (const_int 0 [0])) >> (reg:DI 146) >> (const_int 0 [0]))) >> allowing combination of insns 12 and 33 > >Since we need the side effect we first try the PARALLEL with two sets. >That, as expected, fails. Generic combine code then tries to pull apart >the two sets as distinct insns resulting in this conditional move: > >> (insn 33 31 34 2 (set (reg:DI 149) >> (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0) >> (const_int 0 [0])) >> (reg:DI 146) >> (const_int 0 [0]))) "j.c":8:12 11347 {*czero.nez.disi} >> (expr_list:REG_DEAD (reg:DI 146) >> (nil))) > >Bzzt. We can't actually implement this RTL in the hardware. Basically >it's asking to do 32bit comparison on rv64, ignoring the upper 32 bits >of the input register. That's not actually how zicond works. > >The operands to the comparison need to be in DImode for rv64 and SImode >for rv32. That's the X iterator. After analyzing the rtl log, I can't agree more with this sentence.
>Note the mode of the comparison >operands may be different than the mode of the destination. ie, we >might have a 64bit comparison and produce a 32bit sign extended result >much like the setcc insns support. > >This patch changes the 6 zicond patterns to use the X iterator on the >comparison inputs. That at least makes the patterns correct and fixes >this particular testcase. There's a few other lurking problems that >I'll address in additional patches. > >Committed to the trunk, >Jeff 1 In any case, jeff's analysis is convincing, here I will add a little bit of my own analysis. 2 First, for the test cases: foo(long long int x, long long int y) { if (((int)x | (int)y) != 0) return 6; return x + y; } look directly at the compared assembly code. This allows people to quickly realize where the error occurred. X_mode.s(right) ANYI_mode.S(error) 10 foo: 10 foo: 11 or a5,a0,a1 11 or a5,a0,a1 12 sext.w a5,a5 12 li a4,6 13 addw a0,a0,a1 13 addw a0,a0,a1 14 li a4,6 15 czero.nez a1,a0,a5 14 czero.nez a1,a0,a5 16 czero.eqz a0,a4,a5 15 czero.eqz a0,a4,a5 17 or a0,a0,a1 16 or a0,a0,a1 18 ret 17 ret 3 You will find that the correct assembly is just one more assembly instruction: sext.w a5,a5, the rest of the two are exactly the same. 4 From the perspective of assembly instructions, the a5 value obtained by sext.w a5, a5 may be different from the original a5 value, which leads to errors in the test case. 5 However, it is difficult to directly see that an error has occurred from the rtl log of gcc's passe. 6 I'm wondering about transforms like this: In test.c.c.301r.ira (insn 36 34 42 2 (set (reg:DI 153) (if_then_else:DI (eq:DI (subreg:SI (reg:DI 145) 0) (const_int 0 [0])) (reg:DI 149) (const_int 0 [0]))) "test.c":4:12 13599 {*czero.nez.disi} (expr_list:REG_DEAD (reg:DI 149) (nil))) In test.c.c.302r.reload it becomes (insn 36 34 42 2 (set (reg:DI 11 a1 [153]) (if_then_else:DI (eq:DI (reg:SI 15 a5 [145]) (const_int 0 [0])) (reg:DI 10 a0 [149]) (const_int 0 [0]))) "test.c":4:12 13599 {*czero.nez.disi} (nil)) Obviously, (subreg:SI (reg:DI 145) 0) is transformed into (reg:SI 15 a5 [145]) after passing through reload pass. This conversion is wrong, why did gcc not warn? 7 I'm not very familiar with reload pass, maybe someone can give me a brief introduction, or tell me where to find relevant information? Thanks. Thanks Xiao Zeng