On Wed, Aug 02, 2023 at 01:41:00 PM  Jeff Law <j...@ventanamicro.com> wrote:
>
>c-torture/execute/pr59014-2.c fails with the Zicond work on rv64.  We
>miscompile the "foo" routine because we have eliminated a required sign
>extension.
>
>The key routine looks like this:
>
>foo (long long int x, long long int y)
>{
>   if (((int) x | (int) y) != 0)
>     return 6;
>   return x + y;
>}
>
>So we kindof do the expected thing at expansion time.  We IOR X and Y,
>sign extend the result from 32 to 64 bits (note how the values in the
>source are casted from long long to ints), then emit a suitable
>conditional branch.  ie:
>
>> (insn 10 4 12 2 (set (reg:DI 142)
>>         (ior:DI (reg/v:DI 138 [ x ])
>>             (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3}
>>      (nil))
>> (insn 12 10 13 2 (set (reg:DI 144)
>>         (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 
>>{extendsidi2}
>>      (nil))
>> (jump_insn 13 12 14 2 (set (pc)
>>         (if_then_else (ne (reg:DI 144)
>>                 (const_int 0 [0]))
>>             (label_ref:DI 27)
>>             (pc))) "j.c":6:6 243 {*branchdi}
>>      (expr_list:REG_DEAD (reg:DI 144)
>>         (int_list:REG_BR_PROB 233216732 (nil)))
>
>When we if-convert that we generate this sequence:
>
>> (insn 10 4 12 2 (set (reg:DI 142)
>>         (ior:DI (reg/v:DI 138 [ x ])
>>             (reg/v:DI 139 [ y ]))) "j.c":6:16 99 {iordi3}
>>      (nil))
>> (insn 12 10 30 2 (set (reg:DI 144)
>>         (sign_extend:DI (subreg:SI (reg:DI 142) 0))) "j.c":6:6 116 
>>{extendsidi2}
>>      (nil))
>> (insn 30 12 31 2 (set (reg:DI 147)
>>         (const_int 6 [0x6])) "j.c":8:12 179 {*movdi_64bit}
>>      (nil))    
>> (insn 31 30 33 2 (set (reg:DI 146)
>>         (plus:DI (reg/v:DI 138 [ x ])
>>             (reg/v:DI 139 [ y ]))) "j.c":8:12 5 {adddi3}
>>      (nil))     
>> (insn 33 31 34 2 (set (reg:DI 149)
>>         (if_then_else:DI (ne:DI (reg:DI 144)
>>                 (const_int 0 [0]))
>>             (const_int 0 [0])
>>             (reg:DI 146))) "j.c":8:12 11368 {*czero.nez.didi}
>>      (nil))
>> (insn 34 33 35 2 (set (reg:DI 148)
>>         (if_then_else:DI (eq:DI (reg:DI 144)
>>                 (const_int 0 [0]))
>>             (const_int 0 [0])
>>             (reg:DI 147))) "j.c":8:12 11367 {*czero.eqz.didi}
>>      (nil))
>> (insn 35 34 21 2 (set (reg:DI 137 [ <retval> ])
>>         (ior:DI (reg:DI 148)
>>             (reg:DI 149))) "j.c":8:12 99 {iordi3}
>>      (nil))
>
>Which looks basically OK.  The sign extended subreg is a bit worrisome
>though.  And sure enough when we get into combine:
>
>> Failed to match this instruction:
>> (parallel [
>>         (set (reg:DI 149)
>>             (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
>>                     (const_int 0 [0]))
>>                 (reg:DI 146)
>>                 (const_int 0 [0])))
>>         (set (reg:DI 144)
>>             (sign_extend:DI (subreg:SI (reg:DI 142) 0)))
>>     ])
>> Successfully matched this instruction:
>> (set (reg:DI 144)
>>     (sign_extend:DI (subreg:SI (reg:DI 142) 0)))
>> Successfully matched this instruction:
>> (set (reg:DI 149)
>>     (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
>>             (const_int 0 [0]))
>>         (reg:DI 146)
>>         (const_int 0 [0])))
>> allowing combination of insns 12 and 33
>
>Since we need the side effect we first try the PARALLEL with two sets.
>That, as expected, fails.  Generic combine code then tries to pull apart
>the two sets as distinct insns resulting in this conditional move:
>
>> (insn 33 31 34 2 (set (reg:DI 149)
>>         (if_then_else:DI (eq:DI (subreg:SI (reg:DI 142) 0)
>>                 (const_int 0 [0]))
>>             (reg:DI 146)
>>             (const_int 0 [0]))) "j.c":8:12 11347 {*czero.nez.disi}
>>      (expr_list:REG_DEAD (reg:DI 146)
>>         (nil)))
>
>Bzzt.  We can't actually implement this RTL in the hardware.  Basically
>it's asking to do 32bit comparison on rv64, ignoring the upper 32 bits
>of the input register.  That's not actually how zicond works.
>
>The operands to the comparison need to be in DImode for rv64 and SImode
>for rv32.  That's the X iterator.  
After analyzing the rtl log, I can't agree more with this sentence.

>Note the mode of the comparison
>operands may be different than the mode of the destination.  ie, we
>might have a 64bit comparison and produce a 32bit sign extended result
>much like the setcc insns support.
>
>This patch changes the 6 zicond patterns to use the X iterator on the
>comparison inputs.  That at least makes the patterns correct and fixes
>this particular testcase.   There's a few other lurking problems that
>I'll address in additional patches.
>
>Committed to the trunk,
>Jeff

1 In any case, jeff's analysis is convincing, here I will add a little bit of 
my own analysis.

2 First, for the test cases:

foo(long long int x, long long int y) {
  if (((int)x | (int)y) != 0)
    return 6;
  return x + y;
}

look directly at the compared assembly code. This allows people to quickly
realize where the error occurred.

X_mode.s(right)                                                    
ANYI_mode.S(error)
10 foo:                                                                  10 
foo:                                                                            
                
      11         or      a5,a0,a1                                          11   
     or      a5,a0,a1                                                           
             
      12         sext.w  a5,a5                                            12    
    li      a4,6                                                                
            
      13         addw    a0,a0,a1                                      13       
 addw    a0,a0,a1                                                               
         
      14         li      a4,6                                                   
                                                                                
                                                    
      15         czero.nez       a1,a0,a5                             14        
czero.nez       a1,a0,a5                                                        
        
      16         czero.eqz       a0,a4,a5                             15        
czero.eqz       a0,a4,a5                                                        
        
      17         or      a0,a0,a1                                          16   
     or      a0,a0,a1                                                           
             
      18         ret                                                            
17        ret        

3 You will find that the correct assembly is just one more assembly
instruction: sext.w  a5,a5, the rest of the two are exactly the same.

4 From the perspective of assembly instructions, the a5 value obtained
by sext.w a5, a5 may be different from the original a5 value, which leads
to errors in the test case.

5 However, it is difficult to directly see that an error has occurred
from the rtl log of gcc's passe.

6 I'm wondering about transforms like this:

In test.c.c.301r.ira
(insn 36 34 42 2 (set (reg:DI 153)
        (if_then_else:DI (eq:DI (subreg:SI (reg:DI 145) 0)
                (const_int 0 [0]))
            (reg:DI 149)
            (const_int 0 [0]))) "test.c":4:12 13599 {*czero.nez.disi}
     (expr_list:REG_DEAD (reg:DI 149)
        (nil)))

In test.c.c.302r.reload it becomes
(insn 36 34 42 2 (set (reg:DI 11 a1 [153])
        (if_then_else:DI (eq:DI (reg:SI 15 a5 [145])
                (const_int 0 [0]))
            (reg:DI 10 a0 [149])
            (const_int 0 [0]))) "test.c":4:12 13599 {*czero.nez.disi}
     (nil))

Obviously, (subreg:SI (reg:DI 145) 0) is transformed into (reg:SI 15 a5 [145]) 
after
passing through reload pass. This conversion is wrong, why did gcc not warn?

7 I'm not very familiar with reload pass, maybe someone can give me a
brief introduction, or tell me where to find relevant information? Thanks.

 
Thanks
Xiao Zeng

Reply via email to