Re: [committed] [RISC-V] Avoid sub-word mode comparisons with Zicond

Jeff Law Sat, 05 Aug 2023 07:56:30 -0700



On 8/5/23 01:46, Xiao Zeng wrote:


The operands to the comparison need to be in DImode for rv64 and SImode
for rv32.  That's the X iterator.

After analyzing the rtl log, I can't agree more with this sentence.

Note the mode of the comparison
operands may be different than the mode of the destination.  ie, we
might have a 64bit comparison and produce a 32bit sign extended result
much like the setcc insns support.

This patch changes the 6 zicond patterns to use the X iterator on the
comparison inputs.  That at least makes the patterns correct and fixes
this particular testcase.   There's a few other lurking problems that
I'll address in additional patches.

Committed to the trunk,
Jeff


1 In any case, jeff's analysis is convincing, here I will add a little bit of 
my own analysis.

2 First, for the test cases:

foo(long long int x, long long int y) {
   if (((int)x | (int)y) != 0)
     return 6;
   return x + y;
}

look directly at the compared assembly code. This allows people to quickly
realize where the error occurred.

X_mode.s(right)                                                    
ANYI_mode.S(error)
10 foo:                                                                  10 foo:
       11         or      a5,a0,a1                                          11  
      or      a5,a0,a1
       12         sext.w  a5,a5                                            12   
     li      a4,6
       13         addw    a0,a0,a1                                      13      
  addw    a0,a0,a1
       14         li      a4,6
       15         czero.nez       a1,a0,a5                             14       
 czero.nez       a1,a0,a5
       16         czero.eqz       a0,a4,a5                             15       
 czero.eqz       a0,a4,a5
       17         or      a0,a0,a1                                          16  
      or      a0,a0,a1
       18         ret                                                           
 17        ret

3 You will find that the correct assembly is just one more assembly
instruction: sext.w  a5,a5, the rest of the two are exactly the same.

4 From the perspective of assembly instructions, the a5 value obtained
by sext.w a5, a5 may be different from the original a5 value, which leads
to errors in the test case.

Exactly.


5 However, it is difficult to directly see that an error has occurred
from the rtl log of gcc's passe.

Yes. It's pretty subtle, but par for the course once subreg expressionsare involved. subregs are a major wart in the design/implementation ofRTL due to them having fairly obscure semantics that are dependent onwhether they are widening or narrowing subregs, if the inner object is amem or something else, etc and how they interact withWORD_REGISTER_OPERATIONS.


6 I'm wondering about transforms like this:

In test.c.c.301r.ira
(insn 36 34 42 2 (set (reg:DI 153)
         (if_then_else:DI (eq:DI (subreg:SI (reg:DI 145) 0)
                 (const_int 0 [0]))
             (reg:DI 149)
             (const_int 0 [0]))) "test.c":4:12 13599 {*czero.nez.disi}
      (expr_list:REG_DEAD (reg:DI 149)
         (nil)))

But it's already broken in this form. There was a sign extension inthe IL up through the combine pass. In fact, I nearly made the argumentthat the bug was in combine's simplification of (sign_extend (subreg...)) to (subreg ...).

But ultimately the hardware semantics of risc-v are that comparisonslook at the full register, plain and simple. So a narrowing subreg to asub-word mode is simply wrong i this context. This is also consistentwith the implementation of conditional branches and scc insns in therisc-v backend.


In test.c.c.302r.reload it becomes
(insn 36 34 42 2 (set (reg:DI 11 a1 [153])
         (if_then_else:DI (eq:DI (reg:SI 15 a5 [145])
                 (const_int 0 [0]))
             (reg:DI 10 a0 [149])
             (const_int 0 [0]))) "test.c":4:12 13599 {*czero.nez.disi}
      (nil))

Obviously, (subreg:SI (reg:DI 145) 0) is transformed into (reg:SI 15 a5 [145]) 
after
passing through reload pass. This conversion is wrong, why did gcc not warn?

The only reason subregs exist is because pseudo registers have a singlemode -- which derives from the fact that there is precisely onceinstance of any given pseudo register. See RTL sharing section of theinternals manual.

A subreg expression allows us to look at a pseudo in a mode other thanits natural mode. Essentially its the same bits reinterpreted inanother mode. I'm over-simplifying a lot. The full details of subregsemantics are documented in the RTL section of the internals manual.

WORD_REGISTER_OPERATIONS is a property of the target that (againover-simplifying) says that an operation such as an add may have a givensize (say SImode) in the RTL, but the hardware performs the operation inits native word size (DImode for rv64).

While WORD_REGISTER_OPERATIONS has been around since the 90s, itssemantics have always been a bit vague. It's seen as a wart and I'dvery much like to get away from using it on the RISC-V port.

Register allocation assigns hard registers to the pseudo registers andthe RTL is changed to reflect that register allocation. At that pointwe have no more pseudos in the IL as they have been replaced by hardregisters. Register allocation is split into two distinct phases. IRAand LRA. It's an over-simplification, but IRA is the main registerallocator and LRA is primarily tasked with making each operand match itsconstraints -- ie spilling and reloading.

Hard registers are not shared objects, meaning we can have multipleinstances of the same hard register. So near the end of registerallocation we transform all subregs into an appropriate reg expression.This simplifies the implementation of various passes that run afterregister allocation/reloading.

So the transformation (subreg:SI (reg:DI a5)) into (reg:SI a5) by theregister allocation is valid and normal. That subreg expression shouldnever have shown up in a comparison to begin with.


7 I'm not very familiar with reload pass, maybe someone can give me a
brief introduction, or tell me where to find relevant information? Thanks.

There is some discussion of reload in the internals manual which coverssome of the basics. The implementation details are incredibly complex.Additionally there are two completely distinct implementations. Thereis "reload" which was the original implementation from the late 1980sand is still used by some targets, though it's considered deprecated atthis point. And there's the LRA implementation which was added in the2000s.

The RISC-V port is an LRA port. The best documentation of LRA is thecode itself (lra-*.cc) and the papers that have presented at the variousGNU Cauldron and GCC Summit conferences.


Jeff

Re: [committed] [RISC-V] Avoid sub-word mode comparisons with Zicond

Reply via email to