reg copies in cse

Richard Sandiford Wed, 05 Feb 2020 05:31:28 -0800

Jeff Law <l...@redhat.com> writes:
> Richard & Segher, if y'all could check my analysis here, it'd be
> appreciated.
>
> pr90275 is a P2 regression that is only triggering on ARM.  David's
> testcase in c#1 is the best for this problem as it doesn't require
> magic flags like -fno-dce to trigger.
>
> The block in question:
>
>> (code_label 89 88 90 24 15 (nil) [0 uses])
>> (note 90 89 97 24 [bb 24] NOTE_INSN_BASIC_BLOCK)
>> (insn 97 90 98 24 (parallel [
>>             (set (reg:CC 100 cc)
>>                 (compare:CC (reg:SI 131 [ d_lsm.21 ])
>>                     (const_int 0 [0])))
>>             (set (reg:SI 135 [ d_lsm.21 ])
>>                 (reg:SI 131 [ d_lsm.21 ]))
>>         ]) "pr90275.c":21:45 248 {*movsi_compare0}
>>      (expr_list:REG_DEAD (reg:SI 131 [ d_lsm.21 ])
>>         (nil)))
>> (insn 98 97 151 24 (set (reg:SI 136 [+4 ])
>>         (reg:SI 132 [ d_lsm.21+4 ])) "pr90275.c":21:45 241 {*arm_movsi_insn}
>>      (expr_list:REG_DEAD (reg:SI 132 [ d_lsm.21+4 ])
>>         (expr_list:REG_DEAD (reg:CC 100 cc)
>>             (nil))))
>> (insn 151 98 152 24 (set (reg:SI 131 [ d_lsm.21 ])
>>         (reg:SI 131 [ d_lsm.21 ])) "pr90275.c":21:45 241 {*arm_movsi_insn}
>>      (expr_list:REG_DEAD (reg:SI 135 [ d_lsm.21 ])
>>         (nil)))
>> (insn 152 151 103 24 (set (reg:SI 132 [ d_lsm.21+4 ])
>>         (reg:SI 136 [+4 ])) "pr90275.c":21:45 241 {*arm_movsi_insn}
>>      (expr_list:REG_DEAD (reg:SI 136 [+4 ])
>>         (nil)))
>> 
> insns 97 and 151 are the most important.
>
> We process insn 97 which creates an equivalency between r135 and r131. 
> This is expressed by putting both on on the "same_value" chain
> (table_elt->{next,prev}_same_value).
>
> When we put the REGs on the chain we'll set REG_QTY to a positive
> number which indicates their values are valid.
>
> We continue processing insns forward and run into insn 151 which is a
> self-copy.
>
> First CSE will invalidate r131 (because its set).  Invalidation is
> accomplished by setting REG_QTY for r131 to a negative value.  It does
> not remove r131 from the same value chains.
>
> Then CSE will call insert_regs for r131.  The qty is not valid, so we
> get into this code:
>
>>      if (modified || ! qty_valid)
>>         {
>>           if (classp)
>>             for (classp = classp->first_same_value;
>>                  classp != 0;
>>                  classp = classp->next_same_value)
>>               if (REG_P (classp->exp)
>>                   && GET_MODE (classp->exp) == GET_MODE (x))
>>                 {
>>                   unsigned c_regno = REGNO (classp->exp);
>> 
>>                   gcc_assert (REGNO_QTY_VALID_P (c_regno));
>> [ ... ]
>
> So we walk the chain of same values for r131.  WHen walking we run into
> r131 itself.  Since r131 has been invalidated  we trip the assert.
>
>
> The fix is pretty simple.  We just arrange to stop processing insns
> that are nop reg->reg copies much like we already do for mem->mem
> copies and (set (pc) (pc)).
>
> This has bootstrapped and regression tested on x86_64.  I've also
> verified it fixes the testcase in c#1 of pr90275, the test in pr93125
> and pr92388.  Interestingly enough I couldn't trigger the original
> testcase in 90275, but I'm confident this is ultimately all the same
> problem.


This looks similar to the infamous (to me):

   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01581.html

which had to be reverted because it broke powerpc64 bootstrap.
The problem was that n_sets is misleading for calls:

   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01858.html

That's easy to fix (and I have a fix).  But given the damage this caused
last time, I think it's probably best left to GCC 11.

Thanks,
Richard

Re: [RFA] [PR rtl-optimization/90275] Handle nop reg->reg copies in cse

Reply via email to