Hi Vineet, Jeff,

On Wed, Jul 19, 2023 at 7:31 AM Jeff Law <jeffreya...@gmail.com> wrote:
>
>
>
> On 7/18/23 17:42, Vineet Gupta wrote:
> > Hi Manolis,
> >
> > On 7/18/23 11:01, Jeff Law via Gcc-patches wrote:
> >> Vineet @ Rivos has indicated he stumbled across an ICE with the V3
> >> code.  Hopefully he'll get a testcase for that extracted shortly.
> >
> > Yeah, I was trying to build SPEC2017 with this patch and ran into ICE
> > for several of them with -Ofast build: The reduced test from 455.nab is
> > attached here.
> > The issue happens with v2 as well, so not something introduced by v3.
> >
> > There's ICE in cprop_hardreg which immediately follows f-m-o.
> >
> >
> > The protagonist is ins 93 which starts off in combine as a simple set of
> > a DF 0.
> >
> > | sff.i.288r.combine:(insn 93 337 94 8 (set (reg/v:DF 236 [ e ])
> > | sff.i.288r.combine- (const_double:DF 0.0 [0x0.0p+0])) "sff.i":23:11
> > 190 {*movdf_hardfloat_rv64}
> >
> > Subsequently reload transforms it into SP + offset
> >
> > | sff.i.303r.reload:(insn 93 337 94 9 (set (mem/c:DF (plus:DI (reg/f:DI
> > 2 sp)
> > | sff.i.303r.reload- (const_int 8 [0x8])) [4 %sfp+-8 S8 A64])
> > | sff.i.303r.reload- (const_double:DF 0.0 [0x0.0p+0])) "sff.i":23:11 190
> > {*movdf_hardfloat_rv64}
> > | sff.i.303r.reload- (expr_list:REG_EQUAL (const_double:DF 0.0 [0x0.0p+0])
> >
> > It gets processed by f-m-o and lands in cprop_hardreg, where it triggers
> > ICE.
> >
> > | (insn 93 337 523 11 (set (mem/c:DF (plus:DI (reg/f:DI 2 sp)
> > |                 (const_int 8 [0x8])) [4 %sfp+-8 S8 A64])
> > |         (const_double:DF 0.0 [0x0.0p+0])) "sff.i":23:11 -1
> > ^^^
> > |      (expr_list:REG_EQUAL (const_double:DF 0.0 [0x0.0p+0])
> > |        (nil)))
> > | during RTL pass: cprop_hardreg
> >
> > Here's my analysis:
> >
> > f-m-o: do_check_validity() -> insn_invalid_p() tries to recog() a
> > modified version of insn 93 (actually there is no change, so perhaps
> > something we can optimize later). The corresponding md pattern
> > movdf_hardfloat_rv64 no longer matches since it expects REG_P for
> > operand0, while reload has converted it into SP + offset. f-m-o then
> > does the right thing by invalidating INSN_CODE=-1 for a subsequent
> > recog() to work correctly.
> > But it seems this -1 lingers into the next pass, and trips up
> > copyprop_hardreg_forward_1() -> extract_constrain_insn()
> > So I don't know what the right fix here should be.
> This is a bug in the RISC-V backend.  I actually fixed basically the
> same bug in another backend that was exposed by the f-m-o code.
>

I stumbled upon the same thing when doing an aarch64 bootstrap build yesterday.
Given that this causes issues, maybe doing
  int icode = INSN_CODE (insn);
  ...
  INSN_CODE (insn) = icode;
Is a good option and should also be more performant.
Even with that I'm still getting a segfault while doing a bootstrap
build that I'm investigating.

>
> >
> > In a run with -fno-fold-mem-offsets, the same insn 93 is successfully
> > grok'ed by cprop_hardreg,
> >
> > | (insn 93 337 522 11 (set (mem/c:DF (plus:DI (reg/f:DI 2 sp)
> > |                (const_int 8 [0x8])) [4 %sfp+-8 S8 A64])
> > |        (const_double:DF 0.0 [0x0.0p+0])) "sff.i":23:11 190
> > {*movdf_hardfloat_rv64}
> > ^^^^^^^^^^^^^^^
> > |     (expr_list:REG_EQUAL (const_double:DF 0.0 [0x0.0p+0])
> > |        (nil)))
> >
> > P.S. I wonder if it is a good idea in general to call recog() post
> > reload since the insn could be changed sufficiently to no longer match
> > the md patterns. Of course I don't know the answer.
> If this ever causes a problem, it's a backend bug.  It's that simple.
>
> Conceptually it should always be safe to set INSN_CODE to -1 for any insn.
>
> Odds are for this specific case in the RV backend, we just need a
> constraint to store 0.0 into a memory location.  That can actually be
> implemented as a store from x0 since 0.0 has the bit pattern 0x0.  This
> is probably a good thing to expose anyway as an optimization and can
> move forward independently of the f-m-o patch.
>
>
>
> >
> > P.S.2 When debugging code, I noticed a minor annoyance in the patch with
> > the whole fold_mem_offsets_driver() switch-case indirection. It doesn't
> > seem to be serving any purpose, and we could simply call corresponding
> > do_* routines in execute () itself.
> We were in the process of squashing some of this out of the
> implementation.   I hadn't looked at the V3 patch to see how much
> progress had been made on this yet.
>

Thanks for pointing that out Vineet!
When I refactored the code in the separate do_* functions it never
occured to me that both the _driver function and the state enum are
now useless. I will remove all of this in the next iteration.

Manolis

> Thanks for digging into this!
>
> jeff

Reply via email to