Re: [PATCH] PR middle-end/103059: reload: Also accept ASHIFT with indexed addressing

Maciej W. Rozycki Thu, 04 Nov 2021 14:04:31 -0700

On Thu, 4 Nov 2021, Jeff Law wrote:

> On 11/3/2021 7:53 AM, Maciej W. Rozycki wrote:
> > Correct a `vax-netbsdelf' target regression ultimately caused by commit
> > c605a8bf9270 ("VAX: Accept ASHIFT in address expressions") (needed for
> > LRA) and as of commit 4a960d548b7d ("Avoid invalid loop transformations
> > in jump threading registry.") causing a build error in libgcc:
> But within a MEM the ASHIFT should have been canonicalized into a MULT by an
> appropriate power of two according to the canonicalization rules.


 I thought so as well, but was straigtened out by Richard:

On Thu, 25 Mar 2021, Richard Sandiford wrote:

> > [From <https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552586.html>; 
> > also commit 6b3034eaba83.]
> >
> >  Guys, this triggers a backend's functional regression and an ICE in the 
> > test suite with the LRA conversion I'm currently working on for the VAX 
> > backend.  Before I go ahead and paper it over in the backend I'd like to 
> > understand why this change was considered correct in the first place.
> >
> >  Contrary to what the change description suggests the ASHIFT form is not 
> > documented to be the canonical form for constant multiplication involving 
> > a power of two for addresses used outside `mem'.
> 
> One thing to note here is that, outside of a mem, there's no distinction
> between an address calculation and normal integer arithmetic.  In other
> words, “addresses used outside of a ‘mem’” aren't a distinct category of
> rtx that can be separated from other things outside of a “mem“.  So…
> 
> > What our rules only say 
> > is that for addresses inside `mem' the MULT form is:
> >
> >    * Within address computations (i.e., inside 'mem'), a left shift is
> >      converted into the appropriate multiplication by a power of two.
> >
> >  This change does the reverse of the conversion described above and makes
> > TARGET_LEGITIMATE_ADDRESS_P and possibly other backend code be presented 
> > with either form for indexed addresses, which complicates handling.  The 
> > ICE mentioned above specifically is caused by:
> >
> > (plus:SI (plus:SI (mult:SI (reg:SI 30 [ _10 ])
> >             (const_int 4 [0x4]))
> >         (reg/f:SI 26 [ _6 ]))
> >     (const_int 12 [0xc]))
> 
> …if you write:
> 
> -----------------------------------------------------------
> long *foo ();
> long *bar (long *ptr, long x) { return &foo ()[x + 3]; }
> -----------------------------------------------------------
> 
> then the rtl you get is:
> 
> -----------------------------------------------------------
> …
> (insn 10 9 11 2 (set (reg:SI 32)
>         (plus:SI (reg/v:SI 29 [ x ])
>             (const_int 3 [0x3]))) "/tmp/foo.c":2:47 -1
>      (nil))
> (insn 11 10 12 2 (set (reg:SI 33)
>         (ashift:SI (reg:SI 32)
>             (const_int 2 [0x2]))) "/tmp/foo.c":2:47 -1
>      (nil))
> (insn 12 11 13 2 (set (reg:SI 31)
>         (plus:SI (reg/f:SI 23 [ _1 ])
>             (reg:SI 33))) "/tmp/foo.c":2:40 -1
>      (nil))
> …
> -----------------------------------------------------------
> 
> where the address uses “ashift” rather than “mult”.  Then combine
> tries to generate the same kind of address as the one you quote above,
> but with “ashift” rather than “mult”:
> 
> -----------------------------------------------------------
> Trying 10, 11 -> 12:
>    10: r32:SI=r29:SI+0x3
>       REG_DEAD r29:SI
>    11: r33:SI=r32:SI<<0x2
>       REG_DEAD r32:SI
>    12: r31:SI=r34:SI+r33:SI
>       REG_DEAD r34:SI
>       REG_DEAD r33:SI
> Failed to match this instruction:
> (set (reg:SI 31)
>     (plus:SI (plus:SI (ashift:SI (reg/v:SI 29 [ x ])
>                 (const_int 2 [0x2]))
>             (reg:SI 34))
>         (const_int 12 [0xc])))
> -----------------------------------------------------------
> 
> So I don't see your VAX change as papering over the issue.  The above
> “ashift” form is what address calculations normally look like outside
> of a “mem”.  The point of the rtl canonicalisation rules is to make sure
> that targets don't need to support two different ways of writing the
> same thing, which in this case means not having to support
> “mult”-by-a-power-of-2 as well as “ashift” for the LEA above.
> 
> > coming from:
> >
> > (insn 58 57 59 10 (set (reg:SI 33 [ _13 ])
> >         (zero_extract:SI (mem:SI (plus:SI (plus:SI (mult:SI (reg:SI 30 [ _10
> ])
> >                             (const_int 4 [0x4]))
> >                         (reg/f:SI 26 [ _6 ]))
> >                     (const_int 12 [0xc])) [4 _6->bits[_10]+0 S4 A32])
> >             (reg:QI 56)
> >             (reg:SI 53))) 
> > ".../gcc/testsuite/gcc.c-torture/execute/20090113-2.c":64:12 490
> {*extzv_non_const}
> >      (expr_list:REG_DEAD (reg:QI 56)
> >         (expr_list:REG_DEAD (reg:SI 53)
> >             (expr_list:REG_DEAD (reg:SI 30 [ _10 ])
> >                 (expr_list:REG_DEAD (reg/f:SI 26 [ _6 ])
> >                     (nil))))))
> >
> > being converted into:
> >
> > (plus:SI (plus:SI (ashift:SI (reg:SI 30 [ _10 ])
> >             (const_int 2 [0x2]))
> >         (reg/f:SI 26 [ _6 ]))
> >     (const_int 12 [0xc]))
> >
> > which the backend currently does not recognise as a valid machine address 
> > and clearly all the fallback handling fails in this case.  It also causes 
> > indexed addressing for non-byte data (scaling is implicit in the VAX ISA) 
> > to cease being used where no ICE actually triggers, which causes a serious 
> > code quality regression from extraneous manual address calculations.
> >
> >  Given that the VAX backend currently does not expect ASHIFT in addresses 
> > and it works, this single piece in LRA must be the only one across the 
> > middle end that uses this form and all the other code must have stuck to 
> > the MULT form.  So it does not appear to me that ASHIFT form indeed used 
> > not to be considered canonical until this problematic change.
> 
> Hopefully the above combine example answers this.
> 
> >  I have looked into what other backends do that support scaled indexed 
> > addressing and x86 escaped a regression here owing to an unrelated much 
> > earlier fix: <https://gcc.gnu.org/ml/gcc-patches/2010-04/msg01170.html> 
> > for PR target/43766 (commit 90f775a9c7af) that added ASHIFT support to its 
> > TARGET_LEGITIMATE_ADDRESS_P handler, and Aarch64 presumably has always had 
> > it.
> >
> >  I have therefore made an experimental change for the VAX backend to 
> > accept ASHIFT in its TARGET_LEGITIMATE_ADDRESS_P handler and just like 
> > reverting this change it makes the ICE go away and indexed addressing to 
> > be used again.  However there are numerous other places across the VAX 
> > backend that expect addresses to be in the MULT from, including in 
> > particular expression cost calculation, and it is not clear to me if they 
> > all have to be adjusted for the possibility created by this change for 
> > addresses to come in either form.
> 
> Does the patch also help to optimise my example above?  If so,
> it sounds like a good thing for that reason alone.
> 
> >  So why do we want to use a different canonical form for addresses 
> > depending on the context they are used with?
> >
> >  It does complicate handling in the backend and my understanding has been 
> > that canonicalisation is meant to simplify handling throughout instead.  
> > And sadly the change description does not explain why it is correct to 
> > have addresses use the ASHIFT form in certain contexts and the MULT form 
> > in the remaining ones.
> 
> Yeah, canonicalisation is supposed to simplify handling.  I think the
> thing that thwarts that in this particular context is the fact that
> we have different rules for “mult”s inside addresses.  IMO that was
> a mistake, and “mult” by a power of 2 should be an “ashift”
> wherever it occurs.  But fixing that for all backends would probably
> be a huge task.
> 
> While the special “mem” case exists, we're trading complication in
> one direction for complication in another.  If code matches addresses
> that are used for both address constraints and memory constraints,
> it will need to support both the “ashift” and “mem” forms (and for
> good code quality it should have done that even before the patch).
> On the other side, rtl patterns that match address-like expressions
> can rely on “ashift” being used and do not need to support “mult”.

cf. <https://gcc.gnu.org/pipermail/gcc-patches/2021-March/567318.html> 
(downthread from the message referred from the commit description; shall I 
point to Richard's answer perhaps?), which is why commit c605a8bf9270 
("VAX: Accept ASHIFT in address expressions") was created.

> > .../libgcc/libgcov-driver.c: In function 'gcov_do_dump':
> > .../libgcc/libgcov-driver.c:686:1: error: insn does not satisfy its
> > constraints:
> >    686 | }
> >        | ^
> > (insn 2051 2050 2052 185 (set (reg/f:SI 0 %r0 [555])
> >          (plus:SI (ashift:SI (mem/c:SI (plus:SI (reg/f:SI 13 %fp)
> >                          (const_int -28 [0xffffffffffffffe4])) [40 %sfp+-28
> > S4 A32])
> >                  (const_int 3 [0x3]))
> >              (plus:SI (reg/v/f:SI 9 %r9 [orig:176 fn_buffer ] [176])
> >                  (const_int 24 [0x18]))))
> > ".../libgcc/libgcov-driver.c":172:40 614 {movaddrdi}
> >       (nil))
> I'm guessing this insn is the result of reloading an address within a MEM into
> a REG.

 No, the address has never been in a MEM in the first place.  The original 
insns are as follows:

(insn 2049 2048 2050 166 (set (reg:SI 553)
        (plus:SI (reg/v:SI 154 [ n_ctrs ])
            (const_int 3 [0x3]))) ".../libgcc/libgcov-driver.c":172:40 201 
{addsi3}
     (nil))
(insn 2050 2049 2051 166 (set (reg:SI 554)
        (ashift:SI (reg:SI 553)
            (const_int 3 [0x3]))) ".../libgcc/libgcov-driver.c":172:40 432 
{ashlsi3}
     (expr_list:REG_DEAD (reg:SI 553)
        (nil)))
(insn 2051 2050 2052 166 (set (reg/f:SI 555)
        (plus:SI (reg/v/f:SI 176 [ fn_buffer ])
            (reg:SI 554))) ".../libgcc/libgcov-driver.c":172:40 201 {addsi3}
     (expr_list:REG_DEAD (reg:SI 554)
        (nil)))

and then combine merges them as follows (pretty damn good job IMO!):

(note 2049 2048 2050 166 NOTE_INSN_DELETED)
(note 2050 2049 2051 166 NOTE_INSN_DELETED)
(insn 2051 2050 2052 166 (set (reg/f:SI 555)
        (plus:SI (plus:SI (ashift:SI (reg/v:SI 154 [ n_ctrs ])
                    (const_int 3 [0x3]))
                (reg/v/f:SI 176 [ fn_buffer ]))
            (const_int 24 [0x18]))) ".../libgcc/libgcov-driver.c":172:40 614 
{movaddrdi}
     (nil))

and then reload substitutes (reg/v:SI 154 [ n_ctrs ]) with the inner MEM 
as it fails to reload the pseudo and just uses its memory location.

 For the record libgcc/libgcov-driver.c has this at line 172 as at commit 
4a960d548b7d:

      fn_buffer->info.ctrs[n_ctrs].num = length;
      fn_buffer->info.ctrs[n_ctrs].values = values;

so I guess this insn calculates `fn_buffer->info.ctrs + n_ctrs'.

>   Had that address been in a canonical form I don't think this patch
> would be needed.

 I guess so.  In any case reverting commit c605a8bf9270 does make the 
compiler work (hence I marked the issue as a regression), but said commit 
is needed for LRA to work with the VAX target, and as noted with the 
original submission it also improves code generation with old reload, so 
it's not only that LRA relies on this non-canonical form.

> Am I missing something?

 Well, am *I* missing something?

  Maciej

Re: [PATCH] PR middle-end/103059: reload: Also accept ASHIFT with indexed addressing

Reply via email to