Re: [pushed] LRA, rs6000, Darwin: Amend lo_sum use for forced constants [PR104117].

Iain Sandoe Sun, 20 Feb 2022 09:34:47 -0800

Hi Folks.

> On 14 Feb 2022, at 16:58, Vladimir Makarov <vmaka...@redhat.com> wrote:
> On 2022-02-14 11:00, Richard Sandiford wrote:


>> Vladimir Makarov via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> 
>>> Hi, Richard.  Change LRA is mine and I approved it for Iain's patch.
>>> 
>>> I think there is no need for this code and it is misleading.  If
>>> 'mem[low_sum]' does not work, I don't think that 'reg=low_sum;mem[reg]'
>>> will help for any existing target.  As machine-dependent code for any
>>> target most probably (for ppc64 darwin it is exactly the case) checks
>>> address only in memory, it can wrongly accept wrong address by reloading
>>> it into reg and use it in memory. So these are my arguments for the
>>> remove this code from process_address_1.
>> I'm probably making too much of this, but:
>> 
>> I think the code is potentially useful in that existing targets do forbid
>> forbid lo_sum addresses in certain contexts (due to limited offset range)
>> while still wanting lo_sum to be used to be load the address.  If we
>> handle the high/lo_sum split in generic code then we have more chance
>> of being able to optimise things.  So it feels like this is setting an
>> unfortunate precedent.
>> 
>> I still don't understand what went wrong before though (the PR trail
>> was a bit too long to process :-)).  Is there a case where
>> (lo_sum (high X) X) != X?  If so, that seems like a target bug to me.
>> Or does the target accept (set R1 (lo_sum R2 X)) for an X that cannot
>> be split into a HIGH/LO_SUM pair?  I'd argue that's a target bug too.
>> 
> Sometimes it is hard to make a line where an RA bug is a bug in 
> machine-dependent code or in RA itself.
> 
> For this case I would say it is a bug in the both parts.
> 
> Low-sum is generated by LRA and it does not know that it should be wrapped by 
> unspec for darwin. Generally speaking we could avoid the change in LRA but it 
> would require to do non-trivial analysis in machine dependent code to find 
> cases when 'reg=low_sum ... mem[reg]' is incorrect code for darwin (PIC) 
> target (and may be some other PIC targets too). Therefore I believe the 
> change in LRA is a good solution even if the change can potentially result in 
> less optimized code for some cases.  Taking your concern into account we 
> could probably improve the patch by introducing a hook (I never liked such 
> solutions as we already have too many hooks directing RA) or better to make 
> the LRA change working only for PIC target. Something like this (it probably 
> needs better recognition of pic target):
> 
> --- a/gcc/lra-constraints.cc
> +++ b/gcc/lra-constraints.cc
> @@ -3616,21 +3616,21 @@ process_address_1 (int nop, bool check_only_p,
>           if (HAVE_lo_sum)
>             {
>               /* addr => lo_sum (new_base, addr), case (2) above.  */
>               insn = emit_insn (gen_rtx_SET
>                                 (new_reg,
>                                  gen_rtx_HIGH (Pmode, copy_rtx (addr))));
>               code = recog_memoized (insn);
>               if (code >= 0)
>                 {
>                   *ad.inner = gen_rtx_LO_SUM (Pmode, new_reg, addr);
> -                 if (!valid_address_p (op, &ad, cn))
> +                 if (!valid_address_p (op, &ad, cn) && !flag_pic)

IMO the PIC aspect of this is possibly misleading
 - the issue is that we have an invalid address, and that such addresses in 
this case need to be legitimised by wrapping them in an UNSPEC. 
- My concern about the generic code was that I would not expect Darwin to be 
the only platform that might need to wrap an invlaid address in an unspec  
[TOC, TLS, small data etc. spring to mind].

I need some help understanding the expected pathway through this code that 
could be useful.

we start with an invalid address.

1. we build (set reg (high invalid_address))
 - Darwin was allowing this (and the lo_sum) [eveywhere, not just here] on the 
basis that the target legitimizer would be called later to fix it up.  (that is 
why the initial recog passes) - but AFAICT we never call the target’s address 
legitimizer.

 - I am curious about what (other) circumstance there would be where a (high of 
an invalid address would be useful.

2. …  assuming the we allowed the build of the (high invalid)

 - we now build the lo_sum and check to see if it is valid.

3. if it is _not_ valid, we load it into a reg

  - I am not sure (outside the comment about about post-legitimiizer use) about 
how an invalid lo_sum can be used in this way.

  - assuming we accept this, we then test to see if the register is a valid 
address (my guess is that test will pass pretty much everywhere, since we 
picked a suitable register in the first place).

^^^ this is mostly for my education - the stuff below is a potential solution 
to leaving lra-constraints unchanged and fixing the Darwin bug….

[ part of me wonders why we do not just call the target’s address legitimizer 
when we have an illegal address ]

——— current WIP:

So .. I have split the Darwin patterns into a hi/lo pair for non-PIC and a 
hi/lo pair for PIC.

I added a predicate for the PIC case that requires it to match (unspec [….] 
UNSPEC_MACHOPIC_OFFSET).

Thus both the attempts (high and (lo_sum will fail recog now which has the same 
effect as punting on the bad lo_sum (and the original testcase now generates 
correct code) …

… the hardware is slow (well, it isn’t really - faster than a Cray XMP .. but 
…) .. so regstraps on 11.2 (to check the fix works) and master will take some 
time.

Then, hopefully, there is a target-local fix (and the LRA change could be 
reverted if that’s the concensus about the best solution) ...

cheers
Iain

Re: [pushed] LRA, rs6000, Darwin: Amend lo_sum use for forced constants [PR104117].

Reply via email to