Hi Richard, (hopefully, my take won’t cloud the issue ….)
> On 14 Feb 2022, at 16:00, Richard Sandiford via Gcc-patches > <[email protected]> wrote: > > Hi Vlad, > > Vladimir Makarov via Gcc-patches <[email protected]> writes: >> On 2022-02-14 04:44, Richard Sandiford via Gcc-patches wrote: >>> Iain Sandoe via Gcc-patches <[email protected]> writes: >>>> Two issues resulted in this PR, which manifests when we force a constant >>>> into >>>> memory in LRA (in PIC code on Darwin). The presence of such forced >>>> constants >>>> is quite dependent on other RTL optimisations, and it is easy for the >>>> issue to >>>> become latent for a specific case. >>>> >>>> First, in the Darwin-specific rs6000 backend code, we were not being >>>> careful >>>> enough in rejecting invalid symbolic addresses. Specifically, when >>>> generating >>>> PIC code, we require a SYMBOL_REF to be wrapped in an >>>> UNSPEC_MACHOPIC_OFFSET. >>>> >>>> Second, LRA was attempting to load a register using an invalid lo_sum >>>> address. >>>> >>>> The LRA changes are approved in the PR by Vladimir, and the RS6000 changes >>>> are >>>> Darwin-specific (although, of course, any observations are welcome). >>>> >>>> Tested on several lo_sum targets and x86_64 all languages except as noted: >>>> powerpc64-linux (m32/m64) -D >>>> powerpc64le-linux -D >>>> powerpc64-aix -Ada -Go -D >>>> aarch64-linux -Ada -D >>>> x86_64-linux all langs -D >>>> powerpc-darwin9 (master and 11.2) -D -Go. >>>> >>>> pushed to master, thanks, >>>> Iain >>>> >>>> Signed-off-by: Iain Sandoe <[email protected]> >>>> Co-authored-by: Vladimir Makarov <[email protected]> >>>> >>>> PR target/104117 >>>> >>>> gcc/ChangeLog: >>>> >>>> * config/rs6000/rs6000.cc (darwin_rs6000_legitimate_lo_sum_const_p): >>>> Check for UNSPEC_MACHOPIC_OFFSET wrappers on symbolic addresses when >>>> emitting PIC code. >>>> (legitimate_lo_sum_address_p): Likewise. >>>> * lra-constraints.cc (process_address_1): Do not attempt to emit a reg >>>> load from an invalid lo_sum address. >>>> --- >>>> gcc/config/rs6000/rs6000.cc | 38 +++++++++++++++++++++++++++++++++++-- >>>> gcc/lra-constraints.cc | 17 ++--------------- >>>> 2 files changed, 38 insertions(+), 17 deletions(-) >>>> >>>> […] >>>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc >>>> index fdff9e0720a..c700c3f4578 100644 >>>> --- a/gcc/lra-constraints.cc >>>> +++ b/gcc/lra-constraints.cc >>>> @@ -3625,21 +3625,8 @@ process_address_1 (int nop, bool check_only_p, >>>> *ad.inner = gen_rtx_LO_SUM (Pmode, new_reg, addr); >>>> if (!valid_address_p (op, &ad, cn)) >>>> { >>>> - /* Try to put lo_sum into register. */ >>>> - insn = emit_insn (gen_rtx_SET >>>> - (new_reg, >>>> - gen_rtx_LO_SUM (Pmode, new_reg, >>>> addr))); >>>> - code = recog_memoized (insn); >>>> - if (code >= 0) >>>> - { >>>> - *ad.inner = new_reg; >>>> - if (!valid_address_p (op, &ad, cn)) >>>> - { >>>> - *ad.inner = addr; >>>> - code = -1; >>>> - } >>>> - } >>>> - >>>> + *ad.inner = addr; /* Punt. */ >>>> + code = -1; >>>> } >>>> } >>>> if (code < 0) >>> Could you go into more details about this? Why is it OK to continue >>> to try: >>> >>> (lo_sum new_reg addr) >>> >>> directly as an address (the context at the top of the hunk), but not try >>> moving the lo_sum into a register? They should be semantically equivalent, >>> so it seems that if one is wrong, the other would be too. >>> >> Hi, Richard. Change LRA is mine and I approved it for Iain's patch. >> >> I think there is no need for this code and it is misleading. If >> 'mem[low_sum]' does not work, I don't think that 'reg=low_sum;mem[reg]' >> will help for any existing target. As machine-dependent code for any >> target most probably (for ppc64 darwin it is exactly the case) checks >> address only in memory, it can wrongly accept wrong address by reloading >> it into reg and use it in memory. So these are my arguments for the >> remove this code from process_address_1. > > I'm probably making too much of this, but: > > I think the code is potentially useful in that existing targets do forbid > forbid lo_sum addresses in certain contexts (due to limited offset range) > while still wanting lo_sum to be used to be load the address. If we > handle the high/lo_sum split in generic code then we have more chance > of being able to optimise things. So it feels like this is setting an > unfortunate precedent. > > I still don't understand what went wrong before though (the PR trail > was a bit too long to process :-)). Is there a case where > (lo_sum (high X) X) != X? If so, that seems like a target bug to me. If X is an invalid address (in this case for PIC code a SYMBOL_REF is not valid without an UNSPEC wrapper) - then (high X) and (lo_sum (high X) X) are also invalid. AFAICT the target never (this late in the process) gets the opportunity to apply a transform to validate the address. > Or does the target accept (set R1 (lo_sum R2 X)) for an X that cannot > be split into a HIGH/LO_SUM pair? I'd argue that's a target bug too. once again if X is plain invalid - loading it into a register or splitting it does not solve that - we’d need to call the target’s address legitimizer. Of course, if we conclude that it is a target bug, I’ll try to fix it up . thanks Iain > > Thanks, > Richard
