Re: [PATCH v3] Implement new RTL optimizations pass: fold-mem-offsets.

Jeff Law via Gcc-patches Tue, 18 Jul 2023 11:02:34 -0700



On 7/18/23 11:15, Manolis Tsamis wrote:

On Fri, Jul 14, 2023 at 8:35 AM Jeff Law <[email protected]> wrote:




On 7/13/23 09:05, Manolis Tsamis wrote:

In this version I have made f-m-o able to also eliminate constant
moves in addition to the add constant instructions.
This increases the number of simplified/eliminated instructions and is
a good addition for RISC style ISAs where these are more common.

This has led to pr52146.c failing in x86, which I haven't been able to
find a way to fix.
This involves directly writing to a constant address with -mx32

The code
          movl    $-18874240, %eax
          movl    $0, (%eax)

is 'optimized' to
          movl    $0, %eax
          movl    $0, -18874240(%eax)

Which is actually
          movl    $0, -18874240

which is wrong per the ticket.
The fix for the ticket involved changes to legitimate_address_p which
f-m-o does call but it doesn't reject due to the existence of (%eax)
which in turn is actually zero.
I believe this is not strictly an f-m-o issue since the pass calls all
the required functions to test whether the newly synthesized memory
instruction is valid.

Any ideas on how to solve this issue is appreciated.

I wonder if costing might be useful here.  I would expect the 2nd
sequence is the most costly of the three if address costing models are
reasonably accurate.

Another way would be to look at the length of the memory reference insn.
   If it's larger, then it's likely more costly.

That's what I've got off the top of my head.


I could test whether the cost function prefers the version that we
want, but that would be a workaround I would like to avoid. It may
also be the case that this reproduces with a different sequence where
the unwanted code is actually more profitable.

I was trying to find out whether the original fix can be extended in a
way that solves this, because having an address that is reported as
legitimate but is actually not could also create issues elsewhere.
But I don't yet have a suggestion on how to fix it yet.

I was thinking a bit more about this yesterday, and even in the casewhere the new mem crosses a boundary thus making the memory load/storemore expensive I think we're still OK.

The key is at worst we will have changed an earlier instruction like t =sp + <const> into t = sp which should reduce the cost of that earlierinstruction. And I would expect the vast majority of the time wecompletely eliminate that earlier instruction.


So this may ultimately be a non-issue.

Vineet @ Rivos has indicated he stumbled across an ICE with the V3 code.Hopefully he'll get a testcase for that extracted shortly.


jeff

Re: [PATCH v3] Implement new RTL optimizations pass: fold-mem-offsets.

Reply via email to