On 09/09/2021 13:23, Richard Biener via Gcc-patches wrote:
On Thu, Sep 9, 2021 at 1:09 PM Richard Earnshaw <rearn...@arm.com> wrote:
gen_lowpart_general handles forming a SUBREG of a MEM by using
adjust_address to rework and validate a new version of the MEM.
However, gen_highpart does not attempt this and simply returns (SUBREG
(MEM)) if the change is not 'obviously' safe. Improve on that by
using a similar approach so that gen_lowpart and gen_highpart are
mostly symmetrical in this regard.
When I decipher gen_lowpart correctly then it doesn't generate the
subreg of the mem in the first place so doing it like that in gen_highpart
would _not_ invoke simplify_gen_subreg on a MEM_P but instead
do what you now do directly?
I also wonder why gen_lowpart_general uses byte_lowpart_offset
while you use subreg_highpart_offset where subreg_lowpart_offset
is also available ... huh - and there's also
subreg_size_{lowpart,highpart}_offset.
So it looks like your case wouldn't handle the paradoxical highpart
(which better shouldn't be accessed?).
Surely the highpart of a paradoxical subreg is meaningless... what's the
highpart when the outer subreg is wider than the inner one?
And that's why there is subreg_lowpart_offset, subreg_highpart_offset
and byte_lowpart_offset, but not byte_highpart_offset (because the
latter is there to handle paradoxical cases, but decays to
subreg_lowpart_offset for a non-paradoxical subreg case).
So like
diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 77ea8948ee8..c3dae7d8075 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1585,6 +1585,13 @@ gen_highpart (machine_mode mode, rtx x)
gcc_assert (known_le (msize, (unsigned int) UNITS_PER_WORD)
|| known_eq (msize, GET_MODE_UNIT_SIZE (GET_MODE (x))));
+ /* Offset MEMs. */
+ if (MEM_P (x))
+ {
+ poly_int64 offset = subreg_highpart_offset (mode, GET_MODE (x));
+ return adjust_address (x, mode, offset);
+ }
+
result = simplify_gen_subreg (mode, x, GET_MODE (x),
subreg_highpart_offset (mode, GET_MODE (x)));
gcc_assert (result);
In which case, I'm pretty certain the subsequent MEM_P (result) test can
be removed, as I can't see how simplify_gen_subreg would return a MEM
with such a change.
Testing
+ else if (GET_CODE (result) == SUBREG && MEM_P (SUBREG_REG (result))
+ && MEM_P (x))
looks a bit odd to me.
I'll note it leaves gen_highpart_mode "unfixed", some refactoring should
instead commonize the worker for both interfaces, making gen_highpart
invoke gen_highpart_mode or so.
gen_highpart_mode invokes gen_highpart if the inner mode is not
VOIDmode. Perhaps the logic is somewhat backwards, or perhaps it's just
a bit more efficient that way.
I'll try your suggested change.
R.
gcc/ChangeLog:
PR target/102125
* emit-rtl.c (gen_highpart): If simplify_gen_subreg returns
SUBREG (MEM) for a MEM, use adjust_address to produce a new
MEM.
---
gcc/emit-rtl.c | 8 ++++++++
1 file changed, 8 insertions(+)