subject:"\[PATCH\] Always enable LRA"

On 10/14/22 15:21, Koning, Paul wrote:

On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches
wrote:

On 10/14/22 11:36, Koning, Paul wrote:

On Oct 14, 2022, at 1:10 PM, Jeff Law wrote:

On 10/14/22 10:37, Koning, Paul wrote:

...
But that approach falls down with reload/lra doing substitutions without
validating the result. I guess it might be possible to cobble together
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced? My experience is that I was getting these bad
addressing modes in some test programs, and that the constraints I created to
make the requirement explicit cured that. Maybe I'm expecting too much from
constraints, but my (admittedly inexperienced) understanding of them is that
they inform reload what sort of things it can construct, and what it cannot.

It's not really a constraint issue -- the pattern's condition would cause this
not to recognize, but LRA doesn't re-recognize the insn. We might be able to
hack something in the constraints to force a reload of the source operand in
this case. Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain. Minimally it
should be clearly documented exactly what cases fail to obey the constraints
and what a target writer can do to deal with those failures.

Constraints have a purpose, but as I've noted, they really don't come into play here.
Had LRA tried to see if what it created as a valid move insn, the backend would have said
"nope, that's not valid". That's a stronger test than checking the
constraints. If the insn is not valid according to its condition, then the constraints
simply don't matter.

I'm not aware of a case where constraints are failing to be obeyed and
constraints simply aren't a viable solution here other than to paper over the
problem and hope it doesn't show up elsewhere.

Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is
"r". How would you define a new constraint for operand 1 that disallows overlap with
operand 0 given that the H8 allows autoinc on any register operand? You can't look at operand 0
while processing the constraint for operand 1. Similarly if you try to define a new constraint for
operand0 without looking at operand1.

Easy but cumbersome: define constraints for "register N" (for each N) and another set for "autoinc on
any register other than N". In pdp11, I called these Z0, Z1... and Za, Zb... respectively. Then the insn gets
constraints that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands. As I said, see
pdp11.md, the mov insn.

It generally looks sound, but golly gee, this runs into the "reload
doesn't validate insns problem" if it's done in a reload tree rather
than an lra tree. We've got an insn with a pre-inc destination and a
reg source. The source pseudo doesn't get a hard reg, reload replaces
the pseudo with a mem as expected. Reload finishes with something like this:

(insn 100 98 101 15 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [8 S4 A32])
(mem/c:SI (plus:SI (reg/f:SI 7 sp)
(const_int 8 [0x8])) [9 %sfp+-24 S4 A32])) "j.c":62:11
19 {*movsix}

(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))

Which, isn't a valid instruction on the H8. The insn's condition
verifies that one of the two operands must be a REG. But reload never
bothered to re-recognize the insn after makng the substitution, then
naturally it blows up in reload_cse with a constraint failure because
the pre-inc destination constraints require a reg for the source
operand. But the real culprit here is reload making the substitution
and not validing that the result is valid.

Arggh!

Which brings me back to pondering just removing the autoinc magic
checking in the H8 port :-) I've actually got that spinning in the
tester just to see if there's any obvious fallout. I've already spent
more time on this than I can reasonably justify.

Jeff

Re: [PATCH] Always enable LRA

On 10/14/22 15:21, Koning, Paul wrote:

On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches
wrote:

On 10/14/22 11:36, Koning, Paul wrote:

On Oct 14, 2022, at 1:10 PM, Jeff Law wrote:

On 10/14/22 10:37, Koning, Paul wrote:

I'm not aware of a case where constraints are failing to be obeyed and
constraints simply aren't a viable solution here other than to paper over the
problem and hope it doesn't show up elsewhere.

Yea, you're right. It's definitely possible. Painful, but possible.

jeff

Re: [PATCH] Always enable LRA




> On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/14/22 11:36, Koning, Paul wrote:
>> 
>>> On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:
>>> 
>>> On 10/14/22 10:37, Koning, Paul wrote:
> ...
> But that approach falls down with reload/lra doing substitutions without 
> validating the result.  I guess it might be possible to cobble together 
> something with secondary reloads, but it's way way way down on my todo 
> list.
 Aren't the constraints enforced?  My experience is that I was getting 
 these bad addressing modes in some test programs, and that the constraints 
 I created to make the requirement explicit cured that.  Maybe I'm 
 expecting too much from constraints, but my (admittedly inexperienced) 
 understanding of them is that they inform reload what sort of things it 
 can construct, and what it cannot.
>>> It's not really a constraint issue -- the pattern's condition would cause 
>>> this not to recognize, but LRA doesn't re-recognize the insn.  We might be 
>>> able to hack something in the constraints to force a reload of the source 
>>> operand in this case.   Ugly, but a possibility.
>> I find it hard to cope with constraints that don't constrain.  Minimally it 
>> should be clearly documented exactly what cases fail to obey the constraints 
>> and what a target writer can do to deal with those failures.
> 
> Constraints have a purpose, but as I've noted, they really don't come into 
> play here.   Had LRA tried to see if what it created as a valid move insn, 
> the backend would have said "nope, that's not valid".  That's a stronger test 
> than checking the constraints.  If the insn is not valid according to its 
> condition, then the constraints simply don't matter.
> 
> I'm not aware of a case where constraints are failing to be obeyed and 
> constraints simply aren't a viable solution here other than to paper over the 
> problem and hope it doesn't show up elsewhere.
> 
> Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is 
> "r".  How would you define a new constraint for operand 1 that disallows 
> overlap with operand 0 given that the H8 allows autoinc on any register 
> operand?   You can't look at operand 0 while processing the constraint for 
> operand 1. Similarly if you try to define a new constraint for operand0 
> without looking at operand1.

Easy but cumbersome: define constraints for "register N" (for each N) and 
another set for "autoinc on any register other than N".  In pdp11, I called 
these Z0, Z1... and Za, Zb... respectively.  Then the insn gets constraints 
that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands.  As I 
said, see pdp11.md, the mov insn.

paul

Re: [PATCH] Always enable LRA




On 10/14/22 11:36, Koning, Paul wrote:



On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:

On 10/14/22 10:37, Koning, Paul wrote:

...
But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.

It's not really a constraint issue -- the pattern's condition would cause this 
not to recognize, but LRA doesn't re-recognize the insn.  We might be able to 
hack something in the constraints to force a reload of the source operand in 
this case.   Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain.  Minimally it 
should be clearly documented exactly what cases fail to obey the constraints 
and what a target writer can do to deal with those failures.


Constraints have a purpose, but as I've noted, they really don't come 
into play here.   Had LRA tried to see if what it created as a valid 
move insn, the backend would have said "nope, that's not valid".  That's 
a stronger test than checking the constraints.  If the insn is not valid 
according to its condition, then the constraints simply don't matter.


I'm not aware of a case where constraints are failing to be obeyed and 
constraints simply aren't a viable solution here other than to paper 
over the problem and hope it doesn't show up elsewhere.


Right now operand 0's constraint is "<" meaning pre-inc operand, operand 
1 is "r".  How would you define a new constraint for operand 1 that 
disallows overlap with operand 0 given that the H8 allows autoinc on any 
register operand?   You can't look at operand 0 while processing the 
constraint for operand 1. Similarly if you try to define a new 
constraint for operand0 without looking at operand1.


Hence the h8300_move_ok test in the insn's condition where we can look 
at both operands to assess if it's a legitimate insn.





As it stands, I find myself working hard to write MD code that accurately 
describes the rules of the machine, and for the core machinery to disregard 
those instructions is painful.


No doubt.




Is there a compelling argument for every case where LRA fails to obey the 
constraints?  If not, can they just be called bugs and added to the to-be-fixed 
queue?


There was in the reload days, though I honestly don't remember what it 
was, I'm much less familiar with LRA in this regard, but I trust Vlad's 
engineering skills and strongly believe that failing to recognize was 
done for a good reason.



It's also worth repeating, we can get the same fundamental failure on 
the H8 with reload.  The testcase is different, but the core issue is 
the same.  We have a move with an autoinc destination and the same 
register is also used as a source operand incorerctly created by reload.



What's a bit interesting here is the m68k doesn't do any kind of 
checking for these scenarios. It just accepts them and generates the 
obvious code.  I'm more tempted by the minute to do the same on the H8 :-)



Jeff

Re: [PATCH] Always enable LRA

> On Oct 14, 2022, at 4:12 PM, Segher Boessenkool  
> wrote:
> 
> On Fri, Oct 14, 2022 at 07:58:39PM +, Koning, Paul wrote:
>>> On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches 
>>>  wrote:
>>> On 10/14/22 11:35, Segher Boessenkool wrote:
 On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>> LRA only ever generates insns that pass recog.  The backend allows this
>> define_insn, requiring it to be split (it returns template "#"), but
>> then somehow it doesn't match in any split pass?
> Nope.  The elimination code will just change one register without
> re-recognizing.  That's precisely what happens here.
 That is a big oversight then.  Please file a PR?
>>> 
>>> Sure.  But just recognizing (for this particular case) will just move the 
>>> fault from a failure to split to a failure to recognize. From my wanderings 
>>> in the elimination code, I don't see that it has a path that would allow it 
>>> to reasonably handle this case -- ie, if the insn does not recognize, what 
>>> then?   Conceptually we need to generate an input-reload but I don't see a 
>>> way to do that in the elimination code.  Maybe Vlad knows how it ought to 
>>> be handled.
>> 
>> I probably have too simplistic a view of this, but the way I think of it is 
>> that LRA (and reload) make decisions subject to constraints, and among those 
>> constraints are the ones specified in the MD file patterns.  That to me 
>> means that a substitution proposed to be made by the LRA code is subject to 
>> those invariants: it can't do that if the constraints say "no" and must then 
>> consider some other alternative.
> 
> I think that is exactly right for LRA.
> 
> Old reload conceptually changed the whole function all at once, starting
> with valid RTL, and ending with strictly valid RTL.  LRA works locally,
> one instruction at a time essentially, and makes the changes
> immediately.  If when it has finished work on the function offsets have
> changed, it walks over the whole function again, repeat until done.
> 
> "Strictly valid" means that the constraints are considered, and the insn
> is only valid if some enabled alternative satisfies all constraints.
> 
> I hope I got that all right, I'm not an expert!  :-)

Thanks Segher.

As I said earlier, if for some reason this straightforward understanding is not 
completely accurate, that can be handled provided it is documented when and why 
the exceptions arise, and what methods the target author should use to deal 
with these things when they happen.

As a target maintainer not deeply skilled in the GCC common internals, I tend 
to trip over these things.  With the old reload, and secondary reload in 
particular, it always felt to  me  like the answer was "keep tweaking the 
target definition files until the test cases stop breaking".  That isn't how it 
should be.  

Perhaps some of these issues come from out of the ordinary target restrictions. 
 The autoinc/autodec case we're discussing may be an example of that.  The one 
I remember in particular was the pdp11 float instructions, where I have 6 
registers but only 4 of these can be loaded from or stored to memory.  Putting 
the other two to work while having spill to memory work right took quite a lot 
of iteration.

It may be LRA is better in these areas.  I haven't spent much time with that, 
other than to create a way to enable its use and observing that (a) I got about 
the same test suite numbers either way and (b) the LRA code was not as good in 
some of the cases.

paul

Re: [PATCH] Always enable LRA

On Fri, Oct 14, 2022 at 07:58:39PM +, Koning, Paul wrote:
> > On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches 
> >  wrote:
> > On 10/14/22 11:35, Segher Boessenkool wrote:
> >> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>  LRA only ever generates insns that pass recog.  The backend allows this
>  define_insn, requiring it to be split (it returns template "#"), but
>  then somehow it doesn't match in any split pass?
> >>> Nope.  The elimination code will just change one register without
> >>> re-recognizing.  That's precisely what happens here.
> >> That is a big oversight then.  Please file a PR?
> > 
> > Sure.  But just recognizing (for this particular case) will just move the 
> > fault from a failure to split to a failure to recognize. From my wanderings 
> > in the elimination code, I don't see that it has a path that would allow it 
> > to reasonably handle this case -- ie, if the insn does not recognize, what 
> > then?   Conceptually we need to generate an input-reload but I don't see a 
> > way to do that in the elimination code.  Maybe Vlad knows how it ought to 
> > be handled.
> 
> I probably have too simplistic a view of this, but the way I think of it is 
> that LRA (and reload) make decisions subject to constraints, and among those 
> constraints are the ones specified in the MD file patterns.  That to me means 
> that a substitution proposed to be made by the LRA code is subject to those 
> invariants: it can't do that if the constraints say "no" and must then 
> consider some other alternative.

I think that is exactly right for LRA.

Old reload conceptually changed the whole function all at once, starting
with valid RTL, and ending with strictly valid RTL.  LRA works locally,
one instruction at a time essentially, and makes the changes
immediately.  If when it has finished work on the function offsets have
changed, it walks over the whole function again, repeat until done.

"Strictly valid" means that the constraints are considered, and the insn
is only valid if some enabled alternative satisfies all constraints.

I hope I got that all right, I'm not an expert!  :-)


Segher

Re: [PATCH] Always enable LRA




> On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/14/22 11:35, Segher Boessenkool wrote:
>> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
 LRA only ever generates insns that pass recog.  The backend allows this
 define_insn, requiring it to be split (it returns template "#"), but
 then somehow it doesn't match in any split pass?
>>> Nope.  The elimination code will just change one register without
>>> re-recognizing.  That's precisely what happens here.
>> That is a big oversight then.  Please file a PR?
> 
> Sure.  But just recognizing (for this particular case) will just move the 
> fault from a failure to split to a failure to recognize. From my wanderings 
> in the elimination code, I don't see that it has a path that would allow it 
> to reasonably handle this case -- ie, if the insn does not recognize, what 
> then?   Conceptually we need to generate an input-reload but I don't see a 
> way to do that in the elimination code.  Maybe Vlad knows how it ought to be 
> handled.

I probably have too simplistic a view of this, but the way I think of it is 
that LRA (and reload) make decisions subject to constraints, and among those 
constraints are the ones specified in the MD file patterns.  That to me means 
that a substitution proposed to be made by the LRA code is subject to those 
invariants: it can't do that if the constraints say "no" and must then consider 
some other alternative.

paul

Re: [PATCH] Always enable LRA




On 10/14/22 11:35, Segher Boessenkool wrote:

On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:

LRA only ever generates insns that pass recog.  The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?

Nope.  The elimination code will just change one register without
re-recognizing.  That's precisely what happens here.

That is a big oversight then.  Please file a PR?


Sure.  But just recognizing (for this particular case) will just move 
the fault from a failure to split to a failure to recognize. From my 
wanderings in the elimination code, I don't see that it has a path that 
would allow it to reasonably handle this case -- ie, if the insn does 
not recognize, what then?   Conceptually we need to generate an 
input-reload but I don't see a way to do that in the elimination code.  
Maybe Vlad knows how it ought to be handled.





It is the only way it can know if it needs to reload more.  Even if it
somehow can assume it doesn't have to check this in some cases, an
assert (inside a CHECKING_P) would be nice?


Agreed.


jeff

Re: [PATCH] Always enable LRA

> On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:
> 
> On 10/14/22 10:37, Koning, Paul wrote:
>> 
>>> ...
>>> But that approach falls down with reload/lra doing substitutions without 
>>> validating the result.  I guess it might be possible to cobble together 
>>> something with secondary reloads, but it's way way way down on my todo list.
>> Aren't the constraints enforced?  My experience is that I was getting these 
>> bad addressing modes in some test programs, and that the constraints I 
>> created to make the requirement explicit cured that.  Maybe I'm expecting 
>> too much from constraints, but my (admittedly inexperienced) understanding 
>> of them is that they inform reload what sort of things it can construct, and 
>> what it cannot.
> 
> It's not really a constraint issue -- the pattern's condition would cause 
> this not to recognize, but LRA doesn't re-recognize the insn.  We might be 
> able to hack something in the constraints to force a reload of the source 
> operand in this case.   Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain.  Minimally it 
should be clearly documented exactly what cases fail to obey the constraints 
and what a target writer can do to deal with those failures.

As it stands, I find myself working hard to write MD code that accurately 
describes the rules of the machine, and for the core machinery to disregard 
those instructions is painful.

Is there a compelling argument for every case where LRA fails to obey the 
constraints?  If not, can they just be called bugs and added to the to-be-fixed 
queue?

paul

Re: [PATCH] Always enable LRA

On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
> >LRA only ever generates insns that pass recog.  The backend allows this
> >define_insn, requiring it to be split (it returns template "#"), but
> >then somehow it doesn't match in any split pass?
> 
> Nope.  The elimination code will just change one register without 
> re-recognizing.  That's precisely what happens here.

That is a big oversight then.  Please file a PR?

> >>Register elimination ultimately discovered that (reg 30) was the same as
> >>the stack pointer and did the natural substitution.    The natural
> >>substitution results in invalid RTL and there's really no way to back
> >>out and do something different AFAICT in lra-eliminations.cc.
> >>
> >>The only reason we fault is because the H8 backend knows this is invalid
> >>RTL and refuses to split it.  If we were to try and re-recognize the
> >>insn in question it would fail to recognize after the substitution
> >>because the auto-inc'd operand overlaps with the other operand.
> >But it *did* recog?  Or does LRA somehow not always recog() everything?
> >I always thought that was one of the huge improvements over old reload
> >(it does everything very locally instead of very globally)!
> 
> No, LRA does not force re-recognition in some cases, particularly for 
> register eliminations.

It is the only way it can know if it needs to reload more.  Even if it
somehow can assume it doesn't have to check this in some cases, an
assert (inside a CHECKING_P) would be nice?


Segher

Re: [PATCH] Always enable LRA

On 10/14/22 10:39, Richard Biener wrote:

Am 14.10.2022 um 16:40 schrieb Jeff Law via Gcc-patches
:

On 10/14/22 06:37, Koning, Paul wrote:

On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches
wrote:

On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
141 | }
| ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12 S4 A32])
(reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19
{*movsi}
(expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed
differently on different machine models. The solution I picked is to create two sets of machine-specific
constraint codes, one for "register N" and the other for "autoinc/dec of any register
other than N" and pairing those. (You can see this in pdp11.md, the mov definition.)

I've long suspected the pdp11 was the inspiration for this restriction (I have
memories of noting it before I relocated to Utah, so circa 1992). The key
problem is the generic parts of the compiler don't know what the semantics
ought to be -- so it's not obvious when they do a substitution whether or not
the substitution of one reg for another is actually valid. It's important to
remember that sometimes when we substitute one register for another, we don't
have any contextual information about source vs dest -- it's a long standing
wart that causes problems in other cases as well.

That punts the problem to the backends and the H8 actually tries to deal with
this restriction. Basically in the movxx pattern conditions, when the
destination uses an autoinc addressing mode, the pattern's condition will check
that the source register is different. I would expect other ports likely to do
something similar.

But that approach falls down with reload/lra doing substitutions without
validating the result. I guess it might be possible to cobble together
something with secondary reloads, but it's way way way down on my todo list.

And yes, this case where the autoinc is on the destination works consistently
on the H8 as well. We could consider loosening the restrictions and let this
through. It's certainly simpler as it's a doc change and removing a bit of
code on the H8. It sounds like the pdp11 already assumes that case is valid.

But what is the semantics of the RTL IL?
That ought to be documented.

*If* we went a route to relax the restriction (and I'm still not sure
that's a good idea), we absolutely would have to document the semantics.

jeff

Re: [PATCH] Always enable LRA




On 10/14/22 10:37, Koning, Paul wrote:



On Oct 14, 2022, at 10:38 AM, Jeff Law via Gcc-patches 
 wrote:


On 10/14/22 06:37, Koning, Paul wrote:

On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches  
wrote:


On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed 
differently on different machine models.  The solution I picked is to create two sets of machine-specific 
constraint codes, one for "register N" and the other for "autoinc/dec of any register 
other than N" and pairing those.  (You can see this in pdp11.md, the mov definition.)

I've long suspected the pdp11 was the inspiration for this restriction (I have 
memories of noting it before I relocated to Utah, so circa 1992).  The key 
problem is the generic parts of the compiler don't know what the semantics 
ought to be -- so it's not obvious when they do a substitution whether or not 
the substitution of one reg for another is actually valid.  It's important to 
remember that sometimes when we substitute one register for another, we don't 
have any contextual information about source vs dest -- it's a long standing 
wart that causes problems in other cases as well.

That punts the problem to the backends and the H8 actually tries to deal with 
this restriction.  Basically in the movxx pattern conditions, when the 
destination uses an autoinc addressing mode, the pattern's condition will check 
that the source register is different.  I would expect other ports likely to do 
something similar.

But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.


It's not really a constraint issue -- the pattern's condition would 
cause this not to recognize, but LRA doesn't re-recognize the insn.  We 
might be able to hack something in the constraints to force a reload of 
the source operand in this case.   Ugly, but a possibility.



Jeff

Re: [PATCH] Always enable LRA




On 10/14/22 10:37, Segher Boessenkool wrote:

Hi!

On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote:

On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12
 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

Really smells like an LRA bug to me.


We have this as we leave IRA:


(insn 10 9 11 2 (set (reg/f:SI 30)
    (plus:SI (reg/f:SI 11 fp)
    (const_int -4 [0xfffc]))) "j.c":31:7 264
{*addsi}
 (expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
    (const_int -4 [0xfffc]))
(nil)))
(insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3  S4 A32])
    (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
 (expr_list:REG_DEAD (reg/f:SI 30)
    (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
    (nil


And as we leave LRA:

(note 10 9 11 2 NOTE_INSN_DELETED)
(insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
     (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
     (nil)))

LRA only ever generates insns that pass recog.  The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?


Nope.  The elimination code will just change one register without 
re-recognizing.  That's precisely what happens here.






Register elimination ultimately discovered that (reg 30) was the same as
the stack pointer and did the natural substitution.    The natural
substitution results in invalid RTL and there's really no way to back
out and do something different AFAICT in lra-eliminations.cc.

The only reason we fault is because the H8 backend knows this is invalid
RTL and refuses to split it.  If we were to try and re-recognize the
insn in question it would fail to recognize after the substitution
because the auto-inc'd operand overlaps with the other operand.

But it *did* recog?  Or does LRA somehow not always recog() everything?
I always thought that was one of the huge improvements over old reload
(it does everything very locally instead of very globally)!


No, LRA does not force re-recognition in some cases, particularly for 
register eliminations.



jeff

Re: [PATCH] Always enable LRA




> On Oct 14, 2022, at 12:18 PM, Segher Boessenkool  
> wrote:
> 
> On Fri, Oct 14, 2022 at 12:36:47AM +, Koning, Paul wrote:
>> I guess I'll have to look harder to see if it's possible to make LRA handle 
>> CISC addressing modes like memory indirect, autoincrement, autodecrement, 
>> and others that the old reload handles at least somewhat.  Ideally LRA 
>> should do a better job; right now I believe it doesn't really do these 
>> things at all.  Targets like pdp11 and vax would like these.
> 
> So what does it do now?  Break every more complex addressing mode apart
> again?  Or ICE?  Or something in between?

The former.  LRA does handle some cases but not all that the target permits and 
not as many as the old reload.

Example:

unsigned int cksum (unsigned int *buf, unsigned int len)
{
unsigned int ret = 0;
do {
ret += *buf++;
} while (--len != 0);
return ret;
}

The loop looks like this:

L_2:
add (r2)+,r0
sob r1,L_2

which is what I would expect.  Now throw in an indirection:

Old reload produces this loop:

L_2:
add @(r2)+,r0
sob r1,L_2

while LRA doesn't understand it can use the autoincrement indirect mode:

L_2:
mov (r2)+,r3
add (r3),r0
sob r1,L_2

This is from a GCC 13.0 test build of last June, with -O2 -m45, with and 
without -mlra.

paul

Re: [PATCH] Always enable LRA

2022-10-14 Thread Richard Biener via Gcc-patches




> Am 14.10.2022 um 16:40 schrieb Jeff Law via Gcc-patches 
> :
> 
> 
>> On 10/14/22 06:37, Koning, Paul wrote:
>> 
 On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches 
  wrote:
>>> 
>>> 
>>> On 10/13/22 17:56, Segher Boessenkool wrote:
 h8300 fails during GCC build:
 /home/segher/src/gcc/libgcc/unwind.inc: In function 
 '_Unwind_SjLj_RaiseException':
 /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
 (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 
 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
 during RTL pass: final
 which looks like a backend bug, I don't see a pattern that could split
 this (without needing an extra clobber)?
>>> I'm aware of this -- its invalid RTL:
>>> 
>>> Uses of the register outside of an address are not permitted within the
>>> same insn as a use in an embedded side effect expression because such
>>> insns behave differently on different machines and hence must be treated
>>> as ambiguous and disallowed.
>> I had a bit of a fight with this sort of thing in pdp11, where in fact such 
>> operations are executed differently on different machine models.  The 
>> solution I picked is to create two sets of machine-specific constraint 
>> codes, one for "register N" and the other for "autoinc/dec of any register 
>> other than N" and pairing those.  (You can see this in pdp11.md, the 
>> mov definition.)
> 
> I've long suspected the pdp11 was the inspiration for this restriction (I 
> have memories of noting it before I relocated to Utah, so circa 1992).  The 
> key problem is the generic parts of the compiler don't know what the 
> semantics ought to be -- so it's not obvious when they do a substitution 
> whether or not the substitution of one reg for another is actually valid.  
> It's important to remember that sometimes when we substitute one register for 
> another, we don't have any contextual information about source vs dest -- 
> it's a long standing wart that causes problems in other cases as well.
> 
> That punts the problem to the backends and the H8 actually tries to deal with 
> this restriction.  Basically in the movxx pattern conditions, when the 
> destination uses an autoinc addressing mode, the pattern's condition will 
> check that the source register is different.  I would expect other ports 
> likely to do something similar.
> 
> But that approach falls down with reload/lra doing substitutions without 
> validating the result.  I guess it might be possible to cobble together 
> something with secondary reloads, but it's way way way down on my todo list.
> 
> And yes, this case where the autoinc is on the destination works consistently 
> on the H8 as well.  We could consider loosening the restrictions and let this 
> through.  It's certainly simpler as it's a doc change and removing a bit of 
> code on the H8.  It sounds like the pdp11 already assumes that case is valid.

But what is the semantics of the RTL IL?
That ought to be documented.


> 
> Jeff
>

Re: [PATCH] Always enable LRA

Hi!

On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote:
> On 10/13/22 17:56, Segher Boessenkool wrote:
> >h8300 fails during GCC build:
> >/home/segher/src/gcc/libgcc/unwind.inc: In function 
> >'_Unwind_SjLj_RaiseException':
> >/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
> >   141 | }
> >   | ^
> >(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
> > (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 
> > 19 {*movsi}
> >  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> > (nil)))
> >during RTL pass: final
> >which looks like a backend bug, I don't see a pattern that could split
> >this (without needing an extra clobber)?
> 
> Really smells like an LRA bug to me.
> 
> 
> We have this as we leave IRA:
> 
> 
> (insn 10 9 11 2 (set (reg/f:SI 30)
>    (plus:SI (reg/f:SI 11 fp)
>    (const_int -4 [0xfffc]))) "j.c":31:7 264 
> {*addsi}
> (expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
>    (const_int -4 [0xfffc]))
> (nil)))
> (insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3  S4 A32])
>    (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
> (expr_list:REG_DEAD (reg/f:SI 30)
>    (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>    (nil
> 
> 
> And as we leave LRA:
> 
> (note 10 9 11 2 NOTE_INSN_DELETED)
> (insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
>     (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
>  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>     (nil)))

LRA only ever generates insns that pass recog.  The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?

> Register elimination ultimately discovered that (reg 30) was the same as 
> the stack pointer and did the natural substitution.    The natural 
> substitution results in invalid RTL and there's really no way to back 
> out and do something different AFAICT in lra-eliminations.cc.
> 
> The only reason we fault is because the H8 backend knows this is invalid 
> RTL and refuses to split it.  If we were to try and re-recognize the 
> insn in question it would fail to recognize after the substitution 
> because the auto-inc'd operand overlaps with the other operand.

But it *did* recog?  Or does LRA somehow not always recog() everything?
I always thought that was one of the huge improvements over old reload
(it does everything very locally instead of very globally)!


Segher

Re: [PATCH] Always enable LRA




> On Oct 14, 2022, at 10:38 AM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/14/22 06:37, Koning, Paul wrote:
>> 
>>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches 
>>>  wrote:
>>> 
>>> 
>>> On 10/13/22 17:56, Segher Boessenkool wrote:
 h8300 fails during GCC build:
 /home/segher/src/gcc/libgcc/unwind.inc: In function 
 '_Unwind_SjLj_RaiseException':
 /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
 (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 
 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
 during RTL pass: final
 which looks like a backend bug, I don't see a pattern that could split
 this (without needing an extra clobber)?
>>> I'm aware of this -- its invalid RTL:
>>> 
>>> Uses of the register outside of an address are not permitted within the
>>> same insn as a use in an embedded side effect expression because such
>>> insns behave differently on different machines and hence must be treated
>>> as ambiguous and disallowed.
>> I had a bit of a fight with this sort of thing in pdp11, where in fact such 
>> operations are executed differently on different machine models.  The 
>> solution I picked is to create two sets of machine-specific constraint 
>> codes, one for "register N" and the other for "autoinc/dec of any register 
>> other than N" and pairing those.  (You can see this in pdp11.md, the 
>> mov definition.)
> 
> I've long suspected the pdp11 was the inspiration for this restriction (I 
> have memories of noting it before I relocated to Utah, so circa 1992).  The 
> key problem is the generic parts of the compiler don't know what the 
> semantics ought to be -- so it's not obvious when they do a substitution 
> whether or not the substitution of one reg for another is actually valid.  
> It's important to remember that sometimes when we substitute one register for 
> another, we don't have any contextual information about source vs dest -- 
> it's a long standing wart that causes problems in other cases as well.
> 
> That punts the problem to the backends and the H8 actually tries to deal with 
> this restriction.  Basically in the movxx pattern conditions, when the 
> destination uses an autoinc addressing mode, the pattern's condition will 
> check that the source register is different.  I would expect other ports 
> likely to do something similar.
> 
> But that approach falls down with reload/lra doing substitutions without 
> validating the result.  I guess it might be possible to cobble together 
> something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.

If reload obeys the constraints in the patterns then the back end machine 
definition can be written to avoid the problematic cases, and it is no longer 
necessary to have a general (and as I pointed out, overly broad) rule in 
generic code.

paul

Re: [PATCH] Always enable LRA

On Fri, Oct 14, 2022 at 03:20:40PM +0900, Takayuki 'January June' Suwa wrote:
> On 2022/10/14 8:56, Segher Boessenkool wrote:
> > And finally, xtensa does
> > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy 
> > its constraints:
> >   840 | }
> >   | ^
> > (insn 8 7 9 2 (set (reg:SI 9 a9 [57])
> > (const_int 1431655765 [0x])) 
> > "/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
> >  (expr_list:REG_EQUIV (const_int 1431655765 [0x])
> > (nil)))
> > during RTL pass: postreload
> > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in 
> > extract_constrain_insn, at recog.cc:2692
> 
> This is a result of knowing that Reload is tolerant of out-of-constraint 
> constants.
> LRA support needs to be taken care of before that, ie. in the "split1" pass.
> Excuse me in haste.

So old reload did a split itself?  Or it left it to say the split2 pass?

Thanks for looking into it!


Segher

Re: [PATCH] Always enable LRA

On Fri, Oct 14, 2022 at 12:36:47AM +, Koning, Paul wrote:
> I guess I'll have to look harder to see if it's possible to make LRA handle 
> CISC addressing modes like memory indirect, autoincrement, autodecrement, and 
> others that the old reload handles at least somewhat.  Ideally LRA should do 
> a better job; right now I believe it doesn't really do these things at all.  
> Targets like pdp11 and vax would like these.

So what does it do now?  Break every more complex addressing mode apart
again?  Or ICE?  Or something in between?


Segher

Re: [PATCH] Always enable LRA




On 10/14/22 06:37, Koning, Paul wrote:



On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches  
wrote:


On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed 
differently on different machine models.  The solution I picked is to create two sets of machine-specific 
constraint codes, one for "register N" and the other for "autoinc/dec of any register 
other than N" and pairing those.  (You can see this in pdp11.md, the mov definition.)


I've long suspected the pdp11 was the inspiration for this restriction 
(I have memories of noting it before I relocated to Utah, so circa 
1992).  The key problem is the generic parts of the compiler don't know 
what the semantics ought to be -- so it's not obvious when they do a 
substitution whether or not the substitution of one reg for another is 
actually valid.  It's important to remember that sometimes when we 
substitute one register for another, we don't have any contextual 
information about source vs dest -- it's a long standing wart that 
causes problems in other cases as well.


That punts the problem to the backends and the H8 actually tries to deal 
with this restriction.  Basically in the movxx pattern conditions, when 
the destination uses an autoinc addressing mode, the pattern's condition 
will check that the source register is different.  I would expect other 
ports likely to do something similar.


But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.


And yes, this case where the autoinc is on the destination works 
consistently on the H8 as well.  We could consider loosening the 
restrictions and let this through.  It's certainly simpler as it's a doc 
change and removing a bit of code on the H8.  It sounds like the pdp11 
already assumes that case is valid.



Jeff

Re: [PATCH] Always enable LRA