Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 15:21, Koning, Paul wrote:



On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches  
wrote:


On 10/14/22 11:36, Koning, Paul wrote:

On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:

On 10/14/22 10:37, Koning, Paul wrote:

...
But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.

It's not really a constraint issue -- the pattern's condition would cause this 
not to recognize, but LRA doesn't re-recognize the insn.  We might be able to 
hack something in the constraints to force a reload of the source operand in 
this case.   Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain.  Minimally it 
should be clearly documented exactly what cases fail to obey the constraints 
and what a target writer can do to deal with those failures.

Constraints have a purpose, but as I've noted, they really don't come into play here.   
Had LRA tried to see if what it created as a valid move insn, the backend would have said 
"nope, that's not valid".  That's a stronger test than checking the 
constraints.  If the insn is not valid according to its condition, then the constraints 
simply don't matter.

I'm not aware of a case where constraints are failing to be obeyed and 
constraints simply aren't a viable solution here other than to paper over the 
problem and hope it doesn't show up elsewhere.

Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is 
"r".  How would you define a new constraint for operand 1 that disallows overlap with 
operand 0 given that the H8 allows autoinc on any register operand?   You can't look at operand 0 
while processing the constraint for operand 1. Similarly if you try to define a new constraint for 
operand0 without looking at operand1.

Easy but cumbersome: define constraints for "register N" (for each N) and another set for "autoinc on 
any register other than N".  In pdp11, I called these Z0, Z1... and Za, Zb... respectively.  Then the insn gets 
constraints that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands.  As I said, see 
pdp11.md, the mov insn.


It generally looks sound, but golly gee, this runs into the "reload 
doesn't validate insns problem"  if it's done in a reload tree rather 
than an lra tree.  We've got an insn with a pre-inc destination and a 
reg source.  The source pseudo doesn't get a hard reg, reload replaces 
the pseudo with a mem as expected. Reload finishes with something like this:


(insn 100 98 101 15 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [8  S4 A32])
    (mem/c:SI (plus:SI (reg/f:SI 7 sp)
    (const_int 8 [0x8])) [9 %sfp+-24 S4 A32])) "j.c":62:11 
19 {*movsix}

 (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
    (nil)))

Which, isn't a valid instruction on the H8.  The insn's condition 
verifies that one of the two operands must be a REG.  But reload never 
bothered to re-recognize the insn after makng the substitution, then 
naturally it blows up in reload_cse with a constraint failure because 
the pre-inc destination constraints require a reg for the source 
operand.  But the real culprit here is reload making the substitution 
and not validing that the result is valid.


Arggh!

Which brings me back to pondering just removing the autoinc magic 
checking in the H8 port :-)  I've actually got that spinning in the 
tester just to see if there's any obvious fallout.  I've already spent 
more time on this than I can reasonably justify.



Jeff


Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 15:21, Koning, Paul wrote:



On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches  
wrote:


On 10/14/22 11:36, Koning, Paul wrote:

On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:

On 10/14/22 10:37, Koning, Paul wrote:

...
But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.

It's not really a constraint issue -- the pattern's condition would cause this 
not to recognize, but LRA doesn't re-recognize the insn.  We might be able to 
hack something in the constraints to force a reload of the source operand in 
this case.   Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain.  Minimally it 
should be clearly documented exactly what cases fail to obey the constraints 
and what a target writer can do to deal with those failures.

Constraints have a purpose, but as I've noted, they really don't come into play here.   
Had LRA tried to see if what it created as a valid move insn, the backend would have said 
"nope, that's not valid".  That's a stronger test than checking the 
constraints.  If the insn is not valid according to its condition, then the constraints 
simply don't matter.

I'm not aware of a case where constraints are failing to be obeyed and 
constraints simply aren't a viable solution here other than to paper over the 
problem and hope it doesn't show up elsewhere.

Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is 
"r".  How would you define a new constraint for operand 1 that disallows overlap with 
operand 0 given that the H8 allows autoinc on any register operand?   You can't look at operand 0 
while processing the constraint for operand 1. Similarly if you try to define a new constraint for 
operand0 without looking at operand1.

Easy but cumbersome: define constraints for "register N" (for each N) and another set for "autoinc on 
any register other than N".  In pdp11, I called these Z0, Z1... and Za, Zb... respectively.  Then the insn gets 
constraints that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands.  As I said, see 
pdp11.md, the mov insn.


Yea, you're right.  It's definitely possible.  Painful, but possible.

jeff



Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 14, 2022, at 5:15 PM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/14/22 11:36, Koning, Paul wrote:
>> 
>>> On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:
>>> 
>>> On 10/14/22 10:37, Koning, Paul wrote:
> ...
> But that approach falls down with reload/lra doing substitutions without 
> validating the result.  I guess it might be possible to cobble together 
> something with secondary reloads, but it's way way way down on my todo 
> list.
 Aren't the constraints enforced?  My experience is that I was getting 
 these bad addressing modes in some test programs, and that the constraints 
 I created to make the requirement explicit cured that.  Maybe I'm 
 expecting too much from constraints, but my (admittedly inexperienced) 
 understanding of them is that they inform reload what sort of things it 
 can construct, and what it cannot.
>>> It's not really a constraint issue -- the pattern's condition would cause 
>>> this not to recognize, but LRA doesn't re-recognize the insn.  We might be 
>>> able to hack something in the constraints to force a reload of the source 
>>> operand in this case.   Ugly, but a possibility.
>> I find it hard to cope with constraints that don't constrain.  Minimally it 
>> should be clearly documented exactly what cases fail to obey the constraints 
>> and what a target writer can do to deal with those failures.
> 
> Constraints have a purpose, but as I've noted, they really don't come into 
> play here.   Had LRA tried to see if what it created as a valid move insn, 
> the backend would have said "nope, that's not valid".  That's a stronger test 
> than checking the constraints.  If the insn is not valid according to its 
> condition, then the constraints simply don't matter.
> 
> I'm not aware of a case where constraints are failing to be obeyed and 
> constraints simply aren't a viable solution here other than to paper over the 
> problem and hope it doesn't show up elsewhere.
> 
> Right now operand 0's constraint is "<" meaning pre-inc operand, operand 1 is 
> "r".  How would you define a new constraint for operand 1 that disallows 
> overlap with operand 0 given that the H8 allows autoinc on any register 
> operand?   You can't look at operand 0 while processing the constraint for 
> operand 1. Similarly if you try to define a new constraint for operand0 
> without looking at operand1.

Easy but cumbersome: define constraints for "register N" (for each N) and 
another set for "autoinc on any register other than N".  In pdp11, I called 
these Z0, Z1... and Za, Zb... respectively.  Then the insn gets constraints 
that look like "Z0,Z1,Z2..." and "Za, Zb, Zc..." for the two operands.  As I 
said, see pdp11.md, the mov insn.

paul



Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 11:36, Koning, Paul wrote:



On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:

On 10/14/22 10:37, Koning, Paul wrote:

...
But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.

It's not really a constraint issue -- the pattern's condition would cause this 
not to recognize, but LRA doesn't re-recognize the insn.  We might be able to 
hack something in the constraints to force a reload of the source operand in 
this case.   Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain.  Minimally it 
should be clearly documented exactly what cases fail to obey the constraints 
and what a target writer can do to deal with those failures.


Constraints have a purpose, but as I've noted, they really don't come 
into play here.   Had LRA tried to see if what it created as a valid 
move insn, the backend would have said "nope, that's not valid".  That's 
a stronger test than checking the constraints.  If the insn is not valid 
according to its condition, then the constraints simply don't matter.


I'm not aware of a case where constraints are failing to be obeyed and 
constraints simply aren't a viable solution here other than to paper 
over the problem and hope it doesn't show up elsewhere.


Right now operand 0's constraint is "<" meaning pre-inc operand, operand 
1 is "r".  How would you define a new constraint for operand 1 that 
disallows overlap with operand 0 given that the H8 allows autoinc on any 
register operand?   You can't look at operand 0 while processing the 
constraint for operand 1. Similarly if you try to define a new 
constraint for operand0 without looking at operand1.


Hence the h8300_move_ok test in the insn's condition where we can look 
at both operands to assess if it's a legitimate insn.





As it stands, I find myself working hard to write MD code that accurately 
describes the rules of the machine, and for the core machinery to disregard 
those instructions is painful.


No doubt.




Is there a compelling argument for every case where LRA fails to obey the 
constraints?  If not, can they just be called bugs and added to the to-be-fixed 
queue?


There was in the reload days, though I honestly don't remember what it 
was, I'm much less familiar with LRA in this regard, but I trust Vlad's 
engineering skills and strongly believe that failing to recognize was 
done for a good reason.



It's also worth repeating, we can get the same fundamental failure on 
the H8 with reload.  The testcase is different, but the core issue is 
the same.  We have a move with an autoinc destination and the same 
register is also used as a source operand incorerctly created by reload.



What's a bit interesting here is the m68k doesn't do any kind of 
checking for these scenarios. It just accepts them and generates the 
obvious code.  I'm more tempted by the minute to do the same on the H8 :-)



Jeff




Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 14, 2022, at 4:12 PM, Segher Boessenkool  
> wrote:
> 
> On Fri, Oct 14, 2022 at 07:58:39PM +, Koning, Paul wrote:
>>> On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches 
>>>  wrote:
>>> On 10/14/22 11:35, Segher Boessenkool wrote:
 On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>> LRA only ever generates insns that pass recog.  The backend allows this
>> define_insn, requiring it to be split (it returns template "#"), but
>> then somehow it doesn't match in any split pass?
> Nope.  The elimination code will just change one register without
> re-recognizing.  That's precisely what happens here.
 That is a big oversight then.  Please file a PR?
>>> 
>>> Sure.  But just recognizing (for this particular case) will just move the 
>>> fault from a failure to split to a failure to recognize. From my wanderings 
>>> in the elimination code, I don't see that it has a path that would allow it 
>>> to reasonably handle this case -- ie, if the insn does not recognize, what 
>>> then?   Conceptually we need to generate an input-reload but I don't see a 
>>> way to do that in the elimination code.  Maybe Vlad knows how it ought to 
>>> be handled.
>> 
>> I probably have too simplistic a view of this, but the way I think of it is 
>> that LRA (and reload) make decisions subject to constraints, and among those 
>> constraints are the ones specified in the MD file patterns.  That to me 
>> means that a substitution proposed to be made by the LRA code is subject to 
>> those invariants: it can't do that if the constraints say "no" and must then 
>> consider some other alternative.
> 
> I think that is exactly right for LRA.
> 
> Old reload conceptually changed the whole function all at once, starting
> with valid RTL, and ending with strictly valid RTL.  LRA works locally,
> one instruction at a time essentially, and makes the changes
> immediately.  If when it has finished work on the function offsets have
> changed, it walks over the whole function again, repeat until done.
> 
> "Strictly valid" means that the constraints are considered, and the insn
> is only valid if some enabled alternative satisfies all constraints.
> 
> I hope I got that all right, I'm not an expert!  :-)

Thanks Segher.

As I said earlier, if for some reason this straightforward understanding is not 
completely accurate, that can be handled provided it is documented when and why 
the exceptions arise, and what methods the target author should use to deal 
with these things when they happen.

As a target maintainer not deeply skilled in the GCC common internals, I tend 
to trip over these things.  With the old reload, and secondary reload in 
particular, it always felt to  me  like the answer was "keep tweaking the 
target definition files until the test cases stop breaking".  That isn't how it 
should be.  

Perhaps some of these issues come from out of the ordinary target restrictions. 
 The autoinc/autodec case we're discussing may be an example of that.  The one 
I remember in particular was the pdp11 float instructions, where I have 6 
registers but only 4 of these can be loaded from or stored to memory.  Putting 
the other two to work while having spill to memory work right took quite a lot 
of iteration.

It may be LRA is better in these areas.  I haven't spent much time with that, 
other than to create a way to enable its use and observing that (a) I got about 
the same test suite numbers either way and (b) the LRA code was not as good in 
some of the cases.

paul



Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2022 at 07:58:39PM +, Koning, Paul wrote:
> > On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches 
> >  wrote:
> > On 10/14/22 11:35, Segher Boessenkool wrote:
> >> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
>  LRA only ever generates insns that pass recog.  The backend allows this
>  define_insn, requiring it to be split (it returns template "#"), but
>  then somehow it doesn't match in any split pass?
> >>> Nope.  The elimination code will just change one register without
> >>> re-recognizing.  That's precisely what happens here.
> >> That is a big oversight then.  Please file a PR?
> > 
> > Sure.  But just recognizing (for this particular case) will just move the 
> > fault from a failure to split to a failure to recognize. From my wanderings 
> > in the elimination code, I don't see that it has a path that would allow it 
> > to reasonably handle this case -- ie, if the insn does not recognize, what 
> > then?   Conceptually we need to generate an input-reload but I don't see a 
> > way to do that in the elimination code.  Maybe Vlad knows how it ought to 
> > be handled.
> 
> I probably have too simplistic a view of this, but the way I think of it is 
> that LRA (and reload) make decisions subject to constraints, and among those 
> constraints are the ones specified in the MD file patterns.  That to me means 
> that a substitution proposed to be made by the LRA code is subject to those 
> invariants: it can't do that if the constraints say "no" and must then 
> consider some other alternative.

I think that is exactly right for LRA.

Old reload conceptually changed the whole function all at once, starting
with valid RTL, and ending with strictly valid RTL.  LRA works locally,
one instruction at a time essentially, and makes the changes
immediately.  If when it has finished work on the function offsets have
changed, it walks over the whole function again, repeat until done.

"Strictly valid" means that the constraints are considered, and the insn
is only valid if some enabled alternative satisfies all constraints.

I hope I got that all right, I'm not an expert!  :-)


Segher


Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 14, 2022, at 2:03 PM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/14/22 11:35, Segher Boessenkool wrote:
>> On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
 LRA only ever generates insns that pass recog.  The backend allows this
 define_insn, requiring it to be split (it returns template "#"), but
 then somehow it doesn't match in any split pass?
>>> Nope.  The elimination code will just change one register without
>>> re-recognizing.  That's precisely what happens here.
>> That is a big oversight then.  Please file a PR?
> 
> Sure.  But just recognizing (for this particular case) will just move the 
> fault from a failure to split to a failure to recognize. From my wanderings 
> in the elimination code, I don't see that it has a path that would allow it 
> to reasonably handle this case -- ie, if the insn does not recognize, what 
> then?   Conceptually we need to generate an input-reload but I don't see a 
> way to do that in the elimination code.  Maybe Vlad knows how it ought to be 
> handled.

I probably have too simplistic a view of this, but the way I think of it is 
that LRA (and reload) make decisions subject to constraints, and among those 
constraints are the ones specified in the MD file patterns.  That to me means 
that a substitution proposed to be made by the LRA code is subject to those 
invariants: it can't do that if the constraints say "no" and must then consider 
some other alternative.

paul




Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 11:35, Segher Boessenkool wrote:

On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:

LRA only ever generates insns that pass recog.  The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?

Nope.  The elimination code will just change one register without
re-recognizing.  That's precisely what happens here.

That is a big oversight then.  Please file a PR?


Sure.  But just recognizing (for this particular case) will just move 
the fault from a failure to split to a failure to recognize. From my 
wanderings in the elimination code, I don't see that it has a path that 
would allow it to reasonably handle this case -- ie, if the insn does 
not recognize, what then?   Conceptually we need to generate an 
input-reload but I don't see a way to do that in the elimination code.  
Maybe Vlad knows how it ought to be handled.





It is the only way it can know if it needs to reload more.  Even if it
somehow can assume it doesn't have to check this in some cases, an
assert (inside a CHECKING_P) would be nice?


Agreed.


jeff



Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 14, 2022, at 1:10 PM, Jeff Law  wrote:
> 
> On 10/14/22 10:37, Koning, Paul wrote:
>> 
>>> ...
>>> But that approach falls down with reload/lra doing substitutions without 
>>> validating the result.  I guess it might be possible to cobble together 
>>> something with secondary reloads, but it's way way way down on my todo list.
>> Aren't the constraints enforced?  My experience is that I was getting these 
>> bad addressing modes in some test programs, and that the constraints I 
>> created to make the requirement explicit cured that.  Maybe I'm expecting 
>> too much from constraints, but my (admittedly inexperienced) understanding 
>> of them is that they inform reload what sort of things it can construct, and 
>> what it cannot.
> 
> It's not really a constraint issue -- the pattern's condition would cause 
> this not to recognize, but LRA doesn't re-recognize the insn.  We might be 
> able to hack something in the constraints to force a reload of the source 
> operand in this case.   Ugly, but a possibility.

I find it hard to cope with constraints that don't constrain.  Minimally it 
should be clearly documented exactly what cases fail to obey the constraints 
and what a target writer can do to deal with those failures.

As it stands, I find myself working hard to write MD code that accurately 
describes the rules of the machine, and for the core machinery to disregard 
those instructions is painful.

Is there a compelling argument for every case where LRA fails to obey the 
constraints?  If not, can they just be called bugs and added to the to-be-fixed 
queue?

paul



Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2022 at 11:07:43AM -0600, Jeff Law wrote:
> >LRA only ever generates insns that pass recog.  The backend allows this
> >define_insn, requiring it to be split (it returns template "#"), but
> >then somehow it doesn't match in any split pass?
> 
> Nope.  The elimination code will just change one register without 
> re-recognizing.  That's precisely what happens here.

That is a big oversight then.  Please file a PR?

> >>Register elimination ultimately discovered that (reg 30) was the same as
> >>the stack pointer and did the natural substitution.    The natural
> >>substitution results in invalid RTL and there's really no way to back
> >>out and do something different AFAICT in lra-eliminations.cc.
> >>
> >>The only reason we fault is because the H8 backend knows this is invalid
> >>RTL and refuses to split it.  If we were to try and re-recognize the
> >>insn in question it would fail to recognize after the substitution
> >>because the auto-inc'd operand overlaps with the other operand.
> >But it *did* recog?  Or does LRA somehow not always recog() everything?
> >I always thought that was one of the huge improvements over old reload
> >(it does everything very locally instead of very globally)!
> 
> No, LRA does not force re-recognition in some cases, particularly for 
> register eliminations.

It is the only way it can know if it needs to reload more.  Even if it
somehow can assume it doesn't have to check this in some cases, an
assert (inside a CHECKING_P) would be nice?


Segher


Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 10:39, Richard Biener wrote:



Am 14.10.2022 um 16:40 schrieb Jeff Law via Gcc-patches 
:



On 10/14/22 06:37, Koning, Paul wrote:


On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches  
wrote:


On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed 
differently on different machine models.  The solution I picked is to create two sets of machine-specific 
constraint codes, one for "register N" and the other for "autoinc/dec of any register 
other than N" and pairing those.  (You can see this in pdp11.md, the mov definition.)

I've long suspected the pdp11 was the inspiration for this restriction (I have 
memories of noting it before I relocated to Utah, so circa 1992).  The key 
problem is the generic parts of the compiler don't know what the semantics 
ought to be -- so it's not obvious when they do a substitution whether or not 
the substitution of one reg for another is actually valid.  It's important to 
remember that sometimes when we substitute one register for another, we don't 
have any contextual information about source vs dest -- it's a long standing 
wart that causes problems in other cases as well.

That punts the problem to the backends and the H8 actually tries to deal with 
this restriction.  Basically in the movxx pattern conditions, when the 
destination uses an autoinc addressing mode, the pattern's condition will check 
that the source register is different.  I would expect other ports likely to do 
something similar.

But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

And yes, this case where the autoinc is on the destination works consistently 
on the H8 as well.  We could consider loosening the restrictions and let this 
through.  It's certainly simpler as it's a doc change and removing a bit of 
code on the H8.  It sounds like the pdp11 already assumes that case is valid.

But what is the semantics of the RTL IL?
That ought to be documented.


*If* we went a route to relax the restriction (and I'm still not sure 
that's a good idea), we absolutely would have to document the semantics.



jeff




Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 10:37, Koning, Paul wrote:



On Oct 14, 2022, at 10:38 AM, Jeff Law via Gcc-patches 
 wrote:


On 10/14/22 06:37, Koning, Paul wrote:

On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches  
wrote:


On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed 
differently on different machine models.  The solution I picked is to create two sets of machine-specific 
constraint codes, one for "register N" and the other for "autoinc/dec of any register 
other than N" and pairing those.  (You can see this in pdp11.md, the mov definition.)

I've long suspected the pdp11 was the inspiration for this restriction (I have 
memories of noting it before I relocated to Utah, so circa 1992).  The key 
problem is the generic parts of the compiler don't know what the semantics 
ought to be -- so it's not obvious when they do a substitution whether or not 
the substitution of one reg for another is actually valid.  It's important to 
remember that sometimes when we substitute one register for another, we don't 
have any contextual information about source vs dest -- it's a long standing 
wart that causes problems in other cases as well.

That punts the problem to the backends and the H8 actually tries to deal with 
this restriction.  Basically in the movxx pattern conditions, when the 
destination uses an autoinc addressing mode, the pattern's condition will check 
that the source register is different.  I would expect other ports likely to do 
something similar.

But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.


It's not really a constraint issue -- the pattern's condition would 
cause this not to recognize, but LRA doesn't re-recognize the insn.  We 
might be able to hack something in the constraints to force a reload of 
the source operand in this case.   Ugly, but a possibility.



Jeff


Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 10:37, Segher Boessenkool wrote:

Hi!

On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote:

On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12
 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

Really smells like an LRA bug to me.


We have this as we leave IRA:


(insn 10 9 11 2 (set (reg/f:SI 30)
    (plus:SI (reg/f:SI 11 fp)
    (const_int -4 [0xfffc]))) "j.c":31:7 264
{*addsi}
 (expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
    (const_int -4 [0xfffc]))
(nil)))
(insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3  S4 A32])
    (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
 (expr_list:REG_DEAD (reg/f:SI 30)
    (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
    (nil


And as we leave LRA:

(note 10 9 11 2 NOTE_INSN_DELETED)
(insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
     (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
     (nil)))

LRA only ever generates insns that pass recog.  The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?


Nope.  The elimination code will just change one register without 
re-recognizing.  That's precisely what happens here.






Register elimination ultimately discovered that (reg 30) was the same as
the stack pointer and did the natural substitution.    The natural
substitution results in invalid RTL and there's really no way to back
out and do something different AFAICT in lra-eliminations.cc.

The only reason we fault is because the H8 backend knows this is invalid
RTL and refuses to split it.  If we were to try and re-recognize the
insn in question it would fail to recognize after the substitution
because the auto-inc'd operand overlaps with the other operand.

But it *did* recog?  Or does LRA somehow not always recog() everything?
I always thought that was one of the huge improvements over old reload
(it does everything very locally instead of very globally)!


No, LRA does not force re-recognition in some cases, particularly for 
register eliminations.



jeff




Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 14, 2022, at 12:18 PM, Segher Boessenkool  
> wrote:
> 
> On Fri, Oct 14, 2022 at 12:36:47AM +, Koning, Paul wrote:
>> I guess I'll have to look harder to see if it's possible to make LRA handle 
>> CISC addressing modes like memory indirect, autoincrement, autodecrement, 
>> and others that the old reload handles at least somewhat.  Ideally LRA 
>> should do a better job; right now I believe it doesn't really do these 
>> things at all.  Targets like pdp11 and vax would like these.
> 
> So what does it do now?  Break every more complex addressing mode apart
> again?  Or ICE?  Or something in between?

The former.  LRA does handle some cases but not all that the target permits and 
not as many as the old reload.

Example:

unsigned int cksum (unsigned int *buf, unsigned int len)
{
unsigned int ret = 0;
do {
ret += *buf++;
} while (--len != 0);
return ret;
}

The loop looks like this:

L_2:
add (r2)+,r0
sob r1,L_2

which is what I would expect.  Now throw in an indirection:

Old reload produces this loop:

L_2:
add @(r2)+,r0
sob r1,L_2

while LRA doesn't understand it can use the autoincrement indirect mode:

L_2:
mov (r2)+,r3
add (r3),r0
sob r1,L_2

This is from a GCC 13.0 test build of last June, with -O2 -m45, with and 
without -mlra.

paul



Re: [PATCH] Always enable LRA

2022-10-14 Thread Richard Biener via Gcc-patches



> Am 14.10.2022 um 16:40 schrieb Jeff Law via Gcc-patches 
> :
> 
> 
>> On 10/14/22 06:37, Koning, Paul wrote:
>> 
 On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches 
  wrote:
>>> 
>>> 
>>> On 10/13/22 17:56, Segher Boessenkool wrote:
 h8300 fails during GCC build:
 /home/segher/src/gcc/libgcc/unwind.inc: In function 
 '_Unwind_SjLj_RaiseException':
 /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
 (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 
 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
 during RTL pass: final
 which looks like a backend bug, I don't see a pattern that could split
 this (without needing an extra clobber)?
>>> I'm aware of this -- its invalid RTL:
>>> 
>>> Uses of the register outside of an address are not permitted within the
>>> same insn as a use in an embedded side effect expression because such
>>> insns behave differently on different machines and hence must be treated
>>> as ambiguous and disallowed.
>> I had a bit of a fight with this sort of thing in pdp11, where in fact such 
>> operations are executed differently on different machine models.  The 
>> solution I picked is to create two sets of machine-specific constraint 
>> codes, one for "register N" and the other for "autoinc/dec of any register 
>> other than N" and pairing those.  (You can see this in pdp11.md, the 
>> mov definition.)
> 
> I've long suspected the pdp11 was the inspiration for this restriction (I 
> have memories of noting it before I relocated to Utah, so circa 1992).  The 
> key problem is the generic parts of the compiler don't know what the 
> semantics ought to be -- so it's not obvious when they do a substitution 
> whether or not the substitution of one reg for another is actually valid.  
> It's important to remember that sometimes when we substitute one register for 
> another, we don't have any contextual information about source vs dest -- 
> it's a long standing wart that causes problems in other cases as well.
> 
> That punts the problem to the backends and the H8 actually tries to deal with 
> this restriction.  Basically in the movxx pattern conditions, when the 
> destination uses an autoinc addressing mode, the pattern's condition will 
> check that the source register is different.  I would expect other ports 
> likely to do something similar.
> 
> But that approach falls down with reload/lra doing substitutions without 
> validating the result.  I guess it might be possible to cobble together 
> something with secondary reloads, but it's way way way down on my todo list.
> 
> And yes, this case where the autoinc is on the destination works consistently 
> on the H8 as well.  We could consider loosening the restrictions and let this 
> through.  It's certainly simpler as it's a doc change and removing a bit of 
> code on the H8.  It sounds like the pdp11 already assumes that case is valid.

But what is the semantics of the RTL IL?
That ought to be documented.


> 
> Jeff
> 


Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
Hi!

On Thu, Oct 13, 2022 at 10:47:20PM -0600, Jeff Law wrote:
> On 10/13/22 17:56, Segher Boessenkool wrote:
> >h8300 fails during GCC build:
> >/home/segher/src/gcc/libgcc/unwind.inc: In function 
> >'_Unwind_SjLj_RaiseException':
> >/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
> >   141 | }
> >   | ^
> >(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
> > (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 
> > 19 {*movsi}
> >  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
> > (nil)))
> >during RTL pass: final
> >which looks like a backend bug, I don't see a pattern that could split
> >this (without needing an extra clobber)?
> 
> Really smells like an LRA bug to me.
> 
> 
> We have this as we leave IRA:
> 
> 
> (insn 10 9 11 2 (set (reg/f:SI 30)
>    (plus:SI (reg/f:SI 11 fp)
>    (const_int -4 [0xfffc]))) "j.c":31:7 264 
> {*addsi}
> (expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
>    (const_int -4 [0xfffc]))
> (nil)))
> (insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3  S4 A32])
>    (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
> (expr_list:REG_DEAD (reg/f:SI 30)
>    (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>    (nil
> 
> 
> And as we leave LRA:
> 
> (note 10 9 11 2 NOTE_INSN_DELETED)
> (insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
>     (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
>  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>     (nil)))

LRA only ever generates insns that pass recog.  The backend allows this
define_insn, requiring it to be split (it returns template "#"), but
then somehow it doesn't match in any split pass?

> Register elimination ultimately discovered that (reg 30) was the same as 
> the stack pointer and did the natural substitution.    The natural 
> substitution results in invalid RTL and there's really no way to back 
> out and do something different AFAICT in lra-eliminations.cc.
> 
> The only reason we fault is because the H8 backend knows this is invalid 
> RTL and refuses to split it.  If we were to try and re-recognize the 
> insn in question it would fail to recognize after the substitution 
> because the auto-inc'd operand overlaps with the other operand.

But it *did* recog?  Or does LRA somehow not always recog() everything?
I always thought that was one of the huge improvements over old reload
(it does everything very locally instead of very globally)!


Segher


Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 14, 2022, at 10:38 AM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/14/22 06:37, Koning, Paul wrote:
>> 
>>> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches 
>>>  wrote:
>>> 
>>> 
>>> On 10/13/22 17:56, Segher Boessenkool wrote:
 h8300 fails during GCC build:
 /home/segher/src/gcc/libgcc/unwind.inc: In function 
 '_Unwind_SjLj_RaiseException':
 /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
 (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 
 19 {*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
 during RTL pass: final
 which looks like a backend bug, I don't see a pattern that could split
 this (without needing an extra clobber)?
>>> I'm aware of this -- its invalid RTL:
>>> 
>>> Uses of the register outside of an address are not permitted within the
>>> same insn as a use in an embedded side effect expression because such
>>> insns behave differently on different machines and hence must be treated
>>> as ambiguous and disallowed.
>> I had a bit of a fight with this sort of thing in pdp11, where in fact such 
>> operations are executed differently on different machine models.  The 
>> solution I picked is to create two sets of machine-specific constraint 
>> codes, one for "register N" and the other for "autoinc/dec of any register 
>> other than N" and pairing those.  (You can see this in pdp11.md, the 
>> mov definition.)
> 
> I've long suspected the pdp11 was the inspiration for this restriction (I 
> have memories of noting it before I relocated to Utah, so circa 1992).  The 
> key problem is the generic parts of the compiler don't know what the 
> semantics ought to be -- so it's not obvious when they do a substitution 
> whether or not the substitution of one reg for another is actually valid.  
> It's important to remember that sometimes when we substitute one register for 
> another, we don't have any contextual information about source vs dest -- 
> it's a long standing wart that causes problems in other cases as well.
> 
> That punts the problem to the backends and the H8 actually tries to deal with 
> this restriction.  Basically in the movxx pattern conditions, when the 
> destination uses an autoinc addressing mode, the pattern's condition will 
> check that the source register is different.  I would expect other ports 
> likely to do something similar.
> 
> But that approach falls down with reload/lra doing substitutions without 
> validating the result.  I guess it might be possible to cobble together 
> something with secondary reloads, but it's way way way down on my todo list.

Aren't the constraints enforced?  My experience is that I was getting these bad 
addressing modes in some test programs, and that the constraints I created to 
make the requirement explicit cured that.  Maybe I'm expecting too much from 
constraints, but my (admittedly inexperienced) understanding of them is that 
they inform reload what sort of things it can construct, and what it cannot.

If reload obeys the constraints in the patterns then the back end machine 
definition can be written to avoid the problematic cases, and it is no longer 
necessary to have a general (and as I pointed out, overly broad) rule in 
generic code.

paul



Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2022 at 03:20:40PM +0900, Takayuki 'January June' Suwa wrote:
> On 2022/10/14 8:56, Segher Boessenkool wrote:
> > And finally, xtensa does
> > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy 
> > its constraints:
> >   840 | }
> >   | ^
> > (insn 8 7 9 2 (set (reg:SI 9 a9 [57])
> > (const_int 1431655765 [0x])) 
> > "/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
> >  (expr_list:REG_EQUIV (const_int 1431655765 [0x])
> > (nil)))
> > during RTL pass: postreload
> > /home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in 
> > extract_constrain_insn, at recog.cc:2692
> 
> This is a result of knowing that Reload is tolerant of out-of-constraint 
> constants.
> LRA support needs to be taken care of before that, ie. in the "split1" pass.
> Excuse me in haste.

So old reload did a split itself?  Or it left it to say the split2 pass?

Thanks for looking into it!


Segher


Re: [PATCH] Always enable LRA

2022-10-14 Thread Segher Boessenkool
On Fri, Oct 14, 2022 at 12:36:47AM +, Koning, Paul wrote:
> I guess I'll have to look harder to see if it's possible to make LRA handle 
> CISC addressing modes like memory indirect, autoincrement, autodecrement, and 
> others that the old reload handles at least somewhat.  Ideally LRA should do 
> a better job; right now I believe it doesn't really do these things at all.  
> Targets like pdp11 and vax would like these.

So what does it do now?  Break every more complex addressing mode apart
again?  Or ICE?  Or something in between?


Segher


Re: [PATCH] Always enable LRA

2022-10-14 Thread Jeff Law via Gcc-patches



On 10/14/22 06:37, Koning, Paul wrote:



On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches  
wrote:


On 10/13/22 17:56, Segher Boessenkool wrote:

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such operations are executed 
differently on different machine models.  The solution I picked is to create two sets of machine-specific 
constraint codes, one for "register N" and the other for "autoinc/dec of any register 
other than N" and pairing those.  (You can see this in pdp11.md, the mov definition.)


I've long suspected the pdp11 was the inspiration for this restriction 
(I have memories of noting it before I relocated to Utah, so circa 
1992).  The key problem is the generic parts of the compiler don't know 
what the semantics ought to be -- so it's not obvious when they do a 
substitution whether or not the substitution of one reg for another is 
actually valid.  It's important to remember that sometimes when we 
substitute one register for another, we don't have any contextual 
information about source vs dest -- it's a long standing wart that 
causes problems in other cases as well.


That punts the problem to the backends and the H8 actually tries to deal 
with this restriction.  Basically in the movxx pattern conditions, when 
the destination uses an autoinc addressing mode, the pattern's condition 
will check that the source register is different.  I would expect other 
ports likely to do something similar.


But that approach falls down with reload/lra doing substitutions without 
validating the result.  I guess it might be possible to cobble together 
something with secondary reloads, but it's way way way down on my todo list.


And yes, this case where the autoinc is on the destination works 
consistently on the H8 as well.  We could consider loosening the 
restrictions and let this through.  It's certainly simpler as it's a doc 
change and removing a bit of code on the H8.  It sounds like the pdp11 
already assumes that case is valid.



Jeff



Re: [PATCH] Always enable LRA

2022-10-14 Thread Koning, Paul via Gcc-patches



> On Oct 13, 2022, at 9:07 PM, Jeff Law via Gcc-patches 
>  wrote:
> 
> 
> On 10/13/22 17:56, Segher Boessenkool wrote:
>> 
>> h8300 fails during GCC build:
>> /home/segher/src/gcc/libgcc/unwind.inc: In function 
>> '_Unwind_SjLj_RaiseException':
>> /home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
>>   141 | }
>>   | ^
>> (insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
>> (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
>> {*movsi}
>>  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
>> (nil)))
>> during RTL pass: final
>> which looks like a backend bug, I don't see a pattern that could split
>> this (without needing an extra clobber)?
> 
> I'm aware of this -- its invalid RTL:
> 
> Uses of the register outside of an address are not permitted within the
> same insn as a use in an embedded side effect expression because such
> insns behave differently on different machines and hence must be treated
> as ambiguous and disallowed.

I had a bit of a fight with this sort of thing in pdp11, where in fact such 
operations are executed differently on different machine models.  The solution 
I picked is to create two sets of machine-specific constraint codes, one for 
"register N" and the other for "autoinc/dec of any register other than N" and 
pairing those.  (You can see this in pdp11.md, the mov definition.)

But the pdp11 case is actually not as restrictive as the rule you mentioned.  
The problem case is register N source, autoinc/dec rN destination.  The 
opposite case, which we see here -- autoinc/dec Rn source, Rn destination -- is 
just fine.  Perhaps not all that important, but the ISA definition does not 
object to it.  So I'm not sure why there would be a general rule that says it's 
considered ambiguous when the target machine architecture says it is not.

paul




Re: [PATCH] Always enable LRA

2022-10-13 Thread Jeff Law via Gcc-patches



On 10/13/22 17:56, Segher Boessenkool wrote:


h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?


Really smells like an LRA bug to me.


We have this as we leave IRA:


(insn 10 9 11 2 (set (reg/f:SI 30)
   (plus:SI (reg/f:SI 11 fp)
   (const_int -4 [0xfffc]))) "j.c":31:7 264 {*addsi}
(expr_list:REG_EQUIV (plus:SI (reg/f:SI 11 fp)
   (const_int -4 [0xfffc]))
(nil)))
(insn 11 10 12 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3  S4 A32])
   (reg/f:SI 30)) "j.c":31:7 19 {*movsi}
(expr_list:REG_DEAD (reg/f:SI 30)
   (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
   (nil


And as we leave LRA:

(note 10 9 11 2 NOTE_INSN_DELETED)
(insn 11 10 13 2 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [3 S4 A32])
    (reg/f:SI 7 sp)) "j.c":31:7 19 {*movsi}
 (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
    (nil)))


Register elimination ultimately discovered that (reg 30) was the same as 
the stack pointer and did the natural substitution.    The natural 
substitution results in invalid RTL and there's really no way to back 
out and do something different AFAICT in lra-eliminations.cc.


The only reason we fault is because the H8 backend knows this is invalid 
RTL and refuses to split it.  If we were to try and re-recognize the 
insn in question it would fail to recognize after the substitution 
because the auto-inc'd operand overlaps with the other operand.


Jeff


Re: [PATCH] Always enable LRA

2022-10-13 Thread Jeff Law via Gcc-patches



On 10/13/22 17:56, Segher Boessenkool wrote:


h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
   141 | }
   | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
 (reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
  (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
 (nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?


I'm aware of this -- its invalid RTL:

Uses of the register outside of an address are not permitted within the
same insn as a use in an embedded side effect expression because such
insns behave differently on different machines and hence must be treated
as ambiguous and disallowed.


I'd actually tried to turn on LRA for the H8 port a little while ago and 
stumbled over it.


I'm aware of a similar situation involving a general register on the H8, 
but using reload instead of LRA.  I looked at it a while back and my 
recollection was that the insn was actually fine until reload got its 
grubby hands on it.  And when I wandered reload to hunt for anything 
which handled the restriction noted above, I didn't find anything.  If 
you were to search for H8 bugs in bugzilla, it should be discoverable.


While we could potentially work around this in the backend, it'd be a 
hack at best.  It hasn't risen to the top of my priority list yet.  I 
considered suggesting we change this from "invalid" to "target defined" 
behavior, but that felt like a cop-out.



jeff



Re: [PATCH] Always enable LRA

2022-10-13 Thread Koning, Paul via Gcc-patches



> On Oct 13, 2022, at 7:56 PM, Segher Boessenkool  
> wrote:
> 
> This small patch changes everything that checks targetm.lra_p behave as
> if it returned true.
> 
> It has no effect on any primary or secondary target.  It also is fine
> for nds32 and for nios2, and it works fine for microblaze (which used
> old reload before), resulting in smaller code.
> 
> I have patches to completely rip out old reload, and more stuff after
> that, but of course not everything is nice yet:

I guess I'll have to look harder to see if it's possible to make LRA handle 
CISC addressing modes like memory indirect, autoincrement, autodecrement, and 
others that the old reload handles at least somewhat.  Ideally LRA should do a 
better job; right now I believe it doesn't really do these things at all.  
Targets like pdp11 and vax would like these.

paul




[PATCH] Always enable LRA

2022-10-13 Thread Segher Boessenkool
This small patch changes everything that checks targetm.lra_p behave as
if it returned true.

It has no effect on any primary or secondary target.  It also is fine
for nds32 and for nios2, and it works fine for microblaze (which used
old reload before), resulting in smaller code.

I have patches to completely rip out old reload, and more stuff after
that, but of course not everything is nice yet:

It makes a few targets no longer build though.  In my testing (of all
linux targets that built before) these are alpha, c6x, h8300, m68k,
32-bit parisc, sh, and xtensa.

alpha builds the compiler, but it then crashes with
/home/segher/src/kernel/drivers/tty/serial/serial_core.c:1029:1: internal 
compiler error: maximum number of generated reload insns per insn achieved (90)
(and in three more files) which can mean anything unfortunately.

c6x is more exciting:
/home/segher/src/kernel/fs/char_dev.c:598:1: internal compiler error: in 
priority, at haifa-sched.cc:1597
which is
  /* We should not be interested in priority of an already scheduled insn.  */
  gcc_assert (QUEUE_INDEX (insn) != QUEUE_SCHEDULED);

h8300 fails during GCC build:
/home/segher/src/gcc/libgcc/unwind.inc: In function 
'_Unwind_SjLj_RaiseException':
/home/segher/src/gcc/libgcc/unwind.inc:141:1: error: could not split insn
  141 | }
  | ^
(insn 69 256 327 (set (mem/f:SI (pre_dec:SI (reg/f:SI 7 sp)) [12  S4 A32])
(reg/f:SI 7 sp)) "/home/segher/src/gcc/libgcc/unwind.inc":118:12 19 
{*movsi}
 (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
during RTL pass: final
which looks like a backend bug, I don't see a pattern that could split
this (without needing an extra clobber)?

m68k builds the compiler fine, but then has
/home/segher/src/kernel/fs/squashfs/namei.c: In function 'squashfs_lookup':
/home/segher/src/kernel/fs/squashfs/namei.c:232:1: error: insn does not satisfy 
its constraints:
  232 | }
  | ^
(insn 270 470 271 30 (set (mem:SI (pre_dec:SI (reg/f:SI 15 %sp)) [1  S4 A16])
(plus:SI (sign_extend:SI (reg:HI 8 %a0 [175]))
(reg:SI 2 %d2 [orig:173 val ] [173]))) 
"/home/segher/src/kernel/fs/squashfs/namei.c":212:13 77 {pushasi}
 (expr_list:REG_ARGS_SIZE (const_int 4 [0x4])
(nil)))
during RTL pass: postreload
/home/segher/src/kernel/fs/squashfs/namei.c:232:1: internal compiler error: in 
extract_constrain_insn, at recog.cc:2692
(and similar in two more files).

32-bit parisc now runs into the 90 reloads thing (parisc64 already did
without the patch).

sh doesn't build GCC:
/home/segher/src/gcc/libgcc/libgcc2.c: In function '__divdc3':
/home/segher/src/gcc/libgcc/libgcc2.c:2182:1: error: unable to generate reloads 
for:
 2182 | }
  | ^
(call_insn/u 132 131 1855 7 (parallel [
(set (reg:SI 0 r0)
(call (mem:SI (symbol_ref:SI ("__ltdf2") [flags 0x41] 
) [0  S4 A32])
(const_int 0 [0])))
(use (reg:SI 154 fpscr0))
(use (reg:SI 12 r12))
(clobber (reg:SI 146 pr))
(clobber (reg:SI 758))
]) "/home/segher/src/gcc/libgcc/libgcc2.c":2082:7 233 {call_value_pcrel}
 (expr_list:REG_DEAD (reg:DF 6 r6)
(expr_list:REG_DEAD (reg:DF 4 r4)
(expr_list:REG_CALL_DECL (symbol_ref:SI ("__ltdf2") [flags 0x41] 
)
(expr_list:REG_EH_REGION (const_int -2147483648 
[0x8000])
(nil)
(expr_list (use (reg:DF 6 r6))
(expr_list (use (reg:DF 4 r4))
(nil
during RTL pass: reload

And finally, xtensa does
/home/segher/src/gcc/libgcc/libgcc2.c:840:1: error: insn does not satisfy its 
constraints:
  840 | }
  | ^
(insn 8 7 9 2 (set (reg:SI 9 a9 [57])
(const_int 1431655765 [0x])) 
"/home/segher/src/gcc/libgcc/libgcc2.c":828:21 37 {movsi_internal}
 (expr_list:REG_EQUIV (const_int 1431655765 [0x])
(nil)))
during RTL pass: postreload
/home/segher/src/gcc/libgcc/libgcc2.c:840:1: internal compiler error: in 
extract_constrain_insn, at recog.cc:2692

 - ~ - ~ -

All in all, more worked than expected.  I think it is time to finally
make this switch.

I did not test targets without a linux port.  I build the linux kernel
as well, to see if the resulting compiler actually works :-)
Suggestions what to use for other targets are welcome.

Is there any way to get the final targets updated for LRA?

Other rants?  :-)


Segher

---
 gcc/auto-inc-dec.cc | 2 +-
 gcc/combine.cc  | 2 +-
 gcc/ira-lives.cc| 5 -
 gcc/ira.cc  | 2 +-
 gcc/recog.cc| 2 +-
 gcc/targhooks.cc| 4 
 6 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/gcc/auto-inc-dec.cc b/gcc/auto-inc-dec.cc
index 481e7af..0186c17 100644
--- a/gcc/auto-inc-dec.cc
+++ b/gcc/auto-inc-dec.cc
@@ -1443,7 +1443,7 @@ merge_in_block (int max_reg, basic_block bb)
 
   /* Reload should handle auto-inc within a jump correctly, while LRA
 is known to have issues with autoinc.  */
-  if (JUMP_P