[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-10 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

Jeffrey A. Law  changed:

   What|Removed |Added

   Priority|P1  |P4

--- Comment #14 from Jeffrey A. Law  ---
Thanks for reminding me of the basics of caller-saves.  Andrew and you are
correct.  Downgrading to P4 since this isn't going to hit LRA targets.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-10 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #13 from Richard Sandiford  ---
(In reply to Jeffrey A. Law from comment #11)
> Andrew.  You're missing the point.  This scenario isn't the kind of thing
> that reload and LRA are supposed to fix.  They fix constraint problems.  ie,
> I got the wrong kind of register (wrong register file), or I can't really
> handlea  constant, so reload the value into a register, etc.
> 
> We have a pseudo live across a call and IRA allocates that pseudo to a call
> clobbered hard register.  That's an IRA bug in my book.
I think Andrew's comment 10 is right.  Using call-clobbered registers for
values that are live across a call is a deliberate choice, and like Andrew says
is what -fcaller-saves is designed to do.  All the recent patch did is tweak
the costs around doing that.

For LRA, the job of handling caller saves is integrated into LRA itself.  For
reload, it's instead handled by caller-save.cc.  And I think that's the
problem.  All caller-save.cc knows how to do is to store a pseudo register
before the call and restore the register before its next use (or at the end of
the block, whichever is sooner).  The increment emitted by reload will then be
lost.

So I suppose the tailored fix would be to avoid allocating a call-clobbered
register to a pseudo P if reload is being used and if P is mentioned in a
REG_INC note on a call.  But it's probably easiest just to disable
-fcaller-saves for reload targets instead.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #12 from Andrew Pinski  ---
(In reply to Jeffrey A. Law from comment #11)
> Andrew.  You're missing the point.  This scenario isn't the kind of thing
> that reload and LRA are supposed to fix.  They fix constraint problems.  ie,
> I got the wrong kind of register (wrong register file), or I can't really
> handlea  constant, so reload the value into a register, etc.
> 
> We have a pseudo live across a call and IRA allocates that pseudo to a call
> clobbered hard register.  That's an IRA bug in my book.

Then it is wrong even for aarch64 which allocates r1 to the pseduo case:
  Popping a3(r109,l0)  -- assign reg 1

I am trying to point out that it looks like LRA can fix this up (reload can do
it too in some but not all cases as shown by this bug). I do think it is wrong
at least for a missed optimization case where we need now to spill the register
to the stack across the call.

I see -fno-caller-saves fixes it too.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #11 from Jeffrey A. Law  ---
Andrew.  You're missing the point.  This scenario isn't the kind of thing that
reload and LRA are supposed to fix.  They fix constraint problems.  ie, I got
the wrong kind of register (wrong register file), or I can't really handlea 
constant, so reload the value into a register, etc.

We have a pseudo live across a call and IRA allocates that pseudo to a call
clobbered hard register.  That's an IRA bug in my book.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #10 from Andrew Pinski  ---
So I think the wrong code is due to reload NOT being able to handle:
(mem/f:HI (post_inc:HI (reg:HI 26 [ ivtmp.10 ])) [1 MEM[(int (*) (int)
*)_18]+0 S2 A16])


That is if you look at:
```
void f3(int x, int (*p3 []) (int))
{
  int i;
  int next = x;
  for (i = 0; i < x; i++)
next = p3[0](next);
}

```

IRA produces:
```
  Popping a3(r30,l0)  -- assign reg 13
```

For:
```
(call_insn 14 13 28 3 (set (reg:HI 12 R12)
(call:HI (mem:HI (mem/f:HI (reg/f:HI 30 [ p3 ]) [1 *p3_7(D)+0 S2 A16])
[0 *_1 S2 A16])
(const_int 0 [0]))) "t.c":7:12 89 {call_value_internal}
 (expr_list:REG_CALL_DECL (nil)
(nil))
(expr_list:HI (use (reg:HI 12 R12))
(nil)))

```

Which reload is able to handle:
```
.L2:
CMP.W   2(R1), R10 { JL .L3
; start of epilogue
ADD.W   #4, R1
POPM.W  #1, r10
RET
.L3:
MOV.W   R13, @R1 ;;;<<< store
MOV.W   @R13, R14
CALLR14
ADD.W   #1, R10
MOV.W   @R1, R13 ;; << load
BR  #.L2
```


BUT then again this is less than optimal here. because we are doing the
save/restore of r13 around the function call.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
   Last reconfirmed||2025-03-09
   Keywords||missed-optimization
 Status|UNCONFIRMED |NEW

--- Comment #9 from Andrew Pinski  ---
Note the same effect can be seen on aarch64 (though not the wrong code part due
to reload vs LRA):

  Popping a3(r109,l0)  -- assign reg 1

And then getting:
.L2:
cmp w20, w19
bgt .L3
ldp x19, x20, [sp, 16]
ldp x29, x30, [sp], 48
ret
.L3:
.cfi_restore_state
ldr x2, [x1, x19, lsl 3]
str x1, [sp, 40]
add x19, x19, 1
blr x2
ldr x1, [sp, 40]
b   .L2

See the spill around the call.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #8 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #7)
> Commenting out TARGET_LRA_P in msp430.cc and gets us code that works:
> ```
> .L2:
> CMP.W   R9, R10 { JL.L3
> ; start of epilogue
> ADD.W   #2, R1
> POPM.W  #2, r10
> RET
> .L3:
> MOV.W   @R13+, R14
> MOV.W   R13, @R1
> CALLR14
> ADD.W   #1, R10
> MOV.W   @R1, R13
> BR  #.L2
> 
> ```
> 
> Not the best code as it should have been in R1 in the first place but code
> that definitely works.

Wait `MOV.W   R13, @R1` is a store and `MOV.W   @R1, R13` is also a store. so
my comment about being in R1 is incorrect but that is definitely doing the
right thing and spilling it into a stack location across the call.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #7 from Andrew Pinski  ---
Commenting out TARGET_LRA_P in msp430.cc and gets us code that works:
```
.L2:
CMP.W   R9, R10 { JL.L3
; start of epilogue
ADD.W   #2, R1
POPM.W  #2, r10
RET
.L3:
MOV.W   @R13+, R14
MOV.W   R13, @R1
CALLR14
ADD.W   #1, R10
MOV.W   @R1, R13
BR  #.L2

```

Not the best code as it should have been in R1 in the first place but code that
definitely works.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #6 from Jeffrey A. Law  ---
I think the mcore failure was graphite/interchange-0.

On the topic of reload vs LRA.  The code is broken before IRA hands off to
LRA/reload in the msp430 case.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #5 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #3)
> (In reply to Jeffrey A. Law from comment #0)
> > I've also bisected an mcore-elf regression to the same change, but have not
> > debugged it as the testcase is more complex and GDB port for mcore faults,
> > so debugging is nontrivial.
> 
> It would be useful to list which testcases fail on mcore-elf at least.

Note mcore also uses reload too (PR 113940).

I wonder if there is some bad interaction between reload and IRA after this
patch which using LRA fixes.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=113942

--- Comment #4 from Andrew Pinski  ---
Hmm, msp430 still uses reload, I wonder if the problem is in reload rather than
IRA.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #3 from Andrew Pinski  ---
(In reply to Jeffrey A. Law from comment #0)
> I've also bisected an mcore-elf regression to the same change, but have not
> debugged it as the testcase is more complex and GDB port for mcore faults,
> so debugging is nontrivial.

It would be useful to list which testcases fail on mcore-elf at least.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||ra

--- Comment #1 from Andrew Pinski  ---
It seems interesting to have a mem inside a mem for the call instruction.
Also maybe the post_inc is what it is getting confused here .

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

--- Comment #2 from Andrew Pinski  ---
(In reply to Andrew Pinski from comment #1)
> It seems interesting to have a mem inside a mem for the call instruction.
> Also maybe the post_inc is what it is getting confused here .

Maybe that is correct here. the constraints support a mem there.

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

Sam James  changed:

   What|Removed |Added

   Keywords||wrong-code
   Target Milestone|--- |15.0

[Bug rtl-optimization/119174] [15 Regression] IRA allocating value live across a call to call clobbered register

2025-03-09 Thread law at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119174

Jeffrey A. Law  changed:

   What|Removed |Added

 Target||msp430-elf
 CC||hjl at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org
   Priority|P3  |P1