Re: LRA for avr: Arithmetic on stack pointer

2023-08-09 Thread Vladimir Makarov via Gcc



On 8/9/23 07:15, senthilkumar.selva...@microchip.com wrote:

Hi,

   After turning on FP -> SP elimination after Vlad fixed
   an elimination issue in 
https://gcc.gnu.org/git?p=gcc.git;a=commit;h=2971ff7b1d564ac04b537d907c70e6093af70832,
   I'm now running into reload failure if arithmetic is done on SP.

   For a call to a vararg functions, the avr target pushes args into the stack,
   calls the function, and then adjusts the SP back to where it was before the
   arg pushing occurred.

   So for code like

extern int foo(int, ...);
int bar(void) {
   long double l = 1.2345E6;
   foo(0, l);
   return 0;
}

With some efforts, I reproduced this problem.

   and

$ avr-gcc -mmcu=avr51 -Os ../20031208-1.c
   

...



   I guess the condition exists to ensure sp_off is always correct? Considering 
LRA already
   handles post_dec of SP just fine, perhaps it can allow RTX like


It is a very old code when LRA elimination was pretty constraint.


(set (reg/f:HI 32 __SP_L__)
  (plus:HI (reg/f:HI 32 __SP_L__)
   (const_int 10 [0xa]))) "../20031208-1.c":5:10 discrim 1 165 
{*addhi3_split}

   as long as the PLUS/MINUS is by a constant, and update sp_off accordingly?

   Or is there something the avr target has to do differently?


I think we can permit to stack pointer output reloads.  The only thing 
we need to update sp offset accurately for the original and reload 
insns.  I'll try to make the patch on this week.





Re: LRA for avr: Arithmetic on stack pointer

2023-08-09 Thread Georg-Johann Lay




Am 09.08.23 um 13:15 schrieb SenthilKumar.Selvaraj--- via Gcc:

[...]
   I guess the condition exists to ensure sp_off is always correct? Considering 
LRA already
   handles post_dec of SP just fine, perhaps it can allow RTX like

(set (reg/f:HI 32 __SP_L__)
  (plus:HI (reg/f:HI 32 __SP_L__)
   (const_int 10 [0xa]))) "../20031208-1.c":5:10 discrim 1 165 
{*addhi3_split}

   as long as the PLUS/MINUS is by a constant, and update sp_off accordingly?


It might be the case that non-const addends may also occur, e.g. with 
alloca like code?


Johann



   Or is there something the avr target has to do differently?

Regards
Senthil


LRA for avr: Arithmetic on stack pointer

2023-08-09 Thread SenthilKumar.Selvaraj--- via Gcc
Hi,

  After turning on FP -> SP elimination after Vlad fixed
  an elimination issue in 
https://gcc.gnu.org/git?p=gcc.git;a=commit;h=2971ff7b1d564ac04b537d907c70e6093af70832,
  I'm now running into reload failure if arithmetic is done on SP.

  For a call to a vararg functions, the avr target pushes args into the stack,
  calls the function, and then adjusts the SP back to where it was before the
  arg pushing occurred.

  So for code like

extern int foo(int, ...);
int bar(void) {
  long double l = 1.2345E6;
  foo(0, l);
  return 0;
}

  and 

$ avr-gcc -mmcu=avr51 -Os ../20031208-1.c
  
  Reload sees


(insn 5 2 6 2 (set (reg:QI 44)
(const_int 65 [0x41])) "../20031208-1.c":4:3 86 {movqi_insn_split}
 (expr_list:REG_EQUIV (const_int 65 [0x41])
(nil)))
(insn 6 5 7 2 (set (mem:QI (post_dec:HI (reg/f:HI 32 __SP_L__)) [0  S1 A8])
(reg:QI 44)) "../20031208-1.c":4:3 1 {pushqi1}
 (expr_list:REG_DEAD (reg:QI 44)
(expr_list:REG_ARGS_SIZE (const_int 1 [0x1])
(nil
...

...
(call_insn 19 18 21 2 (parallel [
(set (reg:HI 24 r24)
(call (mem:HI (symbol_ref:HI ("foo") [flags 0x41]  
) [0 foo S2 A8])
(const_int 10 [0xa])))
(use (const_int 0 [0]))
]) "../20031208-1.c":4:3 776 {call_value_insn}
 (expr_list:REG_UNUSED (reg:HI 24 r24)
(expr_list:REG_CALL_DECL (symbol_ref:HI ("foo") [flags 0x41]  
)
(nil)))
(nil))
(insn 21 19 25 2 (set (reg/f:HI 32 __SP_L__)
(plus:HI (reg/f:HI 32 __SP_L__)
(const_int 10 [0xa]))) "../20031208-1.c":5:10 discrim 1 165 
{*addhi3_split}
 (expr_list:REG_UNUSED (reg:QI 33 __SP_H__)
(expr_list:REG_ARGS_SIZE (const_int 0 [0])
(nil

  LRA doesn't pick any of the 4 alternatives for insn 21

Considering alt=0 of insn 21:   (0) =??r  (1) %0  (2) r
Staticly defined alt reject+=12
 Considering alt=1 of insn 21:   (0) d  (1) 0  (2) s
 Considering alt=2 of insn 21:   (0) !w  (1) 0  (2) IJYIJ
Staticly defined alt reject+=600
 Considering alt=3 of insn 21:   (0) d  (1) 0  (2) nYnn
 Considering alt=0 of insn 21:   (0) =??r  (1) %0  (2) r
Staticly defined alt reject+=12
 Considering alt=1 of insn 21:   (0) d  (1) 0  (2) s
 Considering alt=2 of insn 21:   (0) !w  (1) 0  (2) IJYIJ
Staticly defined alt reject+=600
 Considering alt=3 of insn 21:   (0) d  (1) 0  (2) nYnn

  whereas classic reload does.

Reloads for insn # 21
Reload 0: reload_in (HI) = (reg/f:HI 32 __SP_L__)
reload_out (HI) = (reg/f:HI 32 __SP_L__)
LD_REGS, RELOAD_OTHER (opnum = 0)
reload_in_reg: (reg/f:HI 32 __SP_L__)
reload_out_reg: (reg/f:HI 32 __SP_L__)
reload_reg_rtx: (reg:HI 24 r24)


  Digging through the code, I found lra-constraints.c:2687, which forbids
  output reload of SP, with a comment that says

  /* Never do output reload of stack pointer.  It makes
 impossible to do elimination when SP is changed in
 RTL.  */

   Just to check if that is indeed the problem, I commented out the check
   and goto fail; that follows that comment, and reload/code-gen worked fine.

Considering alt=0 of insn 21:   (0) =??r  (1) %0  (2) r
Staticly defined alt reject+=12
0 Non-pseudo reload: reject+=2
0 Small class reload: reject+=3
0 Non input pseudo reload: reject++
overall=24,losers=1 -- refuse
 Considering alt=1 of insn 21:   (0) d  (1) 0  (2) s
0 Non-pseudo reload: reject+=2
0 Small class reload: reject+=3
0 Non input pseudo reload: reject++
1 Matching alt: reject+=2
1 Non-pseudo reload: reject+=2
1 Small class reload: reject+=3
1 Non input pseudo reload: reject++
overall=26,losers=2 -- refuse
 Considering alt=2 of insn 21:   (0) !w  (1) 0  (2) IJYIJ
Staticly defined alt reject+=600
0 Non-pseudo reload: reject+=2
0 Non input pseudo reload: reject++
overall=609,losers=1 -- refuse
 Considering alt=3 of insn 21:   (0) d  (1) 0  (2) nYnn
0 Non-pseudo reload: reject+=2
0 Small class reload: reject+=3
0 Non input pseudo reload: reject++
1 Matching alt: reject+=2
1 Non-pseudo reload: reject+=2
1 Small class reload: reject+=3
1 Non input pseudo reload: reject++
overall=26,losers=2 -- refuse
  Choosing alt 3 in insn 21:  (0) d  (1) 0  (2) nYnn {*addhi3_split} 
(sp_off=-10)
  Creating newreg=50 from oldreg=32, assigning class LD_REGS to r50
   21: r50:HI=r50:HI+0xa
  REG_UNUSED __SP_H__:QI
  REG_ARGS_SIZE 0
Inserting insn reload before:
   29: r50:HI=__SP_L__:HI
Inserting insn reload after:
   30: __SP_L__:HI=r50:HI

  eventually generating