Re: split_live_ranges_for_shrink_wrap and !SHRINK_WRAPPING_ENABLED?

Denis Chertykov Sat, 23 Apr 2016 01:33:07 -0700

2016-04-18 11:53 GMT+03:00 Senthil Kumar Selvaraj
<senthil_kumar.selva...@atmel.com>:
> Hi,
>
>   While tracking down a performance regression for the AVR target from
>   4.9.x to trunk, I noticed that failing the SHRINK_WRAPPING_ENABLED
>   check in ira.c led to noticeably worse code for the following
>   case. That check prevents live range splitting of pseudos containing
>   formal args, and between 4.9.x and now, the check was modified from
>   just flag_shrink_wrap to now flag_shrink_wrap && targetm.have_simple_return.
>
> #include <stdint.h>
>
> extern int bar(uint32_t, uint16_t);
> extern int baz(void);
>
> int foo(uint8_t x, uint32_t y, uint16_t z)
> {
>    uint8_t status;
>    switch(x)
>    {
>       case 0:
>         status = bar(y, z);
>         if (status == 0)
>            status = baz();
>         break;01
>
>    }
>    return status;
> }
>
> If the live range splitting of pseudos containing formal args is not
> done in IRA, then reload uses callee-saved registers (SI r12), resulting in
> extra saves and restores in the prologue/epilogue.
>
> <snip>
> (insn 3 2 5 2 (set (reg/v:SI 12 r12 [orig:47 y ] [47])
>         (reg:SI 20 r20 [ y ])) ../ctrl_access.c:7 95 {*movsi}
>      (nil))
> (note 5 3 8 2 NOTE_INSN_FUNCTION_BEG)
> (insn 8 5 9 2 (set (cc0)
>         (compare (reg:QI 24 r24 [ x ])
>             (const_int 0 [0]))) ../ctrl_access.c:9 404 {*cmpqi}
>      (nil))
> (jump_insn 9 8 13 2 (set (pc)
>         (if_then_else (ne (cc0)
>                 (const_int 0 [0]))
>             (label_ref:HI 25)
>             (pc))) ../ctrl_access.c:9 428 {branch}
>      (int_list:REG_BR_PROB 6102 (nil))
>  -> 25)
> (note 13 9 14 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 14 13 15 3 (set (reg:HI 20 r20)
>         (reg/v:HI 18 r18 [orig:48 z ] [48])) ../ctrl_access.c:12 83 {*movhi}
>      (nil))
> (insn 15 14 16 3 (set (reg:SI 22 r22)
>         (reg/v:SI 12 r12 [orig:47 y ] [47])) ../ctrl_access.c:12 95 {*movsi}
>      (nil))
> </snip>
>
> leading to code like
> <snip>
> push r12
> push r13
> push r14
> push r15
> /* prologue: function */
> /* frame size = 0 */
> /* stack size = 4 */
> .L__stack_usage = 4
> movw r12,r20
> movw r14,r22
> cpse r24,__zero_reg__
> rjmp .L2
> movw r20,r18
> movw r24,r14
> movw r22,r12
> call bar
> </snip>
>
> Whereas in 4.9.2, live range splitting allows reload to generate far
> better RTL, like so
>
> <snip>
> (insn 8 5 9 2 (set (cc0)
>         (compare (reg:QI 24 r24 [ x ])
>             (const_int 0 [0]))) ../ctrl_access.c:9 404 {*cmpqi}
>      (nil))
> (jump_insn 9 8 13 2 (set (pc)
>         (if_then_else (ne (cc0)
>                 (const_int 0 [0]))
>             (label_ref:HI 25)
>             (pc))) ../ctrl_access.c:9 428 {branch}
>      (int_list:REG_BR_PROB 6102 (nil))
>  -> 25)
> (note 13 9 42 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (insn 42 13 14 3 (set (reg/v:SI 22 r22 [orig:48 y ] [48])
>         (reg/v:SI 20 r20 [orig:48 y ] [48])) 95 {*movsi}
>      (nil))
> (insn 14 42 16 3 (set (reg:HI 20 r20)
>         (reg/v:HI 18 r18 [orig:49 z ] [49])) ../ctrl_access.c:12 83 {*movhi}
>      (nil))
> </snip>
>
> leading to code like
> <snip>
> cpse r24,__zero_reg__
> rjmp .L2
> movw r24,r22
> movw r22,r20
> movw r20,r18
> call bar
> </snip>
>
> In either case, shrink wrapping isn't done in the end, but the live
> range splitting ends up helping reload, AFAICT.
>
> I verified that adding simple_return pattern to avr.md fixes the
> regression, but the code size increase if shrink wrapping
> really does happen would badly hurt a flash constrained target like the
> AVR. I also noticed that targets like sh and arm (for thumb) disable it when
> optimizing for size (using the condition part in simple_return pattern)
> - understandable.
>
> What do you guys think is the right way to deal with this? Should I look
> into why range splitting helps reload and have reload itself handle this?


It's a place for experiments.
Generally, live range splitting must be very useful for targets with
constrained register classes and for targets with small amount of
registers.

Denis.

Re: split_live_ranges_for_shrink_wrap and !SHRINK_WRAPPING_ENABLED?

Reply via email to