On Mon, Mar 24, 2014 at 11:54:49AM +0000, Richard Biener wrote:
> On Mon, Mar 24, 2014 at 12:44 PM, James Greenhalgh
> <james.greenha...@arm.com> wrote:
> >
> > Hi,
> >
> > Consider this testcase:
> >
> >   extern void foo (int a, int b, int c, int d);
> >
> >   void
> >   bar (int b, int c, int d)
> >   {
> >       foo (3, b, c, d);
> >   }
> >
> > For many ABI's we will have on entry to the function
> >
> >   r0 = b
> >   r1 = c
> >   r2 = d
> >
> > If we process arguments to the call to foo left-to-right
> > (PUSH_ARGS_REVERSED == 0) we assign temporaries, and then hard registers
> > in this order:
> >
> >   t0 = 3
> >   t1 = r0
> >   t2 = r1
> >   t3 = r2
> >
> >   r0 = t0
> >   r1 = t1
> >   r2 = t2
> >   r3 = t3
> >
> > We can't eliminate any of these temporaries as their live ranges overlap.
> >
> > However, If we process the arguments right-to-left (PUSH_ARGS_REVERSED == 
> > 1),
> > we get this order:
> >
> >   t0 = 3
> >   t1 = r0
> >   t2 = r1
> >   t3 = r2
> >
> >   r3 = t3
> >   r2 = t2
> >   r1 = t1
> >   r0 = t0
> >
> > And then we can start eliminating temporaries we don't need and get:
> >
> >   r3 = r2
> >   r2 = r1
> >   r1 = r0
> >   r0 = 3
> >
> > Which is much neater.
> >
> > It seems to me that this would probably be of benefit on all targets.
> >
> > This would be an argument for setting PUSH_ARGS_REVERSED to 1 by default
> > for all targets. However, this would have a perceivable impact on argument
> > evaluation order (which is unspecified in C, but people have shown
> > sensitivity to it changing in the past and we suspect some people may
> > have built systems relying on the behviour... ho hum...).
> >
> > However, now that all side effects are handled when we are in SSA
> > form, it should be possible to refactor calls.c and expr.c as if
> > PUSH_ARGS_REVERSED were always defined to 1, without changing the
> > argument evaluation order in gimplify.c.
> >
> > In this way, we have the best of both worlds; we maintain argument
> > evaluation order and gain the optimizations given by removing
> > overlapping live ranges for parameters.
> >
> > I've bootstrapped and regression tested this on x86, ARM and AArch64 - but
> > I am well aware these are fairly standard targets, do you have any
> > suggestions or comments on which targets this change will affect?
> >
> > If not, is this OK for stage-1?
> 
> Maybe somebody remembers why we have both paths.  I'd rather
> make gimplification independent of this as well, choosing either
> variant.
> 
> Do you have any hard numbers to compare?  Say cc1 code size,
> when comparing PUSH_ARGS_REVERSED 1 vs 0?

Sure, here is the output of size for AArch64 and ARM. Both of these targets
have PUSH_ARGS_REVERSED == 0. i386 results are not interesting (on the order
of tens of bytes), as PUSH_ARGS_REVERSED == 1 for that target.

trunk is revision 208685
patched is revision 208685 with this patch applied.

AArch64:

   text    data     bss     dec     hex     filename
14546902   82424  781944 15411270  eb2846   trunk/gcc/cc1
14423510   82424  781944 15287878  e94646   patched/gcc/cc1
16721485   82480  805000 17608965  10cb105  trunk/gcc/cc1plus
16564613   82480  805000 17452093  10a4c3d  patched/gcc/cc1plus
724550     7904   80744  813198    c688e    trunk/gcc/gfortran
722606     7904   80744  811254    c60f6    patched/gcc/gfortran

ARM:

   text    data     bss     dec     hex     filename
15176835   31880  694132 15902847  f2a87f   trunk/gcc/cc1
15171939   31880  694124 15897943  f29557   patched/gcc/cc1
17255818   31916  706404 17994138  112919a  trunk/gcc/cc1plus
17250418   31916  706396 17988730  1127c7a  patched/gcc/cc1plus
666307     4824   26780  697911    aa637    trunk/gcc/gfortran
664887     4824   26780  696491    aa0ab    patched/gcc/gfortran

AArch64 benefits more from the optimization, but there is also an
improvement for ARM.

Thanks,
James

> > ---
> > gcc/
> >
> > 2014-03-21  James Greenhalgh  <james.greenha...@arm.com>
> >
> >         * calls.c (initialize_argument_information): Always treat
> >         PUSH_ARGS_REVERSED as 1, simplify code accordingly.
> >         (expand_call): Likewise.
> >         (emit_library_call_calue_1): Likewise.
> >         * expr.c (PUSH_ARGS_REVERSED): Do not define.
> >         (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify
> >         code accordingly.
> 

Reply via email to