On Mon, Mar 24, 2014 at 11:54:49AM +0000, Richard Biener wrote: > On Mon, Mar 24, 2014 at 12:44 PM, James Greenhalgh > <james.greenha...@arm.com> wrote: > > > > Hi, > > > > Consider this testcase: > > > > extern void foo (int a, int b, int c, int d); > > > > void > > bar (int b, int c, int d) > > { > > foo (3, b, c, d); > > } > > > > For many ABI's we will have on entry to the function > > > > r0 = b > > r1 = c > > r2 = d > > > > If we process arguments to the call to foo left-to-right > > (PUSH_ARGS_REVERSED == 0) we assign temporaries, and then hard registers > > in this order: > > > > t0 = 3 > > t1 = r0 > > t2 = r1 > > t3 = r2 > > > > r0 = t0 > > r1 = t1 > > r2 = t2 > > r3 = t3 > > > > We can't eliminate any of these temporaries as their live ranges overlap. > > > > However, If we process the arguments right-to-left (PUSH_ARGS_REVERSED == > > 1), > > we get this order: > > > > t0 = 3 > > t1 = r0 > > t2 = r1 > > t3 = r2 > > > > r3 = t3 > > r2 = t2 > > r1 = t1 > > r0 = t0 > > > > And then we can start eliminating temporaries we don't need and get: > > > > r3 = r2 > > r2 = r1 > > r1 = r0 > > r0 = 3 > > > > Which is much neater. > > > > It seems to me that this would probably be of benefit on all targets. > > > > This would be an argument for setting PUSH_ARGS_REVERSED to 1 by default > > for all targets. However, this would have a perceivable impact on argument > > evaluation order (which is unspecified in C, but people have shown > > sensitivity to it changing in the past and we suspect some people may > > have built systems relying on the behviour... ho hum...). > > > > However, now that all side effects are handled when we are in SSA > > form, it should be possible to refactor calls.c and expr.c as if > > PUSH_ARGS_REVERSED were always defined to 1, without changing the > > argument evaluation order in gimplify.c. > > > > In this way, we have the best of both worlds; we maintain argument > > evaluation order and gain the optimizations given by removing > > overlapping live ranges for parameters. > > > > I've bootstrapped and regression tested this on x86, ARM and AArch64 - but > > I am well aware these are fairly standard targets, do you have any > > suggestions or comments on which targets this change will affect? > > > > If not, is this OK for stage-1? > > Maybe somebody remembers why we have both paths. I'd rather > make gimplification independent of this as well, choosing either > variant. > > Do you have any hard numbers to compare? Say cc1 code size, > when comparing PUSH_ARGS_REVERSED 1 vs 0?
Sure, here is the output of size for AArch64 and ARM. Both of these targets have PUSH_ARGS_REVERSED == 0. i386 results are not interesting (on the order of tens of bytes), as PUSH_ARGS_REVERSED == 1 for that target. trunk is revision 208685 patched is revision 208685 with this patch applied. AArch64: text data bss dec hex filename 14546902 82424 781944 15411270 eb2846 trunk/gcc/cc1 14423510 82424 781944 15287878 e94646 patched/gcc/cc1 16721485 82480 805000 17608965 10cb105 trunk/gcc/cc1plus 16564613 82480 805000 17452093 10a4c3d patched/gcc/cc1plus 724550 7904 80744 813198 c688e trunk/gcc/gfortran 722606 7904 80744 811254 c60f6 patched/gcc/gfortran ARM: text data bss dec hex filename 15176835 31880 694132 15902847 f2a87f trunk/gcc/cc1 15171939 31880 694124 15897943 f29557 patched/gcc/cc1 17255818 31916 706404 17994138 112919a trunk/gcc/cc1plus 17250418 31916 706396 17988730 1127c7a patched/gcc/cc1plus 666307 4824 26780 697911 aa637 trunk/gcc/gfortran 664887 4824 26780 696491 aa0ab patched/gcc/gfortran AArch64 benefits more from the optimization, but there is also an improvement for ARM. Thanks, James > > --- > > gcc/ > > > > 2014-03-21 James Greenhalgh <james.greenha...@arm.com> > > > > * calls.c (initialize_argument_information): Always treat > > PUSH_ARGS_REVERSED as 1, simplify code accordingly. > > (expand_call): Likewise. > > (emit_library_call_calue_1): Likewise. > > * expr.c (PUSH_ARGS_REVERSED): Do not define. > > (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify > > code accordingly. >