Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
On 26/04/14 14:25, Eric Botcazou wrote: 2014-03-21 James Greenhalgh james.greenha...@arm.com * calls.c (initialize_argument_information): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. (expand_call): Likewise. (emit_library_call_calue_1): Likewise. * expr.c (PUSH_ARGS_REVERSED): Do not define. (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. This is fine for the trunk at this point. Are you sure that it's not a correctness issue for some targets though? Ordering of side-effects these days are handled during gimplification. If there are any correctness issues relating to this, then I think we've got a bigger problem. R.
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
On Fri, Apr 25, 2014 at 09:15:57PM +0100, Jeff Law wrote: On 03/24/14 05:44, James Greenhalgh wrote: Which is much neater. It seems to me that this would probably be of benefit on all targets. This would be an argument for setting PUSH_ARGS_REVERSED to 1 by default for all targets. However, this would have a perceivable impact on argument evaluation order (which is unspecified in C, but people have shown sensitivity to it changing in the past and we suspect some people may have built systems relying on the behviour... ho hum...). However, now that all side effects are handled when we are in SSA form, it should be possible to refactor calls.c and expr.c as if PUSH_ARGS_REVERSED were always defined to 1, without changing the argument evaluation order in gimplify.c. In this way, we have the best of both worlds; we maintain argument evaluation order and gain the optimizations given by removing overlapping live ranges for parameters. I've bootstrapped and regression tested this on x86, ARM and AArch64 - but I am well aware these are fairly standard targets, do you have any suggestions or comments on which targets this change will affect? If not, is this OK for stage-1? Thanks, James --- gcc/ 2014-03-21 James Greenhalgh james.greenha...@arm.com * calls.c (initialize_argument_information): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. (expand_call): Likewise. (emit_library_call_calue_1): Likewise. * expr.c (PUSH_ARGS_REVERSED): Do not define. (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. This is fine for the trunk at this point. Sorry for the delay. Thanks Jeff. I've committed this as revision 209897 after checking it still bootstraps across arm/aarch64/i386. I'll look out for any fallout on other targets, please shout if I've broken something! Thanks, James
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
2014-03-21 James Greenhalgh james.greenha...@arm.com * calls.c (initialize_argument_information): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. (expand_call): Likewise. (emit_library_call_calue_1): Likewise. * expr.c (PUSH_ARGS_REVERSED): Do not define. (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. This is fine for the trunk at this point. Are you sure that it's not a correctness issue for some targets though? -- Eric Botcazou
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
On 03/24/14 05:44, James Greenhalgh wrote: Which is much neater. It seems to me that this would probably be of benefit on all targets. This would be an argument for setting PUSH_ARGS_REVERSED to 1 by default for all targets. However, this would have a perceivable impact on argument evaluation order (which is unspecified in C, but people have shown sensitivity to it changing in the past and we suspect some people may have built systems relying on the behviour... ho hum...). However, now that all side effects are handled when we are in SSA form, it should be possible to refactor calls.c and expr.c as if PUSH_ARGS_REVERSED were always defined to 1, without changing the argument evaluation order in gimplify.c. In this way, we have the best of both worlds; we maintain argument evaluation order and gain the optimizations given by removing overlapping live ranges for parameters. I've bootstrapped and regression tested this on x86, ARM and AArch64 - but I am well aware these are fairly standard targets, do you have any suggestions or comments on which targets this change will affect? If not, is this OK for stage-1? Thanks, James --- gcc/ 2014-03-21 James Greenhalgh james.greenha...@arm.com * calls.c (initialize_argument_information): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. (expand_call): Likewise. (emit_library_call_calue_1): Likewise. * expr.c (PUSH_ARGS_REVERSED): Do not define. (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. This is fine for the trunk at this point. Sorry for the delay. jeff
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
Hi, On Mon, 24 Mar 2014, Richard Henderson wrote: See http://en.wikipedia.org/wiki/X86_calling_conventions#pascal Since we don't actually support this anymore, we can certainly tidy this up. Yeah, I thought about that, but I couldn't see how that could have used PUSH_ARGS_REVERSED. The above calling convention would be a language setting while the macro is a target setting, and for interoperability even the Pascal compiler would have to be able to create some calls with reversed arguments. So I thought that Pascal never was the purpose for the macro. (I didn't check to make sure, though) Ciao, Michael.
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
On Mon, Mar 24, 2014 at 12:44 PM, James Greenhalgh james.greenha...@arm.com wrote: Hi, Consider this testcase: extern void foo (int a, int b, int c, int d); void bar (int b, int c, int d) { foo (3, b, c, d); } For many ABI's we will have on entry to the function r0 = b r1 = c r2 = d If we process arguments to the call to foo left-to-right (PUSH_ARGS_REVERSED == 0) we assign temporaries, and then hard registers in this order: t0 = 3 t1 = r0 t2 = r1 t3 = r2 r0 = t0 r1 = t1 r2 = t2 r3 = t3 We can't eliminate any of these temporaries as their live ranges overlap. However, If we process the arguments right-to-left (PUSH_ARGS_REVERSED == 1), we get this order: t0 = 3 t1 = r0 t2 = r1 t3 = r2 r3 = t3 r2 = t2 r1 = t1 r0 = t0 And then we can start eliminating temporaries we don't need and get: r3 = r2 r2 = r1 r1 = r0 r0 = 3 Which is much neater. It seems to me that this would probably be of benefit on all targets. This would be an argument for setting PUSH_ARGS_REVERSED to 1 by default for all targets. However, this would have a perceivable impact on argument evaluation order (which is unspecified in C, but people have shown sensitivity to it changing in the past and we suspect some people may have built systems relying on the behviour... ho hum...). However, now that all side effects are handled when we are in SSA form, it should be possible to refactor calls.c and expr.c as if PUSH_ARGS_REVERSED were always defined to 1, without changing the argument evaluation order in gimplify.c. In this way, we have the best of both worlds; we maintain argument evaluation order and gain the optimizations given by removing overlapping live ranges for parameters. I've bootstrapped and regression tested this on x86, ARM and AArch64 - but I am well aware these are fairly standard targets, do you have any suggestions or comments on which targets this change will affect? If not, is this OK for stage-1? Maybe somebody remembers why we have both paths. I'd rather make gimplification independent of this as well, choosing either variant. Do you have any hard numbers to compare? Say cc1 code size, when comparing PUSH_ARGS_REVERSED 1 vs 0? Thanks, Richard. Thanks, James --- gcc/ 2014-03-21 James Greenhalgh james.greenha...@arm.com * calls.c (initialize_argument_information): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. (expand_call): Likewise. (emit_library_call_calue_1): Likewise. * expr.c (PUSH_ARGS_REVERSED): Do not define. (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly.
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
Hi, On Mon, 24 Mar 2014, Richard Biener wrote: Maybe somebody remembers why we have both paths. I'd rather make gimplification independent of this as well, choosing either variant. I can't say why we have the !PUSH_ARGS_REVERSED path (so I guess that's the way that initially was there, before the distinction was made), but the PUSH_ARGS_REVERSED==1 path is important when you have push instructions and arguments grow in the opposite direction of the stack. The latter is the case for almost all targets, so as soon as you have push instructions (and no ACCUMULATE_OUTGOING_ARGS) you'll want to have PUSH_ARGS_REVERSED==1 as well. I guess the !PUSH_ARGS_REVERSED path only has advantages for the few targets where stack and args grow the same way and that use push instructions for argument passing, which currently are none. pa-64 does have same-direction arguments and stack (both upward), but doesn't use push, and the only other ARGS_GROW_DOWNWARD (pa-32bit and stormy16) also have !STACK_GROWS_DOWNWARD, so again opposite direction. Without push instructions or with ACCUMULATE_OUTGOING_ARGS I guess it doesn't matter much for parameters passed in memory (the offsets simply will be N to 0 instead of 0 to N), and for register arguments there's a slight advantage for PUSH_ARGS_REVERSED==1 for the usual case of same-ordered arguments passed down, like in James' example. Do you have any hard numbers to compare? Say cc1 code size, when comparing PUSH_ARGS_REVERSED 1 vs 0? If we'd agree on only one way, then it must be PUSH_ARGS_REVERSED==1. That would be a change in argument gimplification (and hence eval) order for many targets, and could result in unexpected side-effects for invalid code, like James said. IMO no reason to not try that in stage 1, though. Ciao, Michael.
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
On Mon, Mar 24, 2014 at 11:54:49AM +, Richard Biener wrote: On Mon, Mar 24, 2014 at 12:44 PM, James Greenhalgh james.greenha...@arm.com wrote: Hi, Consider this testcase: extern void foo (int a, int b, int c, int d); void bar (int b, int c, int d) { foo (3, b, c, d); } For many ABI's we will have on entry to the function r0 = b r1 = c r2 = d If we process arguments to the call to foo left-to-right (PUSH_ARGS_REVERSED == 0) we assign temporaries, and then hard registers in this order: t0 = 3 t1 = r0 t2 = r1 t3 = r2 r0 = t0 r1 = t1 r2 = t2 r3 = t3 We can't eliminate any of these temporaries as their live ranges overlap. However, If we process the arguments right-to-left (PUSH_ARGS_REVERSED == 1), we get this order: t0 = 3 t1 = r0 t2 = r1 t3 = r2 r3 = t3 r2 = t2 r1 = t1 r0 = t0 And then we can start eliminating temporaries we don't need and get: r3 = r2 r2 = r1 r1 = r0 r0 = 3 Which is much neater. It seems to me that this would probably be of benefit on all targets. This would be an argument for setting PUSH_ARGS_REVERSED to 1 by default for all targets. However, this would have a perceivable impact on argument evaluation order (which is unspecified in C, but people have shown sensitivity to it changing in the past and we suspect some people may have built systems relying on the behviour... ho hum...). However, now that all side effects are handled when we are in SSA form, it should be possible to refactor calls.c and expr.c as if PUSH_ARGS_REVERSED were always defined to 1, without changing the argument evaluation order in gimplify.c. In this way, we have the best of both worlds; we maintain argument evaluation order and gain the optimizations given by removing overlapping live ranges for parameters. I've bootstrapped and regression tested this on x86, ARM and AArch64 - but I am well aware these are fairly standard targets, do you have any suggestions or comments on which targets this change will affect? If not, is this OK for stage-1? Maybe somebody remembers why we have both paths. I'd rather make gimplification independent of this as well, choosing either variant. Do you have any hard numbers to compare? Say cc1 code size, when comparing PUSH_ARGS_REVERSED 1 vs 0? Sure, here is the output of size for AArch64 and ARM. Both of these targets have PUSH_ARGS_REVERSED == 0. i386 results are not interesting (on the order of tens of bytes), as PUSH_ARGS_REVERSED == 1 for that target. trunk is revision 208685 patched is revision 208685 with this patch applied. AArch64: textdata bss dec hex filename 14546902 82424 781944 15411270 eb2846 trunk/gcc/cc1 14423510 82424 781944 15287878 e94646 patched/gcc/cc1 16721485 82480 805000 17608965 10cb105 trunk/gcc/cc1plus 16564613 82480 805000 17452093 10a4c3d patched/gcc/cc1plus 724550 7904 80744 813198c688etrunk/gcc/gfortran 722606 7904 80744 811254c60f6patched/gcc/gfortran ARM: textdata bss dec hex filename 15176835 31880 694132 15902847 f2a87f trunk/gcc/cc1 15171939 31880 694124 15897943 f29557 patched/gcc/cc1 17255818 31916 706404 17994138 112919a trunk/gcc/cc1plus 17250418 31916 706396 17988730 1127c7a patched/gcc/cc1plus 666307 4824 26780 697911aa637trunk/gcc/gfortran 664887 4824 26780 696491aa0abpatched/gcc/gfortran AArch64 benefits more from the optimization, but there is also an improvement for ARM. Thanks, James --- gcc/ 2014-03-21 James Greenhalgh james.greenha...@arm.com * calls.c (initialize_argument_information): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly. (expand_call): Likewise. (emit_library_call_calue_1): Likewise. * expr.c (PUSH_ARGS_REVERSED): Do not define. (emit_push_insn): Always treat PUSH_ARGS_REVERSED as 1, simplify code accordingly.
Re: [RFC] Remove PUSH_ARGS_REVERSED from the RTL expander.
On 03/24/2014 06:23 AM, Michael Matz wrote: I can't say why we have the !PUSH_ARGS_REVERSED path (so I guess that's the way that initially was there, before the distinction was made), but the PUSH_ARGS_REVERSED==1 path is important when you have push instructions and arguments grow in the opposite direction of the stack. The latter is the case for almost all targets, so as soon as you have push instructions (and no ACCUMULATE_OUTGOING_ARGS) you'll want to have PUSH_ARGS_REVERSED==1 as well. I guess the !PUSH_ARGS_REVERSED path only has advantages for the few targets where stack and args grow the same way and that use push instructions for argument passing, which currently are none. pa-64 does have same-direction arguments and stack (both upward), but doesn't use push, and the only other ARGS_GROW_DOWNWARD (pa-32bit and stormy16) also have !STACK_GROWS_DOWNWARD, so again opposite direction. See http://en.wikipedia.org/wiki/X86_calling_conventions#pascal Since we don't actually support this anymore, we can certainly tidy this up. r~