RE: [PATCH] Disable loop2_invariant for -Os
-Original Message- From: Richard Guenther [mailto:richard.guent...@gmail.com] Sent: 2012年6月28日 17:24 To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Disable loop2_invariant for -Os On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it + will + change the liverange of the invariant, which increases the + register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) +return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Update it according to the comments. Thanks! -Zhenqiang diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f8405dd..b0e84a7 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1931,7 +1931,8 @@ move_loop_invariants (void) curr_loop = loop; /* move_single_loop_invariants for very large loops is time consuming and might need a lot of memory. */ - if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP) + if (loop-num_nodes = (unsigned) + LOOP_INVARIANT_MAX_BBS_IN_LOOP + ! optimize_loop_nest_for_size_p (loop)) move_single_loop_invariants (loop); Wait - move_single_loop_invariants itself already uses optimize_loop_for_speed_p. And looking down it seems to have support for tracking spill cost (eventually only with -fira-loop-pressure) - please work out why this support is not working for you. 1) If -fira_loop_pressure is enabled, it reduces ~24% invariant motions in my tests. But it does not help on total code size. Seams there is issue to update the regs_needed after moving an invariant out of the loop (My benchmark logs show ~73% cases have more than one invariants moved). During tracing, I found that move an integer constant out of the loop does not increase regs_needed. Function get_pressure_class_and_nregs (rtx insn, int *nregs) computes the regs_needed. *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC (set))]; In ARM, the insn to set an integer is like (set (reg:SI 183) (const_int 32 [0x20])) inv1.c:64 182 {*thumb1_movsi_insn} (nil)) GET_MODE (SET_SRC (set)) is VOIDMode and ira_reg_class_max_nregs[pressure_class][VOIDMode] is 0. In one of my test cases, it moves 4 integer constants out of the loop, which leads to spilling. According to the algorithm in calculate_loop_reg_pressure, moving an invariant out of the loop should impact on the register pressure. So I try to add the following code if (! (*nregs)) *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (reg)]; Logs show it reduces another 32% invariant motions. But the code size is still far from disabling the pass. Logs show -fira_loop_pressure impact other passes in addition to loop2_invariant (The result of -fira_loop_pressure -fno-move-loop-invariants is different from the result of -fno-move-loop-invariants). 2) By default -fira_loop_pressure is not enabled for -Os, the logic to compute regs_used seams not sound. The following codes is from function find_invariants_to_move { unsigned int n_regs = DF_REG_SIZE (df); regs_used = 2; for (i = 0; i n_regs; i++) { if (!DF_REGNO_FIRST_DEF (i) DF_REGNO_LAST_USE (i)) { /* This is a value that is used but not changed inside loop. */ regs_used++; } } } * There is no loop related inform in the code. * Benchmark logs show the condition (!DF_REGNO_FIRST_DEF (i) DF_REGNO_LAST_USE (i)) is never true. Thanks! -Zhenqiang
Re: [PATCH] Disable loop2_invariant for -Os
On Tue, Jul 3, 2012 at 10:29 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: -Original Message- From: Richard Guenther [mailto:richard.guent...@gmail.com] Sent: 2012年6月28日 17:24 To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Disable loop2_invariant for -Os On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it + will + change the liverange of the invariant, which increases the + register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) +return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Update it according to the comments. Thanks! -Zhenqiang diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f8405dd..b0e84a7 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1931,7 +1931,8 @@ move_loop_invariants (void) curr_loop = loop; /* move_single_loop_invariants for very large loops is time consuming and might need a lot of memory. */ - if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP) + if (loop-num_nodes = (unsigned) + LOOP_INVARIANT_MAX_BBS_IN_LOOP + ! optimize_loop_nest_for_size_p (loop)) move_single_loop_invariants (loop); Wait - move_single_loop_invariants itself already uses optimize_loop_for_speed_p. And looking down it seems to have support for tracking spill cost (eventually only with -fira-loop-pressure) - please work out why this support is not working for you. 1) If -fira_loop_pressure is enabled, it reduces ~24% invariant motions in my tests. But it does not help on total code size. Seams there is issue to update the regs_needed after moving an invariant out of the loop (My benchmark logs show ~73% cases have more than one invariants moved). During tracing, I found that move an integer constant out of the loop does not increase regs_needed. Function get_pressure_class_and_nregs (rtx insn, int *nregs) computes the regs_needed. *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC (set))]; In ARM, the insn to set an integer is like (set (reg:SI 183) (const_int 32 [0x20])) inv1.c:64 182 {*thumb1_movsi_insn} (nil)) GET_MODE (SET_SRC (set)) is VOIDMode and ira_reg_class_max_nregs[pressure_class][VOIDMode] is 0. In one of my test cases, it moves 4 integer constants out of the loop, which leads to spilling. According to the algorithm in calculate_loop_reg_pressure, moving an invariant out of the loop should impact on the register pressure. So I try to add the following code if (! (*nregs)) *nregs = ira_reg_class_max_nregs[pressure_class][GET_MODE (reg)]; Logs show it reduces another 32% invariant motions. But the code size is still far from disabling the pass. Logs show -fira_loop_pressure impact other passes in addition to loop2_invariant (The result of -fira_loop_pressure -fno-move-loop-invariants is different from the result of -fno-move-loop-invariants). 2) By default -fira_loop_pressure is not enabled for -Os, the logic to compute regs_used seams not sound. The following codes is from function find_invariants_to_move { unsigned int n_regs = DF_REG_SIZE (df); regs_used = 2; for (i = 0; i n_regs; i++) { if (!DF_REGNO_FIRST_DEF (i) DF_REGNO_LAST_USE (i)) { /* This is a value that is used but not changed inside loop. */ regs_used++; } } } * There is no loop related inform in the code. * Benchmark logs show the condition (!DF_REGNO_FIRST_DEF (i) DF_REGNO_LAST_USE (i)) is never true. Still there is code that tries to deal with -Os. Simply disabling the pass makes that logic pointless. Thus, please try to fix the code that is there to deal with -Os (a target may opt to enable -fira-loop-pressure by default for -Os). Thanks, Richard. Thanks! -Zhenqiang
RE: [PATCH] Disable loop2_invariant for -Os
diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it + will + change the liverange of the invariant, which increases the + register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) + return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Update it according to the comments. Thanks! -Zhenqiang diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f8405dd..b0e84a7 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1931,7 +1931,8 @@ move_loop_invariants (void) curr_loop = loop; /* move_single_loop_invariants for very large loops is time consuming and might need a lot of memory. */ - if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP) + if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP + ! optimize_loop_nest_for_size_p (loop)) move_single_loop_invariants (loop); } ChangeLog: 2012-06-28 Zhenqiang Chen zhenqiang.c...@arm.com * loop-invariant.c (move_loop_invariants): Skip move_single_loop_invariants when optimizing loop for size
RE: [PATCH] Disable loop2_invariant for -Os
-Original Message- From: Steven Bosscher [mailto:stevenb@gmail.com] Sent: 2012年6月27日 16:54 To: Zhenqiang Chen Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] Disable loop2_invariant for -Os On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: Hi, In general, invariant motion itself can not reduce code size. But it will change the liverange of the invariant, which might lead to more spilling. This may be true for ARM but it's not true in general. Sometimes loop-invariant Benchmark tests show it also benefits MIPS, PPC and X86 for code size. address arithmetic, that is not exposed in GIMPLE, is profitable to hoist out of the loop. See e.g. PR41026 (for which I still have a patch in the queue). If this goes in anyway, please mention PR39837 in your ChangeLog entry. It can not handle the case. Thanks! -Zhenqiang
Re: [PATCH] Disable loop2_invariant for -Os
On Thu, Jun 28, 2012 at 10:33 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it + will + change the liverange of the invariant, which increases the + register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) + return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Update it according to the comments. Thanks! -Zhenqiang diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f8405dd..b0e84a7 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -1931,7 +1931,8 @@ move_loop_invariants (void) curr_loop = loop; /* move_single_loop_invariants for very large loops is time consuming and might need a lot of memory. */ - if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP) + if (loop-num_nodes = (unsigned) LOOP_INVARIANT_MAX_BBS_IN_LOOP + ! optimize_loop_nest_for_size_p (loop)) move_single_loop_invariants (loop); Wait - move_single_loop_invariants itself already uses optimize_loop_for_speed_p. And looking down it seems to have support for tracking spill cost (eventually only with -fira-loop-pressure) - please work out why this support is not working for you. Richard. } ChangeLog: 2012-06-28 Zhenqiang Chen zhenqiang.c...@arm.com * loop-invariant.c (move_loop_invariants): Skip move_single_loop_invariants when optimizing loop for size
[PATCH] Disable loop2_invariant for -Os
Hi, In general, invariant motion itself can not reduce code size. But it will change the liverange of the invariant, which might lead to more spilling. The patch disables loop2_invariant when optimizing for size. I measured the code size benefit for four targets based on CSiBE benchmark: ARM: 0.33% MIPS: 1.15% PPC: 0.24% X86: 0.45% Is it OK for trunk? Thanks! -Zhenqiang ChangeLog: 2012-06-27 Zhenqiang Chen zhenqiang.c...@arm.com * loop-init.c (gate_rtl_move_loop_invariants): Disable loop2_invariant when optimizing function for size. diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it will + change the liverange of the invariant, which increases the register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) +return false; + return flag_move_loop_invariants; }
Re: [PATCH] Disable loop2_invariant for -Os
On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: Hi, In general, invariant motion itself can not reduce code size. It can expose CSE opportunities across loops though. But it will change the liverange of the invariant, which might lead to more spilling. might - indeed. I wonder what the trade-off is here ... but given that you leave tree loop invariant motion enabled it might not make much of a difference. Still as this is mostly a spilling issue it looks odd to do that generally. In fact you could improve things by only disabling motion when that increases register lifetime - it can after all reduce overall register lifetime: for (;;) inv = inv1 + inv2; ... use inv; to inv = inv1 + inv2; for (;;) ... use inv; has register lifetime reduced. Or at least like I suggest below. The patch disables loop2_invariant when optimizing for size. I measured the code size benefit for four targets based on CSiBE benchmark: ARM: 0.33% MIPS: 1.15% PPC: 0.24% X86: 0.45% Is it OK for trunk? Thanks! -Zhenqiang ChangeLog: 2012-06-27 Zhenqiang Chen zhenqiang.c...@arm.com * loop-init.c (gate_rtl_move_loop_invariants): Disable loop2_invariant when optimizing function for size. diff --git a/gcc/loop-init.c b/gcc/loop-init.c index 03f8f61..5d8cf73 100644 --- a/gcc/loop-init.c +++ b/gcc/loop-init.c @@ -273,6 +273,12 @@ struct rtl_opt_pass pass_rtl_loop_done = static bool gate_rtl_move_loop_invariants (void) { + /* In general, invariant motion can not reduce code size. But it will + change the liverange of the invariant, which increases the register + pressure and might lead to more spilling. */ + if (optimize_function_for_size_p (cfun)) + return false; + Can you do this per loop instead? Using optimize_loop_nest_for_size_p? Thanks, Richard. return flag_move_loop_invariants; }
Re: [PATCH] Disable loop2_invariant for -Os
On Wed, Jun 27, 2012 at 10:40 AM, Zhenqiang Chen zhenqiang.c...@arm.com wrote: Hi, In general, invariant motion itself can not reduce code size. But it will change the liverange of the invariant, which might lead to more spilling. This may be true for ARM but it's not true in general. Sometimes loop-invariant address arithmetic, that is not exposed in GIMPLE, is profitable to hoist out of the loop. See e.g. PR41026 (for which I still have a patch in the queue). If this goes in anyway, please mention PR39837 in your ChangeLog entry. Ciao! Steven