Ping. On 05/29/2013 12:15 PM, Meador Inge wrote: > Hi All, > > This patch fixes a bug in one of the ARM peephole2 optimizations. The > peephole2 optimization in question was changed to use the CC-updating > form for all of the instructions produced by the peephole so that the > encoding will be smaller when compiling for thumb [1]. However, I don't > think that is always safe. > > For example, the CC register might be used by something *after* the > peephole window. The current peephole will transform: > > > (insn:TI 7 49 18 2 (set (reg:CC 24 cc) > (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) > (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn} > (nil)) > > (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc) > (const_int 0 [0])) > (set (reg:SI 3 r3 [140]) > (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp} > (expr_list:REG_EQUIV (const_int 0 [0]) > (nil))) > > (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc) > (const_int 0 [0])) > (set (reg:SI 3 r3 [138]) > (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp} > (expr_list:REG_EQUIV (const_int 1 [0x1]) > (nil))) > > (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc) > (const_int 0 [0])) > (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) > (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp} > (expr_list:REG_DEAD (reg/f:SI 2 r2 [143]) > (nil))) > > (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc) > (const_int 0 [0])) > (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) > (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp} > (nil)) > > (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc) > (const_int 0 [0])) > (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 > *endname_1(D)+0 S1 A8]) > (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn} > (expr_list:REG_DEAD (reg:CC 24 cc) > (expr_list:REG_DEAD (reg:QI 3 r3 [140]) > (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] > [135]) > (nil))))) > > into the following: > > > (insn 59 49 60 2 (parallel [ > (set (reg:CC 24 cc) > (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) > (const_int 0 [0]))) > (set (reg:SI 1 r1) > (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) > (const_int 0 [0]))) > ]) repro.c:6 -1 > (nil)) > > (insn 60 59 61 2 (parallel [ > (set (reg:CC 24 cc) > (compare:CC (const_int 0 [0]) > (reg:SI 1 r1))) > (set (reg:SI 3 r3 [140]) > (minus:SI (const_int 0 [0]) > (reg:SI 1 r1))) > ]) repro.c:6 -1 > (nil)) > > (insn 61 60 19 2 (parallel [ > (set (reg:SI 3 r3 [140]) > (plus:SI (plus:SI (reg:SI 3 r3 [140]) > (reg:SI 1 r1)) > (geu:SI (reg:CC 24 cc) > (const_int 0 [0])))) > (clobber (reg:CC 24 cc)) > ]) repro.c:6 -1 > (nil)) > > (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc) > (const_int 0 [0])) > (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) > (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp} > (nil)) > > (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc) > (const_int 0 [0])) > (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) > (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp} > (expr_list:REG_DEAD (reg/f:SI 2 r2 [143]) > (nil))) > > (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc) > (const_int 0 [0])) > (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 > *endname_1(D)+0 S1 A8]) > (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn} > (expr_list:REG_DEAD (reg:CC 24 cc) > (expr_list:REG_DEAD (reg:QI 3 r3 [140]) > (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] > [135]) > (nil))))) > > > This gets compiled into the incorrect sequence: > > > ldrb r3, [r0, #0] > ldr r2, .L4 > subs r1, r3, #0 > rsbs r3, r1, #0 > adcs r3, r3, r1 > strne r3, [r2, #0] > streq r3, [r2, #0] > strneb r3, [r0, #0] > > > The conditional stores are now dealing with an incorrect condition state. > > This patch fixes the problem by ensuring that the CC reg is dead after the > peephole window for the current peephole definition and falls back on the > original pre-PR46975 peephole when it is live. Unfortunately I had trouble > coming up with a reproduction case against mainline. I only noticed the bug > while working with some local changes that exposed it. > > Built and tested a full ARM GNU/Linux toolchain. No regressions in the GCC > test suite. > > OK? > > gcc/ > > 2013-05-29 Meador Inge <mead...@codesourcery.com> > > * config/arm/arm.md (conditional move peephole2): Only clobber CC > register when it is dead after the peephole window. > > [1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html > > Index: gcc/config/arm/arm.md > =================================================================== > --- gcc/config/arm/arm.md (revision 199414) > +++ gcc/config/arm/arm.md (working copy) > @@ -9978,29 +9978,48 @@ > ;; Attempt to improve the sequence generated by the compare_scc splitters > ;; not to use conditional execution. > (define_peephole2 > - [(set (reg:CC CC_REGNUM) > + [(set (match_operand 0 "cc_register" "") > (compare:CC (match_operand:SI 1 "register_operand" "") > (match_operand:SI 2 "arm_rhs_operand" ""))) > (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) > - (set (match_operand:SI 0 "register_operand" "") (const_int 0))) > + (set (match_operand:SI 3 "register_operand" "") (const_int 0))) > (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) > - (set (match_dup 0) (const_int 1))) > - (match_scratch:SI 3 "r")] > - "TARGET_32BIT" > + (set (match_dup 3) (const_int 1))) > + (match_scratch:SI 4 "r")] > + "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])" > [(parallel > [(set (reg:CC CC_REGNUM) > (compare:CC (match_dup 1) (match_dup 2))) > - (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) > + (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))]) > (parallel > [(set (reg:CC CC_REGNUM) > - (compare:CC (const_int 0) (match_dup 3))) > - (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) > + (compare:CC (const_int 0) (match_dup 4))) > + (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))]) > (parallel > - [(set (match_dup 0) > - (plus:SI (plus:SI (match_dup 0) (match_dup 3)) > + [(set (match_dup 3) > + (plus:SI (plus:SI (match_dup 3) (match_dup 4)) > (geu:SI (reg:CC CC_REGNUM) (const_int 0)))) > (clobber (reg:CC CC_REGNUM))])]) > > +(define_peephole2 > + [(set (reg:CC CC_REGNUM) > + (compare:CC (match_operand:SI 0 "register_operand" "") > + (match_operand:SI 1 "arm_rhs_operand" ""))) > + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) > + (set (match_operand:SI 2 "register_operand" "") (const_int 0))) > + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) > + (set (match_dup 2) (const_int 1))) > + (match_scratch:SI 3 "r")] > + "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])" > + [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1))) > + (parallel > + [(set (reg:CC CC_REGNUM) > + (compare:CC (const_int 0) (match_dup 3))) > + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))]) > + (set (match_dup 2) > + (plus:SI (plus:SI (match_dup 2) (match_dup 3)) > + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) > + > (define_insn "*cond_move" > [(set (match_operand:SI 0 "s_register_operand" "=r,r,r") > (if_then_else:SI (match_operator 3 "equality_operator" >
-- Meador Inge CodeSourcery / Mentor Embedded