https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200
--- Comment #11 from Venkataramanan <venkataramanan.kumar at amd dot com> --- Hi Richard On haswell machine original run time for -O3 -max2 -mprefer-avx2 real 2m35.325s user 2m35.257s sys 0m0.070s Changing the assembly from .L98: jle .L97 cmpl $2, %r9d jne .L97 .L99: To .L98: cmpl $2, %r9d jne .L97 cmpq $0, %rdi jle .L97 .L99: real 2m27.224s user 2m27.138s sys 0m0.087s improves run time. > -----Original Message----- > From: rguenth at gcc dot gnu.org [mailto:gcc-bugzi...@gcc.gnu.org] > Sent: Wednesday, November 9, 2016 6:02 PM > To: Kumar, Venkataramanan <venkataramanan.ku...@amd.com> > Subject: [Bug rtl-optimization/78200] [7 Regression] 429.mcf of cpu2006 > regresses in GCC trunk for avx2 target. > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78200 > > --- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> --- OTOH we > _do_ have initial RTL > > (insn 167 166 168 20 (set (reg:CCGOC 17 flags) > (compare:CCGOC (reg/v:DI 217 [ red_cost ]) > (const_int 0 [0]))) "pbeampp.c":42 -1 > (nil)) > (jump_insn 168 167 169 20 (set (pc) > (if_then_else (ge (reg:CCGOC 17 flags) > (const_int 0 [0])) > (label_ref 175) > (pc))) "pbeampp.c":42 -1 > (int_list:REG_BR_PROB 6400 (nil)) > -> 175) > ;; succ: 21 [36.0%] (FALLTHRU) > ;; 23 [64.0%] > > ;; basic block 23, loop depth 2, count 0, freq 1067, maybe hot ;; Invalid sum > of > incoming frequencies 1216, should be 1067 ;; prev block 22, next block 24, > flags: (NEW, REACHABLE, RTL, MODIFIED, > VISITED) > ;; pred: 20 [64.0%] > (code_label 175 173 176 23 98 "" [1 uses]) (note 176 175 177 23 [bb 23] > NOTE_INSN_BASIC_BLOCK) (insn 177 176 178 23 (set (reg:CCNO 17 flags) > (compare:CCNO (reg/v:DI 217 [ red_cost ]) > (const_int 0 [0]))) "pbeampp.c":42 -1 > (nil)) > (insn 178 177 179 23 (set (reg:QI 273) > (gt:QI (reg:CCNO 17 flags) > (const_int 0 [0]))) "pbeampp.c":42 -1 > (nil)) > (insn 179 178 180 23 (set (reg:CCZ 17 flags) > (compare:CCZ (reg:QI 273) > (const_int 0 [0]))) "pbeampp.c":42 -1 > (nil)) > (jump_insn 180 179 587 23 (set (pc) > (if_then_else (eq (reg:CCZ 17 flags) > (const_int 0 [0])) > (label_ref 196) > (pc))) "pbeampp.c":42 -1 > (int_list:REG_BR_PROB 3300 (nil)) > -> 196) > > that is, it compares in a sensible order allowing for combining (which > appearantly is what causes the code to run slower for not yet explored > reasons). > > Expanding the other way around does not have any justification IMHO and thus > the "fix" would be to the later stage where we combine the compare with the > one on the backedge. > > The issue is CSE2 which does > > (insn 167 166 168 21 (set (reg:CC 17 flags) > (compare:CC (reg/v:DI 217 [ red_cost ]) > (const_int 0 [0]))) "pbeampp.c":42 8 {*cmpdi_1} > (nil)) > (jump_insn 168 167 169 21 (set (pc) > (if_then_else (ge (reg:CC 17 flags) > (const_int 0 [0])) > (label_ref 175) > (pc))) "pbeampp.c":42 635 {*jcc_1} > (expr_list:REG_DEAD (reg:CC 17 flags) > (int_list:REG_BR_PROB 6400 (nil))) -> 175) ... > (insn 178 176 179 24 (set (reg:QI 273) > (gt:QI (reg:CC 17 flags) > (const_int 0 [0]))) "pbeampp.c":42 631 {*setcc_qi} > (expr_list:REG_DEAD (reg:CC 17 flags) > (nil))) > > thus changes the earlier compare to CC and re-uses that CCmode. Note it's > still > a mystery to me why this is slower (and I did not reproduce that myself yet). > > Then we combine it to > > (insn 167 166 168 18 (set (reg:CC 17 flags) > (compare:CC (reg/v:DI 217 [ red_cost ]) > (const_int 0 [0]))) "pbeampp.c":42 8 {*cmpdi_1} > (nil)) > (jump_insn 168 167 169 18 (set (pc) > (if_then_else (ge (reg:CC 17 flags) > (const_int 0 [0])) > (label_ref 175) > (pc))) "pbeampp.c":42 635 {*jcc_1} > (int_list:REG_BR_PROB 6400 (nil)) > -> 175) > ;; succ: 19 [36.0%] (FALLTHRU) > ;; 20 [64.0%] > > > ;; basic block 20, loop depth 0, count 0, freq 1067, maybe hot ;; Invalid sum > of > incoming frequencies 1216, should be 1067 (jump_insn 180 179 587 20 (set (pc) > (if_then_else (le (reg:CC 17 flags) > (const_int 0 [0])) > (label_ref:DI 196) > (pc))) "pbeampp.c":42 635 {*jcc_1} > (int_list:REG_BR_PROB 3300 (expr_list:REG_DEAD (reg:CCZ 17 flags) > (nil))) > > -- > You are receiving this mail because: > You reported the bug.