Hi Uros, With your new fix that add if-then-else splitting for memory operand I got expected performance speed-up - +6.7% for Atom and +8.4% for SNB. We need to do all testing this weekend and I will get you our final feedback on Monday.
Thanks ahead for all your help. Yuri. 2012/12/13 Uros Bizjak <ubiz...@gmail.com>: > On Thu, Dec 13, 2012 at 4:02 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote: > >> We did not see any performance improvement on Atom in 32-bit mode at >> routelookup from eembc_2_0 (eembc_1_1). > > I assume that for x86_64 the patch works as expected. Let's take a > bigger hammer for 32bit targets - the splitter that effectively does > the same as your proposed patch. Please note, that this splitter > operates with optimize_function_for_speed_p predicate, so this > transformation will take place in the whole function. Can you perhaps > investigate what happens if this predicate is changed back to > optimize_insn_for_speed_p () - this is what we would like to have > here? > > ;; Don't do conditional moves with memory inputs. This splitter helps > ;; register starved x86_32 by forcing inputs into registers before reload. > (define_split > [(set (match_operand:SWI248 0 "register_operand") > (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator" > [(reg FLAGS_REG) (const_int 0)]) > (match_operand:SWI248 2 "nonimmediate_operand") > (match_operand:SWI248 3 "nonimmediate_operand")))] > "!TARGET_64BIT && TARGET_CMOVE > && (MEM_P (operands[2]) || MEM_P (operands[3])) > && can_create_pseudo_p () > && optimize_function_for_speed_p (cfun)" > [(set (match_dup 0) > (if_then_else:SWI248 (match_dup 1) (match_dup 2) (match_dup 3)))] > { > if (MEM_P (operands[2])) > operands[2] = force_reg (<MODE>mode, operands[2]); > if (MEM_P (operands[3])) > operands[3] = force_reg (<MODE>mode, operands[3]); > }) > > Attached is the complete patch, including peephole2s. > > Uros.