Hi Uros,

With your new fix that add if-then-else splitting for memory operand I
got expected performance speed-up - +6.7% for Atom and +8.4% for SNB.
We need to do all testing this weekend and I will get you our final
feedback on Monday.

Thanks ahead for all your help.
Yuri.

2012/12/13 Uros Bizjak <ubiz...@gmail.com>:
> On Thu, Dec 13, 2012 at 4:02 PM, Yuri Rumyantsev <ysrum...@gmail.com> wrote:
>
>> We did not see any performance improvement on Atom in 32-bit mode at
>> routelookup from eembc_2_0 (eembc_1_1).
>
> I assume that for x86_64 the patch works as expected. Let's take a
> bigger hammer for 32bit targets - the splitter that effectively does
> the same as your proposed patch. Please note, that this splitter
> operates with optimize_function_for_speed_p predicate, so this
> transformation will take place in the whole function. Can you perhaps
> investigate what happens if this predicate is changed back to
> optimize_insn_for_speed_p () - this is what we would like to have
> here?
>
> ;; Don't do conditional moves with memory inputs.  This splitter helps
> ;; register starved x86_32 by forcing inputs into registers before reload.
> (define_split
>   [(set (match_operand:SWI248 0 "register_operand")
>         (if_then_else:SWI248 (match_operator 1 "ix86_comparison_operator"
>                                [(reg FLAGS_REG) (const_int 0)])
>           (match_operand:SWI248 2 "nonimmediate_operand")
>           (match_operand:SWI248 3 "nonimmediate_operand")))]
>   "!TARGET_64BIT && TARGET_CMOVE
>    && (MEM_P (operands[2]) || MEM_P (operands[3]))
>    && can_create_pseudo_p ()
>    && optimize_function_for_speed_p (cfun)"
>   [(set (match_dup 0)
>         (if_then_else:SWI248 (match_dup 1) (match_dup 2) (match_dup 3)))]
> {
>   if (MEM_P (operands[2]))
>     operands[2] = force_reg (<MODE>mode, operands[2]);
>   if (MEM_P (operands[3]))
>     operands[3] = force_reg (<MODE>mode, operands[3]);
> })
>
> Attached is the complete patch, including peephole2s.
>
> Uros.

Reply via email to