2015-08-03 17:04 GMT+03:00 Uros Bizjak <ubiz...@gmail.com>:
> On Mon, Aug 3, 2015 at 3:02 PM, Kyrill Tkachov <kyrylo.tkac...@arm.com> wrote:
>>
>> On 03/08/15 13:33, Uros Bizjak wrote:
>>>
>>> Hello!
>>>
>>>> 2015-07-30  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>>
>>>>      * ifcvt.c (noce_try_store_flag_constants): Make logic of the case
>>>>      when diff == STORE_FLAG_VALUE or diff == -STORE_FLAG_VALUE more
>>>>      explicit.  Prefer to add the flag whenever possible.
>>>>      (noce_process_if_block): Try noce_try_store_flag_constants before
>>>>      noce_try_cmove.
>>>>
>>>> 2015-07-30  Kyrylo Tkachov  <kyrylo.tkac...@arm.com>
>>>>
>>>>      * gcc.target/aarch64/csel_bfx_1.c: New test.
>>>>      * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.
>>>
>>> This patch regressed following tests on x86_64:
>>>
>>> FAIL: gcc.target/i386/cmov2.c scan-assembler sbb
>>> FAIL: gcc.target/i386/cmov3.c scan-assembler cmov[^3]
>
> The difference for cmov3.c on x86_64 is:
>
>        cmpl    %esi, %edi
>        movl    $-5, %edx
>        movl    $5, %eax
>        cmovg   %edx, %eax
>        ret
>
> vs. new code:
>
>        xorl    %eax, %eax
>        cmpl    %esi, %edi
>        setle   %al
>        negl    %eax
>        andl    $10, %eax
>        subl    $5, %eax
>        ret
>
> I'm not sure old code is really better than new. HJ, do you have any
> better insight?
>
> Uros.

The original code looks better, tree height is just 2 and therefore it
can be executed in 2 cycles. New code has more dependencies and tree
height becomes 5. It is always hard to say for all x86 targets but as
a generic code the original version is better.

Thanks,
Ilya

Reply via email to