On Mon, Jul 26, 2021 at 1:27 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> The following patch to the x86_64 backend improves the code generated
> for a decrement followed by a conditional move.  The primary change is
> to recognize that after subtracting one, checking the result is -1 (or
> equivalently that the original value was zero) can be implemented using
> the borrow/carry flag instead of requiring an explicit test instruction.
> This is achieved by a new define_insn_and_split that allows combine to
> split the desired sequence/composite into a *subsi_3 and *movsicc_noc.
>
> The other change with this patch is/are a pair of peephole2 optimizations
> to eliminate register-to-register moves generated during register
> allocation.  During reload, the compiler doesn't know that inverting
> the condition of a conditional cmove can sometimes reduce register
> pressure, but this is easy to tidy up during the peephole2 pass (where
> swapping the order of the insn's operands performs the required
> logic inversion).
>
> Both improvements are demonstrated by the case below:
>
> int foo(int x) {
>   if (x == 0)
>     x = 16;
>   else x--;
>   return x;
> }
>
> Before:
> foo:    leal    -1(%rdi), %eax
>         testl   %edi, %edi
>         movl    $16, %edx
>         cmove   %edx, %eax
>         ret
>
> After:
> foo:    subl    $1, %edi
>         movl    $16, %eax
>         cmovnc  %edi, %eax
>         ret
>
> And the value of the peephole2 clean-up can be seen on its own in:
>
> int bar(int x) {
>   x--;
>   if (x == 0)
>     x = 16;
>   return x;
> }
>
> Before:
> bar:    movl    %edi, %eax
>         movl    $16, %edx
>         subl    $1, %eax
>         cmove   %edx, %eax
>         ret
>
> After:
> bar:    subl    $1, %edi
>         movl    $16, %eax
>         cmovne  %edi, %eax
>         ret
>
> These idioms were inspired by the source code of NIST SciMark4's
> Random_nextDouble function, where the tweaks above result in
> a ~1% improvement in the MonteCarlo benchmark kernel.
>
> This patch has been tested on x86_64-pc-linux-gnu with a
> "make boostrap" and "make -k check" with no new failures.
>
> Ok for mainline?
>
>
> 2021-07-26  Roger Sayle  <ro...@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386.md (*dec_cmov<mode>): New define_insn_and_split
>         to generate a conditional move using the carry flag after sub $1.
>         (peephole2): Eliminate a register-to-register move by inverting
>         the condition of a conditional move.
>
> gcc/testsuite/ChangeLog
>         * gcc.target/i386/dec-cmov-1.c: New test.
>         * gcc.target/i386/dec-cmov-2.c: New test.

Please also allow ia32 in the testcases. #ifdef __x86_64__ 64bit
specific (long long) tests and add:

/* { dg-additional-options "-march=pentiumpro -mregparm=3" { target ia32 } } */

(cmov generation uses ancient ix86_arch_features, it gets enabled by
using -march=pentiumpro).

OK with the above change.

Thanks,
Uros.

Reply via email to