https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95424

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jeff Law <l...@gcc.gnu.org>:

https://gcc.gnu.org/g:c2b610e7c6c89fd422c5c31f01023bcddf3cf4a5

commit r12-6924-gc2b610e7c6c89fd422c5c31f01023bcddf3cf4a5
Author: Zhao Wei Liew <zhaoweil...@gmail.com>
Date:   Fri Jan 28 13:36:39 2022 -0500

    match.pd: Simplify 1 / X for integer X [PR95424]

    This patch implements an optimization for the following C++ code:

    int f(int x) {
        return 1 / x;
    }

    int f(unsigned int x) {
        return 1 / x;
    }

    Before this patch, x86-64 gcc -std=c++20 -O3 produces the following
assembly:

    f(int):
        xor edx, edx
        mov eax, 1
        idiv edi
        ret
    f(unsigned int):
        xor edx, edx
        mov eax, 1
        div edi
        ret

    In comparison, clang++ -std=c++20 -O3 produces the following assembly:

    f(int):
        lea ecx, [rdi + 1]
        xor eax, eax
        cmp ecx, 3
        cmovb eax, edi
        ret
    f(unsigned int):
        xor eax, eax
        cmp edi, 1
        sete al
        ret

    Clang's output is more efficient as it avoids expensive div operations.

    With this patch, GCC now produces the following assembly:

    f(int):
        lea eax, [rdi + 1]
        cmp eax, 2
        mov eax, 0
        cmovbe eax, edi
        ret
    f(unsigned int):
        xor eax, eax
        cmp edi, 1
        sete al
        ret

    which is virtually identical to Clang's assembly output. Any slight
differences
    in the output for f(int) is possibly related to a different missed
optimization.

    v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587751.html
    Changes from v2:
    1. Refactor from using a switch statement to using the built-in
    if-else statement.

    v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-January/587634.html
    Changes from v1:
    1. Refactor common if conditions.
    2. Use build_[minus_]one_cst (type) to get -1/1 of the correct type.
    3. Match only for TRUNC_DIV_EXPR and TYPE_PRECISION (type) > 1.

    gcc/ChangeLog:

            PR tree-optimization/95424
            * match.pd: Simplify 1 / X where X is an integer.

Reply via email to