[Bug tree-optimization/106076] New: Sub-optimal code is generated for checking bitfields via proxy functions

kyrylo.bohdanenko at gmail dot com via Gcc-bugs Fri, 24 Jun 2022 03:04:10 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106076


            Bug ID: 106076
           Summary: Sub-optimal code is generated for checking bitfields
                    via proxy functions
           Product: gcc
           Version: 13.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kyrylo.bohdanenko at gmail dot com
  Target Milestone: ---

Consider the following struct:

#include <cstdint>

struct SomeClass {
    uint16_t         dummy1 : 1;
    uint16_t         cfg2 : 1;
    uint16_t         cfg3 : 1;
    uint16_t         dummy2 : 1;
    uint16_t         dummy3 : 1;
    uint16_t         dummy4 : 1;
    uint16_t         cfg1 : 1;
    uint16_t         dummy5 : 1;
    uint16_t         cfg4 : 1;

    constexpr bool checkA() const { return cfg1 || cfg2 || cfg3; }
    constexpr bool checkB() const { return cfg4; }

    constexpr bool checkA_B() const { return (cfg1 || cfg2 || cfg3) || cfg4; }
    constexpr bool checkA_B_SLOW() const { return checkA() || checkB(); }
};


For the following functions (which do the same thing) GCC generates different
assembly.

bool check(const SomeClass& rt) {
    return rt.checkA_B();
}

bool check_SLOW(const SomeClass& rt) {
    return rt.checkA_B_SLOW();
}

Compiled as:

g++ -std=c++17 -S 

The assembly:

; demangled: check(SomeClass const&)
_Z5checkRK9SomeClass:
        endbr64
        testw   $326, (%rdi)
        setne   %al
        ret

; demangled: check_SLOW(SomeClass const&)
_Z10check_SLOWRK9SomeClass:
        endbr64
        movzwl  (%rdi), %edx
        movl    $1, %eax
        testb   $70, %dl
        jne     .L3
        movzbl  %dh, %eax
        andl    $1, %eax
.L3:
        ret

As we can see, during check_SLOW GCC decided to check the result on
byte-by-byte basis introducing a conditional jump in between. It looks like GCC
did not fully analyse the code after inlining checkA() and checkB().

FYI, the same code on Clang produces the 1st option of ASM for both functions.

[Bug tree-optimization/106076] New: Sub-optimal code is generated for checking bitfields via proxy functions

Reply via email to