https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125178

            Bug ID: 125178
           Summary: [RISC-V] Missed optimization: Null-byte detection
                    idiom not lowered to orc.b with Zbb
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: [email protected]
  Target Milestone: ---
            Target: riscv
             Build: -O2 -march=rv64gc_zbb

Algorithm for null-byte detection: On RISC-V, when the Zbb extension is
available, this pattern can be optimally implemented using the orc.b
instruction. However, the current compiler fails to recognize this idiom and
generates a long sequence for constant construction and arithmetic.
This pattern has a use case in the has_zero function found in glibc's
string-fzb.h, which has the same logic, and a similar case in the has_zero
function found in Linux's word-at-a-time.h.

#include <stdbool.h>
#include <stdint.h>
bool has_zero_byte(uint64_t x) {
    return ((x - 0x0101010101010101ULL) & ~x & 0x8080808080808080ULL) != 0;
}

gcc trunk -O2 -march=rv64gc_zbb generates:
ASM:
        li      a4,-16842752
        addi    a4,a4,-257
        li      a3,2139062272
        addi    a3,a3,-129
        slli    a5,a4,32
        add     a4,a5,a4
        slli    a5,a3,32
        add     a4,a0,a4
        add     a5,a5,a3
        andn    a0,a4,a0
        xori    a5,a5,-1
        and     a0,a0,a5
        snez    a0,a0
        ret

I think that ASM below is how it should look like

has_zero_byte_asm:
        orc.b  a0, a0
        not    a0, a0
        snez   a0, a0
        ret

Reply via email to