https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125178
Bug ID: 125178
Summary: [RISC-V] Missed optimization: Null-byte detection
idiom not lowered to orc.b with Zbb
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: [email protected]
Target Milestone: ---
Target: riscv
Build: -O2 -march=rv64gc_zbb
Algorithm for null-byte detection: On RISC-V, when the Zbb extension is
available, this pattern can be optimally implemented using the orc.b
instruction. However, the current compiler fails to recognize this idiom and
generates a long sequence for constant construction and arithmetic.
This pattern has a use case in the has_zero function found in glibc's
string-fzb.h, which has the same logic, and a similar case in the has_zero
function found in Linux's word-at-a-time.h.
#include <stdbool.h>
#include <stdint.h>
bool has_zero_byte(uint64_t x) {
return ((x - 0x0101010101010101ULL) & ~x & 0x8080808080808080ULL) != 0;
}
gcc trunk -O2 -march=rv64gc_zbb generates:
ASM:
li a4,-16842752
addi a4,a4,-257
li a3,2139062272
addi a3,a3,-129
slli a5,a4,32
add a4,a5,a4
slli a5,a3,32
add a4,a0,a4
add a5,a5,a3
andn a0,a4,a0
xori a5,a5,-1
and a0,a0,a5
snez a0,a0
ret
I think that ASM below is how it should look like
has_zero_byte_asm:
orc.b a0, a0
not a0, a0
snez a0, a0
ret