https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103071
Bug ID: 103071 Summary: Missed optimization for symmetric subset: (a & b) == a || (a & b) == b Product: gcc Version: 11.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: steinar+gcc at gunderson dot no Target Milestone: --- This is a bit of a long shot, but I'll file it anyway :-) I have this function in a hot path (of course, in the real project, it's inlined): #include <stdbool.h> #include <stdint.h> void foo(); void bar(); void EitherIsSubset(uint64_t v0, uint64_t v1) { if ((v0 & v1) == v0 || (v0 & v1) == v1) { foo(); } else { bar(); } } It is intended to treat v0 and v1 as bit sets, and then test whether either v0 or v1 is a subset of each other (or that they are equal). (An equivalent formulation happens to be replacing & with |.) GCC compiles (with -O2, x86-64) this to: EitherIsSubset: movq %rdi, %rax andq %rsi, %rax cmpq %rsi, %rax je .L4 cmpq %rdi, %rax je .L4 xorl %eax, %eax jmp bar@PLT .L4: xorl %eax, %eax jmp foo@PLT This is pretty straight-forward, but feels like it's using two (relatively hard-to-predict) branches where it should be possible to deal with one. And indeed, GNU superopt (!) found this amazing sequence instead, with v0 in eax and v1 in edx (this is, of course, trivially portable to 64-bit): 14: mov %edx,%ecx or %eax,%edx cmp %edx,%eax sbb %ebx,%ebx sbb %ecx,%edx adc $1,%ebx I can't claim to understand fully what it does, but after this, ebx contains either 0 or 1 with the right answer, and one would assume that after this, the zero flag is also usable to branch on (leaving us with one branch instead of two, in all). Is it possible to teach GCC this sequence? I tried using it as inline assembler, and while it works, it seems it becomes suboptimal and slower, because I can't return a condition code (so I get a redundant test): inline bool EitherIsSubsetAsm(uint64_t v0, uint64_t v1) { uint64_t tmp = v0 | v1; bool result; asm("cmp %1, %2 ; sbb %0, %0 ; sbb %3, %1 ; adc $1, %0" : "=r"(result), "+&r"(tmp) : "r"(v0), "r"(v1) : "cc"); return result; } void EitherIsSubset(uint64_t v0, uint64_t v1) { if (EitherIsSubsetAsm(v0, v1)) { foo(); } else { bar(); } }