https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65871
Bug ID: 65871
Summary: bzhi builtin/intrinsic wrongly assumes bzhi
instruction doesn't set the ZF flag
Product: gcc
Version: unknown
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: jamrial at gmail dot com
unsigned foo(void);
int main(void)
{
if (__builtin_ia32_bzhi_si(foo(), foo()))
return 1;
return 0;
}
Compiled with -mbmi2 -O3
0000000000000000 <main>:
0: 53 push rbx
1: e8 00 00 00 00 call 6 <main+0x6>
6: 89 c3 mov ebx,eax
8: e8 00 00 00 00 call d <main+0xd>
d: c4 e2 60 f5 c0 bzhi eax,eax,ebx
12: 85 c0 test eax,eax
14: 0f 95 c0 setne al
17: 0f b6 c0 movzx eax,al
1a: 5b pop rbx
1b: c3 ret
It generates a redundant test instruction. According to
http://www.felixcloutier.com/x86/BZHI.html bzhi already sets the ZF flag on its
own.
Same happens when using inline assembly instead of the builtin to generate the
bzhi instruction. In all cases reproducible with GCC 4.9.2 and GCC 5.1.0.
Didn't test the 4.8 branch or trunk.
This aside, it would be nice if gcc could generate a bzhi instruction on its
own if it detects "X & ((1 << Y) - 1)" where Y is not a constant, same as it
does for several other bmi and tbm instructions, instead of needing to use the
builtin (Which is only available when targeting bmi2).
I can open a new bug report for that if needed.