http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48634

           Summary: Missed optimization for use of __builtin_ctzll() and
                    __builtin_clzll
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: svfue...@gmail.com
            Target: amd64


unsigned long long foo(unsigned long long x)
{
    return __builtin_ctzll(x);
}

Compiles into

bsf    %rdi,%rax
cltq
retq

at -O3 with 4.6.0
The cltq instruction isn't needed because the bitscan instruction will zero out
the upper 32 bits of rax.  Basically, the return value of these intrinsics
should be unsigned long long instead of int on 64 bit machines.  The ABI means
that the reverse process of truncating back down to an int costs zero
instructions.

Reply via email to