https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84101

--- Comment #30 from Chris Hall <gcc at gmch dot uk> ---
godbolt shows that gcc v9.1 -O3 generates:

   0000: 8d 04 3f   lea    (%rdi,%rdi,1),%eax
   0003: d1 ff      sar    $1,%edi
   0005: 48 98      cltq
   0007: 48 63 d7   movslq %edi,%rdx
   000A: c3         ret
   000B:

The last earlier version available on godbolt is v8.5, which generates 47 bytes
of code shuffling stuff to and from %xmm0.

gcc v13.3 generates the same code as v9.1.  I haven't tried all the intervening
versions, but all the ones I did try also gave the same.

BUT, v14.1 generates:

   0000: 8d 14 3f   lea    (%rdi,%rdi,1),%edx
   0003: d1 ff      sar    $1,%edi
   0005: 48 63 c7   movslq %edi,%rax
   0008: 48 63 d2   movslq %edx,%rdx
   000B: 48 92      xchg   %rax,%rdx
   000D: c3         ret
   000E:

The difference is clearly trivial (1 extra instruction and 3 extra bytes)...
but I don't see why it would choose %edx instead of %eax for the first
operation ?

Reply via email to