https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96034
Bug ID: 96034 Summary: missed optimization with extended registers Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: sshannin at gmail dot com Target Milestone: --- Noticed in example for PR96009. Consider this simple function: double bar(char i) { return i; } Compiled with -O3, we get: movsbl %dil, %edi pxor %xmm0, %xmm0 cvtsi2sdl %edi, %xmm0 ret But aren't the movsb and pxor unnecessary? I think we should be able to just cvtsi2sd and then ret. Interestingly, compiling with -OS instead of -O3 manages to remove the pxor: movsbl %dil, %edi cvtsi2sdl %edi, %xmm0 ret Which is a one instruction better (unless -O3 is trying to keep the pxor for alignment?), but even here I think the movsb could still go too. Closest thing I find is PR48701, in that it also doesn't seem to understand which registers are the same. seth@fr-dev3:$ /toolchain14/bin/gcc -v Using built-in specs. COLLECT_GCC=/toolchain14/bin/gcc COLLECT_LTO_WRAPPER=/toolchain14/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc_9_1_0/configure --prefix=/toolchain14 --enable-languages=c,c++,fortran --enable-lto --disable-plugin --program-suffix=-9.1.0 --disable-multilib Thread model: posix gcc version 9.1.0 (GCC)