https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50918
Andrew Pinski <pinskia at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization --- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- Since GCC 9 for test_cst, the following is produced: movdqa a(%rip), %xmm0 psllw $3, %xmm0 movaps %xmm0, r(%rip) ret For test_var, GCC produces the following on the trunk: test_var: movdqa a(%rip), %xmm0 movslq %edi, %rax movq %rax, %xmm2 pmovsxwd %xmm0, %xmm1 psrldq $8, %xmm0 pmovsxwd %xmm0, %xmm0 pslld %xmm2, %xmm1 pslld %xmm2, %xmm0 movdqa .LC0(%rip), %xmm2 pand %xmm2, %xmm1 pand %xmm0, %xmm2 movdqa %xmm1, %xmm0 packusdw %xmm2, %xmm0 movaps %xmm0, r(%rip) ret