https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114514

--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuho...@gcc.gnu.org>:

https://gcc.gnu.org/g:090714e6cf8029f4ff8883dce687200024adbaeb

commit r15-530-g090714e6cf8029f4ff8883dce687200024adbaeb
Author: liuhongt <hongtao....@intel.com>
Date:   Wed May 15 10:56:24 2024 +0800

    Set d.one_operand_p to true when TARGET_SSSE3 in
ix86_expand_vecop_qihi_partial.

    pshufb is available under TARGET_SSSE3, so
    ix86_expand_vec_perm_const_1 must return true when TARGET_SSSE3.

    With the patch under -march=x86-64-v2

    v8qi
    foo (v8qi a)
    {
      return a >> 5;
    }

    <       pmovsxbw        %xmm0, %xmm0
    <       psraw   $5, %xmm0
    <       pshufb  .LC0(%rip), %xmm0

            vs.

    >       movdqa  %xmm0, %xmm1
    >       pcmpeqd %xmm0, %xmm0
    >       pmovsxbw        %xmm1, %xmm1
    >       psrlw   $8, %xmm0
    >       psraw   $5, %xmm1
    >       pand    %xmm1, %xmm0
    >       packuswb        %xmm0, %xmm0

    Although there's a memory load from constant pool, but it should be
    better when it's inside a loop. The load from constant pool can be
    hoist out. it's 1 instruction vs 4 instructions.

    <       pshufb  .LC0(%rip), %xmm0

    vs.

    >       pcmpeqd %xmm0, %xmm0
    >       psrlw   $8, %xmm0
    >       pand    %xmm1, %xmm0
    >       packuswb        %xmm0, %xmm0

    gcc/ChangeLog:

            PR target/114514
            * config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial):
            Set d.one_operand_p to true when TARGET_SSSE3.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr114514-shufb.c: New test.

Reply via email to