https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125357

--- Comment #5 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <[email protected]>:

https://gcc.gnu.org/g:daf225605e85d17f5c6c1a205918e44b3b1eccba

commit r17-593-gdaf225605e85d17f5c6c1a205918e44b3b1eccba
Author: Jakub Jelinek <[email protected]>
Date:   Tue May 19 10:11:08 2026 +0200

    i386: Use vpermilps for some non-const permutations [PR125357]

    We don't use vpermilps insn for V4S[IF]mode variable permutations on
    TARGET_AVX without TARGET_AVX512*.  For TARGET_AVX512* there are plenty
    of permutation instructions already.  For TARGET_AVX2, the function has
    special cases for one_operand_shuffle for V8SImode/V8SFmode and emits
    reasonable code, but for V4SImode/V4SFmode with TARGET_AVX2 it handles
    those using V8SImode/V8SFmode as two operand shuffle, which requires
    2 preparation instructions, vpermd and one finalization instruction.
    And for !TARGET_AVX2 && TARGET_AVX we just emit terrible code for these.

    So, the following patch uses vpermilps for V4S[IF]mode one_operand_shuffle.

    Trying to handle V8S[IF]mode is not worth it, for TARGET_AVX2 we already
    emit good code (see above) and for !TARGET_AVX2 && TARGET_AVX V8SImode
    mask is not valid vector mode, so we emit terrible code no matter what.

    2026-05-19  Jakub Jelinek  <[email protected]>

            PR target/125357
            * config/i386/i386-expand.cc (ix86_expand_vec_perm): For
            one_operand_shuffle if TARGET_AVX and not TARGET_AVX512F use
            vpermilps for V4SImode/V4SFmode.  Formatting fix.

            * gcc.target/i386/avx-pr125357.c: New test.
            * gcc.target/i386/avx2-pr125357.c: New test.

    Reviewed-by: Hongtao Liu <[email protected]>

Reply via email to