https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110170

--- Comment #21 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The releases/gcc-11 branch has been updated by hongtao Liu
<liuho...@gcc.gnu.org>:

https://gcc.gnu.org/g:0d005deb6c8a956b4f7ccb6e70e8e7830a40fed9

commit r11-11065-g0d005deb6c8a956b4f7ccb6e70e8e7830a40fed9
Author: liuhongt <hongtao....@intel.com>
Date:   Wed Jul 5 13:45:11 2023 +0800

    Disparage slightly for the alternative which move DFmode between SSE_REGS
and GENERAL_REGS.

    For testcase

    void __cond_swap(double* __x, double* __y) {
      bool __r = (*__x < *__y);
      auto __tmp = __r ? *__x : *__y;
      *__y = __r ? *__y : *__x;
      *__x = __tmp;
    }

    GCC-14 with -O2 and -march=x86-64 options generates the following code:

    __cond_swap(double*, double*):
            movsd   xmm1, QWORD PTR [rdi]
            movsd   xmm0, QWORD PTR [rsi]
            comisd  xmm0, xmm1
            jbe     .L2
            movq    rax, xmm1
            movapd  xmm1, xmm0
            movq    xmm0, rax
    .L2:
            movsd   QWORD PTR [rsi], xmm1
            movsd   QWORD PTR [rdi], xmm0
            ret

    rax is used to save and restore DFmode value. In RA both GENERAL_REGS
    and SSE_REGS cost zero since we didn't disparage the
    alternative in movdf_internal pattern, according to register
    allocation order, GENERAL_REGS is allocated. The patch add ? for
    alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal
    pattern, after that we get optimal RA.

    __cond_swap:
    .LFB0:
            .cfi_startproc
            movsd   (%rdi), %xmm1
            movsd   (%rsi), %xmm0
            comisd  %xmm1, %xmm0
            jbe     .L2
            movapd  %xmm1, %xmm2
            movapd  %xmm0, %xmm1
            movapd  %xmm2, %xmm0
    .L2:
            movsd   %xmm1, (%rsi)
            movsd   %xmm0, (%rdi)
            ret

    gcc/ChangeLog:

            PR target/110170
            * config/i386/i386.md (movdf_internal): Disparage slightly for
            2 alternatives (r,v) and (v,r) by adding constraint modifier
            '?'.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr110170-3.c: New test.

    (cherry picked from commit 37a231cc7594d12ba0822077018aad751a6fb94e)

Reply via email to