Hi!

On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
> I guess that Comment #9 patch form the PR should be trivially correct,
> but althouhg it looks obvious, I don't want to propose the patch since
> I have no means of testing it.

I don't have means of testing it either.
https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
128-bits only) are call preserved.

Jonathan, could you please test this if it is sufficient to just change
CALL_USED_REGISTERS or if e.g. something in the pro/epilogue needs tweaking
too?  Thanks.

We are talking e.g. about
/* { dg-options "-O2 -mabi=ms -mavx512vl" } */

typedef double V __attribute__((vector_size (16)));
void foo (void);
V bar (void);
void baz (V);
void
qux (void)
{
  V c;
  {
    register V a __asm ("xmm18");
    V b = bar ();
    asm ("" : "=x" (a) : "0" (b));
    c = a;
  }
  foo ();
  {
    register V d __asm ("xmm18");
    V e;
    d = c;
    asm ("" : "=x" (e) : "0" (d));
    baz (e);
  }
}
where according to the MSDN doc gcc incorrectly holds the c value
in xmm18 register across the foo call; if foo is compiled by some Microsoft
compiler (or LLVM), then it could clobber %xmm18.
If all xmm18 occurrences are changed to say xmm15, then it is valid to hold
the 128-bit value across the foo call (though, surprisingly, LLVM saves it
into stack anyway).

2020-02-04  Uroš Bizjak  <ubiz...@gmail.com>

        * config/i386/config/i386/i386.h (CALL_USED_REGISTERS): Make
        xmm16-xmm31 call-used even in 64-bit ms-abi.

--- gcc/config/i386/config/i386/i386.h.jj       2020-01-22 10:19:24.199221986 
+0100
+++ gcc/config/i386/config/i386/i386.h  2020-02-04 12:09:12.338341003 +0100
@@ -1128,9 +1128,9 @@ extern const char *host_detect_local_cpu
 /*xmm8,xmm9,xmm10,xmm11,xmm12,xmm13,xmm14,xmm15*/              \
      6,   6,    6,    6,    6,    6,    6,    6,               \
 /*xmm16,xmm17,xmm18,xmm19,xmm20,xmm21,xmm22,xmm23*/            \
-     6,    6,     6,    6,    6,    6,    6,    6,             \
+     1,    1,     1,    1,    1,    1,    1,    1,             \
 /*xmm24,xmm25,xmm26,xmm27,xmm28,xmm29,xmm30,xmm31*/            \
-     6,    6,     6,    6,    6,    6,    6,    6,             \
+     1,    1,     1,    1,    1,    1,    1,    1,             \
  /* k0,  k1,  k2,  k3,  k4,  k5,  k6,  k7*/                    \
      1,   1,   1,   1,   1,   1,   1,   1 }
 


        Jakub

Reply via email to