https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124349

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Zdenek Sojka from comment #5)
> I observe the same issue with vcvthf82ph:
> 
> a.s:16: Error: operand size mismatch for `vcvthf82ph'
> 
>         vcvthf82ph      xmm0{k1}, XMMWORD PTR [rsp+8]   # tmp125, tmp130,
> tmp125, v128s8_0      # 25    [c=18 l=16]  vcvthf82phv8hf_mask
> 
> (generated due to __builtin_ia32_vcvthf82ph128_mask)
> 
> I will open a full separate PR if needed, later.

Binutils testsuite says for this insn that the memory qualifier should be half
the size of the first operand,
and for reg, reg cases xmm1, xmm2; ymm1, xmm2; zmm1, ymm2
avx10_2-256-cvt.s:      vcvthf82ph      xmm6{k7}, QWORD PTR
[esp+esi*8+0x10000000]
avx10_2-256-cvt.s:      vcvthf82ph      xmm6, QWORD PTR [ecx]
avx10_2-256-cvt.s:      vcvthf82ph      xmm6, QWORD PTR [ecx+1016]
avx10_2-256-cvt.s:      vcvthf82ph      xmm6{k7}{z}, QWORD PTR [edx-1024]
avx10_2-256-cvt.s:      vcvthf82ph      ymm6{k7}, XMMWORD PTR
[esp+esi*8+0x10000000]
avx10_2-256-cvt.s:      vcvthf82ph      ymm6, XMMWORD PTR [ecx]
avx10_2-256-cvt.s:      vcvthf82ph      ymm6, XMMWORD PTR [ecx+2032]
avx10_2-256-cvt.s:      vcvthf82ph      ymm6{k7}{z}, XMMWORD PTR [edx-2048]
avx10_2-512-cvt.s:      vcvthf82ph      zmm6{k7}, YMMWORD PTR
[esp+esi*8+0x10000000]
avx10_2-512-cvt.s:      vcvthf82ph      zmm6, YMMWORD PTR [ecx]
avx10_2-512-cvt.s:      vcvthf82ph      zmm6, YMMWORD PTR [ecx+4064]
avx10_2-512-cvt.s:      vcvthf82ph      zmm6{k7}{z}, YMMWORD PTR [edx-4096]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      xmm30{k7}, QWORD PTR
[rbp+r14*8+0x10000000]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      xmm30, QWORD PTR [r9]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      xmm30, QWORD PTR [rcx+1016]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      xmm30{k7}{z}, QWORD PTR
[rdx-1024]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      ymm30{k7}, XMMWORD PTR
[rbp+r14*8+0x10000000]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      ymm30, XMMWORD PTR [r9]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      ymm30, XMMWORD PTR [rcx+2032]
x86-64-avx10_2-256-cvt.s:       vcvthf82ph      ymm30{k7}{z}, XMMWORD PTR
[rdx-2048]
x86-64-avx10_2-512-cvt.s:       vcvthf82ph      zmm30{k7}, YMMWORD PTR
[rbp+r14*8+0x10000000]
x86-64-avx10_2-512-cvt.s:       vcvthf82ph      zmm30, YMMWORD PTR [r9]
x86-64-avx10_2-512-cvt.s:       vcvthf82ph      zmm30, YMMWORD PTR [rcx+4064]
x86-64-avx10_2-512-cvt.s:       vcvthf82ph      zmm30{k7}{z}, YMMWORD PTR
[rdx-4096]

So I think we want something like
--- gcc/config/i386/sse.md.jj   2026-03-04 09:38:22.303804750 +0100
+++ gcc/config/i386/sse.md      2026-03-04 11:19:43.379897634 +0100
@@ -32450,6 +32450,9 @@
 (define_mode_attr ssebvecmode_2
   [(V8HF "V16QI") (V16HF "V16QI") (V32HF "V32QI")])

+(define_mode_attr iptrssebvec_2
+  [(V8HF "q") (V16HF "") (V32HF "")])
+
 (define_int_iterator UNSPEC_VCVTBIASPH2FP8_PACK
    [UNSPEC_VCVTBIASPH2BF8 UNSPEC_VCVTBIASPH2BF8S
     UNSPEC_VCVTBIASPH2HF8 UNSPEC_VCVTBIASPH2HF8S])
@@ -32626,7 +32629,7 @@
          [(match_operand:<ssebvecmode_2> 1 "nonimmediate_operand" "vm")]
          UNSPEC_VCVTHF82PH))]
   "TARGET_AVX10_2"
-  "vcvthf82ph\t{%1, %0<mask_operand2>|%0<mask_operand2>, %1}"
+  "vcvthf82ph\t{%1, %0<mask_operand2>|%0<mask_operand2>, %<iptrssebvec_2>1}"
   [(set_attr "prefix" "evex")])

 (define_int_iterator VPDPWPROD

Reply via email to