https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119357
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords|ra |
Status|NEW |ASSIGNED
Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
So, either we fix up the splitters so that they use appropriate predicate:
--- gcc/config/i386/sse.md.jj 2025-02-08 08:54:24.070260101 +0100
+++ gcc/config/i386/sse.md 2025-03-18 19:45:24.041656689 +0100
@@ -22406,7 +22406,7 @@
[(set (reg:CCZ FLAGS_REG)
(compare:CCZ (unspec:SI
[(eq:VI1_AVX2
- (match_operand:VI1_AVX2 0 "vector_operand")
+ (match_operand:VI1_AVX2 0 "register_operand")
(match_operand:VI1_AVX2 1 "const0_operand"))]
UNSPEC_MOVMSK)
(match_operand 2 "const_int_operand")))]
@@ -22443,7 +22443,7 @@
(match_operand:VI1_AVX2 3 "vector_all_ones_operand")
(match_operand:VI1_AVX2 4 "const0_operand")
(unspec:<avx512fmaskmode>
- [(match_operand:VI1_AVX2 0 "vector_operand")
+ [(match_operand:VI1_AVX2 0 "register_operand")
(match_operand:VI1_AVX2 1 "const0_operand")
(const_int 0)]
UNSPEC_PCMP))]
because all the vptest instructions have one operand with register_operand and
another with vector_operand and the splitters use the same operand for both,
Or perhaps better just force it into REG:
--- gcc/config/i386/sse.md.jj 2025-02-08 08:54:24.070260101 +0100
+++ gcc/config/i386/sse.md 2025-03-18 19:58:46.603529373 +0100
@@ -22414,7 +22414,8 @@
[(set (reg:CCZ FLAGS_REG)
(unspec:CCZ [(match_dup 0)
(match_dup 0)]
- UNSPEC_PTEST))])
+ UNSPEC_PTEST))]
+ "operands[0] = force_reg (<MODE>mode, operands[0]);")
(define_insn_and_split "*pmovsk_mask_cmp_<mode>_avx512"
[(set (reg:CCZ FLAGS_REG)
@@ -22455,7 +22456,8 @@
[(set (reg:CCZ FLAGS_REG)
(unspec:CCZ [(match_dup 0)
(match_dup 0)]
- UNSPEC_PTEST))])
+ UNSPEC_PTEST))]
+ "operands[0] = force_reg (<MODE>mode, operands[0]);")
(define_expand "sse2_maskmovdqu"
[(set (match_operand:V16QI 0 "memory_operand")
The difference on the testcase is
- vpxor %xmm0, %xmm0, %xmm0
- vpcmpeqb (%rdi), %xmm0, %xmm0
- vpmovmskb %xmm0, %eax
- cmpl $65535, %eax
+ vmovdqa (%rdi), %xmm0
+ vptest %xmm0, %xmm0
(first patch vs. second), so I think I'll test the latter.