On Thu, Jul 10, 2025 at 9:44 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, Jul 10, 2025 at 2:31 PM Uros Bizjak <ubiz...@gmail.com> wrote: > > > > On Thu, Jul 10, 2025 at 1:57 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557 > > > Author: H.J. Lu <hjl.to...@gmail.com> > > > Date: Thu Jun 26 06:08:51 2025 +0800 > > > > > > x86: Also handle all 1s float vector constant > > > > > > replaces > > > > > > (insn 29 28 30 5 (set (reg:V2SF 107) > > > (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0 S8 > > > A64])) 2031 > > > {*movv2sf_internal} > > > (expr_list:REG_EQUAL (const_vector:V2SF [ > > > (const_double:SF -QNaN [-QNaN]) repeated x2 > > > ]) > > > (nil))) > > > > > > with > > > > > > (insn 98 13 14 3 (set (reg:V8QI 112) > > > (const_vector:V8QI [ > > > (const_int -1 [0xffffffffffffffff]) repeated x8 > > > ])) -1 > > > (nil)) > > > ... > > > (insn 29 28 30 5 (set (reg:V2SF 107) > > > (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal} > > > (expr_list:REG_EQUAL (const_vector:V2SF [ > > > (const_double:SF -QNaN [-QNaN]) repeated x2 > > > ]) > > > (nil))) > > > > > > which leads to > > > > > > pr121015.c: In function ‘render_result_from_bake_h’: > > > pr121015.c:34:1: error: unrecognizable insn: > > > 34 | } > > > | ^ > > > (insn 98 13 14 3 (set (reg:V8QI 112) > > > (const_vector:V8QI [ > > > (const_int -1 [0xffffffffffffffff]) repeated x8 > > > ])) -1 > > > (expr_list:REG_EQUIV (const_vector:V8QI [ > > > (const_int -1 [0xffffffffffffffff]) repeated x8 > > > ]) > > > (nil))) > > > during RTL pass: ira > > > > > > 1. Update constm1_operand to also return true for integer and float all > > > 1s vectors. > > > 2. Add nonimm_or_0_or_m1_operand for nonimmediate, zero or -1 operand. > > > 3. Add BI for constant all 0s/1s operand. > > > 4. Update "*mov<mode>_internal" in mmx.md to handle integer all 1s > > > vectors. > > > 5. Update MMXMODE move splitter to also split all 1s source operand. > > > > > > gcc/ > > > > > > PR target/121015 > > > * config/i386/constraints.md (BI): New constraint. > > > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1_RTX. > > > * config/i386/mmx.md (*mov<mode>_internal): Replace C with BI > > > memory and integer register destination. > > > Update MMXMODE move splitter to also split all 1s source operand. > > > * config/i386/predicates.md (constm1_operand): Also return true > > > for int_float_vector_all_ones_operand. > > > (nonimm_or_0_or_m1_operand): New predicate. > > > > > > gcc/testsuite/ > > > > > > PR target/121015 > > > * gcc.target/i386/pr106022-2.c: Adjusted. > > > * gcc.target/i386/pr121015.c: New test. > > > > > > OK for master? > > > > +;; Match exactly -1. > > +(define_predicate "constm1_operand" > > + (ior (and (match_code "const_int") > > + (match_test "op == constm1_rtx")) > > + (match_operand 0 "int_float_vector_all_ones_operand"))) > > > > No, this predicate should not be repurposed to all-ones predicate.
I will change them to ;; Match vector 0 or integer vector -1. (define_predicate "vector_const0_or_m1_operand" (and (match_code "const_vector") (match_test "op == CONST0_RTX (GET_MODE (op)) || (INTEGRAL_MODE_P (GET_MODE (op)) && op == CONSTM1_RTX (GET_MODE (op)))"))) ; Return true when OP is a nonimmediate, vector 0 or integer vector -1. (define_predicate "nonimm_or_vector_const0_or_m1_operand" (ior (match_operand 0 "nonimmediate_operand") (match_operand 0 "vector_const0_or_m1_operand"))) > > > > For SSE we have a <sseconstm1> macro that defines different > > constraints for float and int moves, I think we should have the same > > approach for MMX. IMO, you also need to amend the <v,C> case. > > Assuming that problematic conversion only happens with SSE2+, IMO the > correct approach is to use nonimmediate_or_sse_const_operand predicate > in mov<MMXMODE:mode>_internal with: > > @@ -5448,6 +5448,7 @@ standard_sse_constant_p (rtx x, machine_mode pred_mode) > return 2; > break; > case 16: > + case 8: > if (TARGET_SSE2) > return 2; > break; > > and adding <sseconstm1> alternative after <v,C>. Something like: > > (define_insn "*mov<mode>_internal" > [(set (match_operand:MMXMODE 0 "nonimmediate_operand" > "=r ,o ,r,r ,m ,?!y,!y,?!y,m ,r ,?!y,v,v,v,v,m,r,v,!y,*x") > (match_operand:MMXMODE 1 "nonimmediate_or_sse_const_operand" > "rCo,rC,C,rm,rC,C ,!y,m ,?!y,?!y,r ,C,<sseconstm1>,v,m,v,v,r,*x,!y"))] 32-bit MMXMODE move splitter needs to support 8-byte all 1s vectors. > The predicate will only allow -1 with SSE2 (the new alternative should > also be enabled only with SSE2), and the register allocator will > always use "v" output for it, avoiding (-1) -> general reg -> xmm > sequence. Maybe also change mov<MMXMODE:mode>" expander to use > nonimmediate_or_sse_const_operand predicate > <sseconstm1> doesn't support 8-byte vector modes. We can use (define_constraint "BX" "@internal MMX vector constant all 0s/1s operand." (and (match_test "TARGET_MMX || TARGET_MMX_WITH_SSE") (match_operand 0 "vector_const0_or_m1_operand"))) and (define_insn "*mov<mode>_internal" [(set (match_operand:MMXMODE 0 "nonimmediate_operand" "=r ,o ,r ,r ,m ,?!y,!y,?!y,m ,r ,?!y,v ,v,v,m,r,v,!y,*x") (match_operand:MMXMODE 1 "nonimm_or_vector_const0_or_m1_operand" "rBXo,rBX,BX,rm,rBX,C ,!y,m ,?!y,?!y,r ,BX,v,m,v,v,r,*x,!y"))] ... (define_split [(set (match_operand:MMXMODE 0 "nonimmediate_gr_operand") (match_operand:MMXMODE 1 "vector_const0_or_m1_operand"))] "!TARGET_64BIT && reload_completed" [(const_int 0)] "ix86_split_long_move (operands); DONE;") -- H.J.