On Thu, Jul 10, 2025 at 9:44 PM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Thu, Jul 10, 2025 at 2:31 PM Uros Bizjak <ubiz...@gmail.com> wrote:
> >
> > On Thu, Jul 10, 2025 at 1:57 PM H.J. Lu <hjl.to...@gmail.com> wrote:
> > >
> > > commit 77473a27bae04da99d6979d43e7bd0a8106f4557
> > > Author: H.J. Lu <hjl.to...@gmail.com>
> > > Date:   Thu Jun 26 06:08:51 2025 +0800
> > >
> > >     x86: Also handle all 1s float vector constant
> > >
> > > replaces
> > >
> > > (insn 29 28 30 5 (set (reg:V2SF 107)
> > >         (mem/u/c:V2SF (symbol_ref/u:DI ("*.LC0") [flags 0x2]) [0  S8 
> > > A64])) 2031
> > >  {*movv2sf_internal}
> > >      (expr_list:REG_EQUAL (const_vector:V2SF [
> > >                 (const_double:SF -QNaN [-QNaN]) repeated x2
> > >             ])
> > >         (nil)))
> > >
> > > with
> > >
> > > (insn 98 13 14 3 (set (reg:V8QI 112)
> > >         (const_vector:V8QI [
> > >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> > >             ])) -1
> > >      (nil))
> > > ...
> > > (insn 29 28 30 5 (set (reg:V2SF 107)
> > >         (subreg:V2SF (reg:V8QI 112) 0)) 2031 {*movv2sf_internal}
> > >      (expr_list:REG_EQUAL (const_vector:V2SF [
> > >                 (const_double:SF -QNaN [-QNaN]) repeated x2
> > >             ])
> > >         (nil)))
> > >
> > > which leads to
> > >
> > > pr121015.c: In function ‘render_result_from_bake_h’:
> > > pr121015.c:34:1: error: unrecognizable insn:
> > >    34 | }
> > >       | ^
> > > (insn 98 13 14 3 (set (reg:V8QI 112)
> > >         (const_vector:V8QI [
> > >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> > >             ])) -1
> > >      (expr_list:REG_EQUIV (const_vector:V8QI [
> > >                 (const_int -1 [0xffffffffffffffff]) repeated x8
> > >             ])
> > >         (nil)))
> > > during RTL pass: ira
> > >
> > > 1. Update constm1_operand to also return true for integer and float all
> > > 1s vectors.
> > > 2. Add nonimm_or_0_or_m1_operand for nonimmediate, zero or -1 operand.
> > > 3. Add BI for constant all 0s/1s operand.
> > > 4. Update "*mov<mode>_internal" in mmx.md to handle integer all 1s 
> > > vectors.
> > > 5. Update MMXMODE move splitter to also split all 1s source operand.
> > >
> > > gcc/
> > >
> > > PR target/121015
> > > * config/i386/constraints.md (BI): New constraint.
> > > * config/i386/i386.cc (ix86_print_operand): Support CONSTM1_RTX.
> > > * config/i386/mmx.md (*mov<mode>_internal): Replace C with BI
> > > memory and integer register destination.
> > > Update MMXMODE move splitter to also split all 1s source operand.
> > > * config/i386/predicates.md (constm1_operand): Also return true
> > > for int_float_vector_all_ones_operand.
> > > (nonimm_or_0_or_m1_operand): New predicate.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/121015
> > > * gcc.target/i386/pr106022-2.c: Adjusted.
> > > * gcc.target/i386/pr121015.c: New test.
> > >
> > > OK for master?
> >
> > +;; Match exactly -1.
> > +(define_predicate "constm1_operand"
> > +  (ior (and (match_code "const_int")
> > +            (match_test "op == constm1_rtx"))
> > +       (match_operand 0 "int_float_vector_all_ones_operand")))
> >
> > No, this predicate should not be repurposed to all-ones predicate.

I will change them to

;; Match vector 0 or integer vector -1.
(define_predicate "vector_const0_or_m1_operand"
  (and (match_code "const_vector")
       (match_test "op == CONST0_RTX (GET_MODE (op))
                    || (INTEGRAL_MODE_P (GET_MODE (op))
                        && op == CONSTM1_RTX (GET_MODE (op)))")))

; Return true when OP is a nonimmediate, vector 0 or integer vector -1.
(define_predicate "nonimm_or_vector_const0_or_m1_operand"
  (ior (match_operand 0 "nonimmediate_operand")
       (match_operand 0 "vector_const0_or_m1_operand")))

> >
> > For SSE we have a <sseconstm1> macro that defines different
> > constraints for float and int moves, I think we should have the same
> > approach for MMX. IMO, you also need to amend the <v,C> case.
>
> Assuming that problematic conversion only happens with SSE2+, IMO the
> correct approach is to use nonimmediate_or_sse_const_operand predicate
> in mov<MMXMODE:mode>_internal with:
>
> @@ -5448,6 +5448,7 @@ standard_sse_constant_p (rtx x, machine_mode pred_mode)
>            return 2;
>          break;
>        case 16:
> +       case 8:
>          if (TARGET_SSE2)
>            return 2;
>          break;
>
> and adding <sseconstm1> alternative after <v,C>. Something like:
>
> (define_insn "*mov<mode>_internal"
>   [(set (match_operand:MMXMODE 0 "nonimmediate_operand"
>     "=r ,o ,r,r ,m ,?!y,!y,?!y,m  ,r  ,?!y,v,v,v,v,m,r,v,!y,*x")
>     (match_operand:MMXMODE 1 "nonimmediate_or_sse_const_operand"
>     "rCo,rC,C,rm,rC,C  ,!y,m  ,?!y,?!y,r  ,C,<sseconstm1>,v,m,v,v,r,*x,!y"))]

32-bit MMXMODE move splitter needs to support 8-byte all 1s vectors.

> The predicate will only allow -1 with SSE2 (the new alternative should
> also be enabled only with SSE2), and the register allocator will
> always use "v" output for it, avoiding (-1) -> general reg -> xmm
> sequence. Maybe also change mov<MMXMODE:mode>" expander to use
> nonimmediate_or_sse_const_operand predicate
>

<sseconstm1> doesn't support 8-byte vector modes.   We can use

(define_constraint "BX"
  "@internal MMX vector constant all 0s/1s operand."
  (and (match_test "TARGET_MMX || TARGET_MMX_WITH_SSE")
       (match_operand 0 "vector_const0_or_m1_operand")))

and

(define_insn "*mov<mode>_internal"
  [(set (match_operand:MMXMODE 0 "nonimmediate_operand"
    "=r  ,o  ,r ,r ,m  ,?!y,!y,?!y,m  ,r  ,?!y,v ,v,v,m,r,v,!y,*x")
    (match_operand:MMXMODE 1 "nonimm_or_vector_const0_or_m1_operand"
    "rBXo,rBX,BX,rm,rBX,C  ,!y,m  ,?!y,?!y,r  ,BX,v,m,v,v,r,*x,!y"))]
...
(define_split
  [(set (match_operand:MMXMODE 0 "nonimmediate_gr_operand")
    (match_operand:MMXMODE 1 "vector_const0_or_m1_operand"))]
  "!TARGET_64BIT && reload_completed"
  [(const_int 0)]
  "ix86_split_long_move (operands); DONE;")

-- 
H.J.

Reply via email to