Hi,

Comments inlined.

> >+;; The attribute gives half modes for vector modes.
> >+(define_mode_attr VHMODE
> >+  [(V8HI "V16QI")
> >+   (V4SI "V8HI")
> >+   (V2DI "V4SI")
> >+   (V2DF "V4SF")])
> >+
> >+;; The attribute gives double modes for vector modes.
> >+(define_mode_attr VDMODE
> >+  [(V4SI "V2DI")
> >+   (V8HI "V4SI")
> >+   (V16QI "V8HI")])
> 
> Presumably there is a reason why this is not a mirror of VHMODE. I.e. it does
> not have floating point modes?

This is a mistake. The floating point mode in VHMODE is never used. Removed.

> >+;; The attribute gives half modes with same number of elements for vector
> modes.
> >+(define_mode_attr TRUNCMODE
> >+  [(V8HI "V8QI")
> >+   (V4SI "V4HI")
> >+   (V2DI "V2SI")])
> >+
> >+;; This attribute gives the mode of the result for "copy_s_b, copy_u_b" etc.
> >+(define_mode_attr RES
> >+  [(V2DF "DF")
> >+   (V4SF "SF")
> >+   (V2DI "DI")
> >+   (V4SI "SI")
> >+   (V8HI "SI")
> >+   (V16QI "SI")])
> 
> Verhaps prefix these with a 'V' to clarify they are vector mode attributes.

Done.


> >+;; This attribute gives define_insn suffix for MSA instructions with need
> 
> with => that need

Fixed.

> >+;; distinction between integer and floating point.
> >+(define_mode_attr msafmt_f
> >+  [(V2DF "d_f")
> >+   (V4SF "w_f")
> >+   (V2DI "d")
> >+   (V4SI "w")
> >+   (V8HI "h")
> >+   (V16QI "b")])
> >+
> >+;; To represent bitmask needed for vec_merge using 
> >"const_<bitmask>_operand".
> 
> Commenting style is different here. Everything else starts with: This 
> attribute
> ...

Changed.
> 
> >+(define_mode_attr bitmask
> >+  [(V2DF "exp_2")
> >+   (V4SF "exp_4")
> >+   (V2DI "exp_2")
> >+   (V4SI "exp_4")
> >+   (V8HI "exp_8")
> >+   (V16QI "exp_16")])
> >+
> >+;; This attribute used to form an immediate operand constraint using
> 
> used to => is used to

Fixed.

> 
> >+;; "const_<bitimm>_operand".
> >+(define_mode_attr bitimm
> >+  [(V16QI "uimm3")
> >+   (V8HI  "uimm4")
> >+   (V4SI  "uimm5")
> >+   (V2DI  "uimm6")
> >+  ])
> >+
> 
> >+(define_expand "fixuns_trunc<FMSA:mode><mode_i>2"
> >+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
> >+    (unsigned_fix:<VIMODE> (match_operand:FMSA 1 "register_operand" "f")))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  emit_insn (gen_msa_ftrunc_u_<msafmt> (operands[0], operands[1]));
> >+  DONE;
> >+})
> 
> The msa_ftrunc_u_* define_insns should just be renamed to use the standard
> pattern names and, more importantly, standard RTL not UNSPEC.

Renamed, define_expands removed. I also replaced FINT with VIMODE and removed
FINT mode attribute to avoid duplication.  
> 
> >+
> >+(define_expand "fix_trunc<FMSA:mode><mode_i>2"
> >+  [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
> >+    (fix:<VIMODE> (match_operand:FMSA 1 "register_operand" "f")))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  emit_insn (gen_msa_ftrunc_s_<msafmt> (operands[0], operands[1]));
> >+  DONE;
> >+})
> 
> Likewise.
> 
> >+
> >+(define_expand "vec_pack_trunc_v2df"
> >+  [(set (match_operand:V4SF 0 "register_operand")
> >+    (vec_concat:V4SF
> >+      (float_truncate:V2SF (match_operand:V2DF 1 "register_operand"))
> >+      (float_truncate:V2SF (match_operand:V2DF 2 "register_operand"))))]
> >+  "ISA_HAS_MSA"
> >+  "")
> 
> Rename msa_fexdo_w to vec_pack_trunc_v2df.
> 
> I see that fexdo has a 'halfword' variant which creates a half-float. What
> else can operate on half-float and should this really be recorded as an
> HFmode?
> 
> >+(define_expand "vec_unpacks_hi_v4sf"
> >+  [(set (match_operand:V2DF 0 "register_operand" "=f")
> >+    (float_extend:V2DF
> >+      (vec_select:V2SF
> >+        (match_operand:V4SF 1 "register_operand" "f")
> >+        (parallel [(const_int 0) (const_int 1)])
> 
> If we swap the (parallel) for a match_operand 2...
> 
> >+      )))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  if (BYTES_BIG_ENDIAN)
> >+    emit_insn (gen_msa_fexupr_d (operands[0], operands[1]));
> >+  else
> >+    emit_insn (gen_msa_fexupl_d (operands[0], operands[1]));
> 
> Then these two could change to set up operands[2] with either
> a parallel of 0/1 or 2/3 and then...
> 
> You could change the fexupr_d and fexupl_d insn patterns to use normal RTL
> that select the appropriate elements (either 0/1 and 2/3).
> 
> >+  DONE;
> 
> Which means the RTL in the pattern would be used to expand this and
> you would remove the DONE. As it stands the pattern on this expand
> is simply never used.

Reworked. The parallel expression for operands[2] is now generated.

> 
> >+})
> >+
> >+(define_expand "vec_unpacks_lo_v4sf"
> >+  [(set (match_operand:V2DF 0 "register_operand" "=f")
> >+    (float_extend:V2DF
> >+      (vec_select:V2SF
> >+        (match_operand:V4SF 1 "register_operand" "f")
> >+        (parallel [(const_int 0) (const_int 1)])
> >+      )))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  if (BYTES_BIG_ENDIAN)
> >+    emit_insn (gen_msa_fexupl_d (operands[0], operands[1]));
> >+  else
> >+    emit_insn (gen_msa_fexupr_d (operands[0], operands[1]));
> >+  DONE;
> >+})
> 
> Likewise but inverted.

As above.

> 
> >+
> >+(define_expand "vec_unpacks_hi_<mode>"
> >+  [(set (match_operand:<VDMODE> 0 "register_operand")
> >+    (match_operand:IMSA_WHB 1 "register_operand"))]
> 
> Not much point in having the (set) in here as it would be illegal anyway
> if it were actually expanded. Just list the two operands. (same throughout)
>

Done.
 
> >+  "ISA_HAS_MSA"
> >+{
> >+  mips_expand_vec_unpack (operands, false/*unsigned_p*/, true/*high_p*/);
> >+  DONE;
> >+})
> >+
> >+(define_expand "vec_unpacks_lo_<mode>"
> >+  [(set (match_operand:<VDMODE> 0 "register_operand")
> >+    (match_operand:IMSA_WHB 1 "register_operand"))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  mips_expand_vec_unpack (operands, false/*unsigned_p*/, false/*high_p*/);
> >+  DONE;
> >+})
> >+
> >+(define_expand "vec_unpacku_hi_<mode>"
> >+  [(set (match_operand:<VDMODE> 0 "register_operand")
> >+    (match_operand:IMSA_WHB 1 "register_operand"))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  mips_expand_vec_unpack (operands, true/*unsigned_p*/, true/*high_p*/);
> >+  DONE;
> >+})
> >+
> >+(define_expand "vec_unpacku_lo_<mode>"
> >+  [(set (match_operand:<VDMODE> 0 "register_operand")
> >+    (match_operand:IMSA_WHB 1 "register_operand"))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  mips_expand_vec_unpack (operands, true/*unsigned_p*/, false/*high_p*/);
> >+  DONE;
> >+})
> >+
> 
> >+(define_expand "vec_set<mode>"
> >+  [(match_operand:IMSA 0 "register_operand")
> >+   (match_operand:<UNITMODE> 1 "reg_or_0_operand")
> >+   (match_operand 2 "const_<indeximm>_operand")]
> >+  "ISA_HAS_MSA"
> >+{
> >+  emit_insn (gen_msa_insert_<msafmt>_insn (operands[0], operands[1],
> >+                                       operands[0],
> >+                                       GEN_INT(1 << INTVAL (operands[2]))));
> >+  DONE;
> >+})
> >+
> >+(define_expand "vec_set<mode>"
> >+  [(match_operand:FMSA 0 "register_operand")
> >+   (match_operand:<UNITMODE> 1 "register_operand")
> >+   (match_operand 2 "const_<indeximm>_operand")]
> >+  "ISA_HAS_MSA"
> >+{
> >+  emit_insn (gen_msa_insve_<msafmt_f>_s (operands[0], operands[0],
> >+                                     GEN_INT(1 << INTVAL (operands[2])),
> >+                                     operands[1]));
> >+  DONE;
> >+})
> >+
> >+(define_expand "vcondu<MSA:mode><IMSA:mode>"
> >+  [(set (match_operand:MSA 0 "register_operand")
> >+    (if_then_else:MSA
> >+      (match_operator 3 ""
> >+        [(match_operand:IMSA 4 "register_operand")
> >+         (match_operand:IMSA 5 "register_operand")])
> >+      (match_operand:MSA 1 "reg_or_m1_operand")
> >+      (match_operand:MSA 2 "reg_or_0_operand")))]
> >+  "ISA_HAS_MSA
> >+   && (GET_MODE_NUNITS (<MSA:MODE>mode) == GET_MODE_NUNITS
> (<IMSA:MODE>mode))"
> >+{
> >+  mips_expand_vec_cond_expr (<MSA:MODE>mode, <MSA:VIMODE>mode, operands);
> >+  DONE;
> >+})
> >+
> >+(define_expand "vcond<MSA:mode><MSA_2:mode>"
> >+  [(set (match_operand:MSA 0 "register_operand")
> >+    (if_then_else:MSA
> >+      (match_operator 3 ""
> >+        [(match_operand:MSA_2 4 "register_operand")
> >+         (match_operand:MSA_2 5 "register_operand")])
> >+      (match_operand:MSA 1 "reg_or_m1_operand")
> >+      (match_operand:MSA 2 "reg_or_0_operand")))]
> >+  "ISA_HAS_MSA
> >+   && (GET_MODE_NUNITS (<MSA:MODE>mode) == GET_MODE_NUNITS 
> >(<MSA:MODE>mode))"
> 
> Bug: This compares MSA to MSA. the second should be MSA_2.

Fixed.

> 
> >+{
> >+  mips_expand_vec_cond_expr (<MSA:MODE>mode, <MSA:VIMODE>mode, operands);
> >+  DONE;
> >+})
> >+
> >+;; Note used directly by builtins but via the following define_expand.
> >+(define_insn "msa_insert_<msafmt>_insn"
> >+  [(set (match_operand:IMSA 0 "register_operand" "=f")
> >+    (vec_merge:IMSA
> >+      (vec_duplicate:IMSA
> >+        (match_operand:<UNITMODE> 1 "reg_or_0_operand" "dJ"))
> >+      (match_operand:IMSA 2 "register_operand" "0")
> >+      (match_operand 3 "const_<bitmask>_operand" "")))]
> >+  "ISA_HAS_MSA"
> >+  "insert.<msafmt>\t%w0[%y3],%z1"
> >+  [(set_attr "type" "simd_insert")
> >+   (set_attr "mode" "<MODE>")])
> 
> Rename to remove _insn, see below. V2DI mode should have # as its pattern
> for 32-bit targets to ensure it gets split

Done.

> 
> >+
> >+;; Expand builtin for HImode and QImode which takes SImode.
> >+(define_expand "msa_insert_<msafmt>"
> >+  [(match_operand:IMSA 0 "register_operand")
> >+   (match_operand:IMSA 1 "register_operand")
> >+   (match_operand 2 "const_<indeximm>_operand")
> >+   (match_operand:<RES> 3 "reg_or_0_operand")]
> >+  "ISA_HAS_MSA"
> >+{
> >+  if ((GET_MODE_SIZE (<UNITMODE>mode) < GET_MODE_SIZE (<RES>mode))
> >+      && (REG_P (operands[3]) || (GET_CODE (operands[3]) == SUBREG
> >+                              && REG_P (SUBREG_REG (operands[3])))))
> >+    operands[3] = lowpart_subreg (<UNITMODE>mode, operands[3], <RES>mode);
> >+  emit_insn (gen_msa_insert_<msafmt>_insn (operands[0], operands[3],
> >+                                       operands[1],
> >+                                       GEN_INT(1 << INTVAL (operands[2]))));
> >+  DONE;
> >+})
> >+
> 
> Lets do this during mips_expand_builtin_insn like ilvl and friends. Having
> expanders
> simply to map the builtins to real instructions doesn't seem very useful.

Done.
> 
> >+(define_expand "msa_insert_<msafmt_f>"
> >+  [(match_operand:FMSA 0 "register_operand")
> >+   (match_operand:FMSA 1 "register_operand")
> >+   (match_operand 2 "const_<indeximm>_operand")
> >+   (match_operand:<UNITMODE> 3 "reg_or_0_operand")]
> >+  "ISA_HAS_MSA"
> >+{
> >+  emit_insn (gen_msa_insert_<msafmt_f>_insn (operands[0], operands[3],
> >+                                         operands[1],
> >+                                         GEN_INT(1 << INTVAL 
> >(operands[2]))));
> >+  DONE;
> >+})
> 
> Likewise.
> 
> >+
> >+(define_insn "msa_insert_<msafmt_f>_insn"
> >+  [(set (match_operand:FMSA 0 "register_operand" "=f")
> >+    (vec_merge:FMSA
> >+      (vec_duplicate:FMSA
> >+        (match_operand:<UNITMODE> 1 "register_operand" "d"))
> >+      (match_operand:FMSA 2 "register_operand" "0")
> >+      (match_operand 3 "const_<bitmask>_operand" "")))]
> >+  "ISA_HAS_MSA"
> >+  "insert.<msafmt>\t%w0[%y3],%z1"
> >+  [(set_attr "type" "simd_insert")
> >+   (set_attr "mode" "<MODE>")])
> 
> Rename to remove _insn, see above. V2DF mode should have # for 32-bit targets
> to
> ensure it gets split.

Likewise.
> 
> >+
> >+(define_split
> >+  [(set (match_operand:MSA_D 0 "register_operand")
> >+    (vec_merge:MSA_D
> >+      (vec_duplicate:MSA_D
> >+        (match_operand:<UNITMODE> 1 "<MSA_D:msa_d>_operand"))
> >+      (match_operand:MSA_D 2 "register_operand")
> >+      (match_operand 3 "const_<bitmask>_operand")))]
> >+  "reload_completed && TARGET_MSA && !TARGET_64BIT"
> >+  [(const_int 0)]
> >+{
> >+  mips_split_msa_insert_d (operands[0], operands[2], operands[3],
> operands[1]);
> >+  DONE;
> >+})
> 
> ...
> 
> >+(define_expand "msa_insve_<msafmt_f>"
> >+  [(set (match_operand:MSA 0 "register_operand")
> >+    (vec_merge:MSA
> >+      (vec_duplicate:MSA
> >+        (vec_select:<UNITMODE>
> >+          (match_operand:MSA 3 "register_operand")
> >+          (parallel [(const_int 0)])))
> >+      (match_operand:MSA 1 "register_operand")
> >+      (match_operand 2 "const_<indeximm>_operand")))]
> >+  "ISA_HAS_MSA"
> >+{
> >+  operands[2] = GEN_INT (1 << INTVAL (operands[2]));
> >+})
> 
> Like for insert patterns do this in mips_expand_builtin_insn and rename the
> instruction below to remove _insn.

Done.
> 
> >+(define_insn "msa_insve_<msafmt_f>_insn"
> >+  [(set (match_operand:MSA 0 "register_operand" "=f")
> >+    (vec_merge:MSA
> >+      (vec_duplicate:MSA
> >+        (vec_select:<UNITMODE>
> >+          (match_operand:MSA 3 "register_operand" "f")
> >+          (parallel [(const_int 0)])))
> >+      (match_operand:MSA 1 "register_operand" "0")
> >+      (match_operand 2 "const_<bitmask>_operand" "")))]
> >+  "ISA_HAS_MSA"
> >+  "insve.<msafmt>\t%w0[%y2],%w3[0]"
> >+  [(set_attr "type" "simd_insert")
> >+   (set_attr "mode" "<MODE>")])
> >+
> >+;; Operand 3 is a scalar.
> >+(define_insn "msa_insve_<msafmt>_f_s"
> 
> It would be more clear to have <msafmt_f> instead of <msafmt>_f. The 's' here
> is for scalar I believe. Perhaps spell it out and put scalar.

OK.
> 
> >+  [(set (match_operand:FMSA 0 "register_operand" "=f")
> >+    (vec_merge:FMSA
> >+      (vec_duplicate:FMSA
> >+        (match_operand:<UNITMODE> 3 "register_operand" "f"))
> >+      (match_operand:FMSA 1 "register_operand" "0")
> >+      (match_operand 2 "const_<bitmask>_operand" "")))]
> >+  "ISA_HAS_MSA"
> >+  "insve.<msafmt>\t%w0[%y2],%w3[0]"
> >+  [(set_attr "type" "simd_insert")
> >+   (set_attr "mode" "<MODE>")])
> 
> >+;; Note that copy_s.d and copy_s.d_f will be split later if !TARGET_64BIT.
> >+(define_insn "msa_copy_s_<msafmt_f>"
> >+  [(set (match_operand:<RES> 0 "register_operand" "=d")
> >+    (sign_extend:<RES>
> >+      (vec_select:<UNITMODE>
> >+        (match_operand:MSA 1 "register_operand" "f")
> >+        (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
> >+  "ISA_HAS_MSA"
> >+  "copy_s.<msafmt>\t%0,%w1[%2]"
> >+  [(set_attr "type" "simd_copy")
> >+   (set_attr "mode" "<MODE>")])
> 
> I think the splits should be explicit and therefore generate # for the the
> two patterns that will be split for !TARGET_64BIT. The sign_extend should
> only be present for V8HI and V16QI modes. There could be value in adding
> widening patterns to extend to DImode on 64-bit targets but it may not
> trigger much.
> 
> >+(define_split
> >+  [(set (match_operand:<UNITMODE> 0 "register_operand")
> >+    (sign_extend:<UNITMODE>
> >+      (vec_select:<UNITMODE>
> >+        (match_operand:MSA_D 1 "register_operand")
> >+        (parallel [(match_operand 2 "const_0_or_1_operand")]))))]
> >+  "reload_completed && TARGET_MSA && !TARGET_64BIT"
> >+  [(const_int 0)]
> >+{
> >+  mips_split_msa_copy_d (operands[0], operands[1], operands[2],
> >+                     gen_msa_copy_s_w);
> >+  DONE;
> >+})
> >+
> >+;; Note that copy_u.d and copy_u.d_f will be split later if !TARGET_64BIT.
> >+(define_insn "msa_copy_u_<msafmt_f>"
> >+  [(set (match_operand:<RES> 0 "register_operand" "=d")
> >+    (zero_extend:<RES>
> >+      (vec_select:<UNITMODE>
> >+        (match_operand:MSA 1 "register_operand" "f")
> >+        (parallel [(match_operand 2 "const_<indeximm>_operand" "")]))))]
> >+  "ISA_HAS_MSA"
> >+  "copy_u.<msafmt>\t%0,%w1[%2]"
> >+  [(set_attr "type" "simd_copy")
> >+   (set_attr "mode" "<MODE>")])
> 
> Likewise on all accounts except that V2SI mode should not be included at all
> here as we need to use the copy_s for V2SI->SImode on 64-bit targets to get
> the correct canonical result.
> 
> >+(define_split
> >+  [(set (match_operand:<UNITMODE> 0 "register_operand")
> >+    (zero_extend:<UNITMODE>
> >+      (vec_select:<UNITMODE>
> >+        (match_operand:MSA_D 1 "register_operand")
> >+        (parallel [(match_operand 2 "const_0_or_1_operand")]))))]
> >+  "reload_completed && TARGET_MSA && !TARGET_64BIT"
> >+  [(const_int 0)]
> >+{
> >+  mips_split_msa_copy_d (operands[0], operands[1], operands[2],
> >+                     gen_msa_copy_u_w);
> 
> This should use copy_s so that when the 32-bit code is used on a 64-bit arch
> then the data stays in canonical form. The copy_u should never be used on
> a 32-bit target, I don't think it should even exist in the MSA32 spec
> actually. I am discussing this with our architecture team and will update
> the thread with the outcome.
> 
> >+  DONE;
> >+})

Refactored throughout.

> 
> = mips.c =
> 
> >+/* Expand VEC_COND_EXPR, where:
> >+   MODE is mode of the result
> >+   VIMODE equivalent integer mode
> >+   OPERANDS operands of VEC_COND_EXPR.  */
> >+
> >+void
> >+mips_expand_vec_cond_expr (machine_mode mode, machine_mode vimode,
> >+                       rtx *operands)
> >+{
> >+  rtx cond = operands[3];
> >+  rtx cmp_op0 = operands[4];
> >+  rtx cmp_op1 = operands[5];
> >+  rtx cmp_res = gen_reg_rtx (vimode);
> >+
> >+  mips_expand_msa_cmp (cmp_res, GET_CODE (cond), cmp_op0, cmp_op1);
> >+
> >+  /* We handle the following cases:
> >+     1) r = a CMP b ? -1 : 0
> >+     2) r = a CMP b ? -1 : v
> >+     3) r = a CMP b ?  v : 0
> >+     4) r = a CMP b ? v1 : v2  */
> >+
> >+  /* Case (1) above.  We only move the results.  */
> >+  if (operands[1] == CONSTM1_RTX (vimode)
> >+      && operands[2] == CONST0_RTX (vimode))
> >+    emit_move_insn (operands[0], cmp_res);
> >+  else
> >+    {
> >+      rtx src1 = gen_reg_rtx (vimode);
> >+      rtx src2 = gen_reg_rtx (vimode);
> >+      rtx mask = gen_reg_rtx (vimode);
> >+      rtx bsel;
> >+
> >+      /* Move the vector result to use it as a mask.  */
> >+      emit_move_insn (mask, cmp_res);
> >+
> >+      if (register_operand (operands[1], mode))
> >+    {
> >+      rtx xop1 = operands[1];
> >+      if (mode != vimode)
> >+        {
> >+          xop1 = gen_reg_rtx (vimode);
> >+          emit_move_insn (xop1, gen_rtx_SUBREG (vimode, operands[1], 0));
> >+        }
> >+      emit_move_insn (src1, xop1);
> >+    }
> >+      else
> >+    /* Case (2) if the below doesn't move the mask to src2.  */
> >+    emit_move_insn (src1, mask);
> 
> Please assert that operands[1] is constm1.

Added.

> 
> >+
> >+      if (register_operand (operands[2], mode))
> >+    {
> >+      rtx xop2 = operands[2];
> >+      if (mode != vimode)
> >+        {
> >+          xop2 = gen_reg_rtx (vimode);
> >+          emit_move_insn (xop2, gen_rtx_SUBREG (vimode, operands[2], 0));
> >+        }
> >+      emit_move_insn (src2, xop2);
> >+    }
> >+      else
> >+    /* Case (3) if the above didn't move the mask to src1.  */
> >+    emit_move_insn (src2, mask);
> 
> Please assert that operands[2] is const0.

Done.
> 
> >+
> >+      /* We deal with case (4) if the mask wasn't moved to either src1 or
> src2.
> >+     In any case, we eventually do vector mask-based copy.  */
> >+      bsel = gen_rtx_UNSPEC (vimode, gen_rtvec (3, mask, src2, src1),
> >+                         UNSPEC_MSA_BSEL_V);
> >+      /* The result is placed back to a register with the mask.  */
> >+      emit_insn (gen_rtx_SET (mask, bsel));
> 
> I guess you expand like this instead of gen_ to avoid having to select the
> function
> based on vimode.

Correct. I thought it would be slightly cleaner to expand this way.

> 
> >+      emit_move_insn (operands[0], gen_rtx_SUBREG (mode, mask, 0));
> >+    }
> >+}
> >+
> 
> Thanks,
> Matthew

Regards,
Robert

Reply via email to