On Fri, 2020-05-22 at 13:27 -0700, Carl Love wrote: > GCC maintainers: > > The following patch adds support for builtins > vec_genbm(), vec_genhm(), > vec_genwm(), vec_gendm(), vec_genqm(), vec_cntm(), vec_expandm(), > vec_extractm(). Support for instructions mtvsrbm, mtvsrhm, mtvsrwm, > mtvsrdm, mtvsrqm, cntm, vexpandm, vextractm. > > The test has been tested on: > > powerpc64le-unknown-linux-gnu (Power 9 LE) > > and mambo with no regression errors. > > Please let me know if this patch is acceptable for mainline. > > Thanks. > > Carl Love > ------------------------------------------------------------------- > > RS6000 RFC 2629, add VSX mask manipulation support > > gcc/ChangeLog > > 2020-05-22 Carl Love <c...@us.ibm.com> > > * config/rs6000/vsx.md (VSX_MM): New define_mode_iterator. > (VSX_MM4): New define_mode_iterator. > (VSX_MM_SUFFIX4): New define_mode_attr. > (vec_mtvsrbm): New define_expand. > (vec_mtvsrbmi): New define_insn. > (vec_mtvsr_<mode>): New define_insn. > (vec_cntmb_<mode>): New define_insn. > (vec_extract_<mode>): New define_insn. > (vec_expand_<mode>): New define_insn. > (define_c_enum unspec): Add entries UNSPEC_MTVSBM, UNSPEC_VCNTMB, > UNSPEC_VEXTRACT, UNSPEC_VEXPAND. > * config/rs6000/altivec.h: Add defines vec_genbm, vec_genhm, vec_genwm, > vec_gendm, vec_genqm, vec_cntm, vec_expandm, vec_extractm.
Nit (?) Name/symbol first. i.e. (vec_genbm, vec_genhm,...) Add definitions. > * config/rs6000/rs6000-builtin.c: Add defines BU_FUTURE_2, BU_FUTURE_1. > (BU_FUTURE_1): Add definitions for mtvsrbm, mtvsrhm, mtvsrwm, > mtvsrdm, mtvsrqm, vexpandmb, vexpandmh, vexpandmw, vexpandmd, vexpandmq, > vextractmb, vextractmh, vextractmw, vextractmd, vextractmq. > (BU_FUTURE_2): Add definitions for cntmbb, cntmbh, cntmbw, cntmbd. > (BU_FUTURE_OVERLOAD_1): Add definitions for mtvsrbm, mtvsrhm, > mtvsrwm, mtvsrdm, mtvsrqm, vexpandm, vextractm. > (BU_FUTURE_OVERLOAD_2): Add defition for cntm. > * config/rs6000/rs6000-call.c (rs6000_expand_binop_builtin): Add > checks for CODE_FOR_vec_cntmbb_v16qi, CODE_FOR_vec_cntmb_v8hi, > CODE_FOR_vec_cntmb_v4si, CODE_FOR_vec_cntmb_v2di. > (altivec_overloaded_builtins): Add overloaded argument entries for > FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_VEC_MTVSRHM, > FUTURE_BUILTIN_VEC_MTVSRWM, > FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_VEC_MTVSRQM, > FUTURE_BUILTIN_VEC_VCNTMBB, > FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH, FUTURE_BUILTIN_VCNTMBW, > FUTURE_BUILTIN_VCNTMBD, FUTURE_BUILTIN_VEXPANDMB, > FUTURE_BUILTIN_VEXPANDMH, > FUTURE_BUILTIN_VEXPANDMW, FUTURE_BUILTIN_VEXPANDMD, > FUTURE_BUILTIN_VEXPANDMQ, > FUTURE_BUILTIN_VEXTRACTMB, FUTURE_BUILTIN_VEXTRACTMH, > FUTURE_BUILTIN_VEXTRACTMW, > FUTURE_BUILTIN_VEXTRACTMD, FUTURE_BUILTIN_VEXTRACTMQ. > (builtin_function_type): Add case entries for FUTURE_BUILTIN_MTVSRBM, > FUTURE_BUILTIN_MTVSRHM, FUTURE_BUILTIN_MTVSRWM, FUTURE_BUILTIN_MTVSRDM, > FUTURE_BUILTIN_MTVSRQM, FUTURE_BUILTIN_VCNTMBB, FUTURE_BUILTIN_VCNTMBH, > FUTURE_BUILTIN_VCNTMBW, FUTURE_BUILTIN_VCNTMBD, > FUTURE_BUILTIN_VEXPANDMB, > FUTURE_BUILTIN_VEXPANDMH, FUTURE_BUILTIN_VEXPANDMW, > FUTURE_BUILTIN_VEXPANDMD, > FUTURE_BUILTIN_VEXPANDMQ. > * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add entries > for MTVSRBM, MTVSRHM, MTVSRWM, MTVSRDM, MTVSRQM, VCNTM, VEXPANDM, > VEXTRACTM. The rs6000-c.c reference here ^ doesn't exist below. Looks like that was moved to rs6000-builtin.def. > * testsuite/gcc.target/powerpc/vsx_mask-runnable.c: Add runnable test > case. > --- > gcc/config/rs6000/altivec.h | 10 + > gcc/config/rs6000/rs6000-builtin.def | 45 ++ > gcc/config/rs6000/rs6000-call.c | 66 +- > gcc/config/rs6000/vsx.md | 67 ++ > .../gcc.target/powerpc/vsx_mask-runnable.c | 614 ++++++++++++++++++ > 5 files changed, 801 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c > > diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h > index 0a7e8ab3647..5917d3a2b76 100644 > --- a/gcc/config/rs6000/altivec.h > +++ b/gcc/config/rs6000/altivec.h > @@ -710,6 +710,16 @@ __altivec_scalar_pred(vec_any_nle, > > #define vec_strir_p(a) __builtin_vec_strir_p (a) > #define vec_stril_p(a) __builtin_vec_stril_p (a) > + > +/* VSX Mask Manipulation builtin. */ > +#define vec_genbm __builtin_vec_mtvsrbm > +#define vec_genhm __builtin_vec_mtvsrhm > +#define vec_genwm __builtin_vec_mtvsrwm > +#define vec_gendm __builtin_vec_mtvsrdm > +#define vec_genqm __builtin_vec_mtvsrqm > +#define vec_cntm __builtin_vec_cntm > +#define vec_expandm __builtin_vec_vexpandm > +#define vec_extractm __builtin_vec_vextractm > #endif ok > > #endif /* _ALTIVEC_H */ > diff --git a/gcc/config/rs6000/rs6000-builtin.def > b/gcc/config/rs6000/rs6000-builtin.def > index 8b1ddb00045..7cab5097aeb 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -1049,6 +1049,22 @@ > (RS6000_BTC_ ## ATTR /* ATTR */ \ > | RS6000_BTC_TERNARY), \ > CODE_FOR_ ## ICODE) /* ICODE */ > + > +#define BU_FUTURE_1(ENUM, NAME, ATTR, ICODE) \ > + RS6000_BUILTIN_1 (FUTURE_BUILTIN_ ## ENUM, /* ENUM */ \ > + "__builtin_vec" NAME, /* NAME */ \ > + RS6000_BTM_FUTURE, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_UNARY), \ > + CODE_FOR_ ## ICODE) /* ICODE */ > + > +#define BU_FUTURE_2(ENUM, NAME, ATTR, ICODE) \ > + RS6000_BUILTIN_2 (FUTURE_BUILTIN_ ## ENUM, /* ENUM */ \ > + "__builtin_vec" NAME, /* NAME */ \ > + RS6000_BTM_FUTURE, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_BINARY), \ > + CODE_FOR_ ## ICODE) /* ICODE */ > #endif > ok > > @@ -2637,6 +2653,26 @@ BU_FUTURE_V_1 (VSTRIHR_P, "vstrihr_p", CONST, > vstrir_p_v8hi) > BU_FUTURE_V_1 (VSTRIBL_P, "vstribl_p", CONST, vstril_p_v16qi) > BU_FUTURE_V_1 (VSTRIHL_P, "vstrihl_p", CONST, vstril_p_v8hi) > > +BU_FUTURE_1 (MTVSRBM, "mtvsrbm", CONST, vec_mtvsrbm) > +BU_FUTURE_1 (MTVSRHM, "mtvsrhm", CONST, vec_mtvsr_v8hi) > +BU_FUTURE_1 (MTVSRWM, "mtvsrwm", CONST, vec_mtvsr_v4si) > +BU_FUTURE_1 (MTVSRDM, "mtvsrdm", CONST, vec_mtvsr_v2di) > +BU_FUTURE_1 (MTVSRQM, "mtvsrqm", CONST, vec_mtvsr_v1ti) > +BU_FUTURE_2 (VCNTMBB, "cntmbb", CONST, vec_cntmb_v16qi) > +BU_FUTURE_2 (VCNTMBH, "cntmbh", CONST, vec_cntmb_v8hi) > +BU_FUTURE_2 (VCNTMBW, "cntmbw", CONST, vec_cntmb_v4si) > +BU_FUTURE_2 (VCNTMBD, "cntmbd", CONST, vec_cntmb_v2di) > +BU_FUTURE_1 (VEXPANDMB, "vexpandmb", CONST, vec_expand_v16qi) > +BU_FUTURE_1 (VEXPANDMH, "vexpandmh", CONST, vec_expand_v8hi) > +BU_FUTURE_1 (VEXPANDMW, "vexpandmw", CONST, vec_expand_v4si) > +BU_FUTURE_1 (VEXPANDMD, "vexpandmd", CONST, vec_expand_v2di) > +BU_FUTURE_1 (VEXPANDMQ, "vexpandmq", CONST, vec_expand_v1ti) > +BU_FUTURE_1 (VEXTRACTMB, "vextractmb", CONST, vec_extract_v16qi) > +BU_FUTURE_1 (VEXTRACTMH, "vextractmh", CONST, vec_extract_v8hi) > +BU_FUTURE_1 (VEXTRACTMW, "vextractmw", CONST, vec_extract_v4si) > +BU_FUTURE_1 (VEXTRACTMD, "vextractmd", CONST, vec_extract_v2di) > +BU_FUTURE_1 (VEXTRACTMQ, "vextractmq", CONST, vec_extract_v1ti) > + > /* Future architecture overloaded vector built-ins. */ > BU_FUTURE_OVERLOAD_2 (CLRL, "clrl") > BU_FUTURE_OVERLOAD_2 (CLRR, "clrr") > @@ -2652,6 +2688,15 @@ BU_FUTURE_OVERLOAD_1 (VSTRIL, "stril") > > BU_FUTURE_OVERLOAD_1 (VSTRIR_P, "strir_p") > BU_FUTURE_OVERLOAD_1 (VSTRIL_P, "stril_p") > + > +BU_FUTURE_OVERLOAD_1 (MTVSRBM, "mtvsrbm") > +BU_FUTURE_OVERLOAD_1 (MTVSRHM, "mtvsrhm") > +BU_FUTURE_OVERLOAD_1 (MTVSRWM, "mtvsrwm") > +BU_FUTURE_OVERLOAD_1 (MTVSRDM, "mtvsrdm") > +BU_FUTURE_OVERLOAD_1 (MTVSRQM, "mtvsrqm") > +BU_FUTURE_OVERLOAD_2 (VCNTM, "cntm") > +BU_FUTURE_OVERLOAD_1 (VEXPANDM, "vexpandm") > +BU_FUTURE_OVERLOAD_1 (VEXTRACTM, "vextractm") > ok > > /* 1 argument crypto functions. */ > BU_CRYPTO_1 (VSBOX, "vsbox", CONST, crypto_vsbox_v2di) > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index 0ac8054d030..f50c859b807 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -5618,6 +5618,52 @@ const struct altivec_builtin_types > altivec_overloaded_builtins[] = { > { FUTURE_BUILTIN_VEC_VSTRIR_P, FUTURE_BUILTIN_VSTRIHR_P, > RS6000_BTI_INTSI, RS6000_BTI_V8HI, 0, 0 }, > > + { FUTURE_BUILTIN_VEC_MTVSRBM, FUTURE_BUILTIN_MTVSRBM, > + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTDI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_MTVSRHM, FUTURE_BUILTIN_MTVSRHM, > + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTDI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_MTVSRWM, FUTURE_BUILTIN_MTVSRWM, > + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTDI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_MTVSRDM, FUTURE_BUILTIN_MTVSRDM, > + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTDI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_MTVSRQM, FUTURE_BUILTIN_MTVSRQM, > + RS6000_BTI_unsigned_V1TI, RS6000_BTI_UINTDI, 0, 0 }, > + > + { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBB, > + RS6000_BTI_unsigned_long_long, > + RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTQI, 0 }, > + { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBH, > + RS6000_BTI_unsigned_long_long, > + RS6000_BTI_unsigned_V8HI, RS6000_BTI_UINTQI, 0 }, > + { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBW, > + RS6000_BTI_unsigned_long_long, > + RS6000_BTI_unsigned_V4SI, RS6000_BTI_UINTQI, 0 }, > + { FUTURE_BUILTIN_VEC_VCNTM, FUTURE_BUILTIN_VCNTMBD, > + RS6000_BTI_unsigned_long_long, > + RS6000_BTI_unsigned_V2DI, RS6000_BTI_UINTQI, 0 }, > + > + { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMB, > + RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMH, > + RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V8HI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMW, > + RS6000_BTI_unsigned_V4SI, RS6000_BTI_unsigned_V4SI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMD, > + RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXPANDM, FUTURE_BUILTIN_VEXPANDMQ, > + RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0, 0 }, > + > + { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMB, > + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V16QI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMH, > + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V8HI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMW, > + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V4SI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMD, > + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V2DI, 0, 0 }, > + { FUTURE_BUILTIN_VEC_VEXTRACTM, FUTURE_BUILTIN_VEXTRACTMQ, > + RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, 0, 0 }, > + > { RS6000_BUILTIN_NONE, RS6000_BUILTIN_NONE, 0, 0, 0, 0 } > }; ok > > @@ -8968,7 +9014,11 @@ rs6000_expand_binop_builtin (enum insn_code icode, > tree exp, rtx target) > || icode == CODE_FOR_unpackkf > || icode == CODE_FOR_unpacktf > || icode == CODE_FOR_unpackif > - || icode == CODE_FOR_unpacktd) > + || icode == CODE_FOR_unpacktd > + || icode == CODE_FOR_vec_cntmb_v16qi > + || icode == CODE_FOR_vec_cntmb_v8hi > + || icode == CODE_FOR_vec_cntmb_v4si > + || icode == CODE_FOR_vec_cntmb_v2di) > { > /* Only allow 1-bit unsigned literals. */ > STRIP_NOPS (arg1); > @@ -13170,6 +13220,20 @@ builtin_function_type (machine_mode mode_ret, > machine_mode mode_arg0, > case P8V_BUILTIN_VGBBD: > case MISC_BUILTIN_CDTBCD: > case MISC_BUILTIN_CBCDTD: > + case FUTURE_BUILTIN_MTVSRBM: > + case FUTURE_BUILTIN_MTVSRHM: > + case FUTURE_BUILTIN_MTVSRWM: > + case FUTURE_BUILTIN_MTVSRDM: > + case FUTURE_BUILTIN_MTVSRQM: > + case FUTURE_BUILTIN_VCNTMBB: > + case FUTURE_BUILTIN_VCNTMBH: > + case FUTURE_BUILTIN_VCNTMBW: > + case FUTURE_BUILTIN_VCNTMBD: > + case FUTURE_BUILTIN_VEXPANDMB: > + case FUTURE_BUILTIN_VEXPANDMH: > + case FUTURE_BUILTIN_VEXPANDMW: > + case FUTURE_BUILTIN_VEXPANDMD: > + case FUTURE_BUILTIN_VEXPANDMQ: > h.uns_p[0] = 1; > h.uns_p[1] = 1; > break; ok > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index 2a28215ac5b..96b6ad22812 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -263,6 +263,13 @@ > ;; Mode attribute to give the suffix for the splat instruction > (define_mode_attr VSX_SPLAT_SUFFIX [(V16QI "b") (V8HI "h")]) > > +;; Iterator for the move to mask instructions > +(define_mode_iterator VSX_MM [V16QI V8HI V4SI V2DI V1TI]) > +(define_mode_iterator VSX_MM4 [V16QI V8HI V4SI V2DI]) > + > +;; Mode attribute to give the suffix for the mask instruction > +(define_mode_attr VSX_MM_SUFFIX [(V16QI "b") (V8HI "h") (V4SI "w") (V2DI > "d") (V1TI "q")]) > + > ;; Constants for creating unspecs > (define_c_enum "unspec" > [UNSPEC_VSX_CONCAT > @@ -344,6 +351,10 @@ > UNSPEC_VSX_FIRST_MISMATCH_INDEX > UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX > UNSPEC_XXGENPCV > + UNSPEC_MTVSBM > + UNSPEC_VCNTMB > + UNSPEC_VEXPAND > + UNSPEC_VEXTRACT > ]) > > ;; VSX moves > @@ -5676,3 +5687,59 @@ > DONE; > }) > > +;; VSX mask manipulation instructions > +(define_expand "vec_mtvsrbm" > + [(set (match_operand:V16QI 0 "vsx_register_operand" "=v") > + (unspec:V16QI [(match_operand:DI 1 "gpc_reg_operand" "b")] > + UNSPEC_MTVSBM))] > + "TARGET_FUTURE" > + { > + if (IN_RANGE (INTVAL (operands[1]), 0, 63)) > + /* This is the vec_mtvsrbmi inst with six bit constant. */ It is either the vec_mtvsrbmi built-in, or the mtvsrbmi instruction. > + emit_insn (gen_vec_mtvsrbmi (operands[0], operands[1])); > + else > + emit_insn (gen_vec_mtvsr_v16qi (operands[0], operands[1])); > + > + DONE; > +}) > + > +(define_insn "vec_mtvsrbmi" > + [(set (match_operand:V16QI 0 "vsx_register_operand" "=v") > + (unspec:V16QI [(match_operand:QI 1 "u6bit_cint_operand" "n")] > + UNSPEC_MTVSBM))] > + "TARGET_FUTURE" > + "mtvsrbmi %0,%1" > +) > + > +(define_insn "vec_mtvsr_<mode>" > + [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v") > + (unspec:VSX_MM [(match_operand:DI 1 "gpc_reg_operand" "b")] > + UNSPEC_MTVSBM))] > + "TARGET_FUTURE" > + "mtvsr<VSX_MM_SUFFIX>m %0,%1"; > + [(set_attr "type" "vecsimple")]) > + > +(define_insn "vec_cntmb_<mode>" > + [(set (match_operand:DI 0 "gpc_reg_operand" "=r") > + (unspec:DI [(match_operand:VSX_MM4 1 "vsx_register_operand" "v") > + (match_operand:QI 2 "const_0_to_1_operand" "n")] > + UNSPEC_VCNTMB))] > + "TARGET_FUTURE" > + "vcntmb<VSX_MM_SUFFIX> %0,%1,%2" > + [(set_attr "type" "vecsimple")]) > + > +(define_insn "vec_extract_<mode>" > + [(set (match_operand:SI 0 "register_operand" "=r") > + (unspec:SI [(match_operand:VSX_MM 1 "vsx_register_operand" "v")] > + UNSPEC_VEXTRACT))] > + "TARGET_FUTURE" > + "vextract<VSX_MM_SUFFIX>m %0,%1" > + [(set_attr "type" "vecsimple")]) > + > +(define_insn "vec_expand_<mode>" > + [(set (match_operand:VSX_MM 0 "vsx_register_operand" "=v") > + (unspec:VSX_MM [(match_operand:VSX_MM 1 "vsx_register_operand" "v")] > + UNSPEC_VEXPAND))] > + "TARGET_FUTURE" > + "vexpand<VSX_MM_SUFFIX>m %0,%1" > + [(set_attr "type" "vecsimple")]) ok > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c > b/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c > new file mode 100644 > index 00000000000..8eab7107b15 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vsx_mask-runnable.c > <snip testcase> I'd probably chop the test up into a few smaller tests, no issue or concern with the test itself. Aside from the cosmetic nits mentioned above, lgtm. Thanks -Will > > > > > > > >