On Fri, Sep 4, 2020 at 9:19 AM Richard Biener
<richard.guent...@gmail.com> wrote:
>
> On Fri, Sep 4, 2020 at 8:38 AM luoxhu <luo...@linux.ibm.com> wrote:
> >
> >
> >
> > On 2020/9/4 14:16, luoxhu via Gcc-patches wrote:
> > > Hi,
> > >
> > >
> > > Yes, I checked and found that both vec_set and vec_extract doesn't support
> > > variable index for most targets, store_bit_field_1 and extract_bit_field_1
> > > would only consider use optabs when index is integer value.  Anyway, it
> > > shouldn't be hard to extend depending on target requirements.
> > >
> > > Another problem is v[n&3]=i and vec_insert(v, i, n) are generating with
> > > different gimple code:
> > >
> > > {
> > > _1 = n & 3;
> > > VIEW_CONVERT_EXPR<int[4]>(v1)[_1] = i;
> > > }
> > >
> > > vs:
> > >
> > > {
> > >    __vector signed int v1;
> > >    __vector signed int D.3192;
> > >    long unsigned int _1;
> > >    long unsigned int _2;
> > >    int * _3;
> > >
> > >    <bb 2> [local count: 1073741824]:
> > >    D.3192 = v_4(D);
> > >    _1 = n_7(D) & 3;
> > >    _2 = _1 * 4;
> > >    _3 = &D.3192 + _2;
> > >    *_3 = i_8(D);
> > >    v1_10 = D.3192;
> > >    return v1_10;
> > > }
> >
> > Just realized use convert_vector_to_array_for_subscript would generate
> > "VIEW_CONVERT_EXPR<int[4]>(v1)[_1] = i;" before produce those instructions,
> >   your confirmation and comments will be highly appreciated...  Thanks in
> > advance. :)
>
> I think what the GCC vector extensions produce is generally better
> so wherever "code generation" for vec_insert resides it should be
> adjusted to produce the same code.  Same for vec_extract.

Guess altivec.h, dispatching to __builtin_vec_insert.  Wonder why it wasn't

#define vec_insert(a,b,c) (a)[c]=(b)

anyway, you obviously have some lowering of the builtin somewhere in rs6000.c
and thus can adjust that.

> > Xionghu
> >
> > >
> > > If not use builtin for "vec_insert(v, i, n)", the pointer is "int*" 
> > > instead
> > > of vector type, will this be difficult for expander to capture so many
> > > statements then call the optabs?  So shall we still keep the builtin style
> > > for "vec_insert(v, i, n)" and expand "v[n&3]=i" with optabs or expand both
> > > with optabs???
> > >
> > > Drafted a fast patch to expand "v[n&3]=i" with optabs as below, sorry 
> > > that not
> > > using existed vec_set yet as not quite sure, together with the first 
> > > patch, both
> > > cases could be handled as expected:
> > >
> > >
> > > [PATCH] Expander: expand VIEW_CONVERT_EXPR to vec_insert with variable 
> > > index
> > >
> > > v[n%4] = i has same semantic with vec_insert (i, v, n), but it will be
> > > optimized to "VIEW_CONVERT_EXPR<int[4]>(v1)[_1] = i;" in gimple, this
> > > patch tries to recognize the pattern in expander and use optabs to
> > > expand it to fast instructions like vec_insert: lvsl+xxperm+xxsel.
> > >
> > > gcc/ChangeLog:
> > >
> > >       * config/rs6000/vector.md:
> > >       * expr.c (expand_assignment):
> > >       * optabs.def (OPTAB_CD):
> > > ---
> > >   gcc/config/rs6000/vector.md | 13 +++++++++++
> > >   gcc/expr.c                  | 46 +++++++++++++++++++++++++++++++++++++
> > >   gcc/optabs.def              |  1 +
> > >   3 files changed, 60 insertions(+)
> > >
> > > diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
> > > index 796345c80d3..46d21271e17 100644
> > > --- a/gcc/config/rs6000/vector.md
> > > +++ b/gcc/config/rs6000/vector.md
> > > @@ -1244,6 +1244,19 @@ (define_expand "vec_extract<mode><VEC_base_l>"
> > >     DONE;
> > >   })
> > >
> > > +(define_expand "vec_insert<VEC_base_l><mode>"
> > > +  [(match_operand:VEC_E 0 "vlogical_operand")
> > > +   (match_operand:<VEC_base> 1 "register_operand")
> > > +   (match_operand 2 "register_operand")]
> > > +  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
> > > +{
> > > +  rtx target = gen_reg_rtx (V16QImode);
> > > +  rs6000_expand_vector_insert (target, operands[0], operands[1], 
> > > operands[2]);
> > > +  rtx sub_target = simplify_gen_subreg (GET_MODE(operands[0]), target, 
> > > V16QImode, 0);
> > > +  emit_insn (gen_rtx_SET (operands[0], sub_target));
> > > +  DONE;
> > > +})
> > > +
> > >   ;; Convert double word types to single word types
> > >   (define_expand "vec_pack_trunc_v2df"
> > >     [(match_operand:V4SF 0 "vfloat_operand")
> > > diff --git a/gcc/expr.c b/gcc/expr.c
> > > index dd2200ddea8..ce2890c1a2d 100644
> > > --- a/gcc/expr.c
> > > +++ b/gcc/expr.c
> > > @@ -5237,6 +5237,52 @@ expand_assignment (tree to, tree from, bool 
> > > nontemporal)
> > >
> > >         to_rtx = expand_expr (tem, NULL_RTX, VOIDmode, EXPAND_WRITE);
> > >
> > > +      tree type = TREE_TYPE (to);
> > > +      if (TREE_CODE (to) == ARRAY_REF && tree_fits_uhwi_p (TYPE_SIZE 
> > > (type))
> > > +       && tree_fits_uhwi_p (TYPE_SIZE_UNIT (type))
> > > +       && tree_to_uhwi (TYPE_SIZE (type))
> > > +              * tree_to_uhwi (TYPE_SIZE_UNIT (type))
> > > +            == 128)
> > > +     {
> > > +       tree op0 = TREE_OPERAND (to, 0);
> > > +       tree op1 = TREE_OPERAND (to, 1);
> > > +       if (TREE_CODE (op0) == VIEW_CONVERT_EXPR)
> > > +         {
> > > +           tree view_op0 = TREE_OPERAND (op0, 0);
> > > +           mode = TYPE_MODE (TREE_TYPE (view_op0));
> > > +           if (TREE_CODE (TREE_TYPE (view_op0)) == VECTOR_TYPE)
> > > +             {
> > > +               rtx value
> > > +                 = expand_expr (from, NULL_RTX, VOIDmode, EXPAND_NORMAL);
> > > +               rtx pos
> > > +                 = expand_expr (op1, NULL_RTX, VOIDmode, EXPAND_NORMAL);
> > > +               rtx temp_target = gen_reg_rtx (mode);
> > > +               emit_move_insn (temp_target, to_rtx);
> > > +
> > > +               machine_mode outermode = mode;
> > > +               scalar_mode innermode = GET_MODE_INNER (outermode);
> > > +               class expand_operand ops[3];
> > > +               enum insn_code icode
> > > +                 = convert_optab_handler (vec_insert_optab, innermode,
> > > +                                          outermode);
> > > +
> > > +               if (icode != CODE_FOR_nothing)
> > > +                 {
> > > +                   pos = convert_to_mode (E_SImode, pos, 0);
> > > +
> > > +                   create_fixed_operand (&ops[0], temp_target);
> > > +                   create_input_operand (&ops[1], value, innermode);
> > > +                   create_input_operand (&ops[2], pos, GET_MODE (pos));
> > > +                   if (maybe_expand_insn (icode, 3, ops))
> > > +                     {
> > > +                       emit_move_insn (to_rtx, temp_target);
> > > +                       pop_temp_slots ();
> > > +                       return;
> > > +                     }
> > > +                 }
> > > +             }
> > > +         }
> > > +     }
> > >         /* If the field has a mode, we want to access it in the
> > >        field's mode, not the computed mode.
> > >        If a MEM has VOIDmode (external with incomplete type),
> > > diff --git a/gcc/optabs.def b/gcc/optabs.def
> > > index 78409aa1453..21b163a969e 100644
> > > --- a/gcc/optabs.def
> > > +++ b/gcc/optabs.def
> > > @@ -96,6 +96,7 @@ OPTAB_CD(mask_gather_load_optab, "mask_gather_load$a$b")
> > >   OPTAB_CD(scatter_store_optab, "scatter_store$a$b")
> > >   OPTAB_CD(mask_scatter_store_optab, "mask_scatter_store$a$b")
> > >   OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
> > > +OPTAB_CD(vec_insert_optab, "vec_insert$a$b")
> > >   OPTAB_CD(vec_init_optab, "vec_init$a$b")
> > >
> > >   OPTAB_CD (while_ult_optab, "while_ult$a$b")
> > >

Reply via email to