Re: Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

Richard Biener Fri, 03 Nov 2023 01:27:58 -0700

On Fri, 3 Nov 2023, juzhe.zh...@rivai.ai wrote:

> Hi?Richi.
> 
> >> Element @var{i}.  I think we don't want to document 'undefined' but
> >> rather match mask_gather_load and say the result should be zero.
> 
> I am confused here,  will vectorizer depend on zero inactive value ?


I don't think it does, but why be inconsistent with gather and at
the same time document it behaves like gather?

> Since RVV load/store doesn't touch the inactive(tail or unmasked) elements, I 
> am wondering whether it will cause issues.

For load not zeroing will cause a false dependence on the previous value
of the register, so I doubt this is optimal behavior.  For stores the
inactive elements are of course not stored.

> 
> juzhe.zh...@rivai.ai
>  
> From: Richard Biener
> Date: 2023-11-03 15:40
> To: juzhe.zh...@rivai.ai
> CC: gcc-patches; jeffreyalaw; richard.sandiford; Robin Dapp
> Subject: Re: Re: [PATCH V2] OPTABS/IFN: Add 
> mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> On Fri, 3 Nov 2023, juzhe.zh...@rivai.ai wrote:
>  
> > Hi, Richi.
> > 
> > The following is strided load/store doc:
> > 
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index fab2513105a..4f0821a291d 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5094,6 +5094,22 @@ Bit @var{i} of the mask is set if element @var{i} of 
> > the result should
> >  be loaded from memory and clear if element @var{i} of the result should be 
> > undefined.
> >  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
> > 
> > +@cindex @code{mask_len_strided_load@var{m}} instruction pattern
> > +@item @samp{mask_len_strided_load@var{m}}
> > +Load several separate memory locations into a destination vector of mode 
> > @var{m}.
> > +Operand 0 is a destination vector of mode @var{m}.
> > +Operand 1 is a scalar base address and operand 2 is a scalar stride of 
> > Pmode.
> > +operand 3 indicates stride operand is signed or unsigned number and 
> > operand 4 is mask operand.
>  
> we don't need operand 3
>  
> > +operand 5 is length operand and operand 6 is bias operand.
> > +The instruction can be seen as a special case of 
> > @code{mask_len_gather_load@var{m}@var{n}}
> > +with an offset vector that is a @code{vec_series} with operand 1 as base 
> > and operand 2 as step.
> > +For each element index i load address is operand 1 + @var{i} * operand 2.
> > +operand 2 can be negative stride if operand 3 is 0.
> > +Similar to mask_len_load, the instruction loads at most (operand 5 + 
> > operand 6) elements from memory.
> > +Bit @var{i} of the mask (operand 4) is set if element @var{i} of the 
> > result should
>  
> Element @var{i}.  I think we don't want to document 'undefined' but
> rather match mask_gather_load and say the result should be zero.
>  
> Similar adjustments below.
>  
> > +be loaded from memory and clear if element @var{i} of the result should be 
> > undefined.
> > +Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored.
> > +
> >  @cindex @code{scatter_store@var{m}@var{n}} instruction pattern
> >  @item @samp{scatter_store@var{m}@var{n}}
> >  Store a vector of mode @var{m} into several distinct memory locations.
> > @@ -5131,6 +5147,21 @@ at most (operand 6 + operand 7) elements of (operand 
> > 4) to memory.
> >  Bit @var{i} of the mask is set if element @var{i} of (operand 4) should be 
> > stored.
> >  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are ignored.
> > 
> > +@cindex @code{mask_len_strided_store@var{m}} instruction pattern
> > +@item @samp{mask_len_strided_store@var{m}}
> > +Store a vector of mode m into several distinct memory locations.
> > +Operand 0 is a scalar base address and operand 1 is scalar stride of Pmode.
> > +operand 2 indicates stride operand is signed or unsigned number.
> > +Operand 3 is the vector of values that should be stored, which is of mode 
> > @var{m}.
> > +operand 4 is mask operand, operand 5 is length operand and operand 6 is 
> > bias operand.
> > +The instruction can be seen as a special case of 
> > @code{mask_len_scatter_store@var{m}@var{n}}
> > +with an offset vector that is a @code{vec_series} with operand 1 as base 
> > and operand 1 as step.
> > +For each element index i store address is operand 0 + @var{i} * operand 1.
> > +operand 1 can be negative stride if operand 2 is 0.
> > +Similar to mask_len_store, the instruction stores at most (operand 5 + 
> > operand 6) elements of mask (operand 4) to memory.
> > +Bit @var{i} of the mask is set if element @var{i} of (operand 3) should be 
> > stored.
> > +Mask elements @var{i} with @var{i} > (operand 5 + operand 6) are ignored.
> > +
> >  @cindex @code{vec_set@var{m}} instruction pattern
> >  @item @samp{vec_set@var{m}}
> >  Set given field in the vector value.  Operand 0 is the vector to modify,
> > 
> > Does it look reasonable ? Thanks.
> > 
> > 
> > 
> > juzhe.zh...@rivai.ai
> >  
> > From: Richard Biener
> > Date: 2023-11-02 22:34
> > To: ???
> > CC: gcc-patches; Jeff Law; richard.sandiford; rdapp.gcc
> > Subject: Re: Re: [PATCH V2] OPTABS/IFN: Add 
> > mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> > On Thu, 2 Nov 2023, ??? wrote:
> >  
> > > Ok. So drop 'scale' and keep signed/unsigned argument, is that right?
> >  
> > I don't think we need signed/unsigned.  RTL expansion has the signedness
> > of the offset argument there and can just extend to the appropriate mode
> > to offset a pointer.
> >  
> > > And I wonder I should create the stride_type using size_type_node or 
> > > ptrdiff_type_node ?
> > > Which is preferrable ?
> >  
> > 'sizetype' - that's the type we require to be used for 
> > the POINTER_PLUS_EXPR offset operand.
> >  
> >  
> > > Thanks.
> > > 
> > > 
> > > 
> > > juzhe.zh...@rivai.ai
> > >  
> > > From: Richard Biener
> > > Date: 2023-11-02 22:27
> > > To: ???
> > > CC: gcc-patches; Jeff Law; richard.sandiford; rdapp.gcc
> > > Subject: Re: Re: [PATCH V2] OPTABS/IFN: Add 
> > > mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> > > On Thu, 2 Nov 2023, ??? wrote:
> > >  
> > > > Hi, Richi.
> > > > 
> > > > >> Do we really need to have two modes for the optab though or could we
> > > > >> simply require the target to support arbitrary offset modes (give it
> > > > >> is implicitly constrained to ptr_mode for the base already)?  Or
> > > > >> properly extend/truncate the offset at expansion time, say to 
> > > > >> ptr_mode
> > > > >> or to the mode of sizetype.
> > > > 
> > > > For RVV, it's ok by default set stride type as ptr_mode/size_type by 
> > > > default.
> > > > Is it ok that I define strided load/store as single mode optab and 
> > > > default Pmode as stride operand?
> > > > How about scale and signed/unsigned operand ?
> > > > It seems scale operand can be removed ? Since we can pass DR_STEP 
> > > > directly to the stride arguments.
> > > > But I think we can't remove signed/unsigned operand since for strided 
> > > > mode = SI mode, the unsigned
> > > > maximum stride = 2^31 wheras signed is 2 ^ 30.
> > >  
> > > On the GIMPLE side I think we want to have a sizetype operand and
> > > indeed drop 'scale', the sizetype operand should be readily available.
> > >  
> > > > 
> > > > 
> > > > 
> > > > juzhe.zh...@rivai.ai
> > > >  
> > > > From: Richard Biener
> > > > Date: 2023-11-02 21:52
> > > > To: Juzhe-Zhong
> > > > CC: gcc-patches; jeffreyalaw; richard.sandiford; rdapp.gcc
> > > > Subject: Re: [PATCH V2] OPTABS/IFN: Add 
> > > > mask_len_strided_load/mask_len_strided_store OPTABS/IFN
> > > > On Tue, 31 Oct 2023, Juzhe-Zhong wrote:
> > > >  
> > > > > As previous Richard's suggested, we should support strided load/store 
> > > > > in
> > > > > loop vectorizer instead hacking RISC-V backend.
> > > > > 
> > > > > This patch adds MASK_LEN_STRIDED LOAD/STORE OPTABS/IFN.
> > > > > 
> > > > > The GIMPLE IR is same as mask_len_gather_load/mask_len_scatter_store 
> > > > > but with
> > > > > changing vector offset into scalar stride.
> > > >  
> > > > I see that it follows gather/scatter.  I'll note that when introducing
> > > > those we failed to add a specifier for TBAA and alignment info for the
> > > > data access.  That means we have to use alias-set zero for the accesses
> > > > (I see existing targets use UNSPECs with some not elaborate MEM anyway,
> > > > but TBAA info might have been the "easy" and obvious property to 
> > > > preserve).  For alignment we either have to assume unaligned or reject
> > > > vectorization of accesses that do not have their original scalar 
> > > > accesses
> > > > naturally aligned (aligned according to their mode).  We don't seem
> > > > to check that though.
> > > >  
> > > > It might be fine to go forward with this since gather/scatter are broken
> > > > in a similar way.
> > > >  
> > > > Do we really need to have two modes for the optab though or could we
> > > > simply require the target to support arbitrary offset modes (give it
> > > > is implicitly constrained to ptr_mode for the base already)?  Or
> > > > properly extend/truncate the offset at expansion time, say to ptr_mode
> > > > or to the mode of sizetype.
> > > >  
> > > > Thanks,
> > > > Richard.
> > > > > We don't have strided_load/strided_store and 
> > > > > mask_strided_load/mask_strided_store since
> > > > > it't unlikely RVV will have such optabs and we can't add the patterns 
> > > > > that we can't test them.
> > > > >
> > > > > 
> > > > > gcc/ChangeLog:
> > > > > 
> > > > > * doc/md.texi: Add mask_len_strided_load/mask_len_strided_store.
> > > > > * internal-fn.cc (internal_load_fn_p): Ditto.
> > > > > (internal_strided_fn_p): Ditto.
> > > > > (internal_fn_len_index): Ditto.
> > > > > (internal_fn_mask_index): Ditto.
> > > > > (internal_fn_stored_value_index): Ditto.
> > > > > (internal_strided_fn_supported_p): Ditto.
> > > > > * internal-fn.def (MASK_LEN_STRIDED_LOAD): Ditto.
> > > > > (MASK_LEN_STRIDED_STORE): Ditto.
> > > > > * internal-fn.h (internal_strided_fn_p): Ditto.
> > > > > (internal_strided_fn_supported_p): Ditto.
> > > > > * optabs.def (OPTAB_CD): Ditto.
> > > > > 
> > > > > ---
> > > > >  gcc/doc/md.texi     | 51 
> > > > > +++++++++++++++++++++++++++++++++++++++++++++
> > > > >  gcc/internal-fn.cc  | 44 ++++++++++++++++++++++++++++++++++++++
> > > > >  gcc/internal-fn.def |  4 ++++
> > > > >  gcc/internal-fn.h   |  2 ++
> > > > >  gcc/optabs.def      |  2 ++
> > > > >  5 files changed, 103 insertions(+)
> > > > > 
> > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > > > > index fab2513105a..5bac713a0dd 100644
> > > > > --- a/gcc/doc/md.texi
> > > > > +++ b/gcc/doc/md.texi
> > > > > @@ -5094,6 +5094,32 @@ Bit @var{i} of the mask is set if element 
> > > > > @var{i} of the result should
> > > > >  be loaded from memory and clear if element @var{i} of the result 
> > > > > should be undefined.
> > > > >  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are 
> > > > > ignored.
> > > > >  
> > > > > +@cindex @code{mask_len_strided_load@var{m}@var{n}} instruction 
> > > > > pattern
> > > > > +@item @samp{mask_len_strided_load@var{m}@var{n}}
> > > > > +Load several separate memory locations into a destination vector of 
> > > > > mode @var{m}.
> > > > > +Operand 0 is a destination vector of mode @var{m}.
> > > > > +Operand 1 is a scalar base address and operand 2 is a scalar stride 
> > > > > of mode @var{n}.
> > > > > +The instruction can be seen as a special case of 
> > > > > @code{mask_len_gather_load@var{m}@var{n}}
> > > > > +with an offset vector that is a @code{vec_series} with operand 1 as 
> > > > > base and operand 2 as step.
> > > > > +For each element index i:
> > > > > +
> > > > > +@itemize @bullet
> > > > > +@item
> > > > > +extend the stride to address width, using zero
> > > > > +extension if operand 3 is 1 and sign extension if operand 3 is zero;
> > > > > +@item
> > > > > +multiply the extended stride by operand 4;
> > > > > +@item
> > > > > +add the result to the base; and
> > > > > +@item
> > > > > +load the value at that address (operand 1 + @var{i} * multiplied and 
> > > > > extended stride) into element @var{i} of operand 0.
> > > > > +@end itemize
> > > > > +
> > > > > +Similar to mask_len_load, the instruction loads at most (operand 6 + 
> > > > > operand 7) elements from memory.
> > > > > +Bit @var{i} of the mask is set if element @var{i} of the result 
> > > > > should
> > > > > +be loaded from memory and clear if element @var{i} of the result 
> > > > > should be undefined.
> > > > > +Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are 
> > > > > ignored.
> > > > > +
> > > > >  @cindex @code{scatter_store@var{m}@var{n}} instruction pattern
> > > > >  @item @samp{scatter_store@var{m}@var{n}}
> > > > >  Store a vector of mode @var{m} into several distinct memory 
> > > > > locations.
> > > > > @@ -5131,6 +5157,31 @@ at most (operand 6 + operand 7) elements of 
> > > > > (operand 4) to memory.
> > > > >  Bit @var{i} of the mask is set if element @var{i} of (operand 4) 
> > > > > should be stored.
> > > > >  Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are 
> > > > > ignored.
> > > > >  
> > > > > +@cindex @code{mask_len_strided_store@var{m}@var{n}} instruction 
> > > > > pattern
> > > > > +@item @samp{mask_len_strided_store@var{m}@var{n}}
> > > > > +Store a vector of mode m into several distinct memory locations.
> > > > > +Operand 0 is a scalar base address and operand 1 is scalar stride of 
> > > > > mode @var{n}.
> > > > > +Operand 2 is the vector of values that should be stored, which is of 
> > > > > mode @var{m}.
> > > > > +The instruction can be seen as a special case of 
> > > > > @code{mask_len_scatter_store@var{m}@var{n}}
> > > > > +with an offset vector that is a @code{vec_series} with operand 1 as 
> > > > > base and operand 2 as step.
> > > > > +For each element index i:
> > > > > +
> > > > > +@itemize @bullet
> > > > > +@item
> > > > > +extend the stride to address width, using zero
> > > > > +extension if operand 2 is 1 and sign extension if operand 2 is zero;
> > > > > +@item
> > > > > +multiply the extended stride by operand 3;
> > > > > +@item
> > > > > +add the result to the base; and
> > > > > +@item
> > > > > +store element @var{i} of operand 4 to that address (operand 1 + 
> > > > > @var{i} * multiplied and extended stride).
> > > > > +@end itemize
> > > > > +
> > > > > +Similar to mask_len_store, the instruction stores at most (operand 6 
> > > > > + operand 7) elements of (operand 4) to memory.
> > > > > +Bit @var{i} of the mask is set if element @var{i} of (operand 4) 
> > > > > should be stored.
> > > > > +Mask elements @var{i} with @var{i} > (operand 6 + operand 7) are 
> > > > > ignored.
> > > > > +
> > > > >  @cindex @code{vec_set@var{m}} instruction pattern
> > > > >  @item @samp{vec_set@var{m}}
> > > > >  Set given field in the vector value.  Operand 0 is the vector to 
> > > > > modify,
> > > > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> > > > > index e7451b96353..f7f85aa7dde 100644
> > > > > --- a/gcc/internal-fn.cc
> > > > > +++ b/gcc/internal-fn.cc
> > > > > @@ -4596,6 +4596,7 @@ internal_load_fn_p (internal_fn fn)
> > > > >      case IFN_GATHER_LOAD:
> > > > >      case IFN_MASK_GATHER_LOAD:
> > > > >      case IFN_MASK_LEN_GATHER_LOAD:
> > > > > +    case IFN_MASK_LEN_STRIDED_LOAD:
> > > > >      case IFN_LEN_LOAD:
> > > > >      case IFN_MASK_LEN_LOAD:
> > > > >        return true;
> > > > > @@ -4648,6 +4649,22 @@ internal_gather_scatter_fn_p (internal_fn fn)
> > > > >      }
> > > > >  }
> > > > >  
> > > > > +/* Return true if IFN is some form of strided load or strided store. 
> > > > >  */
> > > > > +
> > > > > +bool
> > > > > +internal_strided_fn_p (internal_fn fn)
> > > > > +{
> > > > > +  switch (fn)
> > > > > +    {
> > > > > +    case IFN_MASK_LEN_STRIDED_LOAD:
> > > > > +    case IFN_MASK_LEN_STRIDED_STORE:
> > > > > +      return true;
> > > > > +
> > > > > +    default:
> > > > > +      return false;
> > > > > +    }
> > > > > +}
> > > > > +
> > > > >  /* If FN takes a vector len argument, return the index of that 
> > > > > argument,
> > > > >     otherwise return -1.  */
> > > > >  
> > > > > @@ -4662,6 +4679,8 @@ internal_fn_len_index (internal_fn fn)
> > > > >  
> > > > >      case IFN_MASK_LEN_GATHER_LOAD:
> > > > >      case IFN_MASK_LEN_SCATTER_STORE:
> > > > > +    case IFN_MASK_LEN_STRIDED_LOAD:
> > > > > +    case IFN_MASK_LEN_STRIDED_STORE:
> > > > >      case IFN_COND_LEN_FMA:
> > > > >      case IFN_COND_LEN_FMS:
> > > > >      case IFN_COND_LEN_FNMA:
> > > > > @@ -4719,6 +4738,8 @@ internal_fn_mask_index (internal_fn fn)
> > > > >      case IFN_MASK_SCATTER_STORE:
> > > > >      case IFN_MASK_LEN_GATHER_LOAD:
> > > > >      case IFN_MASK_LEN_SCATTER_STORE:
> > > > > +    case IFN_MASK_LEN_STRIDED_LOAD:
> > > > > +    case IFN_MASK_LEN_STRIDED_STORE:
> > > > >        return 4;
> > > > >  
> > > > >      default:
> > > > > @@ -4740,6 +4761,7 @@ internal_fn_stored_value_index (internal_fn fn)
> > > > >      case IFN_SCATTER_STORE:
> > > > >      case IFN_MASK_SCATTER_STORE:
> > > > >      case IFN_MASK_LEN_SCATTER_STORE:
> > > > > +    case IFN_MASK_LEN_STRIDED_STORE:
> > > > >        return 3;
> > > > >  
> > > > >      case IFN_LEN_STORE:
> > > > > @@ -4784,6 +4806,28 @@ internal_gather_scatter_fn_supported_p 
> > > > > (internal_fn ifn, tree vector_type,
> > > > >    && insn_operand_matches (icode, 3 + output_ops, GEN_INT (scale)));
> > > > >  }
> > > > >  
> > > > > +/* Return true if the target supports strided load or strided store 
> > > > > function
> > > > > +   IFN.  For loads, VECTOR_TYPE is the vector type of the load 
> > > > > result,
> > > > > +   while for stores it is the vector type of the stored data 
> > > > > argument.
> > > > > +   STRIDE_TYPE is the type that holds the stride from the previous 
> > > > > element
> > > > > +   memory address of each loaded or stored element.
> > > > > +   SCALE is the amount by which these stride should be multiplied
> > > > > +   *after* they have been extended to address width.  */
> > > > > +
> > > > > +bool
> > > > > +internal_strided_fn_supported_p (internal_fn ifn, tree vector_type,
> > > > > + tree offset_type, int scale)
> > > > > +{
> > > > > +  optab optab = direct_internal_fn_optab (ifn);
> > > > > +  insn_code icode = convert_optab_handler (optab, TYPE_MODE 
> > > > > (vector_type),
> > > > > +    TYPE_MODE (offset_type));
> > > > > +  int output_ops = internal_load_fn_p (ifn) ? 1 : 0;
> > > > > +  bool unsigned_p = TYPE_UNSIGNED (offset_type);
> > > > > +  return (icode != CODE_FOR_nothing
> > > > > +   && insn_operand_matches (icode, 2 + output_ops, GEN_INT 
> > > > > (unsigned_p))
> > > > > +   && insn_operand_matches (icode, 3 + output_ops, GEN_INT (scale)));
> > > > > +}
> > > > > +
> > > > >  /* Return true if the target supports IFN_CHECK_{RAW,WAR}_PTRS 
> > > > > function IFN
> > > > >     for pointers of type TYPE when the accesses have LENGTH bytes and 
> > > > > their
> > > > >     common byte alignment is ALIGN.  */
> > > > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > > > > index a2023ab9c3d..0fa532e8f6b 100644
> > > > > --- a/gcc/internal-fn.def
> > > > > +++ b/gcc/internal-fn.def
> > > > > @@ -199,6 +199,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_GATHER_LOAD, ECF_PURE,
> > > > >         mask_gather_load, gather_load)
> > > > >  DEF_INTERNAL_OPTAB_FN (MASK_LEN_GATHER_LOAD, ECF_PURE,
> > > > >         mask_len_gather_load, gather_load)
> > > > > +DEF_INTERNAL_OPTAB_FN (MASK_LEN_STRIDED_LOAD, ECF_PURE,
> > > > > +        mask_len_strided_load, gather_load)
> > > > >  
> > > > >  DEF_INTERNAL_OPTAB_FN (LEN_LOAD, ECF_PURE, len_load, len_load)
> > > > >  DEF_INTERNAL_OPTAB_FN (MASK_LEN_LOAD, ECF_PURE, mask_len_load, 
> > > > > mask_len_load)
> > > > > @@ -208,6 +210,8 @@ DEF_INTERNAL_OPTAB_FN (MASK_SCATTER_STORE, 0,
> > > > >         mask_scatter_store, scatter_store)
> > > > >  DEF_INTERNAL_OPTAB_FN (MASK_LEN_SCATTER_STORE, 0,
> > > > >         mask_len_scatter_store, scatter_store)
> > > > > +DEF_INTERNAL_OPTAB_FN (MASK_LEN_STRIDED_STORE, 0,
> > > > > +        mask_len_strided_store, scatter_store)
> > > > >  
> > > > >  DEF_INTERNAL_OPTAB_FN (MASK_STORE, 0, maskstore, mask_store)
> > > > >  DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_CONST, vec_store_lanes, 
> > > > > store_lanes)
> > > > > diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
> > > > > index 99de13a0199..8379c61dff7 100644
> > > > > --- a/gcc/internal-fn.h
> > > > > +++ b/gcc/internal-fn.h
> > > > > @@ -235,11 +235,13 @@ extern bool can_interpret_as_conditional_op_p 
> > > > > (gimple *, tree *,
> > > > >  extern bool internal_load_fn_p (internal_fn);
> > > > >  extern bool internal_store_fn_p (internal_fn);
> > > > >  extern bool internal_gather_scatter_fn_p (internal_fn);
> > > > > +extern bool internal_strided_fn_p (internal_fn);
> > > > >  extern int internal_fn_mask_index (internal_fn);
> > > > >  extern int internal_fn_len_index (internal_fn);
> > > > >  extern int internal_fn_stored_value_index (internal_fn);
> > > > >  extern bool internal_gather_scatter_fn_supported_p (internal_fn, 
> > > > > tree,
> > > > >      tree, tree, int);
> > > > > +extern bool internal_strided_fn_supported_p (internal_fn, tree, 
> > > > > tree, int);
> > > > >  extern bool internal_check_ptrs_fn_supported_p (internal_fn, tree,
> > > > >  poly_uint64, unsigned int);
> > > > >  #define VECT_PARTIAL_BIAS_UNSUPPORTED 127
> > > > > diff --git a/gcc/optabs.def b/gcc/optabs.def
> > > > > index 2ccbe4197b7..3d85ac5f678 100644
> > > > > --- a/gcc/optabs.def
> > > > > +++ b/gcc/optabs.def
> > > > > @@ -98,9 +98,11 @@ OPTAB_CD(mask_len_store_optab, 
> > > > > "mask_len_store$a$b")
> > > > >  OPTAB_CD(gather_load_optab, "gather_load$a$b")
> > > > >  OPTAB_CD(mask_gather_load_optab, "mask_gather_load$a$b")
> > > > >  OPTAB_CD(mask_len_gather_load_optab, "mask_len_gather_load$a$b")
> > > > > +OPTAB_CD(mask_len_strided_load_optab, "mask_len_strided_load$a$b")
> > > > >  OPTAB_CD(scatter_store_optab, "scatter_store$a$b")
> > > > >  OPTAB_CD(mask_scatter_store_optab, "mask_scatter_store$a$b")
> > > > >  OPTAB_CD(mask_len_scatter_store_optab, "mask_len_scatter_store$a$b")
> > > > > +OPTAB_CD(mask_len_strided_store_optab, "mask_len_strided_store$a$b")
> > > > >  OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
> > > > >  OPTAB_CD(vec_init_optab, "vec_init$a$b")
> > > > >  
> > > > > 
> > > >  
> > > > 
> > >  
> > > 
> >  
> > 
>  
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: Re: [PATCH V2] OPTABS/IFN: Add mask_len_strided_load/mask_len_strided_store OPTABS/IFN

Reply via email to