On Fri, 12 May 2023, Richard Sandiford wrote:

> Richard Biener <rguent...@suse.de> writes:
> > On Fri, 12 May 2023, Andre Vieira (lists) wrote:
> >
> >> I have dealt with, I think..., most of your comments. There's quite a few
> >> changes, I think it's all a bit simpler now. I made some other changes to 
> >> the
> >> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
> >> the same behaviour as we had with the tree codes before. Also added some 
> >> extra
> >> checks to tree-cfg.cc that made sense to me.
> >> 
> >> I am still regression testing the gimple-range-op change, as that was a 
> >> last
> >> minute change, but the rest survived a bootstrap and regression test on
> >> aarch64-unknown-linux-gnu.
> >> 
> >> cover letter:
> >> 
> >> This patch replaces the existing tree_code widen_plus and widen_minus
> >> patterns with internal_fn versions.
> >> 
> >> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and 
> >> DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
> >> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN 
> >> respectively
> >> except they provide convenience wrappers for defining conversions that 
> >> require
> >> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and 
> >> _lo
> >> and each of those will also require a signed and unsigned version in the 
> >> case
> >> of widening. The hi/lo pair is necessary because the widening and narrowing
> >> operations take n narrow elements as inputs and return n/2 wide elements as
> >> outputs. The 'lo' operation operates on the first n/2 elements of input. 
> >> The
> >> 'hi' operation operates on the second n/2 elements of input. Defining an
> >> internal_fn along with hi/lo variations allows a single internal function 
> >> to
> >> be returned from a vect_recog function that will later be expanded to 
> >> hi/lo.
> >> 
> >> 
> >>  For example:
> >>  IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
> >> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
> >> (u/s)addl2
> >>                        IFN_VEC_WIDEN_PLUS_LO  -> 
> >> vec_widen_<su>add_lo_<mode>
> >> -> (u/s)addl
> >> 
> >> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS 
> >> tree
> >> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
> >
> > What I still don't understand is how we are so narrowly focused on
> > HI/LO?  We need a combined scalar IFN for pattern selection (not
> > sure why that's now called _HILO, I expected no suffix).  Then there's
> > three possibilities the target can implement this:
> >
> >  1) with a widen_[su]add<mode> instruction - I _think_ that's what
> >     RISCV is going to offer since it is a target where vector modes
> >     have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
> >     RVV can do a V4HI to V4SI widening and widening add/subtract
> >     using vwadd[u] and vwsub[u] (the HI->SI widening is actually
> >     done with a widening add of zero - eh).
> >     IIRC GCN is the same here.
> 
> SVE currently does this too, but the addition and widening are
> separate operations.  E.g. in principle there's no reason why
> you can't sign-extend one operand, zero-extend the other, and
> then add the result together.  Or you could extend them from
> different sizes (QI and HI).  All of those are supported
> (if the costing allows them).

I see.  So why does the target the expose widen_[su]add<mode> at all?

> If the target has operations to do combined extending and adding (or
> whatever), then at the moment we rely on combine to generate them.
> 
> So I think this case is separate from Andre's work.  The addition
> itself is just an ordinary addition, and any widening happens by
> vectorising a CONVERT/NOP_EXPR.
> 
> >  2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
> >     codes currently support (exclusively)
> >  3) similar, but widen_[su]add{_even,_odd}<mode>
> >
> > that said, things like decomposes_to_hilo_fn_p look to paint us into
> > a 2) corner without good reason.
> 
> I suppose one question is: how much of the patch is really specific
> to HI/LO, and how much is just grouping two halves together?

Yep, that I don't know for sure.

>  The nice
> thing about the internal-fn grouping macros is that, if (3) is
> implemented in future, the structure will strongly encourage even/odd
> pairs to be supported for all operations that support hi/lo.  That is,
> I would expect the grouping macros to be extended to define even/odd
> ifns alongside hi/lo ones, rather than adding separate definitions
> for even/odd functions.
> 
> If so, at least from the internal-fn.* side of things, I think the question
> is whether it's OK to stick with hilo names for now, or whether we should
> use more forward-looking names.

I think for parts that are independent we could use a more
forward-looking name.  Maybe _halves?  But I'm also not sure
how much of that is really needed (it seems to be tied around
optimizing optabs space?)

Richard.

> Thanks,
> Richard
> 
> >
> > Richard.
> >
> >> gcc/ChangeLog:
> >> 
> >> 2023-05-12  Andre Vieira  <andre.simoesdiasvie...@arm.com>
> >>             Joel Hutton  <joel.hut...@arm.com>
> >>             Tamar Christina  <tamar.christ...@arm.com>
> >> 
> >>         * config/aarch64/aarch64-simd.md (vec_widen_<su>addl_lo_<mode>):
> >> Rename
> >>         this ...
> >>         (vec_widen_<su>add_lo_<mode>): ... to this.
> >>         (vec_widen_<su>addl_hi_<mode>): Rename this ...
> >>         (vec_widen_<su>add_hi_<mode>): ... to this.
> >>         (vec_widen_<su>subl_lo_<mode>): Rename this ...
> >>         (vec_widen_<su>sub_lo_<mode>): ... to this.
> >>         (vec_widen_<su>subl_hi_<mode>): Rename this ...
> >>         (vec_widen_<su>sub_hi_<mode>): ...to this.
> >>         * doc/generic.texi: Document new IFN codes.
> >>    * internal-fn.cc (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Macro to
> >>    define an
> >>         internal_fn that expands into multiple internal_fns for widening.
> >>         (DEF_INTERNAL_OPTAB_NARROWING_HILO_FN): Likewise but for narrowing.
> >>    (ifn_cmp): Function to compare ifn's for sorting/searching.
> >>    (lookup_hilo_internal_fn): Add lookup function.
> >>    (commutative_binary_fn_p): Add widen_plus fn's.
> >>    (widening_fn_p): New function.
> >>    (narrowing_fn_p): New function.
> >>    (decomposes_to_hilo_fn_p): New function.
> >>             (direct_internal_fn_optab): Change visibility.
> >>            * internal-fn.def (DEF_INTERNAL_OPTAB_WIDENING_HILO_FN): Define
> >>     widening
> >>     plus,minus functions.
> >>    (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS_EXPR tree code.
> >>    (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS_EXPR tree code.
> >>    * internal-fn.h (GCC_INTERNAL_FN_H): Add headers.
> >>             (direct_internal_fn_optab): Declare new prototype.
> >>    (lookup_hilo_internal_fn): Likewise.
> >>    (widening_fn_p): Likewise.
> >>    (Narrowing_fn_p): Likewise.
> >>    (decomposes_to_hilo_fn_p): Likewise.
> >>    * optabs.cc (commutative_optab_p): Add widening plus optabs.
> >>    * optabs.def (OPTAB_D): Define widen add, sub optabs.
> >>         * tree-cfg.cc (verify_gimple_call): Add checks for new widen
> >>         add and sub IFNs.
> >>         * tree-inline.cc (estimate_num_insns): Return same
> >>         cost for widen add and sub IFNs as previous tree_codes.
> >>            * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support
> >>     patterns with a hi/lo split.
> >>         (vect_recog_sad_pattern): Refactor to use new IFN codes.
> >>         (vect_recog_widen_plus_pattern): Likewise.
> >>         (vect_recog_widen_minus_pattern): Likewise.
> >>         (vect_recog_average_pattern): Likewise.
> >>    * tree-vect-stmts.cc (vectorizable_conversion): Add support for
> >>             _HILO IFNs.
> >>    (supportable_widening_operation): Likewise.
> >>         * tree.def (WIDEN_SUM_EXPR): Update example to use new IFNs.
> >> 
> >> gcc/testsuite/ChangeLog:
> >> 
> >>            * gcc.target/aarch64/vect-widen-add.c: Test that new
> >>     IFN_VEC_WIDEN_PLUS is being used.
> >>            * gcc.target/aarch64/vect-widen-sub.c: Test that new
> >>     IFN_VEC_WIDEN_MINUS is being used.
> >> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Reply via email to