RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

Tamar Christina Mon, 04 Sep 2017 07:29:02 -0700

> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.21_65);
> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.22_63);
> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> intD.11>(vect__4.23_61);
> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
> > intD.11>(vect__4.24_59);
> >
> > I suspect this patch will be quite bad for us performance wise as it
> > thinks it's as cheap to do all our integer operations on the vector side 
> > with
> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
> 
> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
> vectorization decides it can vectorize this without unrolling.


We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also 
don't define support
For V1DI anywhere in the backend, we do however say we support V1DF, but 
removing
That doesn't cause the ICE to go away. 

> Vectorization with V2DI is rejected:
> 
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
> vectorizable.
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: ***** Re-trying
> analysis with vector size 8
> 
> and that now succeeds (it probably didn't succeed before the patch).

In the .optimized file, I see it vectorised it to 

  vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.21_65);
  vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.22_63);
  vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.23_61);
  vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
intD.11>(vect__4.24_59);
  _54 = POPCOUNT (vect__5.25_58);
  _53 = POPCOUNT (vect__5.25_57);

Which is something we just don't have a pattern for. Before this patch, it was 
rejecting "long unsigned int"
With this patch is somehow thinks we support an integer vector of 1 element, 
even though 1) we don't have an optab
Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can 
tell), and 2) we don't have it in our supported list of vector modes.

So I don't quite understand how we end up with this expression.

Regards,
Tamar

> 
> Richard.
> 
> > ________________________________________
> > From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
> on
> > behalf of Andreas Schwab <sch...@linux-m68k.org>
> > Sent: Saturday, September 2, 2017 10:58 PM
> > To: Jon Beniston
> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
> > vector reduction operations
> >
> > On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:
> >
> > > gcc/
> > > 2017-08-30  Jon Beniston  <j...@beniston.com>
> > >             Richard Biener  <rguent...@suse.de>
> > >
> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > index cfdb72c..5ebeac2 100644
> > > --- a/gcc/tree-vect-patterns.c
> > > +++ b/gcc/tree-vect-patterns.c
> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
> *recog_func,
> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> > >
> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
> > > +      || VECTOR_TYPE_P (type_in))
> > >      {
> > >        /* No need to check target support (already checked by the pattern
> > >           recognition function).  */ diff --git
> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
> > > 013fb1f..fc62efb 100644
> > > --- a/gcc/tree-vect-stmts.c
> > > +++ b/gcc/tree-vect-stmts.c
> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
> > > scalar_type, unsigned size)
> > >    else
> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
> > > -  if (nunits <= 1)
> > > +  /* NOTE: nunits == 1 is allowed to support single element vector types.
> > > */
> > > +  if (nunits < 1)
> > >      return NULL_TREE;
> > >
> > >    vectype = build_vector_type (scalar_type, nunits);
> > >
> > >
> > >
> >
> > That breaks vect/pr68577.c on aarch64.
> >
> > during RTL pass: expand
> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function
> 'slp_test':
> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
> > internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
> unsigned int)
> >         ../../gcc/simplify-rtx.c:6049
> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode,
> unsigned int)
> >         ../../gcc/simplify-rtx.c:6278
> > 0x81d277 store_bit_field_1
> >         ../../gcc/expmed.c:798
> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned
> long, unsigned long, machine_mode, rtx_def*, bool)
> >         ../../gcc/expmed.c:1133
> > 0x840bf7 store_field
> >         ../../gcc/expr.c:6950
> > 0x84792f store_constructor_field
> >         ../../gcc/expr.c:6142
> > 0x83edbf store_constructor
> >         ../../gcc/expr.c:6726
> > 0x840443 expand_constructor
> >         ../../gcc/expr.c:8027
> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:10133
> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:9819
> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:10942
> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
> expand_modifier, rtx_def**, bool)
> >         ../../gcc/expr.c:9819
> > 0x83d197 expand_expr
> >         ../../gcc/expr.h:276
> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
> >         ../../gcc/expr.c:4971
> > 0x71e2f3 expand_gimple_stmt_1
> >         ../../gcc/cfgexpand.c:3653
> > 0x71e2f3 expand_gimple_stmt
> >         ../../gcc/cfgexpand.c:3751
> > 0x721cdb expand_gimple_basic_block
> >         ../../gcc/cfgexpand.c:5750
> > 0x726b07 execute
> >         ../../gcc/cfgexpand.c:6357
> >
> > Andreas.
> >
> > --
> > Andreas Schwab, sch...@linux-m68k.org
> > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276
> > 4ED5 "And now for something completely different."
> >
> >
> 
> --
> Richard Biener <rguent...@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> HRB 21284 (AG Nuernberg)

RE: [RFC, vectorizer] Allow single element vector types for vector reduction operations

Reply via email to