On Mon, Sep 4, 2017 at 7:28 AM, Tamar Christina <tamar.christ...@arm.com> wrote:
>> >   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.21_65);
>> >   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.22_63);
>> >   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> intD.11>(vect__4.23_61);
>> >   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned
>> > intD.11>(vect__4.24_59);
>> >
>> > I suspect this patch will be quite bad for us performance wise as it
>> > thinks it's as cheap to do all our integer operations on the vector side 
>> > with
>> vectors of 1 element. But I'm still waiting for the perf numbers to confirm.
>>
>> Looks like the backend advertises that it can do POPCOUNT on V1DI.  So SLP
>> vectorization decides it can vectorize this without unrolling.
>
> We don't, POPCOUNT is only defined for vector modes V8QI and V16QI, we also 
> don't define support
> For V1DI anywhere in the backend, we do however say we support V1DF, but 
> removing
> That doesn't cause the ICE to go away.
>
>> Vectorization with V2DI is rejected:
>>
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: function is not
>> vectorizable.
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: not vectorized:
>> relevant stmt not supported: _8 = __builtin_popcountl (_5); ...
>> /tmp/trunk/gcc/testsuite/gcc.dg/vect/pr68577.c:18:3: note: ***** Re-trying
>> analysis with vector size 8
>>
>> and that now succeeds (it probably didn't succeed before the patch).
>
> In the .optimized file, I see it vectorised it to
>
>   vect__5.25_58 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.21_65);
>   vect__5.25_57 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.22_63);
>   vect__5.25_56 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.23_61);
>   vect__5.25_55 = VIEW_CONVERT_EXPR<vector(1) long unsigned 
> intD.11>(vect__4.24_59);
>   _54 = POPCOUNT (vect__5.25_58);
>   _53 = POPCOUNT (vect__5.25_57);
>
> Which is something we just don't have a pattern for. Before this patch, it 
> was rejecting "long unsigned int"
> With this patch is somehow thinks we support an integer vector of 1 element, 
> even though 1) we don't have an optab
> Defined for this operation for POPCOUNT (or at all in aarch64 as far as I can 
> tell), and 2) we don't have it in our supported list of vector modes.

Here are the two popcount optab aarch64 has:
(define_mode_iterator GPI [SI DI])
(define_expand "popcount<mode>2"
  [(match_operand:GPI 0 "register_operand")
   (match_operand:GPI 1 "register_operand")]


(define_insn "popcount<mode>2"
(define_mode_iterator VB [V8QI V16QI])
  [(set (match_operand:VB 0 "register_operand" "=w")
        (popcount:VB (match_operand:VB 1 "register_operand" "w")))]

As you can see we only define popcount optab for SI, DI, V8QI and
V16QI.  (note SI and DI uses the V8QI and V16QI during the expansion
but that is a different story).

Maybe somehow the vectorizer is thinking V1DI and DI are
interchangeable in some places.

Thanks,
Andrew


>
> So I don't quite understand how we end up with this expression.
>
> Regards,
> Tamar
>
>>
>> Richard.
>>
>> > ________________________________________
>> > From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
>> on
>> > behalf of Andreas Schwab <sch...@linux-m68k.org>
>> > Sent: Saturday, September 2, 2017 10:58 PM
>> > To: Jon Beniston
>> > Cc: 'Richard Biener'; gcc-patches@gcc.gnu.org
>> > Subject: Re: [RFC, vectorizer] Allow single element vector types for
>> > vector reduction operations
>> >
>> > On Aug 30 2017, "Jon Beniston" <j...@beniston.com> wrote:
>> >
>> > > gcc/
>> > > 2017-08-30  Jon Beniston  <j...@beniston.com>
>> > >             Richard Biener  <rguent...@suse.de>
>> > >
>> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
>> > > index cfdb72c..5ebeac2 100644
>> > > --- a/gcc/tree-vect-patterns.c
>> > > +++ b/gcc/tree-vect-patterns.c
>> > > @@ -4150,7 +4150,7 @@ vect_pattern_recog_1 (vect_recog_func
>> *recog_func,
>> > >    loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>> > >
>> > >    if (VECTOR_BOOLEAN_TYPE_P (type_in)
>> > > -      || VECTOR_MODE_P (TYPE_MODE (type_in)))
>> > > +      || VECTOR_TYPE_P (type_in))
>> > >      {
>> > >        /* No need to check target support (already checked by the pattern
>> > >           recognition function).  */ diff --git
>> > > a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index
>> > > 013fb1f..fc62efb 100644
>> > > --- a/gcc/tree-vect-stmts.c
>> > > +++ b/gcc/tree-vect-stmts.c
>> > > @@ -9098,7 +9098,8 @@ get_vectype_for_scalar_type_and_size (tree
>> > > scalar_type, unsigned size)
>> > >    else
>> > >      simd_mode = mode_for_vector (inner_mode, size / nbytes);
>> > >    nunits = GET_MODE_SIZE (simd_mode) / nbytes;
>> > > -  if (nunits <= 1)
>> > > +  /* NOTE: nunits == 1 is allowed to support single element vector 
>> > > types.
>> > > */
>> > > +  if (nunits < 1)
>> > >      return NULL_TREE;
>> > >
>> > >    vectype = build_vector_type (scalar_type, nunits);
>> > >
>> > >
>> > >
>> >
>> > That breaks vect/pr68577.c on aarch64.
>> >
>> > during RTL pass: expand
>> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c: In function
>> 'slp_test':
>> > /opt/gcc/gcc-20170902/gcc/testsuite/gcc.dg/vect/pr68577.c:20:12:
>> > internal compiler error: in simplify_subreg, at simplify-rtx.c:6050
>> > 0xb55983 simplify_subreg(machine_mode, rtx_def*, machine_mode,
>> unsigned int)
>> >         ../../gcc/simplify-rtx.c:6049
>> > 0xb595f7 simplify_gen_subreg(machine_mode, rtx_def*, machine_mode,
>> unsigned int)
>> >         ../../gcc/simplify-rtx.c:6278
>> > 0x81d277 store_bit_field_1
>> >         ../../gcc/expmed.c:798
>> > 0x81d55f store_bit_field(rtx_def*, unsigned long, unsigned long, unsigned
>> long, unsigned long, machine_mode, rtx_def*, bool)
>> >         ../../gcc/expmed.c:1133
>> > 0x840bf7 store_field
>> >         ../../gcc/expr.c:6950
>> > 0x84792f store_constructor_field
>> >         ../../gcc/expr.c:6142
>> > 0x83edbf store_constructor
>> >         ../../gcc/expr.c:6726
>> > 0x840443 expand_constructor
>> >         ../../gcc/expr.c:8027
>> > 0x82d5bf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:10133
>> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:9819
>> > 0x82dadf expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:10942
>> > 0x82eaeb expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
>> expand_modifier, rtx_def**, bool)
>> >         ../../gcc/expr.c:9819
>> > 0x83d197 expand_expr
>> >         ../../gcc/expr.h:276
>> > 0x83d197 expand_assignment(tree_node*, tree_node*, bool)
>> >         ../../gcc/expr.c:4971
>> > 0x71e2f3 expand_gimple_stmt_1
>> >         ../../gcc/cfgexpand.c:3653
>> > 0x71e2f3 expand_gimple_stmt
>> >         ../../gcc/cfgexpand.c:3751
>> > 0x721cdb expand_gimple_basic_block
>> >         ../../gcc/cfgexpand.c:5750
>> > 0x726b07 execute
>> >         ../../gcc/cfgexpand.c:6357
>> >
>> > Andreas.
>> >
>> > --
>> > Andreas Schwab, sch...@linux-m68k.org
>> > GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276
>> > 4ED5 "And now for something completely different."
>> >
>> >
>>
>> --
>> Richard Biener <rguent...@suse.de>
>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
>> HRB 21284 (AG Nuernberg)

Reply via email to