Richard Biener <richard.guent...@gmail.com> writes:
> On Sat, May 21, 2022 at 5:31 PM Roger Sayle <ro...@nextmovesoftware.com> 
> wrote:
>> This patch simplifies vec_unpack_hi_expr/vec_unpack_lo_expr of a uniform
>> constructor or vec_duplicate operand.  The motivation is from PR 105621
>> where after optimization, we're left with:
>>
>>   vect_cst__21 = {c_8(D), c_8(D), c_8(D), c_8(D)};
>>   vect_iftmp.7_4 = [vec_unpack_hi_expr] vect_cst__21;
>>
>> It turns out that there are no constant folding/simplification patterns
>> in match.pd, but the above can be simplified further to the equivalent:
>>
>>   _20 = (long int) c_8(D);
>>   vect_iftmp.7_4 = [vec_duplicate_expr] _20;
>>
>> which on x86-64 results in one less instruction, replacing pshufd $0
>> then punpackhq, with punpcklqdq.  This transformation is also useful
>> for helping CSE to spot that unpack_hi and unpack_lo are equivalent.
>>
>> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>> and make -k check with no new failures.  Ok for mainline?
>
> I think we need a way to query whether the target can do a VEC_DUPLICATE_EXPR.
> Currently we only ever have them for VL vectors and expand via
> expand_vector_broadcast which eventually simply gives up when there's no
> vec_duplicate or vec_init optabs suitable.
>
> IIRC with the VEC_PERM extension we should be able to handle
> VEC_DUPLICATE via VEC_PERM?  (but we don't yet accept a scalar
> input, just V1<mode>?)

Yeah, should be possible.  Not sure whether it would really help though.
A VEC_PERM_EXPR with only one scalar argument could only have one sensible
permute mask[*], so there'd be a bit of false generality.

Maybe allowing scalar arguments would be more useful for 2 distinct
scalar arguments, but then I guess the question is: why stop at 2?
So if we go down the route of accepting scalars, it might be more
consistent to make VEC_PERM_EXPR support any number of operands
and use it as a replacement for CONSTRUCTOR as well.

Thanks,
Richard

[*] At least until we support “don't care” elements.  However, like I
    mentioned before, I'd personally prefer a “don't care” mask to be
    a separate operand, rather than treating something like -1 as a
    special value.  Special values like that don't really fit the
    current encoding scheme for VL constants, but a separate mask would.

    A separate don't-care mask would also work for variable permute masks.
>
> I see most targets have picked up vec_duplicate but sparc, but still
> we'd need to check the specific mode.  I think we can disregart
> vec_init checking and only apply the transforms when vec_duplicate
> is available.
>
> Richard.
>
>>
>> 2022-05-21  Roger Sayle  <ro...@nextmovesoftware.com>
>>
>> gcc/ChangeLog
>>         * match.pd (simplify vec_unpack_hi): Simplify VEC_UNPACK_*_EXPR
>>         of uniform vector constructors and vec_duplicate.
>>
>> gcc/testsuite/ChangeLog
>>         * g++.dg/vect/pr105621.cc: New test case.
>>
>>
>> Thanks in advance,
>> Roger
>> --
>>

Reply via email to