Richard Biener <richard.guent...@gmail.com> writes: > On Sat, May 21, 2022 at 5:31 PM Roger Sayle <ro...@nextmovesoftware.com> > wrote: >> This patch simplifies vec_unpack_hi_expr/vec_unpack_lo_expr of a uniform >> constructor or vec_duplicate operand. The motivation is from PR 105621 >> where after optimization, we're left with: >> >> vect_cst__21 = {c_8(D), c_8(D), c_8(D), c_8(D)}; >> vect_iftmp.7_4 = [vec_unpack_hi_expr] vect_cst__21; >> >> It turns out that there are no constant folding/simplification patterns >> in match.pd, but the above can be simplified further to the equivalent: >> >> _20 = (long int) c_8(D); >> vect_iftmp.7_4 = [vec_duplicate_expr] _20; >> >> which on x86-64 results in one less instruction, replacing pshufd $0 >> then punpackhq, with punpcklqdq. This transformation is also useful >> for helping CSE to spot that unpack_hi and unpack_lo are equivalent. >> >> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap >> and make -k check with no new failures. Ok for mainline? > > I think we need a way to query whether the target can do a VEC_DUPLICATE_EXPR. > Currently we only ever have them for VL vectors and expand via > expand_vector_broadcast which eventually simply gives up when there's no > vec_duplicate or vec_init optabs suitable. > > IIRC with the VEC_PERM extension we should be able to handle > VEC_DUPLICATE via VEC_PERM? (but we don't yet accept a scalar > input, just V1<mode>?)
Yeah, should be possible. Not sure whether it would really help though. A VEC_PERM_EXPR with only one scalar argument could only have one sensible permute mask[*], so there'd be a bit of false generality. Maybe allowing scalar arguments would be more useful for 2 distinct scalar arguments, but then I guess the question is: why stop at 2? So if we go down the route of accepting scalars, it might be more consistent to make VEC_PERM_EXPR support any number of operands and use it as a replacement for CONSTRUCTOR as well. Thanks, Richard [*] At least until we support “don't care” elements. However, like I mentioned before, I'd personally prefer a “don't care” mask to be a separate operand, rather than treating something like -1 as a special value. Special values like that don't really fit the current encoding scheme for VL constants, but a separate mask would. A separate don't-care mask would also work for variable permute masks. > > I see most targets have picked up vec_duplicate but sparc, but still > we'd need to check the specific mode. I think we can disregart > vec_init checking and only apply the transforms when vec_duplicate > is available. > > Richard. > >> >> 2022-05-21 Roger Sayle <ro...@nextmovesoftware.com> >> >> gcc/ChangeLog >> * match.pd (simplify vec_unpack_hi): Simplify VEC_UNPACK_*_EXPR >> of uniform vector constructors and vec_duplicate. >> >> gcc/testsuite/ChangeLog >> * g++.dg/vect/pr105621.cc: New test case. >> >> >> Thanks in advance, >> Roger >> -- >>