On Wed, Aug 4, 2021 at 3:34 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Tue, Aug 3, 2021 at 6:56 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > 1. Update x86 STORE_MAX_PIECES to use OImode and XImode only if inter-unit > > move is enabled since x86 uses vec_duplicate, which is enabled only when > > inter-unit move is enabled, to implement store_by_pieces. > > 2. Update op_by_pieces_d::op_by_pieces_d to set m_max_size to > > STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES for > > compare_by_pieces. > > > > gcc/ > > > > PR target/101742 > > * expr.c (op_by_pieces_d::op_by_pieces_d): Set m_max_size to > > STORE_MAX_PIECES for store_by_pieces and to COMPARE_MAX_PIECES > > for compare_by_pieces. > > * config/i386/i386.h (STORE_MAX_PIECES): Use OImode and XImode > > only if TARGET_INTER_UNIT_MOVES_TO_VEC is true. > > > > gcc/testsuite/ > > > > PR target/101742 > > * gcc.target/i386/pr101742a.c: New test. > > * gcc.target/i386/pr101742b.c: Likewise. > > --- > > gcc/config/i386/i386.h | 20 +++++++++++--------- > > gcc/expr.c | 6 +++++- > > gcc/testsuite/gcc.target/i386/pr101742a.c | 16 ++++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr101742b.c | 4 ++++ > > 4 files changed, 36 insertions(+), 10 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101742a.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr101742b.c > > > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h > > index bed9cd9da18..9b416abd5f4 100644 > > --- a/gcc/config/i386/i386.h > > +++ b/gcc/config/i386/i386.h > > @@ -1783,15 +1783,17 @@ typedef struct ix86_args { > > /* STORE_MAX_PIECES is the number of bytes at a time that we can > > store efficiently. */ > > #define STORE_MAX_PIECES \ > > - ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ > > - ? 64 \ > > - : ((TARGET_AVX \ > > - && !TARGET_PREFER_AVX128 \ > > - && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ > > - ? 32 \ > > - : ((TARGET_SSE2 \ > > - && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > > - ? 16 : UNITS_PER_WORD))) > > + (TARGET_INTER_UNIT_MOVES_TO_VEC \ > > + ? ((TARGET_AVX512F && !TARGET_PREFER_AVX256) \ > > + ? 64 \ > > + : ((TARGET_AVX \ > > + && !TARGET_PREFER_AVX128 \ > > + && !TARGET_AVX256_SPLIT_UNALIGNED_STORE) \ > > + ? 32 \ > > + : ((TARGET_SSE2 \ > > + && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ > > + ? 16 : UNITS_PER_WORD))) \ > > + : UNITS_PER_WORD) > > > > /* If a memory-to-memory move would take MOVE_RATIO or more simple > > move-instruction pairs, we will do a cpymem or libcall instead. > > expr.c has been fixed. Here is the v2 patch for x86 backend. > OK for master?
OK, but please add the comment about vec_duplicate before the define to explain the situation with TARGET_INTER_UNIT_MOVES_TO_VEC. Thanks, Uros.