On Wed, Dec 6, 2017 at 4:11 PM, Richard Sandiford
<richard.sandif...@linaro.org> wrote:
> Richard Biener <richard.guent...@gmail.com> writes:
>> On Thu, Nov 30, 2017 at 2:18 PM, Richard Sandiford
>> <richard.sandif...@linaro.org> wrote:
>>> Richard Sandiford <richard.sandif...@linaro.org> writes:
>>>> Richard Biener <richard.guent...@gmail.com> writes:
>>>>> On Wed, Nov 29, 2017 at 12:57 PM, Richard Sandiford
>>>>> <richard.sandif...@linaro.org> wrote:
>>>>>> It was clear from the SVE reviews that people were unhappy with how
>>>>>> "special" the variable-length case was.  One particular concern was
>>>>>> the use of VEC_DUPLICATE_CST and VEC_SERIES_CST, and the way that
>>>>>> that would in turn lead to different representations of VEC_PERM_EXPRs
>>>>>> with constant permute vectors, and difficulties in invoking
>>>>>> vec_perm_const_ok.
>>>>>>
>>>>>> This is an RFC for a VECTOR_CST representation that treats each
>>>>>> specific constant as one instance of an arbitrary-length sequence.
>>>>>> The reprensentation then extends to variable-length vectors in a
>>>>>> natural way.
>>>>>>
>>>>>> As discussed on IRC, if a vector contains X*N elements for some
>>>>>> constant N and integer X>0, the main features we need are:
>>>>>>
>>>>>> 1) the ability to represent a sequence that duplicates N values
>>>>>>
>>>>>>    This is useful for SLP invariants.
>>>>>>
>>>>>> 2) the ability to represent a sequence that starts with N values and
>>>>>>    is followed by zeros
>>>>>>
>>>>>>    This is useful for the initial value in a double or SLP reduction
>>>>>>
>>>>>> 3) the ability to represent N interleaved series
>>>>>>
>>>>>>    This is useful for SLP inductions and for VEC_PERM_EXPRs.
>>>>>>
>>>>>> For (2), zero isn't necessarily special, since vectors used in an AND
>>>>>> reduction might need to fill with ones.  Also, we might need up to N
>>>>>> different fill values with mixed SLP operations; it isn't necessarily
>>>>>> safe to assume that a single fill value will always be enough.
>>>>>>
>>>>>> The same goes for (3): there's no reason in principle why the
>>>>>> steps in an SLP induction should all be the same (although they
>>>>>> do need to be at the moment).  E.g. once we support SLP on:
>>>>>>
>>>>>>   for (unsigned int i = 0; i < n; i += 2)
>>>>>>     {
>>>>>>       x[i] += 4 + i;
>>>>>>       x[i + 1] += 11 + i * 3;
>>>>>>     }
>>>>>>
>>>>>> we'll need {[4, 14], +, [2, 6]}.
>>>>>>
>>>>>> So the idea is to represent vectors as P interleaved patterns of the 
>>>>>> form:
>>>>>>
>>>>>>   [BASE0, BASE1, BASE1 + STEP, BASE1 + STEP*2, ...]
>>>>>>
>>>>>> where the STEP is always zero (actually null) for non-integer vectors.
>>>>>> This is effectively projecting a "foreground" value of P elements
>>>>>> onto an arbitrary-length "background" sequenece, where the background
>>>>>> sequence contains P parallel linear series.
>>>>>>
>>>>>> E.g. to pick an extreme and unlikely example,
>>>>>>
>>>>>>   [42, 99, 2, 20, 3, 30, 4, 40, ...]
>>>>>>
>>>>>> has 2 patterns:
>>>>>>
>>>>>>   BASE0 = 42, BASE1 = 2, STEP = 1
>>>>>>   BASE0 = 99, BASE1 = 20, STEP = 10
>>>>>>
>>>>>> The more useful cases are degenerate versions of this general case.
>>>>>>
>>>>>> As far as memory consumption goes: the number of patterns needed for a
>>>>>> fixed-length vector with 2*N elements is always at most N; in the worst
>>>>>> case, we simply interleave the first N elements with the second N 
>>>>>> elements.
>>>>>> The worst-case increase in footprint is therefore N trees for the steps.
>>>>>> In practice the footprint is usually smaller than it was before, since
>>>>>> most constants do have a pattern.
>>>>>>
>>>>>> The patch below implements this for trees.  I have patches to use the
>>>>>> same style of encoding for CONST_VECTOR and vec_perm_indices, but the
>>>>>> tree one is probably easiest to read.
>>>>>>
>>>>>> The patch only adds the representation.  Follow-on patches make more
>>>>>> use of it (and usually make things simpler; e.g. integer_zerop is no
>>>>>> longer a looping operation).
>>>>>>
>>>>>> Does this look better?
>>>>>
>>>>> Yes, the overall design looks good.  I wonder why you chose to have
>>>>> the number of patterns being a power of two?  I suppose this is
>>>>> to have the same number of elements from all patterns in the final
>>>>> vector (which is power-of-two sized)?
>>>>
>>>> Right.  The rtl and vec_perm_indices parts don't have this restriction,
>>>> since some ports do define non-power-of-2 vectors for internal use.
>>>> The problem is that, since VECTOR_CSTs are used by the FE, we need
>>>> to support all valid vector lengths without blowing the 16-bit field.
>>>> Using the same style of representation as TYPE_VECTOR_SUBPARTS seemed
>>>> like the safest way of doing that.
>>>>
>>>>> I wonder if there exists a vector where say a three-pattern
>>>>> interleaving would be smaller than a four-pattern one?
>>>>
>>>> Only in the non-power-of-2 case.
>>>>
>>>>> Given you add flags for various purposes would it make sense to
>>>>> overload 'step' with a regular element to avoid the storage increase
>>>>> in case step is unnecessary?  This makes it have three elements
>>>>> which is of course awkward :/
>>>>
>>>> I wondered about keeping it as an array of trees and tacking the
>>>> steps onto the end as an optional addition.  But the idea is that
>>>> tree_vector_pattern becomes the preferred way of handling constant
>>>> vectors, if it can be used, so it seemed neater to use in the tree
>>>> node too.
>>>
>>> In the end it seemed better to encode the first NPATTERNS * N
>>> elements of the vector, where:
>>>
>>> - N==3 if at least one pattern needs a step; otherwise
>>> - N==2 if at least one pattern has different BASE0s and BASE1s; otherwise
>>> - N==1 (i.e. the vector is a repeated sequence of NPATTERNS elements)
>>>
>>> So effectively all we're doing for the constant-length case is
>>> reducing the number of elements that need to be stored and processed
>>> (once code is converted to use the new routines).
>>>
>>> The patch below does this.  It also introduces a new class
>>> vector_builder<T> for building vectors of Ts, with a derived
>>> tree_vector_builder specifically for trees.  This allows the
>>> compression code to be shared between representations and also
>>> avoids hard-coding the auto_vec<> count everywhere.
>>>
>>> I've also fixed the selftests to get the actual and expected values
>>> the right way round (thanks David for noticing that).
>>>
>>> What do you think?  Does this look better?
>>
>> Yes!  This is ok for trunk.
>
> Thanks :-)  I've since tidied up the documentation and vector-builder.h
> stuff and tested the patch properly with the rest of the SVE series
> (which seems to work well with it).  The encoding is the same as before.
>
> Sorry for the delay in getting the update out, but I wanted to be
> certain that no further changes were needed.
>
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu
> against current trunk.  Still OK?

Ok.

Thanks,
Richard.

> Richard
>
>
> 2017-12-06  Richard Sandiford  <richard.sandif...@arm.com>
>
> gcc/
>         * doc/generic.texi (VECTOR_CST): Describe new representation of
>         vector constants.
>         * vector-builder.h: New file.
>         * tree-vector-builder.h: Likewise.
>         * tree-vector-builder.c: Likewise.
>         * Makefile.in (OBJS): Add tree-vector-builder.o.
>         * tree.def (VECTOR_CST): Update comment to refer to generic.texi.
>         * tree-core.h (tree_base): Add a vector_cst field to the u union.
>         (tree_vector): Change the number of elements to
>         vector_cst_encoded_nelts.
>         * tree.h (VECTOR_CST_NELTS): Redefine using TYPE_VECTOR_SUBPARTS.
>         (VECTOR_CST_ELTS): Delete.
>         (VECTOR_CST_ELT): Redefine using vector_cst_elt.
>         (VECTOR_CST_LOG2_NPATTERNS, VECTOR_CST_NPATTERNS): New macros.
>         (VECTOR_CST_NELTS_PER_PATTERN, VECTOR_CST_DUPLICATE_P): Likewise.
>         (VECTOR_CST_STEPPED_P, VECTOR_CST_ENCODED_ELTS): Likewise.
>         (VECTOR_CST_ENCODED_ELT): Likewise.
>         (vector_cst_encoded_nelts): New function.
>         (make_vector): Take the values of VECTOR_CST_LOG2_NPATTERNS and
>         VECTOR_CST_NELTS_PER_PATTERN as arguments.
>         (vector_cst_int_elt, vector_cst_elt): Declare.
>         * tree.c: Include tree-vector-builder.h.
>         (tree_code_size): Abort if passed VECTOR_CST.
>         (tree_size): Update for new VECTOR_CST layout.
>         (make_vector): Take the values of VECTOR_CST_LOG2_NPATTERNS and
>         VECTOR_CST_NELTS_PER_PATTERN as arguments.
>         (build_vector): Use tree_vector_builder.
>         (vector_cst_int_elt, vector_cst_elt): New functions.
>         (drop_tree_overflow): For VECTOR_CST, drop the TREE_OVERFLOW from the
>         encoded elements and then create the vector in the canonical form.
>         (check_vector_cst, check_vector_cst_duplicate, check_vector_cst_fill)
>         (check_vector_cst_stepped, test_vector_cst_patterns): New functions.
>         (tree_c_tests): Call test_vector_cst_patterns.
>         * lto-streamer-out.c (DFS::DFS_write_tree_body): Handle the new
>         VECTOR_CST fields.
>         (hash_tree): Likewise.
>         * tree-streamer-out.c (write_ts_vector_tree_pointers): Likewise.
>         (streamer_write_tree_header): Likewise.
>         * tree-streamer-in.c (lto_input_ts_vector_tree_pointers): Likewise.
>         (streamer_alloc_tree): Likewise.  Update call to make_vector.
>         * fold-const.c (fold_ternary_loc): Avoid using VECTOR_CST_ELTS.
>
> gcc/lto/
>         * lto.c (compare_tree_sccs_1): Compare the new VECTOR_CST flags.
>
> Index: gcc/doc/generic.texi
> ===================================================================
> --- gcc/doc/generic.texi        2017-12-06 14:46:13.860611208 +0000
> +++ gcc/doc/generic.texi        2017-12-06 14:46:14.128600028 +0000
> @@ -1084,10 +1084,77 @@ These nodes are used to represent comple
>  imaginary parts respectively.
>
>  @item VECTOR_CST
> -These nodes are used to represent vector constants, whose parts are
> -constant nodes.  Each individual constant node is either an integer or a
> -double constant node.  The first operand is a @code{TREE_LIST} of the
> -constant nodes and is accessed through @code{TREE_VECTOR_CST_ELTS}.
> +These nodes are used to represent vector constants.  Each vector
> +constant @var{v} is treated as a specific instance of an arbitrary-length
> +sequence that itself contains @samp{VECTOR_CST_NPATTERNS (@var{v})}
> +interleaved patterns.  Each pattern has the form:
> +
> +@smallexample
> +@{ @var{base0}, @var{base1}, @var{base1} + @var{step}, @var{base1} + 
> @var{step} * 2, @dots{} @}
> +@end smallexample
> +
> +The first three elements in each pattern are enough to determine the
> +values of the other elements.  However, if all @var{step}s are zero,
> +only the first two elements are needed.  If in addition each @var{base1}
> +is equal to the corresponding @var{base0}, only the first element in
> +each pattern is needed.  The number of encoded elements per pattern
> +is given by @samp{VECTOR_CST_NELTS_PER_PATTERN (@var{v})}.
> +
> +For example, the constant:
> +
> +@smallexample
> +@{ 0, 1, 2, 6, 3, 8, 4, 10, 5, 12, 6, 14, 7, 16, 8, 18 @}
> +@end smallexample
> +
> +is interpreted as an interleaving of the sequences:
> +
> +@smallexample
> +@{ 0, 2, 3, 4, 5, 6, 7, 8 @}
> +@{ 1, 6, 8, 10, 12, 14, 16, 18 @}
> +@end smallexample
> +
> +where the sequences are represented by the following patterns:
> +
> +@smallexample
> +@var{base0} == 0, @var{base1} == 2, @var{step} == 1
> +@var{base0} == 1, @var{base1} == 6, @var{step} == 2
> +@end smallexample
> +
> +In this case:
> +
> +@smallexample
> +VECTOR_CST_NPATTERNS (@var{v}) == 2
> +VECTOR_CST_NELTS_PER_PATTERN (@var{v}) == 3
> +@end smallexample
> +
> +The vector is therefore encoded using the first 6 elements
> +(@samp{@{ 0, 1, 2, 6, 3, 8 @}}), with the remaining 10 elements
> +being implicit extensions of them.
> +
> +Sometimes this scheme can create two possible encodings of the same
> +vector.  For example @{ 0, 1 @} could be seen as two patterns with
> +one element each or one pattern with two elements (@var{base0} and
> +@var{base1}).  The canonical encoding is always the one with the
> +fewest patterns or (if both encodings have the same number of
> +petterns) the one with the fewest encoded elements.
> +
> +@samp{vector_cst_encoding_nelts (@var{v})} gives the total number of
> +encoded elements in @var{v}, which is 6 in the example above.
> +@code{VECTOR_CST_ENCODED_ELTS (@var{v})} gives a pointer to the elements
> +encoded in @var{v} and @code{VECTOR_CST_ENCODED_ELT (@var{v}, @var{i})}
> +accesses the value of encoded element @var{i}.
> +
> +@samp{VECTOR_CST_DUPLICATE_P (@var{v})} is true if @var{v} simply contains
> +repeated instances of @samp{VECTOR_CST_NPATTERNS (@var{v})} values.  This is
> +a shorthand for testing @samp{VECTOR_CST_NELTS_PER_PATTERN (@var{v}) == 1}.
> +
> +@samp{VECTOR_CST_STEPPED_P (@var{v})} is true if at least one
> +pattern in @var{v} has a nonzero step.  This is a shorthand for
> +testing @samp{VECTOR_CST_NELTS_PER_PATTERN (@var{v}) == 3}.
> +
> +The utility function @code{vector_cst_elt} gives the value of an
> +arbitrary index as a @code{tree}.  @code{vector_cst_int_elt} gives
> +the same value as a @code{wide_int}.
>
>  @item STRING_CST
>  These nodes represent string-constants.  The @code{TREE_STRING_LENGTH}
> Index: gcc/vector-builder.h
> ===================================================================
> --- /dev/null   2017-12-05 14:21:55.753572108 +0000
> +++ gcc/vector-builder.h        2017-12-06 14:46:14.133599820 +0000
> @@ -0,0 +1,394 @@
> +/* A class for building vector constant patterns.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_VECTOR_BUILDER_H
> +#define GCC_VECTOR_BUILDER_H
> +
> +/* This class is a wrapper around auto_vec<T> for building vectors of T.
> +   It aims to encode each vector as npatterns interleaved patterns,
> +   where each pattern represents a sequence:
> +
> +     { BASE0, BASE1, BASE1 + STEP, BASE1 + STEP*2, BASE1 + STEP*3, ... }
> +
> +   The first three elements in each pattern provide enough information
> +   to derive the other elements.  If all patterns have a STEP of zero,
> +   we only need to encode the first two elements in each pattern.
> +   If BASE1 is also equal to BASE0 for all patterns, we only need to
> +   encode the first element in each pattern.  The number of encoded
> +   elements per pattern is given by nelts_per_pattern.
> +
> +   The class can be used in two ways:
> +
> +   1. It can be used to build a full image of the vector, which is then
> +      canonicalized by finalize ().  In this case npatterns is initially
> +      the number of elements in the vector and nelts_per_pattern is
> +      initially 1.
> +
> +   2. It can be used to build a vector that already has a known encoding.
> +      This is preferred since it is more efficient and copes with
> +      variable-length vectors.  finalize () then canonicalizes the encoding
> +      to a simpler form if possible.
> +
> +   The derived class Derived provides this functionality for specific Ts.
> +   Derived needs to provide the following interface:
> +
> +      bool equal_p (T elt1, T elt2) const;
> +
> +         Return true if elements ELT1 and ELT2 are equal.
> +
> +      bool allow_steps_p () const;
> +
> +         Return true if a stepped representation is OK.  We don't allow
> +         linear series for anything other than integers, to avoid problems
> +         with rounding.
> +
> +      bool integral_p (T elt) const;
> +
> +         Return true if element ELT can be interpreted as an integer.
> +
> +      StepType step (T elt1, T elt2) const;
> +
> +         Return the value of element ELT2 minus the value of element ELT1,
> +         given integral_p (ELT1) && integral_p (ELT2).  There is no fixed
> +         choice of StepType.
> +
> +      bool can_elide_p (T elt) const;
> +
> +         Return true if we can drop element ELT, even if the retained
> +         elements are different.  This is provided for TREE_OVERFLOW
> +         handling.
> +
> +      void note_representative (T *elt1_ptr, T elt2);
> +
> +         Record that ELT2 is being elided, given that ELT1_PTR points to
> +         the last encoded element for the containing pattern.  This is
> +         again provided for TREE_OVERFLOW handling.  */
> +
> +template<typename T, typename Derived>
> +class vector_builder : public auto_vec<T, 32>
> +{
> +public:
> +  vector_builder ();
> +
> +  unsigned int full_nelts () const { return m_full_nelts; }
> +  unsigned int npatterns () const { return m_npatterns; }
> +  unsigned int nelts_per_pattern () const { return m_nelts_per_pattern; }
> +  unsigned int encoded_nelts () const;
> +  bool encoded_full_vector_p () const;
> +
> +  void finalize ();
> +
> +protected:
> +  void new_vector (unsigned int, unsigned int, unsigned int);
> +  void reshape (unsigned int, unsigned int);
> +  bool repeating_sequence_p (unsigned int, unsigned int, unsigned int);
> +  bool stepped_sequence_p (unsigned int, unsigned int, unsigned int);
> +  bool try_npatterns (unsigned int);
> +
> +private:
> +  vector_builder (const vector_builder &);
> +  vector_builder &operator= (const vector_builder &);
> +  Derived *derived () { return static_cast<Derived *> (this); }
> +  const Derived *derived () const;
> +
> +  unsigned int m_full_nelts;
> +  unsigned int m_npatterns;
> +  unsigned int m_nelts_per_pattern;
> +};
> +
> +template<typename T, typename Derived>
> +inline const Derived *
> +vector_builder<T, Derived>::derived () const
> +{
> +  return static_cast<const Derived *> (this);
> +}
> +
> +template<typename T, typename Derived>
> +inline
> +vector_builder<T, Derived>::vector_builder ()
> +  : m_full_nelts (0),
> +    m_npatterns (0),
> +    m_nelts_per_pattern (0)
> +{}
> +
> +/* Return the number of elements that are explicitly encoded.  The vec
> +   starts with these explicitly-encoded elements and may contain additional
> +   elided elements.  */
> +
> +template<typename T, typename Derived>
> +inline unsigned int
> +vector_builder<T, Derived>::encoded_nelts () const
> +{
> +  return m_npatterns * m_nelts_per_pattern;
> +}
> +
> +/* Return true if every element of the vector is explicitly encoded.  */
> +
> +template<typename T, typename Derived>
> +inline bool
> +vector_builder<T, Derived>::encoded_full_vector_p () const
> +{
> +  return m_npatterns * m_nelts_per_pattern == m_full_nelts;
> +}
> +
> +/* Start building a vector that has FULL_NELTS elements.  Initially
> +   encode it using NPATTERNS patterns with NELTS_PER_PATTERN each.  */
> +
> +template<typename T, typename Derived>
> +void
> +vector_builder<T, Derived>::new_vector (unsigned int full_nelts,
> +                                       unsigned int npatterns,
> +                                       unsigned int nelts_per_pattern)
> +{
> +  m_full_nelts = full_nelts;
> +  m_npatterns = npatterns;
> +  m_nelts_per_pattern = nelts_per_pattern;
> +  this->reserve (encoded_nelts ());
> +  this->truncate (0);
> +}
> +
> +/* Change the encoding to NPATTERNS patterns of NELTS_PER_PATTERN each,
> +   but without changing the underlying vector.  */
> +
> +template<typename T, typename Derived>
> +void
> +vector_builder<T, Derived>::reshape (unsigned int npatterns,
> +                                    unsigned int nelts_per_pattern)
> +{
> +  unsigned int old_encoded_nelts = encoded_nelts ();
> +  unsigned int new_encoded_nelts = npatterns * nelts_per_pattern;
> +  gcc_checking_assert (new_encoded_nelts <= old_encoded_nelts);
> +  unsigned int next = new_encoded_nelts - npatterns;
> +  for (unsigned int i = new_encoded_nelts; i < old_encoded_nelts; ++i)
> +    {
> +      derived ()->note_representative (&(*this)[next], (*this)[i]);
> +      next += 1;
> +      if (next == new_encoded_nelts)
> +       next -= npatterns;
> +    }
> +  m_npatterns = npatterns;
> +  m_nelts_per_pattern = nelts_per_pattern;
> +}
> +
> +/* Return true if elements [START, END) contain a repeating sequence of
> +   STEP elements.  */
> +
> +template<typename T, typename Derived>
> +bool
> +vector_builder<T, Derived>::repeating_sequence_p (unsigned int start,
> +                                                 unsigned int end,
> +                                                 unsigned int step)
> +{
> +  for (unsigned int i = start; i < end - step; ++i)
> +    if (!derived ()->equal_p ((*this)[i], (*this)[i + step]))
> +      return false;
> +  return true;
> +}
> +
> +/* Return true if elements [START, END) contain STEP interleaved linear
> +   series.  */
> +
> +template<typename T, typename Derived>
> +bool
> +vector_builder<T, Derived>::stepped_sequence_p (unsigned int start,
> +                                               unsigned int end,
> +                                               unsigned int step)
> +{
> +  if (!derived ()->allow_steps_p ())
> +    return false;
> +
> +  for (unsigned int i = start + step * 2; i < end; ++i)
> +    {
> +      T elt1 = (*this)[i - step * 2];
> +      T elt2 = (*this)[i - step];
> +      T elt3 = (*this)[i];
> +
> +      if (!derived ()->integral_p (elt1)
> +         || !derived ()->integral_p (elt2)
> +         || !derived ()->integral_p (elt3))
> +       return false;
> +
> +      if (derived ()->step (elt1, elt2) != derived ()->step (elt2, elt3))
> +       return false;
> +
> +      if (!derived ()->can_elide_p (elt3))
> +       return false;
> +    }
> +  return true;
> +}
> +
> +/* Try to change the number of encoded patterns to NPATTERNS, returning
> +   true on success.  */
> +
> +template<typename T, typename Derived>
> +bool
> +vector_builder<T, Derived>::try_npatterns (unsigned int npatterns)
> +{
> +  if (m_nelts_per_pattern == 1)
> +    {
> +      /* See whether NPATTERNS is valid with the current 
> 1-element-per-pattern
> +        encoding.  */
> +      if (repeating_sequence_p (0, encoded_nelts (), npatterns))
> +       {
> +         reshape (npatterns, 1);
> +         return true;
> +       }
> +
> +      /* We can only increase the number of elements per pattern if all
> +        elements are still encoded explicitly.  */
> +      if (!encoded_full_vector_p ())
> +       return false;
> +    }
> +
> +  if (m_nelts_per_pattern <= 2)
> +    {
> +      /* See whether NPATTERNS is valid with a 2-element-per-pattern
> +        encoding.  */
> +      if (repeating_sequence_p (npatterns, encoded_nelts (), npatterns))
> +       {
> +         reshape (npatterns, 2);
> +         return true;
> +       }
> +
> +      /* We can only increase the number of elements per pattern if all
> +        elements are still encoded explicitly.  */
> +      if (!encoded_full_vector_p ())
> +       return false;
> +    }
> +
> +  if (m_nelts_per_pattern <= 3)
> +    {
> +      /* See whether we have NPATTERNS interleaved linear series,
> +        giving a 3-element-per-pattern encoding.  */
> +      if (stepped_sequence_p (npatterns, encoded_nelts (), npatterns))
> +       {
> +         reshape (npatterns, 3);
> +         return true;
> +       }
> +      return false;
> +    }
> +
> +  gcc_unreachable ();
> +}
> +
> +/* Replace the current encoding with the canonical form.  */
> +
> +template<typename T, typename Derived>
> +void
> +vector_builder<T, Derived>::finalize ()
> +{
> +  /* The encoding requires the same number of elements to come from each
> +     pattern.  */
> +  gcc_assert (m_full_nelts % m_npatterns == 0);
> +
> +  /* Allow the caller to build more elements than necessary.  For example,
> +     it's often convenient to build a stepped vector from the natural
> +     encoding of three elements even if the vector itself only has two.  */
> +  if (m_full_nelts <= encoded_nelts ())
> +    {
> +      m_npatterns = m_full_nelts;
> +      m_nelts_per_pattern = 1;
> +    }
> +
> +  /* Try to whittle down the number of elements per pattern.  That is:
> +
> +     1. If we have stepped patterns whose steps are all 0, reduce the
> +        number of elements per pattern from 3 to 2.
> +
> +     2. If we have background fill values that are the same as the
> +        foreground values, reduce the number of elements per pattern
> +        from 2 to 1.  */
> +  while (m_nelts_per_pattern > 1
> +        && repeating_sequence_p (encoded_nelts () - m_npatterns * 2,
> +                                 encoded_nelts (), m_npatterns))
> +    /* The last two sequences of M_NPATTERNS elements are equal,
> +       so remove the last one.  */
> +    reshape (m_npatterns, m_nelts_per_pattern - 1);
> +
> +  if (pow2p_hwi (m_npatterns))
> +    {
> +      /* Try to halve the number of patterns while doing so gives a
> +        valid pattern.  This approach is linear in the number of
> +        elements, whereas searcing from 1 up would be O(n*log(n)).
> +
> +        Each halving step tries to keep the number of elements per pattern
> +        the same.  If that isn't possible, and if all elements are still
> +        explicitly encoded, the halving step can instead increase the number
> +        of elements per pattern.
> +
> +        E.g. for:
> +
> +            { 0, 2, 3, 4, 5, 6, 7, 8 }  npatterns == 8  full_nelts == 8
> +
> +        we first realize that the second half of the sequence is not
> +        equal to the first, so we cannot maintain 1 element per pattern
> +        for npatterns == 4.  Instead we halve the number of patterns
> +        and double the number of elements per pattern, treating this
> +        as a "foreground" { 0, 2, 3, 4 } against a "background" of
> +        { 5, 6, 7, 8 | 5, 6, 7, 8 ... }:
> +
> +            { 0, 2, 3, 4 | 5, 6, 7, 8 }  npatterns == 4
> +
> +        Next we realize that this is *not* a foreround of { 0, 2 }
> +        against a background of { 3, 4 | 3, 4 ... }, so the only
> +        remaining option for reducing the number of patterns is
> +        to use a foreground of { 0, 2 } against a stepped background
> +        of { 1, 2 | 3, 4 | 5, 6 ... }.  This is valid because we still
> +        haven't elided any elements:
> +
> +            { 0, 2 | 3, 4 | 5, 6 }  npatterns == 2
> +
> +        This in turn can be reduced to a foreground of { 0 } against a
> +        stepped background of { 1 | 2 | 3 ... }:
> +
> +            { 0 | 2 | 3 }  npatterns == 1
> +
> +        This last step would not have been possible for:
> +
> +            { 0, 0 | 3, 4 | 5, 6 }  npatterns == 2.  */
> +      while ((m_npatterns & 1) == 0 && try_npatterns (m_npatterns / 2))
> +       continue;
> +
> +      /* Builders of arbitrary fixed-length vectors can use:
> +
> +            new_vector (x, x, 1)
> +
> +        so that every element is specified explicitly.  Handle cases
> +        that are actually wrapping series, like { 0, 1, 2, 3, 0, 1, 2, 3 }
> +        would be for 2-bit elements.  We'll have treated them as
> +        duplicates in the loop above.  */
> +      if (m_nelts_per_pattern == 1
> +         && this->length () >= m_full_nelts
> +         && (m_npatterns & 3) == 0
> +         && stepped_sequence_p (m_npatterns / 4, m_full_nelts,
> +                                m_npatterns / 4))
> +       {
> +         reshape (m_npatterns / 4, 3);
> +         while ((m_npatterns & 1) == 0 && try_npatterns (m_npatterns / 2))
> +           continue;
> +       }
> +    }
> +  else
> +    /* For the non-power-of-2 case, do a simple search up from 1.  */
> +    for (unsigned int i = 1; i <= m_npatterns / 2; ++i)
> +      if (m_npatterns % i == 0 && try_npatterns (i))
> +       break;
> +}
> +
> +#endif
> Index: gcc/tree-vector-builder.h
> ===================================================================
> --- /dev/null   2017-12-05 14:21:55.753572108 +0000
> +++ gcc/tree-vector-builder.h   2017-12-06 14:46:14.131599903 +0000
> @@ -0,0 +1,135 @@
> +/* A class for building vector tree constants.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_TREE_VECTOR_BUILDER_H
> +#define GCC_TREE_VECTOR_BUILDER_H
> +
> +#include "vector-builder.h"
> +
> +/* This class is used to build VECTOR_CSTs from a sequence of elements.
> +   See vector_builder for more details.  */
> +class tree_vector_builder : public vector_builder<tree, tree_vector_builder>
> +{
> +  typedef vector_builder<tree, tree_vector_builder> parent;
> +  friend class vector_builder<tree, tree_vector_builder>;
> +
> +public:
> +  tree_vector_builder () : m_type (0) {}
> +  tree_vector_builder (tree, unsigned int, unsigned int);
> +  tree build ();
> +
> +  tree type () const { return m_type; }
> +
> +  void new_vector (tree, unsigned int, unsigned int);
> +  bool new_unary_operation (tree, tree, bool);
> +
> +private:
> +  bool equal_p (const_tree, const_tree) const;
> +  bool allow_steps_p () const;
> +  bool integral_p (const_tree) const;
> +  wide_int step (const_tree, const_tree) const;
> +  bool can_elide_p (const_tree) const;
> +  void note_representative (tree *, tree);
> +
> +  tree m_type;
> +};
> +
> +/* Create a new builder for a vector of type TYPE.  Initially encode the
> +   value as NPATTERNS interleaved patterns with NELTS_PER_PATTERN elements
> +   each.  */
> +
> +inline
> +tree_vector_builder::tree_vector_builder (tree type, unsigned int npatterns,
> +                                         unsigned int nelts_per_pattern)
> +{
> +  new_vector (type, npatterns, nelts_per_pattern);
> +}
> +
> +/* Start building a new vector of type TYPE.  Initially encode the value
> +   as NPATTERNS interleaved patterns with NELTS_PER_PATTERN elements each.  
> */
> +
> +inline void
> +tree_vector_builder::new_vector (tree type, unsigned int npatterns,
> +                                unsigned int nelts_per_pattern)
> +{
> +  m_type = type;
> +  parent::new_vector (TYPE_VECTOR_SUBPARTS (type), npatterns,
> +                     nelts_per_pattern);
> +}
> +
> +/* Return true if elements I1 and I2 are equal.  */
> +
> +inline bool
> +tree_vector_builder::equal_p (const_tree elt1, const_tree elt2) const
> +{
> +  return operand_equal_p (elt1, elt2, 0);
> +}
> +
> +/* Return true if a stepped representation is OK.  We don't allow
> +   linear series for anything other than integers, to avoid problems
> +   with rounding.  */
> +
> +inline bool
> +tree_vector_builder::allow_steps_p () const
> +{
> +  return INTEGRAL_TYPE_P (TREE_TYPE (m_type));
> +}
> +
> +/* Return true if ELT can be interpreted as an integer.  */
> +
> +inline bool
> +tree_vector_builder::integral_p (const_tree elt) const
> +{
> +  return TREE_CODE (elt) == INTEGER_CST;
> +}
> +
> +/* Return the value of element ELT2 minus the value of element ELT1.
> +   Both elements are known to be INTEGER_CSTs.  */
> +
> +inline wide_int
> +tree_vector_builder::step (const_tree elt1, const_tree elt2) const
> +{
> +  return wi::to_wide (elt2) - wi::to_wide (elt1);
> +}
> +
> +/* Return true if we can drop element ELT, even if the retained elements
> +   are different.  Return false if this would mean losing overflow
> +   information.  */
> +
> +inline bool
> +tree_vector_builder::can_elide_p (const_tree elt) const
> +{
> +  return !CONSTANT_CLASS_P (elt) || !TREE_OVERFLOW (elt);
> +}
> +
> +/* Record that ELT2 is being elided, given that ELT1_PTR points to the last
> +   encoded element for the containing pattern.  */
> +
> +inline void
> +tree_vector_builder::note_representative (tree *elt1_ptr, tree elt2)
> +{
> +  if (CONSTANT_CLASS_P (elt2) && TREE_OVERFLOW (elt2))
> +    {
> +      gcc_assert (operand_equal_p (*elt1_ptr, elt2, 0));
> +      if (!TREE_OVERFLOW (elt2))
> +       *elt1_ptr = elt2;
> +    }
> +}
> +
> +#endif
> Index: gcc/tree-vector-builder.c
> ===================================================================
> --- /dev/null   2017-12-05 14:21:55.753572108 +0000
> +++ gcc/tree-vector-builder.c   2017-12-06 14:46:14.131599903 +0000
> @@ -0,0 +1,64 @@
> +/* A class for building vector tree constants.
> +   Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "fold-const.h"
> +#include "tree-vector-builder.h"
> +
> +/* Try to start building a new vector of type TYPE that holds the result of
> +   a unary operation on VECTOR_CST T.  ALLOW_STEPPED_P is true if the
> +   operation can handle stepped encodings directly, without having to
> +   expand the full sequence.
> +
> +   Return true if the operation is possible, which it always is when
> +   ALLOW_STEPPED_P is true.  Leave the builder unchanged otherwise.  */
> +
> +bool
> +tree_vector_builder::new_unary_operation (tree type, tree t,
> +                                         bool allow_stepped_p)
> +{
> +  unsigned int full_nelts = TYPE_VECTOR_SUBPARTS (type);
> +  gcc_assert (full_nelts == TYPE_VECTOR_SUBPARTS (TREE_TYPE (t)));
> +  unsigned int npatterns = VECTOR_CST_NPATTERNS (t);
> +  unsigned int nelts_per_pattern = VECTOR_CST_NELTS_PER_PATTERN (t);
> +  if (!allow_stepped_p && nelts_per_pattern > 2)
> +    {
> +      npatterns = full_nelts;
> +      nelts_per_pattern = 1;
> +    }
> +  new_vector (type, npatterns, nelts_per_pattern);
> +  return true;
> +}
> +
> +/* Return a VECTOR_CST for the current constant.  */
> +
> +tree
> +tree_vector_builder::build ()
> +{
> +  finalize ();
> +  gcc_assert (pow2p_hwi (npatterns ()));
> +  tree v = make_vector (exact_log2 (npatterns ()), nelts_per_pattern ());
> +  TREE_TYPE (v) = m_type;
> +  memcpy (VECTOR_CST_ENCODED_ELTS (v), address (),
> +         encoded_nelts () * sizeof (tree));
> +  return v;
> +}
> Index: gcc/Makefile.in
> ===================================================================
> --- gcc/Makefile.in     2017-12-06 14:46:13.860611208 +0000
> +++ gcc/Makefile.in     2017-12-06 14:46:14.128600028 +0000
> @@ -1572,6 +1572,7 @@ OBJS = \
>         tree-vect-loop-manip.o \
>         tree-vect-slp.o \
>         tree-vectorizer.o \
> +       tree-vector-builder.o \
>         tree-vrp.o \
>         tree.o \
>         typed-splay-tree.o \
> Index: gcc/tree.def
> ===================================================================
> --- gcc/tree.def        2017-12-06 14:46:13.860611208 +0000
> +++ gcc/tree.def        2017-12-06 14:46:14.132599861 +0000
> @@ -301,7 +301,7 @@ DEFTREECODE (FIXED_CST, "fixed_cst", tcc
>     whose contents are other constant nodes.  */
>  DEFTREECODE (COMPLEX_CST, "complex_cst", tcc_constant, 0)
>
> -/* Contents are in VECTOR_CST_ELTS field.  */
> +/* See generic.texi for details.  */
>  DEFTREECODE (VECTOR_CST, "vector_cst", tcc_constant, 0)
>
>  /* Contents are TREE_STRING_LENGTH and the actual contents of the string.  */
> Index: gcc/tree-core.h
> ===================================================================
> --- gcc/tree-core.h     2017-12-06 14:46:13.860611208 +0000
> +++ gcc/tree-core.h     2017-12-06 14:46:14.130599945 +0000
> @@ -972,8 +972,17 @@ struct GTY(()) tree_base {
>      /* VEC length.  This field is only used with TREE_VEC.  */
>      int length;
>
> -    /* Number of elements.  This field is only used with VECTOR_CST.  */
> -    unsigned int nelts;
> +    /* This field is only used with VECTOR_CST.  */
> +    struct {
> +      /* The value of VECTOR_CST_LOG2_NPATTERNS.  */
> +      unsigned int log2_npatterns : 8;
> +
> +      /* The value of VECTOR_CST_NELTS_PER_PATTERN.  */
> +      unsigned int nelts_per_pattern : 8;
> +
> +      /* For future expansion.  */
> +      unsigned int unused : 16;
> +    } vector_cst;
>
>      /* SSA version number.  This field is only used with SSA_NAME.  */
>      unsigned int version;
> @@ -1325,7 +1334,7 @@ struct GTY(()) tree_complex {
>
>  struct GTY(()) tree_vector {
>    struct tree_typed typed;
> -  tree GTY ((length ("VECTOR_CST_NELTS ((tree) &%h)"))) elts[1];
> +  tree GTY ((length ("vector_cst_encoded_nelts ((tree) &%h)"))) elts[1];
>  };
>
>  struct GTY(()) tree_identifier {
> Index: gcc/tree.h
> ===================================================================
> --- gcc/tree.h  2017-12-06 14:46:13.860611208 +0000
> +++ gcc/tree.h  2017-12-06 14:46:14.133599820 +0000
> @@ -1008,10 +1008,24 @@ #define TREE_STRING_POINTER(NODE) \
>  #define TREE_REALPART(NODE) (COMPLEX_CST_CHECK (NODE)->complex.real)
>  #define TREE_IMAGPART(NODE) (COMPLEX_CST_CHECK (NODE)->complex.imag)
>
> -/* In a VECTOR_CST node.  */
> -#define VECTOR_CST_NELTS(NODE) (VECTOR_CST_CHECK (NODE)->base.u.nelts)
> -#define VECTOR_CST_ELTS(NODE) (VECTOR_CST_CHECK (NODE)->vector.elts)
> -#define VECTOR_CST_ELT(NODE,IDX) (VECTOR_CST_CHECK (NODE)->vector.elts[IDX])
> +/* In a VECTOR_CST node.  See generic.texi for details.  */
> +#define VECTOR_CST_NELTS(NODE) (TYPE_VECTOR_SUBPARTS (TREE_TYPE (NODE)))
> +#define VECTOR_CST_ELT(NODE,IDX) vector_cst_elt (NODE, IDX)
> +
> +#define VECTOR_CST_LOG2_NPATTERNS(NODE) \
> +  (VECTOR_CST_CHECK (NODE)->base.u.vector_cst.log2_npatterns)
> +#define VECTOR_CST_NPATTERNS(NODE) \
> +  (1U << VECTOR_CST_LOG2_NPATTERNS (NODE))
> +#define VECTOR_CST_NELTS_PER_PATTERN(NODE) \
> +  (VECTOR_CST_CHECK (NODE)->base.u.vector_cst.nelts_per_pattern)
> +#define VECTOR_CST_DUPLICATE_P(NODE) \
> +  (VECTOR_CST_NELTS_PER_PATTERN (NODE) == 1)
> +#define VECTOR_CST_STEPPED_P(NODE) \
> +  (VECTOR_CST_NELTS_PER_PATTERN (NODE) == 3)
> +#define VECTOR_CST_ENCODED_ELTS(NODE) \
> +  (VECTOR_CST_CHECK (NODE)->vector.elts)
> +#define VECTOR_CST_ENCODED_ELT(NODE, ELT) \
> +  (VECTOR_CST_CHECK (NODE)->vector.elts[ELT])
>
>  /* Define fields and accessors for some special-purpose tree nodes.  */
>
> @@ -3882,6 +3896,14 @@ #define error_operand_p(NODE)                          
>           \
>    ((NODE) == error_mark_node                                   \
>     || ((NODE) && TREE_TYPE ((NODE)) == error_mark_node))
>
> +/* Return the number of elements encoded directly in a VECTOR_CST.  */
> +
> +inline unsigned int
> +vector_cst_encoded_nelts (const_tree t)
> +{
> +  return VECTOR_CST_NPATTERNS (t) * VECTOR_CST_NELTS_PER_PATTERN (t);
> +}
> +
>  extern tree decl_assembler_name (tree);
>  extern void overwrite_decl_assembler_name (tree decl, tree name);
>  extern tree decl_comdat_group (const_tree);
> @@ -4021,7 +4043,7 @@ extern tree force_fit_type (tree, const
>  extern tree build_int_cst (tree, HOST_WIDE_INT);
>  extern tree build_int_cstu (tree type, unsigned HOST_WIDE_INT cst);
>  extern tree build_int_cst_type (tree, HOST_WIDE_INT);
> -extern tree make_vector (unsigned CXX_MEM_STAT_INFO);
> +extern tree make_vector (unsigned, unsigned CXX_MEM_STAT_INFO);
>  extern tree build_vector (tree, vec<tree> CXX_MEM_STAT_INFO);
>  extern tree build_vector_from_ctor (tree, vec<constructor_elt, va_gc> *);
>  extern tree build_vector_from_val (tree, tree);
> @@ -4268,6 +4290,9 @@ extern tree first_field (const_tree);
>
>  extern bool initializer_zerop (const_tree);
>
> +extern wide_int vector_cst_int_elt (const_tree, unsigned int);
> +extern tree vector_cst_elt (const_tree, unsigned int);
> +
>  /* Given a vector VEC, return its first element if all elements are
>     the same.  Otherwise return NULL_TREE.  */
>
> Index: gcc/tree.c
> ===================================================================
> --- gcc/tree.c  2017-12-06 14:46:13.860611208 +0000
> +++ gcc/tree.c  2017-12-06 14:46:14.132599861 +0000
> @@ -66,6 +66,7 @@ Software Foundation; either version 3, o
>  #include "attribs.h"
>  #include "rtl.h"
>  #include "regs.h"
> +#include "tree-vector-builder.h"
>
>  /* Tree code classes.  */
>
> @@ -837,7 +838,7 @@ tree_code_size (enum tree_code code)
>         case REAL_CST:          return sizeof (tree_real_cst);
>         case FIXED_CST:         return sizeof (tree_fixed_cst);
>         case COMPLEX_CST:       return sizeof (tree_complex);
> -       case VECTOR_CST:        return sizeof (tree_vector);
> +       case VECTOR_CST:        gcc_unreachable ();
>         case STRING_CST:        gcc_unreachable ();
>         default:
>           gcc_checking_assert (code >= NUM_TREE_CODES);
> @@ -897,7 +898,7 @@ tree_size (const_tree node)
>
>      case VECTOR_CST:
>        return (sizeof (struct tree_vector)
> -             + (VECTOR_CST_NELTS (node) - 1) * sizeof (tree));
> +             + (vector_cst_encoded_nelts (node) - 1) * sizeof (tree));
>
>      case STRING_CST:
>        return TREE_STRING_LENGTH (node) + offsetof (struct tree_string, str) 
> + 1;
> @@ -1708,13 +1709,19 @@ cst_and_fits_in_hwi (const_tree x)
>           && (tree_fits_shwi_p (x) || tree_fits_uhwi_p (x)));
>  }
>
> -/* Build a newly constructed VECTOR_CST node of length LEN.  */
> +/* Build a newly constructed VECTOR_CST with the given values of
> +   (VECTOR_CST_)LOG2_NPATTERNS and (VECTOR_CST_)NELTS_PER_PATTERN.  */
>
>  tree
> -make_vector (unsigned len MEM_STAT_DECL)
> +make_vector (unsigned log2_npatterns,
> +            unsigned int nelts_per_pattern MEM_STAT_DECL)
>  {
> +  gcc_assert (IN_RANGE (nelts_per_pattern, 1, 3));
>    tree t;
> -  unsigned length = (len - 1) * sizeof (tree) + sizeof (struct tree_vector);
> +  unsigned npatterns = 1 << log2_npatterns;
> +  unsigned encoded_nelts = npatterns * nelts_per_pattern;
> +  unsigned length = (sizeof (struct tree_vector)
> +                    + (encoded_nelts - 1) * sizeof (tree));
>
>    record_node_allocation_statistics (VECTOR_CST, length);
>
> @@ -1722,7 +1729,8 @@ make_vector (unsigned len MEM_STAT_DECL)
>
>    TREE_SET_CODE (t, VECTOR_CST);
>    TREE_CONSTANT (t) = 1;
> -  VECTOR_CST_NELTS (t) = len;
> +  VECTOR_CST_LOG2_NPATTERNS (t) = log2_npatterns;
> +  VECTOR_CST_NELTS_PER_PATTERN (t) = nelts_per_pattern;
>
>    return t;
>  }
> @@ -1733,29 +1741,10 @@ make_vector (unsigned len MEM_STAT_DECL)
>  tree
>  build_vector (tree type, vec<tree> vals MEM_STAT_DECL)
>  {
> -  unsigned int nelts = vals.length ();
> -  gcc_assert (nelts == TYPE_VECTOR_SUBPARTS (type));
> -  int over = 0;
> -  unsigned cnt = 0;
> -  tree v = make_vector (nelts);
> -  TREE_TYPE (v) = type;
> -
> -  /* Iterate through elements and check for overflow.  */
> -  for (cnt = 0; cnt < nelts; ++cnt)
> -    {
> -      tree value = vals[cnt];
> -
> -      VECTOR_CST_ELT (v, cnt) = value;
> -
> -      /* Don't crash if we get an address constant.  */
> -      if (!CONSTANT_CLASS_P (value))
> -       continue;
> -
> -      over |= TREE_OVERFLOW (value);
> -    }
> -
> -  TREE_OVERFLOW (v) = over;
> -  return v;
> +  gcc_assert (vals.length () == TYPE_VECTOR_SUBPARTS (type));
> +  tree_vector_builder builder (type, vals.length (), 1);
> +  builder.splice (vals);
> +  return builder.build ();
>  }
>
>  /* Return a new VECTOR_CST node whose type is TYPE and whose values
> @@ -10370,6 +10359,59 @@ build_opaque_vector_type (tree innertype
>    return cand;
>  }
>
> +/* Return the value of element I of VECTOR_CST T as a wide_int.  */
> +
> +wide_int
> +vector_cst_int_elt (const_tree t, unsigned int i)
> +{
> +  /* First handle elements that are directly encoded.  */
> +  unsigned int encoded_nelts = vector_cst_encoded_nelts (t);
> +  if (i < encoded_nelts)
> +    return wi::to_wide (VECTOR_CST_ENCODED_ELT (t, i));
> +
> +  /* Identify the pattern that contains element I and work out the index of
> +     the last encoded element for that pattern.  */
> +  unsigned int npatterns = VECTOR_CST_NPATTERNS (t);
> +  unsigned int pattern = i % npatterns;
> +  unsigned int count = i / npatterns;
> +  unsigned int final_i = encoded_nelts - npatterns + pattern;
> +
> +  /* If there are no steps, the final encoded value is the right one.  */
> +  if (!VECTOR_CST_STEPPED_P (t))
> +    return wi::to_wide (VECTOR_CST_ENCODED_ELT (t, final_i));
> +
> +  /* Otherwise work out the value from the last two encoded elements.  */
> +  tree v1 = VECTOR_CST_ENCODED_ELT (t, final_i - npatterns);
> +  tree v2 = VECTOR_CST_ENCODED_ELT (t, final_i);
> +  wide_int diff = wi::to_wide (v2) - wi::to_wide (v1);
> +  return wi::to_wide (v2) + (count - 2) * diff;
> +}
> +
> +/* Return the value of element I of VECTOR_CST T.  */
> +
> +tree
> +vector_cst_elt (const_tree t, unsigned int i)
> +{
> +  /* First handle elements that are directly encoded.  */
> +  unsigned int encoded_nelts = vector_cst_encoded_nelts (t);
> +  if (i < encoded_nelts)
> +    return VECTOR_CST_ENCODED_ELT (t, i);
> +
> +  /* If there are no steps, the final encoded value is the right one.  */
> +  if (!VECTOR_CST_STEPPED_P (t))
> +    {
> +      /* Identify the pattern that contains element I and work out the index 
> of
> +        the last encoded element for that pattern.  */
> +      unsigned int npatterns = VECTOR_CST_NPATTERNS (t);
> +      unsigned int pattern = i % npatterns;
> +      unsigned int final_i = encoded_nelts - npatterns + pattern;
> +      return VECTOR_CST_ENCODED_ELT (t, final_i);
> +    }
> +
> +  /* Otherwise work out the value from the last two encoded elements.  */
> +  return wide_int_to_tree (TREE_TYPE (TREE_TYPE (t)),
> +                          vector_cst_int_elt (t, i));
> +}
>
>  /* Given an initializer INIT, return TRUE if INIT is zero or some
>     aggregate of zeros.  Otherwise return FALSE.  */
> @@ -12451,6 +12493,23 @@ drop_tree_overflow (tree t)
>    if (TREE_CODE (t) == INTEGER_CST)
>      return wide_int_to_tree (TREE_TYPE (t), wi::to_wide (t));
>
> +  /* For VECTOR_CST, remove the overflow bits from the encoded elements
> +     and canonicalize the result.  */
> +  if (TREE_CODE (t) == VECTOR_CST)
> +    {
> +      tree_vector_builder builder;
> +      builder.new_unary_operation (TREE_TYPE (t), t, true);
> +      unsigned int count = builder.encoded_nelts ();
> +      for (unsigned int i = 0; i < count; ++i)
> +       {
> +         tree elt = VECTOR_CST_ELT (t, i);
> +         if (TREE_OVERFLOW (elt))
> +           elt = drop_tree_overflow (elt);
> +         builder.quick_push (elt);
> +       }
> +      return builder.build ();
> +    }
> +
>    /* Otherwise, as all tcc_constants are possibly shared, copy the node
>       and drop the flag.  */
>    t = copy_node (t);
> @@ -12465,15 +12524,7 @@ drop_tree_overflow (tree t)
>        if (TREE_OVERFLOW (TREE_IMAGPART (t)))
>         TREE_IMAGPART (t) = drop_tree_overflow (TREE_IMAGPART (t));
>      }
> -  if (TREE_CODE (t) == VECTOR_CST)
> -    {
> -      for (unsigned i = 0; i < VECTOR_CST_NELTS (t); ++i)
> -       {
> -         tree& elt = VECTOR_CST_ELT (t, i);
> -         if (TREE_OVERFLOW (elt))
> -           elt = drop_tree_overflow (elt);
> -       }
> -    }
> +
>    return t;
>  }
>
> @@ -14016,6 +14067,139 @@ test_labels ()
>    ASSERT_FALSE (FORCED_LABEL (label_decl));
>  }
>
> +/* Check that VECTOR_CST ACTUAL contains the elements in EXPECTED.  */
> +
> +static void
> +check_vector_cst (vec<tree> expected, tree actual)
> +{
> +  ASSERT_EQ (expected.length (), TYPE_VECTOR_SUBPARTS (TREE_TYPE (actual)));
> +  for (unsigned int i = 0; i < expected.length (); ++i)
> +    ASSERT_EQ (wi::to_wide (expected[i]),
> +              wi::to_wide (vector_cst_elt (actual, i)));
> +}
> +
> +/* Check that VECTOR_CST ACTUAL contains NPATTERNS duplicated elements,
> +   and that its elements match EXPECTED.  */
> +
> +static void
> +check_vector_cst_duplicate (vec<tree> expected, tree actual,
> +                           unsigned int npatterns)
> +{
> +  ASSERT_EQ (npatterns, VECTOR_CST_NPATTERNS (actual));
> +  ASSERT_EQ (1, VECTOR_CST_NELTS_PER_PATTERN (actual));
> +  ASSERT_EQ (npatterns, vector_cst_encoded_nelts (actual));
> +  ASSERT_TRUE (VECTOR_CST_DUPLICATE_P (actual));
> +  ASSERT_FALSE (VECTOR_CST_STEPPED_P (actual));
> +  check_vector_cst (expected, actual);
> +}
> +
> +/* Check that VECTOR_CST ACTUAL contains NPATTERNS foreground elements
> +   and NPATTERNS background elements, and that its elements match
> +   EXPECTED.  */
> +
> +static void
> +check_vector_cst_fill (vec<tree> expected, tree actual,
> +                      unsigned int npatterns)
> +{
> +  ASSERT_EQ (npatterns, VECTOR_CST_NPATTERNS (actual));
> +  ASSERT_EQ (2, VECTOR_CST_NELTS_PER_PATTERN (actual));
> +  ASSERT_EQ (2 * npatterns, vector_cst_encoded_nelts (actual));
> +  ASSERT_FALSE (VECTOR_CST_DUPLICATE_P (actual));
> +  ASSERT_FALSE (VECTOR_CST_STEPPED_P (actual));
> +  check_vector_cst (expected, actual);
> +}
> +
> +/* Check that VECTOR_CST ACTUAL contains NPATTERNS stepped patterns,
> +   and that its elements match EXPECTED.  */
> +
> +static void
> +check_vector_cst_stepped (vec<tree> expected, tree actual,
> +                         unsigned int npatterns)
> +{
> +  ASSERT_EQ (npatterns, VECTOR_CST_NPATTERNS (actual));
> +  ASSERT_EQ (3, VECTOR_CST_NELTS_PER_PATTERN (actual));
> +  ASSERT_EQ (3 * npatterns, vector_cst_encoded_nelts (actual));
> +  ASSERT_FALSE (VECTOR_CST_DUPLICATE_P (actual));
> +  ASSERT_TRUE (VECTOR_CST_STEPPED_P (actual));
> +  check_vector_cst (expected, actual);
> +}
> +
> +/* Test the creation of VECTOR_CSTs.  */
> +
> +static void
> +test_vector_cst_patterns ()
> +{
> +  auto_vec<tree, 8> elements (8);
> +  elements.quick_grow (8);
> +  tree element_type = build_nonstandard_integer_type (16, true);
> +  tree vector_type = build_vector_type (element_type, 8);
> +
> +  /* Test a simple linear series with a base of 0 and a step of 1:
> +     { 0, 1, 2, 3, 4, 5, 6, 7 }.  */
> +  for (unsigned int i = 0; i < 8; ++i)
> +    elements[i] = build_int_cst (element_type, i);
> +  check_vector_cst_stepped (elements, build_vector (vector_type, elements), 
> 1);
> +
> +  /* Try the same with the first element replaced by 100:
> +     { 100, 1, 2, 3, 4, 5, 6, 7 }.  */
> +  elements[0] = build_int_cst (element_type, 100);
> +  check_vector_cst_stepped (elements, build_vector (vector_type, elements), 
> 1);
> +
> +  /* Try a series that wraps around.
> +     { 100, 65531, 65532, 65533, 65534, 65535, 0, 1 }.  */
> +  for (unsigned int i = 1; i < 8; ++i)
> +    elements[i] = build_int_cst (element_type, (65530 + i) & 0xffff);
> +  check_vector_cst_stepped (elements, build_vector (vector_type, elements), 
> 1);
> +
> +  /* Try a downward series:
> +     { 100, 79, 78, 77, 76, 75, 75, 73 }.  */
> +  for (unsigned int i = 1; i < 8; ++i)
> +    elements[i] = build_int_cst (element_type, 80 - i);
> +  check_vector_cst_stepped (elements, build_vector (vector_type, elements), 
> 1);
> +
> +  /* Try two interleaved series with different bases and steps:
> +     { 100, 53, 66, 206, 62, 212, 58, 218 }.  */
> +  elements[1] = build_int_cst (element_type, 53);
> +  for (unsigned int i = 2; i < 8; i += 2)
> +    {
> +      elements[i] = build_int_cst (element_type, 70 - i * 2);
> +      elements[i + 1] = build_int_cst (element_type, 200 + i * 3);
> +    }
> +  check_vector_cst_stepped (elements, build_vector (vector_type, elements), 
> 2);
> +
> +  /* Try a duplicated value:
> +     { 100, 100, 100, 100, 100, 100, 100, 100 }.  */
> +  for (unsigned int i = 1; i < 8; ++i)
> +    elements[i] = elements[0];
> +  check_vector_cst_duplicate (elements,
> +                             build_vector (vector_type, elements), 1);
> +
> +  /* Try an interleaved duplicated value:
> +     { 100, 55, 100, 55, 100, 55, 100, 55 }.  */
> +  elements[1] = build_int_cst (element_type, 55);
> +  for (unsigned int i = 2; i < 8; ++i)
> +    elements[i] = elements[i - 2];
> +  check_vector_cst_duplicate (elements,
> +                             build_vector (vector_type, elements), 2);
> +
> +  /* Try a duplicated value with 2 exceptions
> +     { 41, 97, 100, 55, 100, 55, 100, 55 }.  */
> +  elements[0] = build_int_cst (element_type, 41);
> +  elements[1] = build_int_cst (element_type, 97);
> +  check_vector_cst_fill (elements, build_vector (vector_type, elements), 2);
> +
> +  /* Try with and without a step
> +     { 41, 97, 100, 21, 100, 35, 100, 49 }.  */
> +  for (unsigned int i = 3; i < 8; i += 2)
> +    elements[i] = build_int_cst (element_type, i * 7);
> +  check_vector_cst_stepped (elements, build_vector (vector_type, elements), 
> 2);
> +
> +  /* Try a fully-general constant:
> +     { 41, 97, 100, 21, 100, 9990, 100, 49 }.  */
> +  elements[5] = build_int_cst (element_type, 9990);
> +  check_vector_cst_fill (elements, build_vector (vector_type, elements), 4);
> +}
> +
>  /* Run all of the selftests within this file.  */
>
>  void
> @@ -14024,6 +14208,7 @@ tree_c_tests ()
>    test_integer_constants ();
>    test_identifiers ();
>    test_labels ();
> +  test_vector_cst_patterns ();
>  }
>
>  } // namespace selftest
> Index: gcc/lto-streamer-out.c
> ===================================================================
> --- gcc/lto-streamer-out.c      2017-12-06 14:46:13.860611208 +0000
> +++ gcc/lto-streamer-out.c      2017-12-06 14:46:14.130599945 +0000
> @@ -747,8 +747,9 @@ #define DFS_follow_tree_edge(DEST) \
>
>    if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
>      {
> -      for (unsigned i = 0; i < VECTOR_CST_NELTS (expr); ++i)
> -       DFS_follow_tree_edge (VECTOR_CST_ELT (expr, i));
> +      unsigned int count = vector_cst_encoded_nelts (expr);
> +      for (unsigned int i = 0; i < count; ++i)
> +       DFS_follow_tree_edge (VECTOR_CST_ENCODED_ELT (expr, i));
>      }
>
>    if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))
> @@ -1195,8 +1196,11 @@ #define visit(SIBLING) \
>      }
>
>    if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
> -    for (unsigned i = 0; i < VECTOR_CST_NELTS (t); ++i)
> -      visit (VECTOR_CST_ELT (t, i));
> +    {
> +      unsigned int count = vector_cst_encoded_nelts (t);
> +      for (unsigned int i = 0; i < count; ++i)
> +       visit (VECTOR_CST_ENCODED_ELT (t, i));
> +    }
>
>    if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))
>      {
> Index: gcc/tree-streamer-out.c
> ===================================================================
> --- gcc/tree-streamer-out.c     2017-12-06 14:46:13.860611208 +0000
> +++ gcc/tree-streamer-out.c     2017-12-06 14:46:14.131599903 +0000
> @@ -533,11 +533,11 @@ write_ts_common_tree_pointers (struct ou
>  static void
>  write_ts_vector_tree_pointers (struct output_block *ob, tree expr, bool 
> ref_p)
>  {
> -  unsigned i;
>    /* Note that the number of elements for EXPR has already been emitted
>       in EXPR's header (see streamer_write_tree_header).  */
> -  for (i = 0; i < VECTOR_CST_NELTS (expr); ++i)
> -    stream_write_tree (ob, VECTOR_CST_ELT (expr, i), ref_p);
> +  unsigned int count = vector_cst_encoded_nelts (expr);
> +  for (unsigned int i = 0; i < count; ++i)
> +    stream_write_tree (ob, VECTOR_CST_ENCODED_ELT (expr, i), ref_p);
>  }
>
>
> @@ -960,7 +960,12 @@ streamer_write_tree_header (struct outpu
>    else if (CODE_CONTAINS_STRUCT (code, TS_IDENTIFIER))
>      write_identifier (ob, ob->main_stream, expr);
>    else if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
> -    streamer_write_hwi (ob, VECTOR_CST_NELTS (expr));
> +    {
> +      bitpack_d bp = bitpack_create (ob->main_stream);
> +      bp_pack_value (&bp, VECTOR_CST_LOG2_NPATTERNS (expr), 8);
> +      bp_pack_value (&bp, VECTOR_CST_NELTS_PER_PATTERN (expr), 8);
> +      streamer_write_bitpack (&bp);
> +    }
>    else if (CODE_CONTAINS_STRUCT (code, TS_VEC))
>      streamer_write_hwi (ob, TREE_VEC_LENGTH (expr));
>    else if (CODE_CONTAINS_STRUCT (code, TS_BINFO))
> Index: gcc/tree-streamer-in.c
> ===================================================================
> --- gcc/tree-streamer-in.c      2017-12-06 14:46:13.860611208 +0000
> +++ gcc/tree-streamer-in.c      2017-12-06 14:46:14.130599945 +0000
> @@ -592,8 +592,10 @@ streamer_alloc_tree (struct lto_input_bl
>      }
>    else if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
>      {
> -      HOST_WIDE_INT len = streamer_read_hwi (ib);
> -      result = make_vector (len);
> +      bitpack_d bp = streamer_read_bitpack (ib);
> +      unsigned int log2_npatterns = bp_unpack_value (&bp, 8);
> +      unsigned int nelts_per_pattern = bp_unpack_value (&bp, 8);
> +      result = make_vector (log2_npatterns, nelts_per_pattern);
>      }
>    else if (CODE_CONTAINS_STRUCT (code, TS_BINFO))
>      {
> @@ -650,9 +652,9 @@ lto_input_ts_common_tree_pointers (struc
>  lto_input_ts_vector_tree_pointers (struct lto_input_block *ib,
>                                    struct data_in *data_in, tree expr)
>  {
> -  unsigned i;
> -  for (i = 0; i < VECTOR_CST_NELTS (expr); ++i)
> -    VECTOR_CST_ELT (expr, i) = stream_read_tree (ib, data_in);
> +  unsigned int count = vector_cst_encoded_nelts (expr);
> +  for (unsigned int i = 0; i < count; ++i)
> +    VECTOR_CST_ENCODED_ELT (expr, i) = stream_read_tree (ib, data_in);
>  }
>
>
> Index: gcc/fold-const.c
> ===================================================================
> --- gcc/fold-const.c    2017-12-06 14:46:13.860611208 +0000
> +++ gcc/fold-const.c    2017-12-06 14:46:14.129599986 +0000
> @@ -11610,9 +11610,8 @@ fold_ternary_loc (location_t loc, enum t
>                   unsigned int nelts = VECTOR_CST_NELTS (arg0);
>                   auto_vec<tree, 32> elts (nelts);
>                   elts.quick_grow (nelts);
> -                 memcpy (&elts[0], VECTOR_CST_ELTS (arg0),
> -                         sizeof (tree) * nelts);
> -                 elts[k] = arg1;
> +                 for (unsigned int i = 0; i < VECTOR_CST_NELTS (arg0); ++i)
> +                   elts[i] = (i == k ? arg1 : VECTOR_CST_ELT (arg0, i));
>                   return build_vector (type, elts);
>                 }
>             }
> Index: gcc/lto/lto.c
> ===================================================================
> --- gcc/lto/lto.c       2017-12-06 14:46:13.860611208 +0000
> +++ gcc/lto/lto.c       2017-12-06 14:46:14.130599945 +0000
> @@ -1065,6 +1065,12 @@ #define compare_values(X) \
>                         TREE_FIXED_CST_PTR (t1), TREE_FIXED_CST_PTR (t2)))
>        return false;
>
> +  if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
> +    {
> +      compare_values (VECTOR_CST_LOG2_NPATTERNS);
> +      compare_values (VECTOR_CST_NELTS_PER_PATTERN);
> +    }
> +
>    if (CODE_CONTAINS_STRUCT (code, TS_DECL_COMMON))
>      {
>        compare_values (DECL_MODE);
> @@ -1281,11 +1287,12 @@ #define compare_tree_edges(E1, E2) \
>
>    if (CODE_CONTAINS_STRUCT (code, TS_VECTOR))
>      {
> -      unsigned i;
>        /* Note that the number of elements for EXPR has already been emitted
>          in EXPR's header (see streamer_write_tree_header).  */
> -      for (i = 0; i < VECTOR_CST_NELTS (t1); ++i)
> -       compare_tree_edges (VECTOR_CST_ELT (t1, i), VECTOR_CST_ELT (t2, i));
> +      unsigned int count = vector_cst_encoded_nelts (t1);
> +      for (unsigned int i = 0; i < count; ++i)
> +       compare_tree_edges (VECTOR_CST_ENCODED_ELT (t1, i),
> +                           VECTOR_CST_ENCODED_ELT (t2, i));
>      }
>
>    if (CODE_CONTAINS_STRUCT (code, TS_COMPLEX))

Reply via email to