The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although
there was also a hard-coded assumption that this was equal to the type
size.  This was inconvenient for SVE for two reasons:

- When compiling for a specific power-of-2 SVE vector length, we might
  want to align to a full vector.  However, the TYPE_ALIGN is governed
  by the ABI alignment, which is 128 bits regardless of size.

- For vector-length-agnostic code it doesn't usually make sense to align,
  since the runtime vector length might not be a power of two.  Even for
  power of two sizes, there's no guarantee that aligning to the previous
  16 bytes will be an improveent.

This patch therefore adds a target hook to control the preferred
vectoriser (as opposed to ABI) alignment.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu.
Also tested by comparing the testsuite assembly output on at least one
target per CPU directory.  OK to install?

Richard


2017-09-18  Richard Sandiford  <richard.sandif...@linaro.org>
            Alan Hayward  <alan.hayw...@arm.com>
            David Sherwood  <david.sherw...@arm.com>

gcc/
        * target.def (preferred_vector_alignment): New hook.
        * doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New
        hook.
        * doc/tm.texi: Regenerate.
        * targhooks.h (default_preferred_vector_alignment): Declare.
        * targhooks.c (default_preferred_vector_alignment): New function.
        * tree-vectorizer.h (dataref_aux): Add a target_alignment field.
        Expand commentary.
        (DR_TARGET_ALIGNMENT): New macro.
        (aligned_access_p): Update commentary.
        (vect_known_alignment_in_bytes): New function.
        * tree-vect-data-refs.c (vect_calculate_required_alignment): New
        function.
        (vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT.
        Calculate the misalignment based on the target alignment rather than
        the vector size.
        (vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT
        rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment.
        (vect_enhance_data_refs_alignment): Mask the byte misalignment with
        the target alignment, rather than masking the element misalignment
        with the number of elements in a vector.  Also use the target
        alignment when calculating the maximum number of peels.
        (vect_find_same_alignment_drs): Use vect_calculate_required_alignment
        instead of TYPE_ALIGN_UNIT.
        (vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter.
        Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT.
        (vect_create_addr_base_for_vector_ref): Update call accordingly.
        (vect_create_data_ref_ptr): Likewise.
        (vect_setup_realignment): Realign by ANDing with
        -DR_TARGET_MISALIGNMENT.
        * tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate
        the number of peels based on DR_TARGET_ALIGNMENT.
        * tree-vect-stmts.c (get_group_load_store_type): Compare the gap
        with the guaranteed alignment boundary when deciding whether
        overrun is OK.
        (vectorizable_mask_load_store): Interpret DR_MISALIGNMENT
        relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT.
        (ensure_base_align): Remove stmt_info parameter.  Get the
        target base alignment from DR_TARGET_ALIGNMENT.
        (vectorizable_store): Update call accordingly.   Interpret
        DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of
        TYPE_ALIGN_UNIT.
        (vectorizable_load): Likewise.

gcc/testsuite/
        * gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording
        of alignment message.
        * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.

Index: gcc/target.def
===================================================================
*** gcc/target.def      2017-09-18 12:56:24.635070853 +0100
--- gcc/target.def      2017-09-18 12:56:24.847378559 +0100
*************** misalignment value (@var{misalign}).",
*** 1820,1825 ****
--- 1820,1839 ----
   int, (enum vect_cost_for_stmt type_of_cost, tree vectype, int misalign),
   default_builtin_vectorization_cost)
  
+ DEFHOOK
+ (preferred_vector_alignment,
+  "This hook returns the preferred alignment in bits for accesses to\n\
+ vectors of type @var{type} in vectorized code.  This might be less than\n\
+ or greater than the ABI-defined value returned by\n\
+ @code{TARGET_VECTOR_ALIGNMENT}.  It can be equal to the alignment of\n\
+ a single element, in which case the vectorizer will not try to optimize\n\
+ for alignment.\n\
+ \n\
+ The default hook returns @code{TYPE_ALIGN (@var{type})}, which is\n\
+ correct for most targets.",
+  HOST_WIDE_INT, (const_tree type),
+  default_preferred_vector_alignment)
+ 
  /* Return true if vector alignment is reachable (by peeling N
     iterations) for the given scalar type.  */
  DEFHOOK
Index: gcc/doc/tm.texi.in
===================================================================
*** gcc/doc/tm.texi.in  2017-09-18 12:56:24.635070853 +0100
--- gcc/doc/tm.texi.in  2017-09-18 12:56:24.846475122 +0100
*************** address;  but often a machine-dependent
*** 4086,4091 ****
--- 4086,4093 ----
  
  @hook TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST
  
+ @hook TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT
+ 
  @hook TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE
  
  @hook TARGET_VECTORIZE_VEC_PERM_CONST_OK
Index: gcc/doc/tm.texi
===================================================================
*** gcc/doc/tm.texi     2017-09-18 12:56:24.635070853 +0100
--- gcc/doc/tm.texi     2017-09-18 12:56:24.846475122 +0100
*************** For vector memory operations the cost ma
*** 5754,5759 ****
--- 5754,5771 ----
  misalignment value (@var{misalign}).
  @end deftypefn
  
+ @deftypefn {Target Hook} HOST_WIDE_INT 
TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT (const_tree @var{type})
+ This hook returns the preferred alignment in bits for accesses to
+ vectors of type @var{type} in vectorized code.  This might be less than
+ or greater than the ABI-defined value returned by
+ @code{TARGET_VECTOR_ALIGNMENT}.  It can be equal to the alignment of
+ a single element, in which case the vectorizer will not try to optimize
+ for alignment.
+ 
+ The default hook returns @code{TYPE_ALIGN (@var{type})}, which is
+ correct for most targets.
+ @end deftypefn
+ 
  @deftypefn {Target Hook} bool TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE 
(const_tree @var{type}, bool @var{is_packed})
  Return true if vector alignment is reachable (by peeling N iterations) for 
the given scalar type @var{type}.  @var{is_packed} is false if the scalar 
access using @var{type} is known to be naturally aligned.
  @end deftypefn
Index: gcc/targhooks.h
===================================================================
*** gcc/targhooks.h     2017-09-18 12:56:24.635070853 +0100
--- gcc/targhooks.h     2017-09-18 12:56:24.847378559 +0100
*************** extern tree default_builtin_reciprocal (
*** 95,100 ****
--- 95,101 ----
  
  extern HOST_WIDE_INT default_vector_alignment (const_tree);
  
+ extern HOST_WIDE_INT default_preferred_vector_alignment (const_tree);
  extern bool default_builtin_vector_alignment_reachable (const_tree, bool);
  extern bool
  default_builtin_support_vector_misalignment (machine_mode mode,
Index: gcc/targhooks.c
===================================================================
*** gcc/targhooks.c     2017-09-18 12:56:24.635070853 +0100
--- gcc/targhooks.c     2017-09-18 12:56:24.847378559 +0100
*************** default_vector_alignment (const_tree typ
*** 1175,1180 ****
--- 1175,1189 ----
    return align;
  }
  
+ /* The default implementation of
+    TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT.  */
+ 
+ HOST_WIDE_INT
+ default_preferred_vector_alignment (const_tree type)
+ {
+   return TYPE_ALIGN (type);
+ }
+ 
  /* By default assume vectors of element TYPE require a multiple of the natural
     alignment of TYPE.  TYPE is naturally aligned if IS_PACKED is false.  */
  bool
Index: gcc/tree-vectorizer.h
===================================================================
*** gcc/tree-vectorizer.h       2017-09-18 12:56:24.635070853 +0100
--- gcc/tree-vectorizer.h       2017-09-18 12:56:24.850088870 +0100
*************** #define PURE_SLP_STMT(S)
*** 790,796 ****
--- 790,800 ----
  #define STMT_SLP_TYPE(S)                   (S)->slp_type
  
  struct dataref_aux {
+   /* The misalignment in bytes of the reference, or -1 if not known.  */
    int misalignment;
+   /* The byte alignment that we'd ideally like the reference to have,
+      and the value that misalignment is measured against.  */
+   int target_alignment;
    /* If true the alignment of base_decl needs to be increased.  */
    bool base_misaligned;
    tree base_decl;
*************** #define DR_MISALIGNMENT(DR) dr_misalignm
*** 1037,1043 ****
  #define SET_DR_MISALIGNMENT(DR, VAL) set_dr_misalignment (DR, VAL)
  #define DR_MISALIGNMENT_UNKNOWN (-1)
  
! /* Return TRUE if the data access is aligned, and FALSE otherwise.  */
  
  static inline bool
  aligned_access_p (struct data_reference *data_ref_info)
--- 1041,1051 ----
  #define SET_DR_MISALIGNMENT(DR, VAL) set_dr_misalignment (DR, VAL)
  #define DR_MISALIGNMENT_UNKNOWN (-1)
  
! /* Only defined once DR_MISALIGNMENT is defined.  */
! #define DR_TARGET_ALIGNMENT(DR) DR_VECT_AUX (DR)->target_alignment
! 
! /* Return true if data access DR is aligned to its target alignment
!    (which may be less than a full vector).  */
  
  static inline bool
  aligned_access_p (struct data_reference *data_ref_info)
*************** known_alignment_for_access_p (struct dat
*** 1054,1059 ****
--- 1062,1080 ----
    return (DR_MISALIGNMENT (data_ref_info) != DR_MISALIGNMENT_UNKNOWN);
  }
  
+ /* Return the minimum alignment in bytes that the vectorized version
+    of DR is guaranteed to have.  */
+ 
+ static inline unsigned int
+ vect_known_alignment_in_bytes (struct data_reference *dr)
+ {
+   if (DR_MISALIGNMENT (dr) == DR_MISALIGNMENT_UNKNOWN)
+     return TYPE_ALIGN_UNIT (TREE_TYPE (DR_REF (dr)));
+   if (DR_MISALIGNMENT (dr) == 0)
+     return DR_TARGET_ALIGNMENT (dr);
+   return DR_MISALIGNMENT (dr) & -DR_MISALIGNMENT (dr);
+ }
+ 
  /* Return the behavior of DR with respect to the vectorization context
     (which for outer loop vectorization might not be the behavior recorded
     in DR itself).  */
Index: gcc/tree-vect-data-refs.c
===================================================================
*** gcc/tree-vect-data-refs.c   2017-09-18 12:56:24.635070853 +0100
--- gcc/tree-vect-data-refs.c   2017-09-18 12:56:24.849185433 +0100
*************** vect_record_base_alignments (vec_info *v
*** 775,780 ****
--- 775,791 ----
        }
  }
  
+ /* Return the target alignment for the vectorized form of DR.  */
+ 
+ static unsigned int
+ vect_calculate_target_alignment (struct data_reference *dr)
+ {
+   gimple *stmt = DR_STMT (dr);
+   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+   return targetm.vectorize.preferred_vector_alignment (vectype);
+ }
+ 
  /* Function vect_compute_data_ref_alignment
  
     Compute the misalignment of the data reference DR.
*************** vect_compute_data_ref_alignment (struct
*** 811,816 ****
--- 822,831 ----
    innermost_loop_behavior *drb = vect_dr_behavior (dr);
    bool step_preserves_misalignment_p;
  
+   unsigned HOST_WIDE_INT vector_alignment
+     = vect_calculate_target_alignment (dr) / BITS_PER_UNIT;
+   DR_TARGET_ALIGNMENT (dr) = vector_alignment;
+ 
    /* No step for BB vectorization.  */
    if (!loop)
      {
*************** vect_compute_data_ref_alignment (struct
*** 823,865 ****
       relative to the outer-loop (LOOP).  This is ok only if the misalignment
       stays the same throughout the execution of the inner-loop, which is why
       we have to check that the stride of the dataref in the inner-loop evenly
!      divides by the vector size.  */
    else if (nested_in_vect_loop_p (loop, stmt))
      {
        step_preserves_misalignment_p
!       = (DR_STEP_ALIGNMENT (dr)
!          % GET_MODE_SIZE (TYPE_MODE (vectype))) == 0;
  
        if (dump_enabled_p ())
        {
          if (step_preserves_misalignment_p)
            dump_printf_loc (MSG_NOTE, vect_location,
!                            "inner step divides the vector-size.\n");
          else
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!                            "inner step doesn't divide the vector-size.\n");
        }
      }
  
    /* Similarly we can only use base and misalignment information relative to
       an innermost loop if the misalignment stays the same throughout the
       execution of the loop.  As above, this is the case if the stride of
!      the dataref evenly divides by the vector size.  */
    else
      {
        unsigned vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
        step_preserves_misalignment_p
!       = ((DR_STEP_ALIGNMENT (dr) * vf)
!          % GET_MODE_SIZE (TYPE_MODE (vectype))) == 0;
  
        if (!step_preserves_misalignment_p && dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!                        "step doesn't divide the vector-size.\n");
      }
  
    unsigned int base_alignment = drb->base_alignment;
    unsigned int base_misalignment = drb->base_misalignment;
-   unsigned HOST_WIDE_INT vector_alignment = TYPE_ALIGN_UNIT (vectype);
  
    /* Calculate the maximum of the pooled base address alignment and the
       alignment that we can compute for DR itself.  */
--- 838,878 ----
       relative to the outer-loop (LOOP).  This is ok only if the misalignment
       stays the same throughout the execution of the inner-loop, which is why
       we have to check that the stride of the dataref in the inner-loop evenly
!      divides by the vector alignment.  */
    else if (nested_in_vect_loop_p (loop, stmt))
      {
        step_preserves_misalignment_p
!       = (DR_STEP_ALIGNMENT (dr) % vector_alignment) == 0;
  
        if (dump_enabled_p ())
        {
          if (step_preserves_misalignment_p)
            dump_printf_loc (MSG_NOTE, vect_location,
!                            "inner step divides the vector alignment.\n");
          else
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!                            "inner step doesn't divide the vector"
!                            " alignment.\n");
        }
      }
  
    /* Similarly we can only use base and misalignment information relative to
       an innermost loop if the misalignment stays the same throughout the
       execution of the loop.  As above, this is the case if the stride of
!      the dataref evenly divides by the alignment.  */
    else
      {
        unsigned vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
        step_preserves_misalignment_p
!       = ((DR_STEP_ALIGNMENT (dr) * vf) % vector_alignment) == 0;
  
        if (!step_preserves_misalignment_p && dump_enabled_p ())
        dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
!                        "step doesn't divide the vector alignment.\n");
      }
  
    unsigned int base_alignment = drb->base_alignment;
    unsigned int base_misalignment = drb->base_misalignment;
  
    /* Calculate the maximum of the pooled base address alignment and the
       alignment that we can compute for DR itself.  */
*************** vect_update_misalignment_for_peel (struc
*** 1007,1015 ****
      {
        bool negative = tree_int_cst_compare (DR_STEP (dr), size_zero_node) < 0;
        int misal = DR_MISALIGNMENT (dr);
-       tree vectype = STMT_VINFO_VECTYPE (stmt_info);
        misal += negative ? -npeel * dr_size : npeel * dr_size;
!       misal &= (TYPE_ALIGN (vectype) / BITS_PER_UNIT) - 1;
        SET_DR_MISALIGNMENT (dr, misal);
        return;
      }
--- 1020,1027 ----
      {
        bool negative = tree_int_cst_compare (DR_STEP (dr), size_zero_node) < 0;
        int misal = DR_MISALIGNMENT (dr);
        misal += negative ? -npeel * dr_size : npeel * dr_size;
!       misal &= DR_TARGET_ALIGNMENT (dr) - 1;
        SET_DR_MISALIGNMENT (dr, misal);
        return;
      }
*************** vect_enhance_data_refs_alignment (loop_v
*** 1657,1672 ****
          {
            if (known_alignment_for_access_p (dr))
              {
!               unsigned int npeel_tmp = 0;
              bool negative = tree_int_cst_compare (DR_STEP (dr),
                                                    size_zero_node) < 0;
  
!               vectype = STMT_VINFO_VECTYPE (stmt_info);
!               nelements = TYPE_VECTOR_SUBPARTS (vectype);
!             mis = DR_MISALIGNMENT (dr) / vect_get_scalar_dr_size (dr);
              if (DR_MISALIGNMENT (dr) != 0)
!               npeel_tmp = (negative ? (mis - nelements)
!                            : (nelements - mis)) & (nelements - 1);
  
                /* For multiple types, it is possible that the bigger type 
access
                   will have more than one peeling option.  E.g., a loop with 
two
--- 1669,1685 ----
          {
            if (known_alignment_for_access_p (dr))
              {
!             unsigned int npeel_tmp = 0;
              bool negative = tree_int_cst_compare (DR_STEP (dr),
                                                    size_zero_node) < 0;
  
!             vectype = STMT_VINFO_VECTYPE (stmt_info);
!             nelements = TYPE_VECTOR_SUBPARTS (vectype);
!             unsigned int target_align = DR_TARGET_ALIGNMENT (dr);
!             unsigned int dr_size = vect_get_scalar_dr_size (dr);
!             mis = (negative ? DR_MISALIGNMENT (dr) : -DR_MISALIGNMENT (dr));
              if (DR_MISALIGNMENT (dr) != 0)
!               npeel_tmp = (mis & (target_align - 1)) / dr_size;
  
                /* For multiple types, it is possible that the bigger type 
access
                   will have more than one peeling option.  E.g., a loop with 
two
*************** vect_enhance_data_refs_alignment (loop_v
*** 1701,1707 ****
                  {
                    vect_peeling_hash_insert (&peeling_htab, loop_vinfo,
                                            dr, npeel_tmp);
!                   npeel_tmp += nelements;
                  }
  
              one_misalignment_known = true;
--- 1714,1720 ----
                  {
                    vect_peeling_hash_insert (&peeling_htab, loop_vinfo,
                                            dr, npeel_tmp);
!                 npeel_tmp += target_align / dr_size;
                  }
  
              one_misalignment_known = true;
*************** vect_enhance_data_refs_alignment (loop_v
*** 1922,1928 ****
        stmt = DR_STMT (dr0);
        stmt_info = vinfo_for_stmt (stmt);
        vectype = STMT_VINFO_VECTYPE (stmt_info);
-       nelements = TYPE_VECTOR_SUBPARTS (vectype);
  
        if (known_alignment_for_access_p (dr0))
          {
--- 1935,1940 ----
*************** vect_enhance_data_refs_alignment (loop_v
*** 1935,1943 ****
                   updating DR_MISALIGNMENT values.  The peeling factor is the
                   vectorization factor minus the misalignment as an element
                   count.  */
!             mis = DR_MISALIGNMENT (dr0) / vect_get_scalar_dr_size (dr0);
!               npeel = ((negative ? mis - nelements : nelements - mis)
!                      & (nelements - 1));
              }
  
          /* For interleaved data access every iteration accesses all the
--- 1947,1956 ----
                   updating DR_MISALIGNMENT values.  The peeling factor is the
                   vectorization factor minus the misalignment as an element
                   count.  */
!             mis = negative ? DR_MISALIGNMENT (dr0) : -DR_MISALIGNMENT (dr0);
!             unsigned int target_align = DR_TARGET_ALIGNMENT (dr0);
!             npeel = ((mis & (target_align - 1))
!                      / vect_get_scalar_dr_size (dr0));
              }
  
          /* For interleaved data access every iteration accesses all the
*************** vect_enhance_data_refs_alignment (loop_v
*** 1976,1985 ****
                unsigned max_peel = npeel;
                if (max_peel == 0)
                  {
!                 gimple *dr_stmt = DR_STMT (dr0);
!                   stmt_vec_info vinfo = vinfo_for_stmt (dr_stmt);
!                   tree vtype = STMT_VINFO_VECTYPE (vinfo);
!                   max_peel = TYPE_VECTOR_SUBPARTS (vtype) - 1;
                  }
                if (max_peel > max_allowed_peel)
                  {
--- 1989,1996 ----
                unsigned max_peel = npeel;
                if (max_peel == 0)
                  {
!                 unsigned int target_align = DR_TARGET_ALIGNMENT (dr0);
!                 max_peel = target_align / vect_get_scalar_dr_size (dr0) - 1;
                  }
                if (max_peel > max_allowed_peel)
                  {
*************** vect_find_same_alignment_drs (struct dat
*** 2201,2208 ****
    if (diff != 0)
      {
        /* Get the wider of the two alignments.  */
!       unsigned int align_a = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE 
(stmtinfo_a));
!       unsigned int align_b = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE 
(stmtinfo_b));
        unsigned int max_align = MAX (align_a, align_b);
  
        /* Require the gap to be a multiple of the larger vector alignment.  */
--- 2212,2221 ----
    if (diff != 0)
      {
        /* Get the wider of the two alignments.  */
!       unsigned int align_a = (vect_calculate_target_alignment (dra)
!                             / BITS_PER_UNIT);
!       unsigned int align_b = (vect_calculate_target_alignment (drb)
!                             / BITS_PER_UNIT);
        unsigned int max_align = MAX (align_a, align_b);
  
        /* Require the gap to be a multiple of the larger vector alignment.  */
*************** vect_get_new_ssa_name (tree type, enum v
*** 3995,4010 ****
  /* Duplicate ptr info and set alignment/misaligment on NAME from DR.  */
  
  static void
! vect_duplicate_ssa_name_ptr_info (tree name, data_reference *dr,
!                                 stmt_vec_info stmt_info)
  {
    duplicate_ssa_name_ptr_info (name, DR_PTR_INFO (dr));
-   unsigned int align = TYPE_ALIGN_UNIT (STMT_VINFO_VECTYPE (stmt_info));
    int misalign = DR_MISALIGNMENT (dr);
    if (misalign == DR_MISALIGNMENT_UNKNOWN)
      mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (name));
    else
!     set_ptr_info_alignment (SSA_NAME_PTR_INFO (name), align, misalign);
  }
  
  /* Function vect_create_addr_base_for_vector_ref.
--- 4008,4022 ----
  /* Duplicate ptr info and set alignment/misaligment on NAME from DR.  */
  
  static void
! vect_duplicate_ssa_name_ptr_info (tree name, data_reference *dr)
  {
    duplicate_ssa_name_ptr_info (name, DR_PTR_INFO (dr));
    int misalign = DR_MISALIGNMENT (dr);
    if (misalign == DR_MISALIGNMENT_UNKNOWN)
      mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (name));
    else
!     set_ptr_info_alignment (SSA_NAME_PTR_INFO (name),
!                           DR_TARGET_ALIGNMENT (dr), misalign);
  }
  
  /* Function vect_create_addr_base_for_vector_ref.
*************** vect_create_addr_base_for_vector_ref (gi
*** 4109,4115 ****
        && TREE_CODE (addr_base) == SSA_NAME
        && !SSA_NAME_PTR_INFO (addr_base))
      {
!       vect_duplicate_ssa_name_ptr_info (addr_base, dr, stmt_info);
        if (offset || byte_offset)
        mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr_base));
      }
--- 4121,4127 ----
        && TREE_CODE (addr_base) == SSA_NAME
        && !SSA_NAME_PTR_INFO (addr_base))
      {
!       vect_duplicate_ssa_name_ptr_info (addr_base, dr);
        if (offset || byte_offset)
        mark_ptr_info_alignment_unknown (SSA_NAME_PTR_INFO (addr_base));
      }
*************** vect_create_data_ref_ptr (gimple *stmt,
*** 4368,4375 ****
        /* Copy the points-to information if it exists. */
        if (DR_PTR_INFO (dr))
        {
!         vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr, stmt_info);
!         vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr, stmt_info);
        }
        if (ptr_incr)
        *ptr_incr = incr;
--- 4380,4387 ----
        /* Copy the points-to information if it exists. */
        if (DR_PTR_INFO (dr))
        {
!         vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr);
!         vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr);
        }
        if (ptr_incr)
        *ptr_incr = incr;
*************** vect_create_data_ref_ptr (gimple *stmt,
*** 4398,4405 ****
        /* Copy the points-to information if it exists. */
        if (DR_PTR_INFO (dr))
        {
!         vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr, stmt_info);
!         vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr, stmt_info);
        }
        if (ptr_incr)
        *ptr_incr = incr;
--- 4410,4417 ----
        /* Copy the points-to information if it exists. */
        if (DR_PTR_INFO (dr))
        {
!         vect_duplicate_ssa_name_ptr_info (indx_before_incr, dr);
!         vect_duplicate_ssa_name_ptr_info (indx_after_incr, dr);
        }
        if (ptr_incr)
        *ptr_incr = incr;
*************** vect_setup_realignment (gimple *stmt, gi
*** 5003,5012 ****
        new_temp = copy_ssa_name (ptr);
        else
        new_temp = make_ssa_name (TREE_TYPE (ptr));
        new_stmt = gimple_build_assign
                   (new_temp, BIT_AND_EXPR, ptr,
!                   build_int_cst (TREE_TYPE (ptr),
!                                  -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
        new_bb = gsi_insert_on_edge_immediate (pe, new_stmt);
        gcc_assert (!new_bb);
        data_ref
--- 5015,5024 ----
        new_temp = copy_ssa_name (ptr);
        else
        new_temp = make_ssa_name (TREE_TYPE (ptr));
+       unsigned int align = DR_TARGET_ALIGNMENT (dr);
        new_stmt = gimple_build_assign
                   (new_temp, BIT_AND_EXPR, ptr,
!                   build_int_cst (TREE_TYPE (ptr), -(HOST_WIDE_INT) align));
        new_bb = gsi_insert_on_edge_immediate (pe, new_stmt);
        gcc_assert (!new_bb);
        data_ref
Index: gcc/tree-vect-loop-manip.c
===================================================================
*** gcc/tree-vect-loop-manip.c  2017-09-18 12:56:24.635070853 +0100
--- gcc/tree-vect-loop-manip.c  2017-09-18 12:56:24.849185433 +0100
*************** vect_gen_prolog_loop_niters (loop_vec_in
*** 956,963 ****
    gimple *dr_stmt = DR_STMT (dr);
    stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
!   int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
!   int nelements = TYPE_VECTOR_SUBPARTS (vectype);
  
    if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0)
      {
--- 956,962 ----
    gimple *dr_stmt = DR_STMT (dr);
    stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
    tree vectype = STMT_VINFO_VECTYPE (stmt_info);
!   unsigned int target_align = DR_TARGET_ALIGNMENT (dr);
  
    if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0)
      {
*************** vect_gen_prolog_loop_niters (loop_vec_in
*** 978,1009 ****
        tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
                                                              &stmts, offset);
        tree type = unsigned_type_for (TREE_TYPE (start_addr));
!       tree vectype_align_minus_1 = build_int_cst (type, vectype_align - 1);
!       HOST_WIDE_INT elem_size =
!                 int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
        tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));
!       tree nelements_minus_1 = build_int_cst (type, nelements - 1);
!       tree nelements_tree = build_int_cst (type, nelements);
!       tree byte_misalign;
!       tree elem_misalign;
! 
!       /* Create:  byte_misalign = addr & (vectype_align - 1)  */
!       byte_misalign =
!       fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
!                    vectype_align_minus_1);
! 
!       /* Create:  elem_misalign = byte_misalign / element_size  */
!       elem_misalign =
!       fold_build2 (RSHIFT_EXPR, type, byte_misalign, elem_size_log);
  
!       /* Create:  (niters_type) (nelements - elem_misalign)&(nelements - 1)  
*/
        if (negative)
!       iters = fold_build2 (MINUS_EXPR, type, elem_misalign, nelements_tree);
        else
!       iters = fold_build2 (MINUS_EXPR, type, nelements_tree, elem_misalign);
!       iters = fold_build2 (BIT_AND_EXPR, type, iters, nelements_minus_1);
        iters = fold_convert (niters_type, iters);
!       *bound = nelements - 1;
      }
  
    if (dump_enabled_p ())
--- 977,1012 ----
        tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
                                                              &stmts, offset);
        tree type = unsigned_type_for (TREE_TYPE (start_addr));
!       tree target_align_minus_1 = build_int_cst (type, target_align - 1);
!       HOST_WIDE_INT elem_size
!       = int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
        tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));
!       HOST_WIDE_INT align_in_elems = target_align / elem_size;
!       tree align_in_elems_minus_1 = build_int_cst (type, align_in_elems - 1);
!       tree align_in_elems_tree = build_int_cst (type, align_in_elems);
!       tree misalign_in_bytes;
!       tree misalign_in_elems;
! 
!       /* Create:  misalign_in_bytes = addr & (target_align - 1).  */
!       misalign_in_bytes
!       = fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
!                      target_align_minus_1);
! 
!       /* Create:  misalign_in_elems = misalign_in_bytes / element_size.  */
!       misalign_in_elems
!       = fold_build2 (RSHIFT_EXPR, type, misalign_in_bytes, elem_size_log);
  
!       /* Create:  (niters_type) ((align_in_elems - misalign_in_elems)
!                                & (align_in_elems - 1)).  */
        if (negative)
!       iters = fold_build2 (MINUS_EXPR, type, misalign_in_elems,
!                            align_in_elems_tree);
        else
!       iters = fold_build2 (MINUS_EXPR, type, align_in_elems_tree,
!                            misalign_in_elems);
!       iters = fold_build2 (BIT_AND_EXPR, type, iters, align_in_elems_minus_1);
        iters = fold_convert (niters_type, iters);
!       *bound = align_in_elems - 1;
      }
  
    if (dump_enabled_p ())
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c       2017-09-18 12:56:24.635070853 +0100
--- gcc/tree-vect-stmts.c       2017-09-18 12:56:24.850088870 +0100
*************** get_group_load_store_type (gimple *stmt,
*** 1737,1742 ****
--- 1737,1743 ----
    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
    struct loop *loop = loop_vinfo ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
    gimple *first_stmt = GROUP_FIRST_ELEMENT (stmt_info);
+   data_reference *first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt 
(first_stmt));
    unsigned int group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt));
    bool single_element_p = (stmt == first_stmt
                           && !GROUP_NEXT_ELEMENT (stmt_info));
*************** get_group_load_store_type (gimple *stmt,
*** 1780,1789 ****
                               " non-consecutive accesses\n");
              return false;
            }
!         /* If the access is aligned an overrun is fine.  */
          if (overrun_p
!             && aligned_access_p
!                  (STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt))))
            overrun_p = false;
          if (overrun_p && !can_overrun_p)
            {
--- 1781,1793 ----
                               " non-consecutive accesses\n");
              return false;
            }
!         /* An overrun is fine if the trailing elements are smaller
!            than the alignment boundary B.  Every vector access will
!            be a multiple of B and so we are guaranteed to access a
!            non-gap element in the same B-sized block.  */
          if (overrun_p
!             && gap < (vect_known_alignment_in_bytes (first_dr)
!                       / vect_get_scalar_dr_size (first_dr)))
            overrun_p = false;
          if (overrun_p && !can_overrun_p)
            {
*************** get_group_load_store_type (gimple *stmt,
*** 1804,1817 ****
        /* If there is a gap at the end of the group then these optimizations
         would access excess elements in the last iteration.  */
        bool would_overrun_p = (gap != 0);
!       /* If the access is aligned an overrun is fine, but only if the
!          overrun is not inside an unused vector (if the gap is as large
!        or larger than a vector).  */
        if (would_overrun_p
!         && gap < nunits
!         && aligned_access_p
!               (STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt))))
        would_overrun_p = false;
        if (!STMT_VINFO_STRIDED_P (stmt_info)
          && (can_overrun_p || !would_overrun_p)
          && compare_step_with_zero (stmt) > 0)
--- 1808,1822 ----
        /* If there is a gap at the end of the group then these optimizations
         would access excess elements in the last iteration.  */
        bool would_overrun_p = (gap != 0);
!       /* An overrun is fine if the trailing elements are smaller than the
!        alignment boundary B.  Every vector access will be a multiple of B
!        and so we are guaranteed to access a non-gap element in the
!        same B-sized block.  */
        if (would_overrun_p
!         && gap < (vect_known_alignment_in_bytes (first_dr)
!                   / vect_get_scalar_dr_size (first_dr)))
        would_overrun_p = false;
+ 
        if (!STMT_VINFO_STRIDED_P (stmt_info)
          && (can_overrun_p || !would_overrun_p)
          && compare_step_with_zero (stmt) > 0)
*************** vectorizable_mask_load_store (gimple *st
*** 2351,2357 ****
                                             TYPE_SIZE_UNIT (vectype));
            }
  
!         align = TYPE_ALIGN_UNIT (vectype);
          if (aligned_access_p (dr))
            misalign = 0;
          else if (DR_MISALIGNMENT (dr) == -1)
--- 2356,2362 ----
                                             TYPE_SIZE_UNIT (vectype));
            }
  
!         align = DR_TARGET_ALIGNMENT (dr);
          if (aligned_access_p (dr))
            misalign = 0;
          else if (DR_MISALIGNMENT (dr) == -1)
*************** vectorizable_mask_load_store (gimple *st
*** 2404,2410 ****
                                             TYPE_SIZE_UNIT (vectype));
            }
  
!         align = TYPE_ALIGN_UNIT (vectype);
          if (aligned_access_p (dr))
            misalign = 0;
          else if (DR_MISALIGNMENT (dr) == -1)
--- 2409,2415 ----
                                             TYPE_SIZE_UNIT (vectype));
            }
  
!         align = DR_TARGET_ALIGNMENT (dr);
          if (aligned_access_p (dr))
            misalign = 0;
          else if (DR_MISALIGNMENT (dr) == -1)
*************** vectorizable_operation (gimple *stmt, gi
*** 5553,5577 ****
    return true;
  }
  
! /* A helper function to ensure data reference DR's base alignment
!    for STMT_INFO.  */
  
  static void
! ensure_base_align (stmt_vec_info stmt_info, struct data_reference *dr)
  {
    if (!dr->aux)
      return;
  
    if (DR_VECT_AUX (dr)->base_misaligned)
      {
-       tree vectype = STMT_VINFO_VECTYPE (stmt_info);
        tree base_decl = DR_VECT_AUX (dr)->base_decl;
  
        if (decl_in_symtab_p (base_decl))
!       symtab_node::get (base_decl)->increase_alignment (TYPE_ALIGN (vectype));
        else
        {
!           SET_DECL_ALIGN (base_decl, TYPE_ALIGN (vectype));
            DECL_USER_ALIGN (base_decl) = 1;
        }
        DR_VECT_AUX (dr)->base_misaligned = false;
--- 5558,5582 ----
    return true;
  }
  
! /* A helper function to ensure data reference DR's base alignment.  */
  
  static void
! ensure_base_align (struct data_reference *dr)
  {
    if (!dr->aux)
      return;
  
    if (DR_VECT_AUX (dr)->base_misaligned)
      {
        tree base_decl = DR_VECT_AUX (dr)->base_decl;
  
+       unsigned int align_base_to = DR_TARGET_ALIGNMENT (dr) * BITS_PER_UNIT;
+ 
        if (decl_in_symtab_p (base_decl))
!       symtab_node::get (base_decl)->increase_alignment (align_base_to);
        else
        {
!         SET_DECL_ALIGN (base_decl, align_base_to);
            DECL_USER_ALIGN (base_decl) = 1;
        }
        DR_VECT_AUX (dr)->base_misaligned = false;
*************** vectorizable_store (gimple *stmt, gimple
*** 5775,5781 ****
  
    /* Transform.  */
  
!   ensure_base_align (stmt_info, dr);
  
    if (memory_access_type == VMAT_GATHER_SCATTER)
      {
--- 5780,5786 ----
  
    /* Transform.  */
  
!   ensure_base_align (dr);
  
    if (memory_access_type == VMAT_GATHER_SCATTER)
      {
*************** vectorizable_store (gimple *stmt, gimple
*** 6417,6423 ****
                                      dataref_offset
                                      ? dataref_offset
                                      : build_int_cst (ref_type, 0));
!             align = TYPE_ALIGN_UNIT (vectype);
              if (aligned_access_p (first_dr))
                misalign = 0;
              else if (DR_MISALIGNMENT (first_dr) == -1)
--- 6422,6428 ----
                                      dataref_offset
                                      ? dataref_offset
                                      : build_int_cst (ref_type, 0));
!             align = DR_TARGET_ALIGNMENT (first_dr);
              if (aligned_access_p (first_dr))
                misalign = 0;
              else if (DR_MISALIGNMENT (first_dr) == -1)
*************** vectorizable_load (gimple *stmt, gimple_
*** 6813,6819 ****
  
    /* Transform.  */
  
!   ensure_base_align (stmt_info, dr);
  
    if (memory_access_type == VMAT_GATHER_SCATTER)
      {
--- 6818,6824 ----
  
    /* Transform.  */
  
!   ensure_base_align (dr);
  
    if (memory_access_type == VMAT_GATHER_SCATTER)
      {
*************** vectorizable_load (gimple *stmt, gimple_
*** 7512,7518 ****
                                     dataref_offset
                                     ? dataref_offset
                                     : build_int_cst (ref_type, 0));
!                   align = TYPE_ALIGN_UNIT (vectype);
                    if (alignment_support_scheme == dr_aligned)
                      {
                        gcc_assert (aligned_access_p (first_dr));
--- 7517,7523 ----
                                     dataref_offset
                                     ? dataref_offset
                                     : build_int_cst (ref_type, 0));
!                   align = DR_TARGET_ALIGNMENT (dr);
                    if (alignment_support_scheme == dr_aligned)
                      {
                        gcc_assert (aligned_access_p (first_dr));
*************** vectorizable_load (gimple *stmt, gimple_
*** 7555,7565 ****
                      ptr = copy_ssa_name (dataref_ptr);
                    else
                      ptr = make_ssa_name (TREE_TYPE (dataref_ptr));
                    new_stmt = gimple_build_assign
                                 (ptr, BIT_AND_EXPR, dataref_ptr,
                                  build_int_cst
                                  (TREE_TYPE (dataref_ptr),
!                                  -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
                    vect_finish_stmt_generation (stmt, new_stmt, gsi);
                    data_ref
                      = build2 (MEM_REF, vectype, ptr,
--- 7560,7571 ----
                      ptr = copy_ssa_name (dataref_ptr);
                    else
                      ptr = make_ssa_name (TREE_TYPE (dataref_ptr));
+                   unsigned int align = DR_TARGET_ALIGNMENT (first_dr);
                    new_stmt = gimple_build_assign
                                 (ptr, BIT_AND_EXPR, dataref_ptr,
                                  build_int_cst
                                  (TREE_TYPE (dataref_ptr),
!                                  -(HOST_WIDE_INT) align));
                    vect_finish_stmt_generation (stmt, new_stmt, gsi);
                    data_ref
                      = build2 (MEM_REF, vectype, ptr,
*************** vectorizable_load (gimple *stmt, gimple_
*** 7581,7588 ****
                    new_stmt = gimple_build_assign
                                 (NULL_TREE, BIT_AND_EXPR, ptr,
                                  build_int_cst
!                                 (TREE_TYPE (ptr),
!                                  -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
                    ptr = copy_ssa_name (ptr, new_stmt);
                    gimple_assign_set_lhs (new_stmt, ptr);
                    vect_finish_stmt_generation (stmt, new_stmt, gsi);
--- 7587,7593 ----
                    new_stmt = gimple_build_assign
                                 (NULL_TREE, BIT_AND_EXPR, ptr,
                                  build_int_cst
!                                 (TREE_TYPE (ptr), -(HOST_WIDE_INT) align));
                    ptr = copy_ssa_name (ptr, new_stmt);
                    gimple_assign_set_lhs (new_stmt, ptr);
                    vect_finish_stmt_generation (stmt, new_stmt, gsi);
*************** vectorizable_load (gimple *stmt, gimple_
*** 7592,7611 ****
                    break;
                  }
                case dr_explicit_realign_optimized:
!                 if (TREE_CODE (dataref_ptr) == SSA_NAME)
!                   new_temp = copy_ssa_name (dataref_ptr);
!                 else
!                   new_temp = make_ssa_name (TREE_TYPE (dataref_ptr));
!                 new_stmt = gimple_build_assign
!                              (new_temp, BIT_AND_EXPR, dataref_ptr,
!                               build_int_cst
!                                 (TREE_TYPE (dataref_ptr),
!                                  -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
!                 vect_finish_stmt_generation (stmt, new_stmt, gsi);
!                 data_ref
!                   = build2 (MEM_REF, vectype, new_temp,
!                             build_int_cst (ref_type, 0));
!                 break;
                default:
                  gcc_unreachable ();
                }
--- 7597,7618 ----
                    break;
                  }
                case dr_explicit_realign_optimized:
!                 {
!                   if (TREE_CODE (dataref_ptr) == SSA_NAME)
!                     new_temp = copy_ssa_name (dataref_ptr);
!                   else
!                     new_temp = make_ssa_name (TREE_TYPE (dataref_ptr));
!                   unsigned int align = DR_TARGET_ALIGNMENT (first_dr);
!                   new_stmt = gimple_build_assign
!                     (new_temp, BIT_AND_EXPR, dataref_ptr,
!                      build_int_cst (TREE_TYPE (dataref_ptr),
!                                    -(HOST_WIDE_INT) align));
!                   vect_finish_stmt_generation (stmt, new_stmt, gsi);
!                   data_ref
!                     = build2 (MEM_REF, vectype, new_temp,
!                               build_int_cst (ref_type, 0));
!                   break;
!                 }
                default:
                  gcc_unreachable ();
                }
Index: gcc/testsuite/gcc.dg/vect/vect-outer-3a.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-outer-3a.c   2017-09-18 12:56:24.635070853 
+0100
--- gcc/testsuite/gcc.dg/vect/vect-outer-3a.c   2017-09-18 12:56:24.849185433 
+0100
*************** int main (void)
*** 49,52 ****
  }
  
  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail 
{ vect_no_align && { ! vect_hw_misalign } } } } } */
! /* { dg-final { scan-tree-dump-times "step doesn't divide the vector-size" 1 
"vect" } } */
--- 49,52 ----
  }
  
  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail 
{ vect_no_align && { ! vect_hw_misalign } } } } } */
! /* { dg-final { scan-tree-dump-times "step doesn't divide the vector 
alignment" 1 "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-outer-3a-big-array.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-outer-3a-big-array.c 2017-09-18 
12:56:24.635070853 +0100
--- gcc/testsuite/gcc.dg/vect/vect-outer-3a-big-array.c 2017-09-18 
12:56:24.847378559 +0100
*************** int main (void)
*** 49,52 ****
  }
  
  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail 
{ vect_no_align && { ! vect_hw_misalign } } } } } */
! /* { dg-final { scan-tree-dump-times "step doesn't divide the vector-size" 1 
"vect" } } */
--- 49,52 ----
  }
  
  /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail 
{ vect_no_align && { ! vect_hw_misalign } } } } } */
! /* { dg-final { scan-tree-dump-times "step doesn't divide the vector 
alignment" 1 "vect" } } */

Reply via email to