https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679
--- Comment #18 from rguenther at suse dot de <rguenther at suse dot de> --- On Mon, 24 Nov 2014, belagod at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679 > > --- Comment #17 from Tejas Belagod <belagod at gcc dot gnu.org> --- > > - > > /* Do a block move either if the size is so small as to make > > each individual move a sub-unit move on average, or if it > > - is so large as to make individual moves inefficient. */ > > + is so large as to make individual moves inefficient. Reuse > > + the same costs logic as we use in the SRA passes. */ > > + unsigned max_scalarization_size > > + = optimize_function_for_size_p (cfun) > > + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) > > + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED); > > + > > if (size > 0 > > && num_nonzero_elements > 1 > > && (size < num_nonzero_elements > > - || !can_move_by_pieces (size, align))) > > + || size > max_scalarization_size)) > > { > > if (notify_temp_creation) > > return GS_ERROR; > > I think both move_by_pieces and SRA can co-exist here: > > diff --git a/gcc/gimplify.c b/gcc/gimplify.c > index 8e3dd83..be51ce7 100644 > --- a/gcc/gimplify.c > +++ b/gcc/gimplify.c > @@ -70,6 +70,7 @@ along with GCC; see the file COPYING3. If not see > #include "omp-low.h" > #include "gimple-low.h" > #include "cilk.h" > +#include "params.h" > > #include "langhooks-def.h" /* FIXME: for lhd_set_decl_assembler_name */ > #include "tree-pass.h" /* FIXME: only for PROP_gimple_any */ > @@ -3895,7 +3896,6 @@ gimplify_init_constructor (tree *expr_p, gimple_seq > *pre_p, gimple_seq *post_p, > DECL_ATTRIBUTES (current_function_decl)))) > { > HOST_WIDE_INT size = int_size_in_bytes (type); > unsigned int align; > > /* ??? We can still get unbounded array types, at least > from the C++ front end. This seems wrong, but attempt > @@ -3907,20 +3907,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq > *pre_p, gimple_seq *post_p, > TREE_TYPE (ctor) = type = TREE_TYPE (object); > } > > /* Find the maximum alignment we can assume for the object. */ > /* ??? Make use of DECL_OFFSET_ALIGN. */ > if (DECL_P (object)) > align = DECL_ALIGN (object); > else > align = TYPE_ALIGN (type); > > /* Do a block move either if the size is so small as to make > each individual move a sub-unit move on average, or if it > - is so large as to make individual moves inefficient. */ > + is so large as to make individual moves inefficient. Reuse > + the same costs logic as we use in the SRA passes. */ > + unsigned max_scalarization_size > + = optimize_function_for_size_p (cfun) > + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) > + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED); > + > if (size > 0 > && num_nonzero_elements > 1 > && (size < num_nonzero_elements > + || size > max_scalarization_size > || !can_move_by_pieces (size, align)) > { > if (notify_temp_creation) > return GS_ERROR; > > If it isn't profitable to do an SRA, we can fall-back to the backend hook to > move it by pieces. This way, I think we'll have move opportunity for > optimization. But that wouldn't fix the AARCH64 case as the backend says "no" here anyway?