On Thu, Nov 12, 2015 at 7:35 PM, Alan Lawrence <alan.lawre...@arm.com> wrote:
> On 06/11/15 16:29, Richard Biener wrote:
>>>> 2) You should be able to use fold_ctor_reference directly (in place
>>> of
>>>> all your code
>>>> in case offset and size are readily available - don't remember
>>> exactly how
>>>> complete scalarization "walks" elements).  Alternatively use
>>>> fold_const_aggregate_ref.
>
> Well, yes I was able to remove subst_constant_pool_initial by using
> fold_ctor_reference, with some fairly strict checks about the access 
> expressions
> found while scanning the input. I still needed to either deep-scan the 
> constant
> value itself, or allow completely_scalarize to fail, due to Ada c460010 where
> a variable has DECL_INITIAL: {VIEW_CONVERT_EXPR<integer[2:4]>({1, 2, 3})}
> (that is, a CONSTRUCTOR whose only element is a V.C.E. of another 
> CONSTRUCTOR).
>
> However:
>
>> I tried to suggest creating piecewise loads from the constant pool as 
>> opposed to the aggregate one.  But I suppose the difficulty is to determine 
>> 'pieces', not their values (that followup passes and folding can handle 
>> quite efficiently).
>
> I've now got this working, and it simplifies the patch massively :).
>
> We now replace e.g. a constant-pool load a = *.LC0, with a series of loads 
> e.g.
> SR_a1 = SRlc0_1
> SR_a2 = SRlc0_2
> etc. etc. (well those wouldn't be quite the names, but you get the idea.)
>
> And then at the start of the function we initialise
> SRlc0_1 = *.LC0[0]
> SRlc0_2 = *.LC0[1]
> etc. etc.
> - I needed to use SRlc0_1 etc. variables because some of the constant-pool 
> loads
> coming from Ada are rather more complicated than 'a = *.LC0' and substituting
> the access expression (e.g. '*.LC0[0].foo'), rather than the replacement_decl,
> into those lead to just too many verify_gimple failures.
>
> However, the initialization seems to work well enough, if I put a check into
> sra_modify_expr to avoid changing 'SRlc0_1 = *.LC0[0]' into 'SRlc0_1 = 
> SRlc0_1'
> (!). Slightly hacky perhaps but harmless as there cannot be a statement 
> writing
> to a scalar replacement prior to sra_modify_expr.
>
> Also I had to prevent writing scalar replacements back to the constant pool
> (gnat opt31.adb).
>
> Finally - I've left the disqualified_constants code in, but am now only using 
> it
> for an assert...so I guess it'd be reasonable to take that out. In an ideal
> world we could do those checks only in checking builds but allocating bitmaps
> only for checking builds feels like it would make checking vs non-checking
> diverge too much.
>
> This is now passing bootstrap+check-{gcc,g++,fortran} on AArch64, and the same
> plus Ada on x64_64 and ARM, except for some additional guality failures in
> pr54970-1.c on AArch64 (tests which are already failing on ARM) - I haven't 
> had
> any success in figuring those out yet, suggestions welcome.
>
> However, without the DOM patches, this does not fix the oiginal 
> ssa-dom-cse-2.c
> testcase, even though the load is scalarized in esra and the individual 
> element
> loads resolved in ccp2. I don't think I'll be able to respin those between now
> and stage 1 ending on Saturday, so hence I've had to drop the testsuite 
> change,
> and can only / will merely describe the remaining problem in PR/63679. I'm
> now benchmarking on AArch64 to see what difference this patch makes on its 
> own.
>
> Thoughts?

The new function needs a comment.  Otherwise, given there are no testcases,
I wonder what this does to nested constructs like multi-dimensional arrays
in a struct.

The patch looks _much_ better now though.  But I also wonder what the effects
on code-generation are for the non-reg-type load of part case.

I'd say the patch is ok with the comment added and a testcase (or two) verifying
it works as desired.

Thanks,
Richard.

> gcc/ChangeLog:
>
>         PR target/63679
>         * tree-sra.c (disqualified_constants): New.
>         (sra_initialize): Allocate disqualified_constants.
>         (sra_deinitialize): Free disqualified_constants.
>         (disqualify_candidate): Update disqualified_constants when 
> appropriate.
>         (create_access): Scan for constant-pool entries as we go along.
>         (scalarizable_type_p): Add check against type_contains_placeholder_p.
>         (maybe_add_sra_candidate): Allow constant-pool entries.
>         (initialize_constant_pool_replacements): New.
>         (sra_modify_expr): Avoid mangling assignments created by previous.
>         (sra_modify_function_body): Call 
> initialize_constant_pool_replacements.
> ---
>  gcc/tree-sra.c | 93 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 90 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 1d4a632..aa60f06 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -325,6 +325,10 @@ candidate (unsigned uid)
>     those which cannot be (because they are and need be used as a whole).  */
>  static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap;
>
> +/* Bitmap of candidates in the constant pool, which cannot be scalarized
> +   because this would produce non-constant expressions (e.g. Ada).  */
> +static bitmap disqualified_constants;
> +
>  /* Obstack for creation of fancy names.  */
>  static struct obstack name_obstack;
>
> @@ -647,6 +651,7 @@ sra_initialize (void)
>      (vec_safe_length (cfun->local_decls) / 2);
>    should_scalarize_away_bitmap = BITMAP_ALLOC (NULL);
>    cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL);
> +  disqualified_constants = BITMAP_ALLOC (NULL);
>    gcc_obstack_init (&name_obstack);
>    base_access_vec = new hash_map<tree, auto_vec<access_p> >;
>    memset (&sra_stats, 0, sizeof (sra_stats));
> @@ -665,6 +670,7 @@ sra_deinitialize (void)
>    candidates = NULL;
>    BITMAP_FREE (should_scalarize_away_bitmap);
>    BITMAP_FREE (cannot_scalarize_away_bitmap);
> +  BITMAP_FREE (disqualified_constants);
>    access_pool.release ();
>    assign_link_pool.release ();
>    obstack_free (&name_obstack, NULL);
> @@ -679,6 +685,8 @@ disqualify_candidate (tree decl, const char *reason)
>  {
>    if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl)))
>      candidates->remove_elt_with_hash (decl, DECL_UID (decl));
> +  if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl))
> +    bitmap_set_bit (disqualified_constants, DECL_UID (decl));
>
>    if (dump_file && (dump_flags & TDF_DETAILS))
>      {
> @@ -830,8 +838,11 @@ create_access_1 (tree base, HOST_WIDE_INT offset, 
> HOST_WIDE_INT size)
>    return access;
>  }
>
> +static bool maybe_add_sra_candidate (tree);
> +
>  /* Create and insert access for EXPR. Return created access, or NULL if it is
> -   not possible.  */
> +   not possible.  Also scan for uses of constant pool as we go along and add
> +   to candidates.  */
>
>  static struct access *
>  create_access (tree expr, gimple *stmt, bool write)
> @@ -854,6 +865,25 @@ create_access (tree expr, gimple *stmt, bool write)
>    else
>      ptr = false;
>
> +  /* For constant-pool entries, check we can substitute the constant value.  
> */
> +  if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base)
> +      && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA))
> +    {
> +      gcc_assert (!bitmap_bit_p (disqualified_constants, DECL_UID (base)));
> +      if (expr != base
> +         && !is_gimple_reg_type (TREE_TYPE (expr))
> +         && dump_file && (dump_flags & TDF_DETAILS))
> +       {
> +         /* This occurs in Ada with accesses to ARRAY_RANGE_REFs,
> +            and elements of multidimensional arrays (which are
> +            multi-element arrays in their own right).  */
> +         fprintf (dump_file, "Allowing non-reg-type load of part"
> +                             " of constant-pool entry: ");
> +         print_generic_expr (dump_file, expr, 0);
> +       }
> +      maybe_add_sra_candidate (base);
> +    }
> +
>    if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
>      return NULL;
>
> @@ -912,6 +942,8 @@ static bool
>  scalarizable_type_p (tree type)
>  {
>    gcc_assert (!is_gimple_reg_type (type));
> +  if (type_contains_placeholder_p (type))
> +    return false;
>
>    switch (TREE_CODE (type))
>    {
> @@ -1832,7 +1864,12 @@ maybe_add_sra_candidate (tree var)
>        reject (var, "not aggregate");
>        return false;
>      }
> -  if (needs_to_live_in_memory (var))
> +  /* Allow constant-pool entries (that "need to live in memory")
> +     unless we are doing IPA SRA.  */
> +  if (needs_to_live_in_memory (var)
> +      && (sra_mode == SRA_MODE_EARLY_IPA
> +         || TREE_CODE (var) != VAR_DECL
> +         || !DECL_IN_CONSTANT_POOL (var)))
>      {
>        reject (var, "needs to live in memory");
>        return false;
> @@ -3250,6 +3287,9 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
> *gsi)
>    racc = get_access_for_expr (rhs);
>    if (!lacc && !racc)
>      return SRA_AM_NONE;
> +  /* Avoid modifying initializations of constant-pool replacements.  */
> +  if (racc && (racc->replacement_decl == lhs))
> +    return SRA_AM_NONE;
>
>    loc = gimple_location (stmt);
>    if (lacc && lacc->grp_to_be_replaced)
> @@ -3366,7 +3406,10 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
> *gsi)
>        || contains_vce_or_bfcref_p (lhs)
>        || stmt_ends_bb_p (stmt))
>      {
> -      if (access_has_children_p (racc))
> +      /* No need to copy into a constant-pool, it comes pre-initialized.  */
> +      if (access_has_children_p (racc)
> +         && (TREE_CODE (racc->base) != VAR_DECL
> +             || !DECL_IN_CONSTANT_POOL (racc->base)))
>         generate_subtree_copies (racc->first_child, rhs, racc->offset, 0, 0,
>                                  gsi, false, false, loc);
>        if (access_has_children_p (lacc))
> @@ -3469,6 +3512,48 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator 
> *gsi)
>      }
>  }
>
> +static void
> +initialize_constant_pool_replacements (void)
> +{
> +  gimple_seq seq = NULL;
> +  gimple_stmt_iterator gsi = gsi_start (seq);
> +  bitmap_iterator bi;
> +  unsigned i;
> +
> +  EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
> +    if (bitmap_bit_p (should_scalarize_away_bitmap, i)
> +       && !bitmap_bit_p (cannot_scalarize_away_bitmap, i))
> +      {
> +       tree var = candidate (i);
> +       if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var))
> +         continue;
> +       vec<access_p> *access_vec = get_base_access_vector (var);
> +       if (!access_vec)
> +         continue;
> +       for (unsigned i = 0; i < access_vec->length (); i++)
> +         {
> +           struct access *access = (*access_vec)[i];
> +           if (!access->replacement_decl)
> +             continue;
> +           gassign *stmt = gimple_build_assign (
> +             get_access_replacement (access), unshare_expr (access->expr));
> +           if (dump_file && (dump_flags & TDF_DETAILS))
> +             {
> +               fprintf (dump_file, "Generating constant initializer: ");
> +               print_gimple_stmt (dump_file, stmt, 0, 1);
> +               fprintf (dump_file, "\n");
> +             }
> +           gsi_insert_after (&gsi, stmt, GSI_NEW_STMT);
> +           update_stmt (stmt);
> +         }
> +      }
> +
> +  seq = gsi_seq (gsi);
> +  if (seq)
> +    gsi_insert_seq_on_edge_immediate (
> +      single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq);
> +}
> +
>  /* Traverse the function body and all modifications as decided in
>     analyze_all_variable_accesses.  Return true iff the CFG has been
>     changed.  */
> @@ -3479,6 +3564,8 @@ sra_modify_function_body (void)
>    bool cfg_changed = false;
>    basic_block bb;
>
> +  initialize_constant_pool_replacements ();
> +
>    FOR_EACH_BB_FN (bb, cfun)
>      {
>        gimple_stmt_iterator gsi = gsi_start_bb (bb);
> --
> 1.9.1
>

Reply via email to