On Thu, Nov 12, 2015 at 7:35 PM, Alan Lawrence <alan.lawre...@arm.com> wrote: > On 06/11/15 16:29, Richard Biener wrote: >>>> 2) You should be able to use fold_ctor_reference directly (in place >>> of >>>> all your code >>>> in case offset and size are readily available - don't remember >>> exactly how >>>> complete scalarization "walks" elements). Alternatively use >>>> fold_const_aggregate_ref. > > Well, yes I was able to remove subst_constant_pool_initial by using > fold_ctor_reference, with some fairly strict checks about the access > expressions > found while scanning the input. I still needed to either deep-scan the > constant > value itself, or allow completely_scalarize to fail, due to Ada c460010 where > a variable has DECL_INITIAL: {VIEW_CONVERT_EXPR<integer[2:4]>({1, 2, 3})} > (that is, a CONSTRUCTOR whose only element is a V.C.E. of another > CONSTRUCTOR). > > However: > >> I tried to suggest creating piecewise loads from the constant pool as >> opposed to the aggregate one. But I suppose the difficulty is to determine >> 'pieces', not their values (that followup passes and folding can handle >> quite efficiently). > > I've now got this working, and it simplifies the patch massively :). > > We now replace e.g. a constant-pool load a = *.LC0, with a series of loads > e.g. > SR_a1 = SRlc0_1 > SR_a2 = SRlc0_2 > etc. etc. (well those wouldn't be quite the names, but you get the idea.) > > And then at the start of the function we initialise > SRlc0_1 = *.LC0[0] > SRlc0_2 = *.LC0[1] > etc. etc. > - I needed to use SRlc0_1 etc. variables because some of the constant-pool > loads > coming from Ada are rather more complicated than 'a = *.LC0' and substituting > the access expression (e.g. '*.LC0[0].foo'), rather than the replacement_decl, > into those lead to just too many verify_gimple failures. > > However, the initialization seems to work well enough, if I put a check into > sra_modify_expr to avoid changing 'SRlc0_1 = *.LC0[0]' into 'SRlc0_1 = > SRlc0_1' > (!). Slightly hacky perhaps but harmless as there cannot be a statement > writing > to a scalar replacement prior to sra_modify_expr. > > Also I had to prevent writing scalar replacements back to the constant pool > (gnat opt31.adb). > > Finally - I've left the disqualified_constants code in, but am now only using > it > for an assert...so I guess it'd be reasonable to take that out. In an ideal > world we could do those checks only in checking builds but allocating bitmaps > only for checking builds feels like it would make checking vs non-checking > diverge too much. > > This is now passing bootstrap+check-{gcc,g++,fortran} on AArch64, and the same > plus Ada on x64_64 and ARM, except for some additional guality failures in > pr54970-1.c on AArch64 (tests which are already failing on ARM) - I haven't > had > any success in figuring those out yet, suggestions welcome. > > However, without the DOM patches, this does not fix the oiginal > ssa-dom-cse-2.c > testcase, even though the load is scalarized in esra and the individual > element > loads resolved in ccp2. I don't think I'll be able to respin those between now > and stage 1 ending on Saturday, so hence I've had to drop the testsuite > change, > and can only / will merely describe the remaining problem in PR/63679. I'm > now benchmarking on AArch64 to see what difference this patch makes on its > own. > > Thoughts?
The new function needs a comment. Otherwise, given there are no testcases, I wonder what this does to nested constructs like multi-dimensional arrays in a struct. The patch looks _much_ better now though. But I also wonder what the effects on code-generation are for the non-reg-type load of part case. I'd say the patch is ok with the comment added and a testcase (or two) verifying it works as desired. Thanks, Richard. > gcc/ChangeLog: > > PR target/63679 > * tree-sra.c (disqualified_constants): New. > (sra_initialize): Allocate disqualified_constants. > (sra_deinitialize): Free disqualified_constants. > (disqualify_candidate): Update disqualified_constants when > appropriate. > (create_access): Scan for constant-pool entries as we go along. > (scalarizable_type_p): Add check against type_contains_placeholder_p. > (maybe_add_sra_candidate): Allow constant-pool entries. > (initialize_constant_pool_replacements): New. > (sra_modify_expr): Avoid mangling assignments created by previous. > (sra_modify_function_body): Call > initialize_constant_pool_replacements. > --- > gcc/tree-sra.c | 93 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- > 1 file changed, 90 insertions(+), 3 deletions(-) > > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c > index 1d4a632..aa60f06 100644 > --- a/gcc/tree-sra.c > +++ b/gcc/tree-sra.c > @@ -325,6 +325,10 @@ candidate (unsigned uid) > those which cannot be (because they are and need be used as a whole). */ > static bitmap should_scalarize_away_bitmap, cannot_scalarize_away_bitmap; > > +/* Bitmap of candidates in the constant pool, which cannot be scalarized > + because this would produce non-constant expressions (e.g. Ada). */ > +static bitmap disqualified_constants; > + > /* Obstack for creation of fancy names. */ > static struct obstack name_obstack; > > @@ -647,6 +651,7 @@ sra_initialize (void) > (vec_safe_length (cfun->local_decls) / 2); > should_scalarize_away_bitmap = BITMAP_ALLOC (NULL); > cannot_scalarize_away_bitmap = BITMAP_ALLOC (NULL); > + disqualified_constants = BITMAP_ALLOC (NULL); > gcc_obstack_init (&name_obstack); > base_access_vec = new hash_map<tree, auto_vec<access_p> >; > memset (&sra_stats, 0, sizeof (sra_stats)); > @@ -665,6 +670,7 @@ sra_deinitialize (void) > candidates = NULL; > BITMAP_FREE (should_scalarize_away_bitmap); > BITMAP_FREE (cannot_scalarize_away_bitmap); > + BITMAP_FREE (disqualified_constants); > access_pool.release (); > assign_link_pool.release (); > obstack_free (&name_obstack, NULL); > @@ -679,6 +685,8 @@ disqualify_candidate (tree decl, const char *reason) > { > if (bitmap_clear_bit (candidate_bitmap, DECL_UID (decl))) > candidates->remove_elt_with_hash (decl, DECL_UID (decl)); > + if (TREE_CODE (decl) == VAR_DECL && DECL_IN_CONSTANT_POOL (decl)) > + bitmap_set_bit (disqualified_constants, DECL_UID (decl)); > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > @@ -830,8 +838,11 @@ create_access_1 (tree base, HOST_WIDE_INT offset, > HOST_WIDE_INT size) > return access; > } > > +static bool maybe_add_sra_candidate (tree); > + > /* Create and insert access for EXPR. Return created access, or NULL if it is > - not possible. */ > + not possible. Also scan for uses of constant pool as we go along and add > + to candidates. */ > > static struct access * > create_access (tree expr, gimple *stmt, bool write) > @@ -854,6 +865,25 @@ create_access (tree expr, gimple *stmt, bool write) > else > ptr = false; > > + /* For constant-pool entries, check we can substitute the constant value. > */ > + if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base) > + && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA)) > + { > + gcc_assert (!bitmap_bit_p (disqualified_constants, DECL_UID (base))); > + if (expr != base > + && !is_gimple_reg_type (TREE_TYPE (expr)) > + && dump_file && (dump_flags & TDF_DETAILS)) > + { > + /* This occurs in Ada with accesses to ARRAY_RANGE_REFs, > + and elements of multidimensional arrays (which are > + multi-element arrays in their own right). */ > + fprintf (dump_file, "Allowing non-reg-type load of part" > + " of constant-pool entry: "); > + print_generic_expr (dump_file, expr, 0); > + } > + maybe_add_sra_candidate (base); > + } > + > if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) > return NULL; > > @@ -912,6 +942,8 @@ static bool > scalarizable_type_p (tree type) > { > gcc_assert (!is_gimple_reg_type (type)); > + if (type_contains_placeholder_p (type)) > + return false; > > switch (TREE_CODE (type)) > { > @@ -1832,7 +1864,12 @@ maybe_add_sra_candidate (tree var) > reject (var, "not aggregate"); > return false; > } > - if (needs_to_live_in_memory (var)) > + /* Allow constant-pool entries (that "need to live in memory") > + unless we are doing IPA SRA. */ > + if (needs_to_live_in_memory (var) > + && (sra_mode == SRA_MODE_EARLY_IPA > + || TREE_CODE (var) != VAR_DECL > + || !DECL_IN_CONSTANT_POOL (var))) > { > reject (var, "needs to live in memory"); > return false; > @@ -3250,6 +3287,9 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator > *gsi) > racc = get_access_for_expr (rhs); > if (!lacc && !racc) > return SRA_AM_NONE; > + /* Avoid modifying initializations of constant-pool replacements. */ > + if (racc && (racc->replacement_decl == lhs)) > + return SRA_AM_NONE; > > loc = gimple_location (stmt); > if (lacc && lacc->grp_to_be_replaced) > @@ -3366,7 +3406,10 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator > *gsi) > || contains_vce_or_bfcref_p (lhs) > || stmt_ends_bb_p (stmt)) > { > - if (access_has_children_p (racc)) > + /* No need to copy into a constant-pool, it comes pre-initialized. */ > + if (access_has_children_p (racc) > + && (TREE_CODE (racc->base) != VAR_DECL > + || !DECL_IN_CONSTANT_POOL (racc->base))) > generate_subtree_copies (racc->first_child, rhs, racc->offset, 0, 0, > gsi, false, false, loc); > if (access_has_children_p (lacc)) > @@ -3469,6 +3512,48 @@ sra_modify_assign (gimple *stmt, gimple_stmt_iterator > *gsi) > } > } > > +static void > +initialize_constant_pool_replacements (void) > +{ > + gimple_seq seq = NULL; > + gimple_stmt_iterator gsi = gsi_start (seq); > + bitmap_iterator bi; > + unsigned i; > + > + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) > + if (bitmap_bit_p (should_scalarize_away_bitmap, i) > + && !bitmap_bit_p (cannot_scalarize_away_bitmap, i)) > + { > + tree var = candidate (i); > + if (TREE_CODE (var) != VAR_DECL || !DECL_IN_CONSTANT_POOL (var)) > + continue; > + vec<access_p> *access_vec = get_base_access_vector (var); > + if (!access_vec) > + continue; > + for (unsigned i = 0; i < access_vec->length (); i++) > + { > + struct access *access = (*access_vec)[i]; > + if (!access->replacement_decl) > + continue; > + gassign *stmt = gimple_build_assign ( > + get_access_replacement (access), unshare_expr (access->expr)); > + if (dump_file && (dump_flags & TDF_DETAILS)) > + { > + fprintf (dump_file, "Generating constant initializer: "); > + print_gimple_stmt (dump_file, stmt, 0, 1); > + fprintf (dump_file, "\n"); > + } > + gsi_insert_after (&gsi, stmt, GSI_NEW_STMT); > + update_stmt (stmt); > + } > + } > + > + seq = gsi_seq (gsi); > + if (seq) > + gsi_insert_seq_on_edge_immediate ( > + single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun)), seq); > +} > + > /* Traverse the function body and all modifications as decided in > analyze_all_variable_accesses. Return true iff the CFG has been > changed. */ > @@ -3479,6 +3564,8 @@ sra_modify_function_body (void) > bool cfg_changed = false; > basic_block bb; > > + initialize_constant_pool_replacements (); > + > FOR_EACH_BB_FN (bb, cfun) > { > gimple_stmt_iterator gsi = gsi_start_bb (bb); > -- > 1.9.1 >