Hi,

On Tue, Aug 25, 2015 at 12:06:16PM +0100, Alan Lawrence wrote:
> This makes SRA replace loads of records/arrays from constant pool entries,
> with elementwise assignments of the constant values, hence, overcoming the
> fundamental problem in PR/63679.
> 
> As a first pass, the approach I took was to look for constant-pool loads as
> we scanned through other accesses, and add them as candidates there; to build 
> a
> constant replacement_decl for any such accesses in completely_scalarize; and 
> to
> use any existing replacement_decl rather than creating a variable in
> create_access_replacement. (I did try using CONSTANT_CLASS_P in the latter, 
> but
> that does not allow addresses of labels, which can still end up in the 
> constant
> pool.)
> 
> Feedback as to the approach or how it might be better structured / fitted into
> SRA, is solicited ;).

I'm not familiar with constant pools very much, but I'll try:

> 
> Bootstrapped + check-gcc on x86-none-linux-gnu, aarch64-none-linux-gnu and
> arm-none-linux-gnueabihf, including with the next patch (rfc), which greatly 
> increases the number of testcases in which this code is exercised!
> 
> Have also verified that the ssa-dom-cse-2.c scan-tree-dump test passes (using 
> a stage 1 compiler only, without execution) on alpha, hppa, powerpc, sparc, 
> avr, and sh.
> 
> gcc/ChangeLog:
> 
>       * tree-sra.c (create_access): Scan for uses of constant pool and add
>       to candidates.
>       (subst_initial): New.
>       (scalarize_elem): Build replacement_decl using subst_initial.
>       (create_access_replacement): Use replacement_decl if set.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Remove xfail, add --param
>       sra-max-scalarization-size-Ospeed.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c |  7 +---
>  gcc/tree-sra.c                                | 56 
> +++++++++++++++++++++++++--
>  2 files changed, 55 insertions(+), 8 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
> index 9eccdc9..b13d583 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized" } */
> +/* { dg-options "-O3 -fno-tree-fre -fno-tree-pre -fdump-tree-optimized 
> --param sra-max-scalarization-size-Ospeed=32" } */
>  
>  int
>  foo ()
> @@ -17,7 +17,4 @@ foo ()
>  /* After late unrolling the above loop completely DOM should be
>     able to optimize this to return 28.  */
>  
> -/* See PR63679 and PR64159, if the target forces the initializer to memory 
> then
> -   DOM is not able to perform this optimization.  */
> -
> -/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail aarch64*-*-* 
> alpha*-*-* hppa*-*-* powerpc*-*-* sparc*-*-* s390*-*-* } } } */
> +/* { dg-final { scan-tree-dump "return 28;" "optimized" } } */
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index af35fcc..a3ff2df 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -865,6 +865,17 @@ create_access (tree expr, gimple stmt, bool write)
>    else
>      ptr = false;
>  
> +  /* FORNOW: scan for uses of constant pool as we go along.  */
> +  if (TREE_CODE (base) == VAR_DECL && DECL_IN_CONSTANT_POOL (base)
> +      && !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
> +    {
> +      gcc_assert (!write);
> +      bitmap_set_bit (candidate_bitmap, DECL_UID (base));
> +      tree_node **slot = candidates->find_slot_with_hash (base, DECL_UID 
> (base),
> +                                                       INSERT);
> +      *slot = base;
> +    }
> +

I believe you only want to do this if (sra_mode ==
SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA).

The idea of candidates is that we gather them in find_var_candidates
and ten we only eliminate them, this has the benefit of not worrying
about disqualifying a candidate and then erroneously re-adding it
later.  So if you could find a way to structure your code this way, I'd
much happier.  If it is impossible without traversing the whole
function just for that purpose, we may need some mechanism to prevent
us from making a disqualified decl a candidate again.  Or, if we come
to the conclusion that constant pool decls do not ever get
disqualified, a gcc_assert making sure it actually does not happen in
disqualify_candidate.

And of course at find_var_candidates time we check that all candidates
pass simple checks in maybe_add_sra_candidate.  I suppose many of them
do not make sense for constant pool decls but at least please have a
look whether that is the case for all of them or not.

>    if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base)))
>      return NULL;
>  
> @@ -1025,6 +1036,37 @@ completely_scalarize (tree base, tree decl_type, 
> HOST_WIDE_INT offset, tree ref)
>      }
>  }
>  
> +static tree
> +subst_initial (tree expr, tree var)

This needs a comment and a better name.  A name that would make it
clear this is for constant pool stuff only.

> +{
> +  if (TREE_CODE (expr) == VAR_DECL)
> +    {
> +      gcc_assert (DECL_IN_CONSTANT_POOL (expr));
> +      gcc_assert (expr == var);
> +      return DECL_INITIAL (expr);
> +    }
> +  if (TREE_CODE (expr) == COMPONENT_REF)
> +    {
> +      gcc_assert (TREE_CODE (TREE_OPERAND (expr, 1)) == FIELD_DECL);
> +      gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE);
> +      return fold_build3 (COMPONENT_REF, TREE_TYPE (expr),
> +                       subst_initial (TREE_OPERAND (expr, 0), var),
> +                       TREE_OPERAND (expr, 1),
> +                       NULL_TREE);
> +    }
> +  if (TREE_CODE (expr) == ARRAY_REF)
> +    {
> +      gcc_assert (TREE_OPERAND (expr, 2) == NULL_TREE);
> +      gcc_assert (TREE_OPERAND (expr, 3) == NULL_TREE);
> +      return fold (build4 (ARRAY_REF, TREE_TYPE (expr),
> +                        subst_initial (TREE_OPERAND (expr, 0), var),
> +                        TREE_OPERAND (expr, 1),
> +                        NULL_TREE,
> +                        NULL_TREE));
> +    }
> +  gcc_unreachable ();
> +}
> +
>  static void
>  scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size,
>                tree ref, tree type)
> @@ -1033,6 +1075,9 @@ scalarize_elem (tree base, HOST_WIDE_INT pos, 
> HOST_WIDE_INT size,
>    {
>      struct access *access = create_access_1 (base, pos, size);
>      access->expr = ref;
> +    if (TREE_CODE (base) == VAR_DECL
> +     && DECL_IN_CONSTANT_POOL (base))
> +      access->replacement_decl = subst_initial (ref, base);

So replacement_decl ceases to be a decl and should be renamed to
replacement_expr.  (Such cleanups can be done as a followup, of
course).

Thanks,

Martin

Reply via email to