On Fri, 21 Feb 2014, Richard Biener wrote:
>
> This fixes the slowness of RTL expansion in PR60291 which is caused
> by excessive collisions in mem-attr sharing. The issue here is
> that sharing attempts happens across functions and we have a _lot_
> of functions in this testcase referencing the same lexically
> equivalent memory, for example MEM[(StgWord *)_5 + -64B]. That
> means those get the same hash value. But they don't compare
> equal because an SSA name _5 from function A is of course not equal
> to one from function B.
>
> The following fixes that by not doing mem-attr sharing across functions
> by clearing the mem-attrs hashtable in rest_of_clean_state.
>
> Another fix may be to do what the comment in iterative_hash_expr
> says for SSA names:
>
> case SSA_NAME:
> /* We can just compare by pointer. */
> return iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val);
>
> (probably blame me for changing that to hashing the SSA version).
It was lxo.
> But I'm not sure that doesn't uncover issues with various hashtables and
> walking them, generating code dependent on the order. It's IMHO just not
> expected that you compare function-local expressions from different
> functions.
Same speedup result from
Index: gcc/tree.c
===================================================================
--- gcc/tree.c (revision 207960)
+++ gcc/tree.c (working copy)
@@ -7428,7 +7428,7 @@ iterative_hash_expr (const_tree t, hashv
}
case SSA_NAME:
/* We can just compare by pointer. */
- return iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val);
+ return iterative_hash_hashval_t ((uintptr_t)t>>3, val);
case PLACEHOLDER_EXPR:
/* The node itself doesn't matter. */
return val;
and from
Index: gcc/tree.c
===================================================================
--- gcc/tree.c (revision 207960)
+++ gcc/tree.c (working copy)
@@ -7428,7 +7428,9 @@ iterative_hash_expr (const_tree t, hashv
}
case SSA_NAME:
/* We can just compare by pointer. */
- return iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val);
+ return iterative_hash_host_wide_int
+ (DECL_UID (cfun->decl),
+ iterative_hash_host_wide_int (SSA_NAME_VERSION (t), val));
case PLACEHOLDER_EXPR:
/* The node itself doesn't matter. */
return val;
better than hashing pointers but requring cfun != NULL in this
function isn't good either.
At this point I'm more comfortable with clearing the hashtable
than with changing iterative_hash_expr in any way. It's also
along the way to get rid of the hash completely.
Oh, btw, the speedup is going from
expand : 481.98 (94%) usr 1.15 (17%) sys 481.94 (93%)
wall 293891 kB (15%) ggc
to
expand : 2.66 ( 7%) usr 0.13 ( 2%) sys 2.64 ( 6%)
wall 262544 kB (13%) ggc
at -O0 (less dramatic slowness for -On).
> The other thing would be to discard mem-attr sharing alltogether,
> but that doesn't seem appropriate at this stage (but it would
> also simplify quite some code). With only one function in RTL
> at a time that shouldn't be too bad (see several suggestions
> along that line, even with statistics).
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for
> trunk and 4.8 branch?
>
> Thanks,
> Richard.
>
> 2014-02-21 Richard Biener <[email protected]>
>
> PR middle-end/60291
> * rtl.h (clear_mem_attrs_htab): Declare.
> * emit-rtl.c (clear_mem_attrs_htab): New function.
> * final.c (rest_of_clean_state): Call clear_mem_attrs_htab
> to avoid sharing mem-attrs between functions.
>
> Index: gcc/rtl.h
> ===================================================================
> *** gcc/rtl.h (revision 207960)
> --- gcc/rtl.h (working copy)
> *************** extern int in_sequence_p (void);
> *** 2546,2551 ****
> --- 2546,2552 ----
> extern void init_emit (void);
> extern void init_emit_regs (void);
> extern void init_emit_once (void);
> + extern void clear_mem_attrs_htab (void);
> extern void push_topmost_sequence (void);
> extern void pop_topmost_sequence (void);
> extern void set_new_first_and_last_insn (rtx, rtx);
> Index: gcc/emit-rtl.c
> ===================================================================
> *** gcc/emit-rtl.c (revision 207960)
> --- gcc/emit-rtl.c (working copy)
> *************** init_emit_once (void)
> *** 5913,5918 ****
> --- 5913,5926 ----
> simple_return_rtx = gen_rtx_fmt_ (SIMPLE_RETURN, VOIDmode);
> cc0_rtx = gen_rtx_fmt_ (CC0, VOIDmode);
> }
> +
> + /* Clear the mem-attrs sharing hash table. */
> +
> + void
> + clear_mem_attrs_htab (void)
> + {
> + htab_empty (mem_attrs_htab);
> + }
>
> /* Produce exact duplicate of insn INSN after AFTER.
> Care updating of libcall regions if present. */
> Index: gcc/final.c
> ===================================================================
> *** gcc/final.c (revision 207960)
> --- gcc/final.c (working copy)
> *************** rest_of_clean_state (void)
> *** 4678,4683 ****
> --- 4678,4686 ----
>
> init_recog_no_volatile ();
>
> + /* Reset mem-attrs sharing. */
> + clear_mem_attrs_htab ();
> +
> /* We're done with this function. Free up memory if we can. */
> free_after_parsing (cfun);
> free_after_compilation (cfun);
>
--
Richard Biener <[email protected]>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer