Re: [PATCH] Fix PR59890, improve var-tracking compile-time
On Mon, Jan 20, 2014 at 10:50 AM, Richard Biener rguent...@suse.de wrote: This improves var-tracking dataflow convergence by using post order on the inverted CFG - which is appropriate for forward dataflow problems. This haves compile-time spent in var-tracking for PR45364 (it also improves other testcases, but that one seems to be the best case). For this to pass bootstrap I had to fix PR59890, two latent bugs with the recently added local_get_addr_cache. Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? Thanks, Richard. 2014-01-20 Richard Biener rguent...@suse.de PR rtl-optimization/45364 PR rtl-optimization/59890 * var-tracking.c (local_get_addr_clear_given_value): Handle already cleared slot. (val_reset): Handle not allocated local_get_addr_cache. (vt_find_locations): Use post-order on the inverted CFG. This caused: FAIL: gcc.dg/guality/pr43051-1.c -O1 line 40 v == 1 FAIL: gcc.dg/guality/pr43051-1.c -O1 line 41 e == a[1] FAIL: gcc.dg/guality/pr43051-1.c -Os line 40 v == 1 FAIL: gcc.dg/guality/pr43051-1.c -Os line 41 e == a[1] on Linux/x86. H.J.
Re: [PATCH] Fix PR59890, improve var-tracking compile-time
On Jan 20, 2014, Jakub Jelinek ja...@redhat.com wrote: On Mon, Jan 20, 2014 at 06:24:36PM -0200, Alexandre Oliva wrote: But I think this one is wrong. if (var-onepart == ONEPART_VALUE) { if (local_get_addr_cache == NULL) return; But when local_get_addr_cache is non-NULL, no matter if we find a slot there or don't, we still fall thru into the 3 loops etc. Uhh, yes indeed... Evidently I imagined a return at the end of this if :-/ The patch is good then; but what is my review worth after barfing like that? :-D -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Toolchain Engineer
[PATCH] Fix PR59890, improve var-tracking compile-time
This improves var-tracking dataflow convergence by using post order on the inverted CFG - which is appropriate for forward dataflow problems. This haves compile-time spent in var-tracking for PR45364 (it also improves other testcases, but that one seems to be the best case). For this to pass bootstrap I had to fix PR59890, two latent bugs with the recently added local_get_addr_cache. Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk? Thanks, Richard. 2014-01-20 Richard Biener rguent...@suse.de PR rtl-optimization/45364 PR rtl-optimization/59890 * var-tracking.c (local_get_addr_clear_given_value): Handle already cleared slot. (val_reset): Handle not allocated local_get_addr_cache. (vt_find_locations): Use post-order on the inverted CFG. Index: gcc/var-tracking.c === *** gcc/var-tracking.c (revision 206808) --- gcc/var-tracking.c (working copy) *** static bool *** 2481,2487 local_get_addr_clear_given_value (const void *v ATTRIBUTE_UNUSED, void **slot, void *x) { ! if (vt_get_canonicalize_base ((rtx)*slot) == x) *slot = NULL; return true; } --- 2481,2488 local_get_addr_clear_given_value (const void *v ATTRIBUTE_UNUSED, void **slot, void *x) { ! if (*slot != NULL !vt_get_canonicalize_base ((rtx)*slot) == x) *slot = NULL; return true; } *** val_reset (dataflow_set *set, decl_or_va *** 2501,2507 gcc_assert (var-n_var_parts == 1); ! if (var-onepart == ONEPART_VALUE) { rtx x = dv_as_value (dv); void **slot; --- 2502,2509 gcc_assert (var-n_var_parts == 1); ! if (var-onepart == ONEPART_VALUE !local_get_addr_cache != NULL) { rtx x = dv_as_value (dv); void **slot; *** vt_find_locations (void) *** 6934,6945 bool success = true; timevar_push (TV_VAR_TRACKING_DATAFLOW); ! /* Compute reverse completion order of depth first search of the CFG so that the data-flow runs faster. */ ! rc_order = XNEWVEC (int, n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS); bb_order = XNEWVEC (int, last_basic_block_for_fn (cfun)); ! pre_and_rev_post_order_compute (NULL, rc_order, false); ! for (i = 0; i n_basic_blocks_for_fn (cfun) - NUM_FIXED_BLOCKS; i++) bb_order[rc_order[i]] = i; free (rc_order); --- 6936,6947 bool success = true; timevar_push (TV_VAR_TRACKING_DATAFLOW); ! /* Compute reverse top sord order of the inverted CFG so that the data-flow runs faster. */ ! rc_order = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); bb_order = XNEWVEC (int, last_basic_block_for_fn (cfun)); ! int num = inverted_post_order_compute (rc_order); ! for (i = 0; i num; i++) bb_order[rc_order[i]] = i; free (rc_order);
Re: [PATCH] Fix PR59890, improve var-tracking compile-time
On Jan 20, 2014, Richard Biener rguent...@suse.de wrote: local_get_addr_clear_given_value (const void *v ATTRIBUTE_UNUSED, void **slot, void *x) { ! if (*slot != NULL !vt_get_canonicalize_base ((rtx)*slot) == x) *slot = NULL; return true; } This part is ok *** val_reset (dataflow_set *set, decl_or_va *** 2501,2507 gcc_assert (var-n_var_parts == 1); ! if (var-onepart == ONEPART_VALUE) { rtx x = dv_as_value (dv); void **slot; --- 2502,2509 gcc_assert (var-n_var_parts == 1); ! if (var-onepart == ONEPART_VALUE !local_get_addr_cache != NULL) { rtx x = dv_as_value (dv); void **slot; But I think this one is wrong. You don't want to treat a one-part value as if it wasn't one. If we have to discard locs and equivalences for a one-part value that doesn't have any (because we don't even have a local_get_addr_cache yet), you can *probably* just return right away, because your job is already done. So I'd try: if (var-onepart == ONEPART_VALUE) { if (local_get_addr_cache == NULL) return; ! /* Compute reverse top sord order of the inverted CFG so that the data-flow runs faster. */ sord? ! rc_order = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); bb_order = XNEWVEC (int, last_basic_block_for_fn (cfun)); ! int num = inverted_post_order_compute (rc_order); ! for (i = 0; i num; i++) bb_order[rc_order[i]] = i; free (rc_order); This looks reasonable to me, but I'm a bit concerned that changes in the iteration order could break assumptions present in the code. I can't think of anything specific, but I haven't completed swapping this code back in ;-) -- Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/ You must be the change you wish to see in the world. -- Gandhi Be Free! -- http://FSFLA.org/ FSF Latin America board member Free Software Evangelist Red Hat Brazil Toolchain Engineer
Re: [PATCH] Fix PR59890, improve var-tracking compile-time
On Mon, Jan 20, 2014 at 06:24:36PM -0200, Alexandre Oliva wrote: ! if (var-onepart == ONEPART_VALUE) { rtx x = dv_as_value (dv); void **slot; --- 2502,2509 gcc_assert (var-n_var_parts == 1); ! if (var-onepart == ONEPART_VALUE !local_get_addr_cache != NULL) { rtx x = dv_as_value (dv); void **slot; But I think this one is wrong. You don't want to treat a one-part value as if it wasn't one. If we have to discard locs and equivalences for a one-part value that doesn't have any (because we don't even have a local_get_addr_cache yet), you can *probably* just return right away, because your job is already done. So I'd try: if (var-onepart == ONEPART_VALUE) { if (local_get_addr_cache == NULL) return; But when local_get_addr_cache is non-NULL, no matter if we find a slot there or don't, we still fall thru into the 3 loops etc. Jakub