Richard Biener <richard.guent...@gmail.com> writes:
> On Mon, Jul 8, 2019 at 4:41 PM Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Richard Biener <richard.guent...@gmail.com> writes:
>> > On Sun, Jul 7, 2019 at 9:07 PM Jeff Law <l...@redhat.com> wrote:
>> >>
>> >> On 7/7/19 3:45 AM, Richard Sandiford wrote:
>> >> > DCE tries to delete dead stores to local data and also tries to insert
>> >> > debug binds for simple cases:
>> >> >
>> >> >   /* If this is a store into a variable that is being optimized away,
>> >> >      add a debug bind stmt if possible.  */
>> >> >   if (MAY_HAVE_DEBUG_BIND_STMTS
>> >> >       && gimple_assign_single_p (stmt)
>> >> >       && is_gimple_val (gimple_assign_rhs1 (stmt)))
>> >> >     {
>> >> >       tree lhs = gimple_assign_lhs (stmt);
>> >> >       if ((VAR_P (lhs) || TREE_CODE (lhs) == PARM_DECL)
>> >> >         && !DECL_IGNORED_P (lhs)
>> >> >         && is_gimple_reg_type (TREE_TYPE (lhs))
>> >> >         && !is_global_var (lhs)
>> >> >         && !DECL_HAS_VALUE_EXPR_P (lhs))
>> >> >       {
>> >> >         tree rhs = gimple_assign_rhs1 (stmt);
>> >> >         gdebug *note
>> >> >           = gimple_build_debug_bind (lhs, unshare_expr (rhs), stmt);
>> >> >         gsi_insert_after (i, note, GSI_SAME_STMT);
>> >> >       }
>> >> >     }
>> >> >
>> >> > But this doesn't help for things like "print *ptr" when ptr points
>> >> > to the local variable (tests Og-dce-1.c and Og-dce-2.c).  It also tends
>> >> > to make the *live* -- and thus useful -- values optimised out, because
>> >> > we can't yet switch back to tracking the memory location as it evolves
>> >> > over time (test Og-dce-3.c).
>> >> >
>> >> > So for -Og I think it'd be better not to delete any stmts with
>> >> > vdefs for now.  This also means that we can avoid the potentially
>> >> > expensive vop walks (which already have a cut-off, but still).
>> >> >
>> >> > The patch also fixes the Og failures in gcc.dg/guality/pr54970.c
>> >> > (PR 86638).
>> >> >
>> >> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> >> >
>> >> > Richard
>> >> >
>> >> >
>> >> > 2019-07-07  Richard Sandiford  <richard.sandif...@arm.com>
>> >> >
>> >> > gcc/
>> >> >       PR debug/86638
>> >> >       * tree-ssa-dce.c (keep_all_vdefs_p): New function.
>> >> >       (mark_stmt_if_obviously_necessary): Mark all stmts with vdefs as
>> >> >       necessary if keep_all_vdefs_p is true.
>> >> >       (mark_aliased_reaching_defs_necessary): Add a gcc_checking_assert
>> >> >       that keep_all_vdefs_p is false.
>> >> >       (mark_all_reaching_defs_necessary): Likewise.
>> >> >       (propagate_necessity): Skip the vuse scan if keep_all_vdefs_p is 
>> >> > true.
>> >> >
>> >> > gcc/testsuite/
>> >> >       * c-c++-common/guality/Og-dce-1.c: New test.
>> >> >       * c-c++-common/guality/Og-dce-2.c: Likewise.
>> >> >       * c-c++-common/guality/Og-dce-3.c: Likewise.
>> >> OK
>> >
>> > I wonder how code size (and compile-time) is affected by the DSE/DCE patch?
>> > Say just look at -Og built cc1?
>>
>> Overall I see a ~2.5% slowdown and a 4.7% increase in load size.
>> That comes almost entirely from the (RTL) DSE side; this patch
>> and gimple DSE part don't seem to make much difference.
>>
>> If I keep the gimple passes as-is and just disable RTL DSE, the slowdown
>> is still ~2.5% and there's a 4.4% increase in load size.
>>
>> These are all measuring cc1plus (built from post-patch sources)
>> and using -O2 -g tree-into-ssa.ii for the speed checks.
>>
>> > Can you restrict the keep-all-vdefs to user variables (and measure the
>> > difference this makes)?
>>
>> In order to avoid wrong debug for pointer dereferences, I think it would
>> have to be keep-all-vdefs for writes to either user variables or unknown
>> locations.  But as above, I can't measure a significant difference with
>> the patch.
>>
>> > Again I wonder if this makes C++ with -Og impractical runtime-wise.
>>
>> Got a particular test in mind?
>
> Nothing specific - there are a few C/C++ benchmarks in SPEC and there's
> also tramp3d-v4.  I guess SRA is much more important for the abstraction
> penalty than DSE - FRE should be able to remove the abstraction, just the
> dead stores will remain (but they'd probably nicely execute out-of-order).
>
> Anyway, the biggest runtime penalty from -Og is probably not running
> any loop optimization (invariant motion mostly).

Finally tried it on tramp3d-v4, and I see a slowdown of ~1.6%.

Thanks,
Richard

Reply via email to