[Bug ipa/65076] [5 Regression] 16% tramp3d-v4.cpp compile time regression

rguenth at gcc dot gnu.org Tue, 24 Mar 2015 07:34:40 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65076


Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #11)
> Sorry, the number of clobbers drops at DSE1, not during ehcleanup2, I just
> messed up my grep. 
> 
> I tried the following patch:
> 
> Index: passes.def
> ===================================================================
> --- passes.def  (revision 221541)
> +++ passes.def  (working copy)
> @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.
>           NEXT_PASS (pass_build_ealias);
>           NEXT_PASS (pass_fre);
>           NEXT_PASS (pass_merge_phi);
> +         NEXT_PASS (pass_dse);
>           NEXT_PASS (pass_cd_dce);
>           NEXT_PASS (pass_early_ipa_sra);
>           NEXT_PASS (pass_tail_recursion);
> 
> This brings number of CLOBBER statements at release_ssa time down to 7392
> (50% reduction).  A nice effect of this patch is that it tends to simplify
> destructors often to empty to make them more inlinable:
> 
>  ObserverEvent::~ObserverEvent() (struct ObserverEvent * const this)
>  {
>    <bb 2>:
> -  this_2(D)->_vptr.ObserverEvent = &MEM[(void *)&_ZTV13ObserverEvent + 16B];
>    MEM[(struct  &)this_2(D)] ={v} {CLOBBER};
>    return;
> 
> saves a lot of the clobbers:
>  Engine<3, double, ExpressionTag<UnaryNode<FnNorm, BinaryNode<OpSubtract,
> Reference<Field<NoMesh<3>, Vector<3, double, Full>, ViewEngine<3,
> IndexFunction<GenericURM<MeshTraits<3, double, UniformRectilinearTag,
> CartesianTag, 3> >::PositionsFunctor> > > >, Scalar<Vector<3, double, Full>
> > > > > >::~Engine() (struct Engine * const this)
>  {
>    <bb 2>:
> -  MEM[(struct  &)this_2(D) + 32] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D) + 32] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D) + 8] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D) + 8] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D) + 8] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D)] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D)] ={v} {CLOBBER};
> -  MEM[(struct  &)this_2(D)] ={v} {CLOBBER};
> +  MEM[(struct  &)this_1(D)] ={v} {CLOBBER};
>    return;
> 
> which is especially nice for LTO streaming.
> 
> and saves about 7% of code apparently after inlining:
> 
> $ wc -l *copyprop2
> 200189 tramp3d-v4.ii.085t.copyprop2
> $ wc -l ../5/*copyprop2
> 215060 ../5/tramp3d-v4.ii.084t.copyprop2
> 
> Even though the inline decisions does not seem to be changed considerably
> (at least on tramp3d).

Yeah, clobbers don't account for anything for size/inline estimates
(well, I hope so!).

And yes, doing DSE early is quite an old idea...  we should revisit it
next stage1.

> On unrelated note I noticed PR65502
> 
> Still I guess this does not really explain the origin of regression in
> statement count relative to 4.9...

No idea.  I'll have to look myself - the &X + 4 vs. &MEM[&X, 4] is very
reecent so it can't be blamed for the regression.  But it might be blamed
for the number of stmt differences - but only from the very beginning.
That is, I can't see how the difference shows in .ssa but not in .cfg.

[Bug ipa/65076] [5 Regression] 16% tramp3d-v4.cpp compile time regression

Reply via email to