Hi!

On Wed, Aug 12, 2020 at 09:03:35PM +0200, Richard Biener wrote:
> On August 12, 2020 7:53:07 PM GMT+02:00, Jan Hubicka <hubi...@ucw.cz> wrote:
> >> From: Xiong Hu Luo <luo...@linux.ibm.com>
> >> 523.xalancbmk_r +1.32%
> >> 541.leela_r     +1.51%
> >> 548.exchange2_r +31.87%
> >> 507.cactuBSSN_r +0.80%
> >> 526.blender_r   +1.25%
> >> 538.imagick_r   +1.82%

> >> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> >> index 0211f08964f..11903ac1960 100644
> >> --- a/gcc/cgraph.h
> >> +++ b/gcc/cgraph.h
> >> @@ -3314,6 +3314,8 @@ cgraph_edge::recursive_p (void)
> >>    cgraph_node *c = callee->ultimate_alias_target ();
> >>    if (caller->inlined_to)
> >>      return caller->inlined_to->decl == c->decl;
> >> +  else if (caller->clone_of && c->clone_of)
> >> +    return caller->clone_of->decl == c->clone_of->decl;
> >>    else
> >>      return caller->decl == c->decl;
> >
> >If you clone the function so it is no longer self recursive, it does
> >not
> >make much sense to lie to optimizers that the function is still
> >recursive.

Like Richard says below (if I understand him right, sorry if not), the
function still *is* recursive in its group of clones.

> >The inlining would be harmful even if the programer did cloning by
> >hand.
> >I guess main problem is the extreme register pressure issue combining
> >loop depth of 10 in caller with loop depth of 10 in callee just because
> >the function is called once.
> >
> >The negative effect is most likely also due to wrong profile estimate
> >which drives IRA to optimize wrong spot.  But I wonder if we simply
> >don't want to teach inlining function called once to not construct
> >large
> >loop depths?  Something like do not inline if caller&callee loop depth
> >is over 3 or so?
> 
> I don't think that's good by itself (consider leaf functions and x86 xmm reg 
> ABI across calls). Even with large loop depth abstraction penalty removal can 
> make inlining worth it. For the testcase the recursiveness is what looks 
> special (recursion from a deeper loop nest level). 

Yes, the loop stuff / register pressure issues might help for the
exchange result, but what about the other five above?


Segher

Reply via email to