> On Mon, Jun 6, 2016 at 3:19 AM, Jan Hubicka <hubi...@ucw.cz> wrote:
> > Hi,
> > while looking into profile mismatches introduced by the backward threading 
> > pass
> > I noticed that the heuristics seems quite simplistics.  First it should be
> > profile sensitive and disallow duplication when optimizing cold paths. 
> > Second
> > it should use estimate_num_insns because gimple statement count is not 
> > really
> > very realistic estimate of final code size effect and third there seems to 
> > be
> > no reason to disable the pass for functions optimized for size.
> >
> > If we block duplication for more than 1 insns for size optimized paths the 
> > pass
> > is able to do majority of threading decisions that are for free and improve 
> > codegen.
> > The code size benefit was between 0.5% to 2.7% on testcases I tried 
> > (tramp3d,
> > GCC modules, xlanancbmk and some other stuff around my hd).
> >
> > Bootstrapped/regtested x86_64-linux, seems sane?
> >
> > The pass should also avoid calling cleanup_cfg when no trheading was done
> > and i do not see why it is guarded by expensive_optimizations. What are the
> > main compile time complexity limitations?
> 
> This patch caused a huge regression (~11%) on coremarks on ThunderX.
> I assume other targets too.
> Basically it looks like the path is no longer thread jumped.

Sorry for late reply. I checked our periodic testers and the patch seems more 
or less
performance neutral with some code size improvements. Can you point me to the 
path that
is no longer crossjumped? I added diag output, so you should see the reason why 
the
path was considered unprofitable - either it was cold or we exceeded the 
maximal size.
The size is largely untuned, so perhaps we can just adjust it.

Honza

Reply via email to