> On Mon, Jun 6, 2016 at 3:19 AM, Jan Hubicka <hubi...@ucw.cz> wrote: > > Hi, > > while looking into profile mismatches introduced by the backward threading > > pass > > I noticed that the heuristics seems quite simplistics. First it should be > > profile sensitive and disallow duplication when optimizing cold paths. > > Second > > it should use estimate_num_insns because gimple statement count is not > > really > > very realistic estimate of final code size effect and third there seems to > > be > > no reason to disable the pass for functions optimized for size. > > > > If we block duplication for more than 1 insns for size optimized paths the > > pass > > is able to do majority of threading decisions that are for free and improve > > codegen. > > The code size benefit was between 0.5% to 2.7% on testcases I tried > > (tramp3d, > > GCC modules, xlanancbmk and some other stuff around my hd). > > > > Bootstrapped/regtested x86_64-linux, seems sane? > > > > The pass should also avoid calling cleanup_cfg when no trheading was done > > and i do not see why it is guarded by expensive_optimizations. What are the > > main compile time complexity limitations? > > This patch caused a huge regression (~11%) on coremarks on ThunderX. > I assume other targets too. > Basically it looks like the path is no longer thread jumped.
Sorry for late reply. I checked our periodic testers and the patch seems more or less performance neutral with some code size improvements. Can you point me to the path that is no longer crossjumped? I added diag output, so you should see the reason why the path was considered unprofitable - either it was cold or we exceeded the maximal size. The size is largely untuned, so perhaps we can just adjust it. Honza