> >I don't see static profile prediction to be very useful here to find
> >"really
> >hot code" - neither in current implementation or future. The problem of
> >-O2 is that we kind of know that only 10% of code somewhere matters for
> >performance but we have no way to reliably identify it.
> 
> It's hard to do better than statically look at (ipa) loop depth. But 
> shouldn't that be good enough? 

Only if you assume that you have whole program and understand indirect calls.
There are some stats on this here
http://ieeexplore.ieee.org/document/717399/

It shows that propagating static profile across whole progrma (which is just
tiny bit more fancy than counting loop depth) sort of work statistically.  I
really do not have very high hopes of this reliably working in production
compiler.  We already have PRs for single function benchmark where deep loop
nest is used ininitialization or so and the actual hard working part has small
loop nest & gets identified as cold.  

As soon as you start propagating in whole program context, such local mistakes
will become more comon.
> 
> >
> >I would make sense to have less agressive vectoriazaoitn at -O2 and
> >more at
> >-Ofast/-O3.
> 
> We tried that but the runtime effects were not offsetting the compile time 
> cost. 

Yep, i remember that.

Honza

Reply via email to