> On 1/8/20 11:35 AM, Jan Hubicka wrote:
> > Hi,
> > Just to explain better what I am worried about.  The overall sum of
> > counters in TOPN does not have very good meaning if you have more than N
> > target.
> > 
> > Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
> > guarantees if target X is taken by more than 50% of times, it will win,
> > however its count can be arbitrarily low.
> > 
> > Consider following sequence of call targets
> > 
> > Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X
> > The TOPN counter will behave as follows:
> > Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8
> > 
> > Now for sequence
> > 
> > Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X
> > Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2
> > 
> > There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
> > Depending on order the counter of X can be anywhere between 2 and 8. So
> > setting threshold without trashing a lot of valid cases would be hard
> > and if you have something like 40k of parallel invocations of clang I
> > would expect quite high divergence between the runs.
> 
> Thank you for the example. I'm aware of these sequences that can lead
> to a significant divergence in profiles.
> 
> > 
> > Similar sequence exists for N>1, you only need to populate other entries
> > by new targets.
> 
> Yes, but it's much more harder to rapidly decrease a dominant value if we
> want to assume that only 10% of samples were populated.
> 
> > 
> > However based on observation above we could probably scale up the
> > probabilites assuming that the missing part of histogram has pretty much
> > the same distribution as known part.
> 
> I hope so.
> 
> I'm also opened for a lower default param value. If you can provide any 
> numbers
> for clang I would appreciate that.

Well, my problem is that as soon as the parameter is non-0 you will
inherently get different counts on counters you accept. Even if you cut
them to differ by less than 10% they will still differ from run to run.
This will lead to different speculation probabilities in each build
which may or may not produce different code.  For complex callgraphs
even relatively small differences in weights leads to reordering of
inline decisions which may lead to different inline decision but even if
it does not it definitly leads to different order of callgraph,
different UIDs of decl which give different hashtable values in late
optimizers and make them to diverge.

If I want to get reproducible builds of very large code base (like
clang) i am not even sure how to test it: look for value which leads to
no difference in say 1000 builds and do it each time GCC or clang
updates?

I would still preffer invalidation before streaming (which is fully
deterministic) and possibly have option
-fno-deterministic-profile-merging (or something similar) for users who
do not care and for threaded programs where this is impossible
(which needs to trigger different merging in libgcov).

Honza

Reply via email to