On 1/8/20 11:35 AM, Jan Hubicka wrote:
Hi,
Just to explain better what I am worried about.  The overall sum of
counters in TOPN does not have very good meaning if you have more than N
target.

Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
guarantees if target X is taken by more than 50% of times, it will win,
however its count can be arbitrarily low.

Consider following sequence of call targets

Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X
The TOPN counter will behave as follows:
Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8

Now for sequence

Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X
Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2

There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
Depending on order the counter of X can be anywhere between 2 and 8. So
setting threshold without trashing a lot of valid cases would be hard
and if you have something like 40k of parallel invocations of clang I
would expect quite high divergence between the runs.

Thank you for the example. I'm aware of these sequences that can lead
to a significant divergence in profiles.


Similar sequence exists for N>1, you only need to populate other entries
by new targets.

Yes, but it's much more harder to rapidly decrease a dominant value if we
want to assume that only 10% of samples were populated.


However based on observation above we could probably scale up the
probabilites assuming that the missing part of histogram has pretty much
the same distribution as known part.

I hope so.

I'm also opened for a lower default param value. If you can provide any numbers
for clang I would appreciate that.

Martin


Honza


Reply via email to