Re: [PATCH] Relax invalidation of TOP N counters in PGO.

Jan Hubicka Thu, 09 Jan 2020 01:52:21 -0800

> On 1/8/20 3:05 PM, Jan Hubicka wrote:
> > > 
> > > > 
> > > > I would still preffer invalidation before streaming (which is fully
> > > > deterministic) and possibly have option
> > > 
> > > Do you mean __gcov_merge_topn?
> > 
> > I suggest we do the following:
> > 
> >   - have non-deterministic and deterministic version of TOPN counter
> >     and a flag chosing between deterministic and non-deterministic
> >     instrumentation.
> >   - in non-deterministic version do merging dropping least frequent
> >     values as done with your patch.
> 
> Good.
> 
> >   - in deterministic version do the following
> > 
> >     in __gcov_merge_topn
> >     1) prune out all values in measured counts with associated counter
> >        less than total_count/N (where N is number of values we track)
> 
> Can you please explain me how precisely is this prune useful?


The motivation is to keep everything deterministic.  At the end of run
we have deterministic set of counters with deterministic values and
after this step we again have deterministic set, just smaller (since we
removed useless ones).

The the idea was to produce union of all the sets from all the runs
(invalidating ocunter at the overflow) which is again deterministic by
distributivity of union operation, so it does not matter in what order
we perform it.
> 
> >     2) prune out all values in read counter same way
> >     3) merge the values ignoring any with resulting counter being
> >        less than merged_total_count/N and invalidating whole counter if
> >        there are too many suriving ones
> >     4) before using profiles in GCC prune out all vlaues with associated
> >     counter less than total_count/N (they are not useful anyway).

However it seems that I missed a problem here. With step 2 and 4 the
merge is not distributive.

I added 2 and 4 trying to work around the fact that we have no
convenient place to do pruning if merging is not performed (since merge
functions are not called at all).  I think we need to invent such place
- just do that on the begginign of dump_one_gcov.
> > 
> >     Here total_count can be simply sum of all N counters, it would be
> >     better if it was total number of runs, but that is not available at
> >     gcov_merge time unless we introduce extra counter for misses.
> > 
> >     Observation is that since we never kick out value because counter is
> >     full (but still rather invalidate whole counter) this remains
> >     deterministics. Hopefully prunning out useless small values will
> >     prevent the Firefox-like examples from triggering too often.
> >     Definitly this is stronger than simply invalidating all counters
> >     where number of runs != sum of all counts which is other
> >     deterministics variant.
> > 
> >     2) and 4) is only needed since we currently have no convenient place
> >     to prune counters prior streaming if merging is not done at all.
> >     If we invent one we could skip that step.
> > 
> > 
> > Does it make sense to you?
> > 
> > I was also wondering: with TOPN it would be nice to have the property
> > that target with at greater 1/(TOPN+1) probability gets into the counter,
> > but this guaranteed only for N=1.
> > For N>1 one can populate N-1 counters with not too important values
> > and only then start putting in frequent values be sure that they fight
> > and be sure that the frequent target is always fight between themself
> > for the last slot (exploiting the "lets decrease the least counter
> > heuristics")
> > 
> > Like for N=3
> > 
> > X X Y Y Z W Z W Z W Z W  ......
> > 
> > here first two counters will get occupied by X and Y with counts 2 and
> > the third counter will end up with W. Z will be missed even if it has
> > limit 0.5 and both X and Y probability 0 in the limit.
> > 
> > What about increasing value of every couner by N on hit and decreasing
> > all by 1 on miss?  It will make sure that counts having frequency less
> > than 1/N will be dropped making space for those with higher frequencies.
> 
> I like this approach. It looks smart. So then we'll divide all counter values
> by N, right?

Yes.
> 
> Note that we used to have the Google's profiling code for TOP N:
> https://github.com/gcc-mirror/gcc/blob/gcc-7-branch/libgcc/libgcov-profiler.c#L179

It seems to not implement the logic to decrease counts on mismatch.
Instead if counter is full the value is entered to the least frequently
used entry.  This clearly can degreade.  For N=3 one can do
X X Y Y Z W Z W Z W Z W ..
where Z and W will keep kicking off each other.

So they seem to wait when this grows to 3000 and take low values out of
histogram.

I do not see much advantage in this scheme except that the counts are
not decreased and this they more closely corresponds to actual
frequencies of the target (still not precisely since the target might
have been kicked out earlier).

Honza
> 
> It's somehow counting number of evictions and if it reaches a threshold, then 
> low
> counter values are evicted.
> 
> Martin
> 
> > This should also improve behaviour WRT the problem above.
> > > 
> > > > -fno-deterministic-profile-merging (or something similar) for users who
> > > > do not care and for threaded programs where this is impossible
> > > > (which needs to trigger different merging in libgcov).
> > > 
> > > Fine, let's postpone that for GCC 11.
> > 
> > It is a regression, so I think we have right to solve it for GCC 10.
> > 
> > Honza
> > > 
> > > Martin
> > > 
> > > > 
> > > > Honza
> > > > 
> > > 
>

Re: [PATCH] Relax invalidation of TOP N counters in PGO.

Reply via email to