> 
> 
> > Am 23.07.2025 um 16:42 schrieb Andrew Stubbs <a...@baylibre.com>:
> > 
> > On 23/07/2025 15:22, Michael Matz wrote:
> >> Hello,
> >> On Tue, 22 Jul 2025, Andrew Stubbs wrote:
> >>>> Don’t you need to instrument more (or at least different) edges when
> >>>> only having visited bits ?  With counters you can derive coverage of
> >>>> related edges by diffing counters (consider a simple diamond).
> >>> 
> >>> I have not yet got this deep into the analysis (apparently).  In my
> >>> head, if I just reduced every counter to a single bit (using saturated
> >>> add) then gcov would just present a report with "1" or "#####" all down
> >>> the left side.
> >>> 
> >>> Are you saying that it might actually report "2" (or more) for some
> >>> cases where the result is derived from multiple counters?  If so, this
> >>> is probably acceptable for a first implementation, but we might want to
> >>> do something about it.
> >> No, it will have to say "unknown" for some of the non-instrumented edges.
> >> For instance, in a branch:
> >>    -- e1 --> if (cond) -- e2 --> ...
> >>                       \-- e3 --> ...
> >> We will only instrument a subset of all edges, as c(e1)==c(e2)+c(e3).  The
> >> subset we select will be a minimal-cost edge graph cover of the CFG.
> >> Depending on surrounding code we may select (for demostrations sake) to
> >> instrument e1 and e2 (or perhaps we know c(e1) from still other edges).
> >> So we know c(e1) and c(e2), and can compute c(e3)=c(e1)-c(e2) correctly.
> >> With visited flags you loose that because the meeting operation
> >> (logical or) isn't invertible: if you know that e1 was visited, and you
> >> know that e2 was visited, you still don't know anything about e3.  It may,
> >> or may not, have been visited.  You won't know unless you do instrument
> >> e3.  Or select a different subset of edge (here, v(e1) could be computed
> >> from v(e2)|v(e3)).
> >> So, that need to instrument more edges offsets the storage savings.  I
> >> have no gut feeling for how bad that is, though.  You'd have to try :)
> >> Perhaps it still comes out positive.
> > 
> > Thank you for the careful explanation. It seems like all the "innermost" 
> > blocks would need to be instrumented, and in most cases the rest can be 
> > assumed. Unfortunately, that probably means instrumenting the hottest 
> > blocks, but hopefully/maybe/possibly there's no so many of them. I'll take 
> > a look at the selection algorithm and figure out how it's working, but I 
> > assume I will end up needing a matching gcov to go with the new 
> > instrumentation.
> 
> I think you can work on the instrumentation part by keeping the counter 
> format and just changing the update to set bit zero.

we build minimum spaning tree of the CFG and then instrument all
non-spanning edges.  Since typical out-degree of a basic block is 2 you
get approximately as many counters as basic blocks.  The counts on the
spaning tree are then determined by Kirhoff law.
https://dl.acm.org/doi/pdf/10.1145/183432.183527

If you want to track bits, you may simply instrument all edges and the
data size would be still approx 32times smaller at the expense of more
work in code segment.  You can optimize this. If BB has only one
sucessor S you know that if BB is executed, S is also executed. Not sure
if you can do much better.

I was also playing with idea of not streaming all the zeros as 64bit
values or possibly one can allocate counters on demand...

As for atomic operations, I think you will still need locking unless you
can set a given bit in memory without loading+storing the remaining bits
of the byte.

Honza

Reply via email to