Re: GCov space optimization

Andrew Stubbs Tue, 22 Jul 2025 09:30:24 -0700

On 22/07/2025 16:45, Richard Biener wrote:

Am 22.07.2025 um 16:56 schrieb Andrew Stubbs <[email protected]>:

Hi all,

Question: Would it be acceptable to introduce a new "counter" variety, together with 
options and UI support, that simply records a "visited" state?

Have there been any previous efforts in this space that I've missed?


I'm currently looking at the feasibility of reducing the memory footprint of 
coverage testing.  I don't have any figures or testcases yet, but the customer 
states that coverage testing works, except that the memory usage can be a 
problem on their devices, so I'm exploring what options we have to address this.

Looking at the implementation, we currently use one 32- or 64-bit counter (depending on 
the size of "long", I think) for each arc in the program. Obviously this memory 
overhead is a small proportion of the text size, but even so, it might be overkill for 
simple coverage testing (as opposed to profiling). However, the code inserted to 
atomically update the counter might end up being quite a large proportion of the text, so 
that's worth looking at too.

If we add a new "counter" that uses one byte per arc, then that's not really enough bits for a useful 
counter, but is more than enough for a "visited" flag. Required storage would reduce by a factor of 4 or 8, 
and since nothing ever transitions from "visited" to "unvisited" there will be no need for atomic 
updates, which should reduce the code overhead.

Alternatively, if we use one bit per arc then storage will reduce even further, 
but now we need to do read-modify-write again, so maybe the saving is entirely 
negated.


As for the file format, we can either support the new counter all the way through libgcc 
and gcov, or we can expand it to look like a regular counter on stream-out.  This second 
approach would be a lot less effort, and means we retain backwards compatibility with 
pre-installed gcov tools (and even PGO will still be able to locate the most unlikely 
branches?), but the output will be presented as "1" iteration and might 
possibly end up being misleading (at least to the unwary user).

Which approach is likely to be more acceptable to the maintainers?


Don’t you need to instrument more (or at least different) edges when only 
having visited bits ?  With counters you can derive coverage of related edges 
by diffing counters (consider a simple diamond).

I have not yet got this deep into the analysis (apparently). In myhead, if I just reduced every counter to a single bit (using saturatedadd) then gcov would just present a report with "1" or "#####" all downthe left side.

Are you saying that it might actually report "2" (or more) for somecases where the result is derived from multiple counters? If so, thisis probably acceptable for a first implementation, but we might want todo something about it.

That said, any trick that gives fewer counters will reduce overhead, ofboth size and time, and that sounds appealing.

But yes, bits loop appealing for both data and code size.

Richard



Thanks in advance

Andrew

Re: GCov space optimization

Reply via email to