https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99105

--- Comment #6 from Martin Liška <marxin at gcc dot gnu.org> ---
(In reply to Jan Hubicka from comment #5)
> > > So it effectively replaces gcov's own buffered I/O by stdio.  First I am
> > > not sure how safe it is (as we had a lot of fun about using malloc)
> > 
> > Why is not safe? We use filesystem locking for .gcda file.
> 
> Because user apps may do funny thins with stdio such as they do with
> malloc.  Fewer library stuff we rely on, the less likely we will hit the
> problems.  So I am not sure if simply fixing i/o isn't better approach,
> but I do not know.

Sure. With the patch, we don't rely on any glibc feature. We will just
use a default read/write IO (which uses a buffering internally).

> > 
> > > also it adds dependency on stdio that is not necessarily good idea for
> > > embedded targets. Not sure how often it is used there.
> > 
> > It was motivated by PR97834. Well, I think it's better to rely on a system C
> > library
> > as it provides a faster implementation of buffered I/O.
> > 
> > For embedded targets, I plan to implement hooks that can be used instead of
> > I/O:
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559342.html
> > 
> > > 
> > > But why glibc stdio is more effective? Is it because our buffer size of
> > > 1k is way too small (as it seems juding from the profile that is
> > > dominated by fread calls rather than open/lock/close)?
> > 
> > It behaved the same on my machine, but BSD impact was more significant.
> 
> Clang training seems to be a good extreme testcase and not that hard to
> set up. It is relatively large testsuite and streaming is clearly
> dominating over everything.

Sure, I'll set it up.

> 
> Profile also seems quite clear that read dominates other syscall
> overhead.
> > 
> > I'm planning to collect more detailed statistics about why is a lot of small
> > I/Os slower.
> 
> From the perf it seems that simply the syscall overhead plays important
> role (about 20% at kernel side, plus 9% on glibc side) followed by some
> stupidness of opensuse setup - apparmor and btrfs.

Yes, that's pretty obvious from the profile.

> > 
> > In the case of Clang, I would expect 100s (or even 1000s) of object files.
> > During profiling
> > run (using all cores), I would expect each run takes 100ms (or even 
> > seconds),
> > so waiting
> > for a file lock of an object file should not block it much.
> 
> 2727 gcda files, 44MB overall, 4MB xz compressed tar file.
> I am actually surprised that the file count is quite small. Firefox has
> more...

To be honest, it's very small file size. I would expect these files should
definitely live in a page cache.

What type of disk do you use?

Reply via email to