Hi Mark,

On Tue, May 26, 2026 at 7:36 AM Mark Wielaard <[email protected]> wrote:
> OK, I think I agree with this analysis. So the libraries should
> guarantee thread-safety for any concurrent "read" operations on a
> specific handle (even if the underlying data structure is created
> lazily on first access, in which case we do use internal locking). But
> requires the user to use explicit/external locking when combining
> concurrent "read" and "write" operations on the same handle.
>
> What guarantees do we give on the "read" (sub)handles after any "write"
> operation? e.g when the code gets an Elf_Scn * from an Elf * and then
> an Elf_Data * from the Elf_Scn, are the Elf_Scn * and Elf_Data still
> valid after some "write" operation to the Elf *? Or are they
> invalidated and cannot be used after a "write" operation?

We should aim for writes to invalidate existing handles as little
as possible. For example reads/writes to distinct Elf_Scn handles
should be safe to run concurrently before and after (not during) a
call to elf_newscn, which appends the new Elf_Scn to the end of the
section list without invalidating the other sections.

However elf_compress/elf_compress_gnu for example free Elf_Data->d_buf
and allocate a new one, so there are some functions that do invalidate
a subhandle and users need to be made aware of limitations like this.

One goal of this policy is to have a framework that lets us articulate
simple, intuitive guarantees and constraints to users and to avoid
burdening them with numerous special cases resulting from library
implementation details.  But some special cases will be necessary
and we will document them.

> [...] What if I want to write an parallel eu-strip, can I write out
> separate Elf_Scns in parallel? Do I need to hold a lock per Elf,
> per Elf_Scn or not at all to call elf_newdata? Or if I want to
> parallelize eu-elfcompress, can I create a thread per Elf_Scn to call
> elf_compress on them?

I looked over all the different eu-* tools and almost all of them could
have multithreading capabilities added to them in ways that are obviously
consistent with the proposed policy. eu-strip and eu-elfcompress are the
two most update-heavy tools so you're right to call attention to these.

I believe that a thread per Elf_Scn will work in both cases. Any lazy init
will be protected by an atomic flag and lock.  This will be transparent to
the library caller and once all sections are initialized, updates to
distinct Elf_Scns should not modify any shared data.

> It looks like just focusing on libelf seems to provide a good
> concurrency win even for applications also using libdw. Is that because
> libdw doesn't contain problematic locks? Or is libdw not really thread-
> safe already?

Bugs aside, libdw is basically thread safe. Benchmarking has not revealed
any problematic libdw locks yet. There could be some workflows we haven't
benchmarked that reveal problematic locking patterns.  dwarf_getalt may
need to be improved at some point for this reason.  But with this policy
accepted we should be able to replace a hot path lock with an atomic flag,
use a lock only during lazy init, and get the performance benefits we
are seeing from gelf_getsymshndx and others.

Aaron

Reply via email to