Re: [PATCH v4 00/11] Improve futex usage

Paolo Bonzini Tue, 27 May 2025 08:03:22 -0700

On Tue, May 27, 2025 at 5:01 AM Akihiko Odaki <akihiko.od...@daynix.com> wrote:
> I'd like to submit it with "[PATCH v4 05/11] qemu-thread: Avoid futex
> abstraction for non-Linux" because it aligns the implementations of
> Linux and non-Linux versions to rely on a store-release of EV_SET in
> qemu_event_set().


Ok, I see what you mean - you would like the xchg to be an
xchg_release essentially.

There is actually one case in which skipping the xchg has an effect.
If you have the following:

- one side does

  s.foo = 1;
  qemu_event_set(&s.ev);

- the other side never reaches the qemu_event_reset(&s.ev)

then skipping the xchg might allow the cacheline for ev to remain
shared. This is unlikely to *make* a difference, though it does
*exist* as a difference, so I will review the patch, but I really
prefer to place it last.  It's safer to take a known-working
algorithm, apply it to all OSes (or at least Linux and Windows), and
only then you refine it. It also makes my queue shorter.

> > Do you think it's incorrect?  I'll wait for your answer before sending
> > out the actual pull request.
>
> It's correct, but I don't think it's worthwhile.
>
> This code path is only used by platforms without a futex wrapper.
> Currently we only have one for Linux and this series adds one for
> Windows, but FreeBSD[1] and OpenBSD[2] have their own futex. macOS also
> gained one with version 14.4.[3] We can add wrappers for them too if
> their performance really matters.
> So the only platforms listed in docs/about/build-platforms.rst that
> require the non-futex version are macOS older than 14.4 and NetBSD.
> macOS older than 14.4 will not be supported after June 5 since macOS 14
> was released June 5, 2023 and docs/about/build-platforms.rst says:
>
> There are too few relevant platforms to justify the effort potentially
> needed for quality assurance.

Ok, nice.  So it's really just NetBSD in the end.

> Moreover, qemu_event_reset() is often followed by qemu_event_wait() or
> other barriers so probably relaxing ordering here does not affect the
> overall ordering constraint (and performance) much.

Understood.  For me it wasn't really about performance, but more about
understanding exactly which reorderings can happen and what
synchronizes with what. Load-acquire/store-release are simpler to
understand in that respect, especially since this use of condvar,
without the mutex in reset, is different from everything else that
I've ever seen.

Paolo

Re: [PATCH v4 00/11] Improve futex usage

Reply via email to