Re: [PATCH 1/8] Introduce simple hazard pointers

Peter Zijlstra Wed, 25 Jun 2025 03:51:09 -0700

On Tue, Jun 24, 2025 at 08:10:54PM -0700, Boqun Feng wrote:
> As its name suggests, simple hazard pointers (shazptr) is a
> simplification of hazard pointers [1]: it has only one hazard pointer
> slot per-CPU and is targeted for simple use cases where the read-side
> already has preemption disabled. It's a trade-off between full features
> of a normal hazard pointer implementation (multiple slots, dynamic slot
> allocation, etc.) and the simple use scenario.
> 
> Since there's only one slot per-CPU, so shazptr read-side critical
> section nesting is a problem that needs to be resolved, because at very
> least, interrupts and NMI can introduce nested shazptr read-side
> critical sections. A SHAZPTR_WILDCARD is introduced to resolve this:
> SHAZPTR_WILDCARD is a special address value that blocks *all* shazptr
> waiters. In an interrupt-causing shazptr read-side critical section
> nesting case (i.e. an interrupt happens while the per-CPU hazard pointer
> slot being used and tries to acquire a hazard pointer itself), the inner
> critical section will switch the value of the hazard pointer slot into
> SHAZPTR_WILDCARD, and let the outer critical section eventually zero the
> slot. The SHAZPTR_WILDCARD still provide the correct protection because
> it blocks all the waiters.


Don't we typically name such a thing a tombstone?

> It's true that once the wildcard mechanism is activated, shazptr
> mechanism may be downgrade to something similar to RCU (and probably
> with a worse implementation), which generally has longer wait time and
> larger memory footprint compared to a typical hazard pointer
> implementation. However, that can only happen with a lot of users using
> hazard pointers, and then it's reasonable to introduce the
> fully-featured hazard pointer implementation [2] and switch users to it.
> 
> Note that shazptr_protect() may be added later, the current potential
> usage doesn't require it, and a shazptr_acquire(), which installs the
> protected value to hazard pointer slot and proves the smp_mb(), is
> enough for now.
> 
> [1]: M. M. Michael, "Hazard pointers: safe memory reclamation for
>      lock-free objects," in IEEE Transactions on Parallel and
>      Distributed Systems, vol. 15, no. 6, pp. 491-504, June 2004
> 
> Link: 
> https://lore.kernel.org/lkml/20240917143402.930114-1-boqun.f...@gmail.com/ [2]
> Signed-off-by: Boqun Feng <boqun.f...@gmail.com>
> ---
>  include/linux/shazptr.h  | 73 ++++++++++++++++++++++++++++++++++++++++
>  kernel/locking/Makefile  |  2 +-
>  kernel/locking/shazptr.c | 29 ++++++++++++++++
>  3 files changed, 103 insertions(+), 1 deletion(-)
>  create mode 100644 include/linux/shazptr.h
>  create mode 100644 kernel/locking/shazptr.c
> 
> diff --git a/include/linux/shazptr.h b/include/linux/shazptr.h
> new file mode 100644
> index 000000000000..287cd04b4be9
> --- /dev/null
> +++ b/include/linux/shazptr.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Simple hazard pointers
> + *
> + * Copyright (c) 2025, Microsoft Corporation.
> + *
> + * Author: Boqun Feng <boqun.f...@gmail.com>
> + *
> + * A simple variant of hazard pointers, the users must ensure the preemption
> + * is already disabled when calling a shazptr_acquire() to protect an 
> address.
> + * If one shazptr_acquire() is called after another shazptr_acquire() has 
> been
> + * called without the corresponding shazptr_clear() has been called, the 
> later
> + * shazptr_acquire() must be cleared first.
> + *
> + * The most suitable usage is when only one address need to be protected in a
> + * preemption disabled critical section.

It might be useful to have some example code included here to illustrate
how this is supposed to be used etc.

> + */
> +
> +#ifndef _LINUX_SHAZPTR_H
> +#define _LINUX_SHAZPTR_H
> +
> +#include <linux/cleanup.h>
> +#include <linux/percpu.h>
> +
> +/* Make ULONG_MAX the wildcard value */
> +#define SHAZPTR_WILDCARD ((void *)(ULONG_MAX))

Right, I typically write that like: ((void *)-1L) or ((void *)~0UL)

> +
> +DECLARE_PER_CPU_SHARED_ALIGNED(void *, shazptr_slots);
> +
> +/* Represent a held hazard pointer slot */
> +struct shazptr_guard {
> +     void **slot;
> +     bool use_wildcard;
> +};

Natural alignment ensures the LSB of that pointer is 0, which is enough
space to stick that bool in, no?

> +
> +/*
> + * Acquire a hazptr slot and begin the hazard pointer critical section.
> + *
> + * Must be called with preemption disabled, and preemption must remain 
> disabled
> + * until shazptr_clear().
> + */
> +static inline struct shazptr_guard shazptr_acquire(void *ptr)
> +{
> +     struct shazptr_guard guard = {
> +             /* Preemption is disabled. */
> +             .slot = this_cpu_ptr(&shazptr_slots),

What you're trying to say with that comment is that: this_cpu_ptr(),
will complain if preemption is not already disabled, and as such this
verifies the assumption?

You can also add:

        lockdep_assert_preemption_disabled();

at the start of this function and then all these comments can go in the
bin, no?

> +             .use_wildcard = false,
> +     };
> +
> +     if (likely(!READ_ONCE(*guard.slot))) {
> +             WRITE_ONCE(*guard.slot, ptr);
> +     } else {
> +             guard.use_wildcard = true;
> +             WRITE_ONCE(*guard.slot, SHAZPTR_WILDCARD);
> +     }
> +
> +     smp_mb(); /* Synchronize with smp_mb() at synchronize_shazptr(). */
> +
> +     return guard;
> +}
> +
> +static inline void shazptr_clear(struct shazptr_guard guard)
> +{
> +     /* Only clear the slot when the outermost guard is released */
> +     if (likely(!guard.use_wildcard))
> +             smp_store_release(guard.slot, NULL); /* Pair with ACQUIRE at 
> synchronize_shazptr() */
> +}
> +
> +void synchronize_shazptr(void *ptr);
> +
> +DEFINE_CLASS(shazptr, struct shazptr_guard, shazptr_clear(_T),
> +          shazptr_acquire(ptr), void *ptr);
> +#endif
> diff --git a/kernel/locking/Makefile b/kernel/locking/Makefile
> index a114949eeed5..1517076c98ec 100644
> --- a/kernel/locking/Makefile
> +++ b/kernel/locking/Makefile
> @@ -3,7 +3,7 @@
>  # and is generally not a function of system call inputs.
>  KCOV_INSTRUMENT              := n
>  
> -obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o
> +obj-y += mutex.o semaphore.o rwsem.o percpu-rwsem.o shazptr.o
>  
>  # Avoid recursion lockdep -> sanitizer -> ... -> lockdep & improve 
> performance.
>  KASAN_SANITIZE_lockdep.o := n
> diff --git a/kernel/locking/shazptr.c b/kernel/locking/shazptr.c
> new file mode 100644
> index 000000000000..991fd1a05cfd
> --- /dev/null
> +++ b/kernel/locking/shazptr.c
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Simple hazard pointers
> + *
> + * Copyright (c) 2025, Microsoft Corporation.
> + *
> + * Author: Boqun Feng <boqun.f...@gmail.com>
> + */
> +
> +#include <linux/atomic.h>
> +#include <linux/cpumask.h>
> +#include <linux/shazptr.h>
> +
> +DEFINE_PER_CPU_SHARED_ALIGNED(void *, shazptr_slots);
> +EXPORT_PER_CPU_SYMBOL_GPL(shazptr_slots);
> +
> +void synchronize_shazptr(void *ptr)
> +{
> +     int cpu;

        lockdep_assert_preemption_enabled();

> +
> +     smp_mb(); /* Synchronize with the smp_mb() in shazptr_acquire(). */
> +     for_each_possible_cpu(cpu) {
> +             void **slot = per_cpu_ptr(&shazptr_slots, cpu);
> +             /* Pair with smp_store_release() in shazptr_clear(). */
> +             smp_cond_load_acquire(slot,
> +                                   VAL != ptr && VAL != SHAZPTR_WILDCARD);
> +     }
> +}
> +EXPORT_SYMBOL_GPL(synchronize_shazptr);
> -- 
> 2.39.5 (Apple Git-154)
>

Re: [PATCH 1/8] Introduce simple hazard pointers

Reply via email to