Re: [RFC PATCH v4 1/4] compiler.h: Introduce ptr_eq() to preserve address dependency

David Laight Thu, 18 Dec 2025 09:09:05 -0800

On Thu, 18 Dec 2025 14:27:36 +0000
Gary Guo <[email protected]> wrote:


> On Thu, 18 Dec 2025 09:03:13 +0000
> David Laight <[email protected]> wrote:
> 
> > On Wed, 17 Dec 2025 20:45:28 -0500
> > Mathieu Desnoyers <[email protected]> wrote:
> >   
> > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > > index 5b45ea7dff3e..c5ca3b54c112 100644
> > > --- a/include/linux/compiler.h
> > > +++ b/include/linux/compiler.h
> > > @@ -163,6 +163,69 @@ void ftrace_likely_update(struct ftrace_likely_data 
> > > *f, int val,
> > >   __asm__ ("" : "=r" (var) : "0" (var))
> > >  #endif
> > >  
> > > +/*
> > > + * Compare two addresses while preserving the address dependencies for
> > > + * later use of the address. It should be used when comparing an address
> > > + * returned by rcu_dereference().
> > > + *
> > > + * This is needed to prevent the compiler CSE and SSA GVN optimizations
> > > + * from using @a (or @b) in places where the source refers to @b (or @a)
> > > + * based on the fact that after the comparison, the two are known to be
> > > + * equal, which does not preserve address dependencies and allows the
> > > + * following misordering speculations:
> > > + *
> > > + * - If @b is a constant, the compiler can issue the loads which depend
> > > + *   on @a before loading @a.
> > > + * - If @b is a register populated by a prior load, weakly-ordered
> > > + *   CPUs can speculate loads which depend on @a before loading @a.
> > > + *
> > > + * The same logic applies with @a and @b swapped.
> > > + *
> > > + * Return value: true if pointers are equal, false otherwise.
> > > + *
> > > + * The compiler barrier() is ineffective at fixing this issue. It does
> > > + * not prevent the compiler CSE from losing the address dependency:
> > > + *
> > > + * int fct_2_volatile_barriers(void)
> > > + * {
> > > + *     int *a, *b;
> > > + *
> > > + *     do {
> > > + *         a = READ_ONCE(p);
> > > + *         asm volatile ("" : : : "memory");
> > > + *         b = READ_ONCE(p);
> > > + *     } while (a != b);
> > > + *     asm volatile ("" : : : "memory");  <-- barrier()
> > > + *     return *b;
> > > + * }
> > > + *
> > > + * With gcc 14.2 (arm64):
> > > + *
> > > + * fct_2_volatile_barriers:
> > > + *         adrp    x0, .LANCHOR0
> > > + *         add     x0, x0, :lo12:.LANCHOR0
> > > + * .L2:
> > > + *         ldr     x1, [x0]  <-- x1 populated by first load.
> > > + *         ldr     x2, [x0]
> > > + *         cmp     x1, x2
> > > + *         bne     .L2
> > > + *         ldr     w0, [x1]  <-- x1 is used for access which should 
> > > depend on b.
> > > + *         ret
> > > + *
> > > + * On weakly-ordered architectures, this lets CPU speculation use the
> > > + * result from the first load to speculate "ldr w0, [x1]" before
> > > + * "ldr x2, [x0]".
> > > + * Based on the RCU documentation, the control dependency does not
> > > + * prevent the CPU from speculating loads.    
> > 
> > I'm not sure that example (of something that doesn't work) is really 
> > necessary.
> > The simple example of, given:
> >     return a == b ? *a : 0;
> > the generated code might speculatively dereference 'b' (not a) before 
> > returning
> > zero when the pointers are different.  
> 
> I'm not sure I understand what you're saying.
> 
> `b` cannot be speculatively dereferenced by the compiler in code-path
> where pointers are different, as the compiler cannot ascertain that it is
> valid.

The 'validity' doesn't matter for speculative execution.

> The speculative execution on the processor side *does not* matter here as
> it needs to honour address dependency (unless you're Alpha, which is why we
> add a `mb()` in each `READ_ONCE`).

There isn't an 'address dependency', that is the problem.
The issue is that 'a == b ? *a : 0' and 'a == b ? *b : 0' always evaluate
to the same value and the compiler will (effectively) substitute one for the
other.
But sometimes you really do care which pointer is speculatively dereferenced
when the they are different.
Memory barriers can only enforce the order of the reads of 'a', 'b' and '*a',
they won't change whether the generated code contains '*a' or '*b'.

        David


> 
> Best,
> Gary
> 
> 
>

Re: [RFC PATCH v4 1/4] compiler.h: Introduce ptr_eq() to preserve address dependency

Reply via email to