On 2024-09-25 12:45, Boqun Feng wrote:
On Wed, Sep 25, 2024 at 12:11:52PM +0200, Jonas Oberhauser wrote:


Am 9/25/2024 um 12:02 PM schrieb Boqun Feng:
Hi Jonas,

Of
course, if we are really worried about compilers being too "smart"

Ah, I see you know me better and better...

we can always do the comparison in asm code, then compilers don't know
anything of the equality between 'ptr' and 'head - head_offset'.
Yes, but then a simple compiler barrier between the comparison and returning
ptr would also do the trick, right? And maybe easier on the eyes.


The thing about putting a compiler barrier is that it will prevent all
compiler reorderings, and some of the reordering may contribute to
better codegen. (I know in this case, we have a smp_mb(), but still
compilers can move unrelated code upto the second load for optimization
purpose). Asm comparison is cheaper in this way. But TBH, compilers
should provide a way to compare pointer values without using the result
for pointer equality proof, if "convert to unsigned long" doesn't work,
some other ways should work.


Based on Documentation/RCU/rcu_dereference.rst :

-       Be very careful about comparing pointers obtained from
        rcu_dereference() against non-NULL values.  As Linus Torvalds
        explained, if the two pointers are equal, the compiler could
        substitute the pointer you are comparing against for the pointer
        obtained from rcu_dereference().  For example::

                p = rcu_dereference(gp);
                if (p == &default_struct)
                        do_default(p->a);

        Because the compiler now knows that the value of "p" is exactly
        the address of the variable "default_struct", it is free to
        transform this code into the following::

                p = rcu_dereference(gp);
                if (p == &default_struct)
                        do_default(default_struct.a);

        On ARM and Power hardware, the load from "default_struct.a"
        can now be speculated, such that it might happen before the
        rcu_dereference().  This could result in bugs due to misordering.

So I am not only concerned about compiler proofs here, as it appears
that the speculation done by the CPU can also cause issues on some
architectures.

Thanks,

Mathieu

Regards,
Boqun


Have fun,
    jonas


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


Reply via email to