On Thu, 26 Sept 2024 at 08:54, Jonas Oberhauser <jonas.oberhau...@huaweicloud.com> wrote: > > No, the issue introduced by the compiler optimization (or by your > original patch) is that the CPU can speculatively load from the first > pointer as soon as it has completed the load of that pointer:
You mean the compiler can do it. The inline asm has no impact on what the CPU does. The conditional isn't a barrier for the actual hardware. But once the compiler doesn't try to do it, the data dependency on the address does end up being an ordering constraint on the hardware too (I'm happy to say that I haven't heard from the crazies that want value prediction in a long time). Just use a barrier. Or make sure to use the proper ordered memory accesses when possible. Don't use an inline asm for the compare - we don't even have anything insane like that as a portable helper, and we shouldn't have it. Linus