Re: [RFC PATCH 1/4] hazptr: Add initial implementation of hazard pointers

Jonas Oberhauser Thu, 19 Sep 2024 13:39:50 -0700




Am 9/19/2024 um 2:12 AM schrieb Jann Horn:

On Tue, Sep 17, 2024 at 4:33 PM Boqun Feng <boqun.f...@gmail.com> wrote:

Hazard pointers [1] provide a way to dynamically distribute refcounting
and can be used to improve the scalability of refcounting without
significant space cost.

+static inline void *__hazptr_tryprotect(hazptr_t *hzp,
+                                       void *const *p,
+                                       unsigned long head_offset)
+{
+       void *ptr;
+       struct callback_head *head;
+
+       ptr = READ_ONCE(*p);
+
+       if (ptr == NULL)
+               return NULL;
+
+       head = (struct callback_head *)(ptr + head_offset);
+
+       WRITE_ONCE(*hzp, head);
+       smp_mb();
+
+       ptr = READ_ONCE(*p); // read again
+
+       if (ptr + head_offset != head) { // pointer changed
+               WRITE_ONCE(*hzp, NULL);  // reset hazard pointer
+               return NULL;
+       } else
+               return ptr;
+}


I got nerdsniped by the Plumbers talk. So, about that smp_mb()...

I think you should be able to avoid the smp_mb() using relaxed atomics
(on architectures that have those), at the cost of something like a
cmpxchg-acquire sandwiched between a load-acquire and a relaxed load?
I'm not sure how their cost compares to an smp_mb() though.

We have done a similar scheme before, and on some architectures (notx86) the RMW is slightly cheaper than the mb.

Your reasoning is a bit simplified because it seems to assume a strongerconcept of ordering than LKMM has, but even with LKMM's ordering yourcode seems fine.

I feel it can even be simplified a little, the hazard bit does not seemnecessary.

Assuming atomic operations for everything racy, relaxed unless statedotherwise:


(R)eader:

old = read p // I don't think this needs acq, because of addressdependencies (*)

   haz ||=_acq old
   if (read p != old) retry;
   *old

(W)riter:

   p =_??? ... // assuming we don't set it to null this needs a rel
   --- mb ---
   haz ||= 0
   while (read_acq haz == old) retry;
   delete old

In order to get a use-after-free, both of the R:read p need to readbefore W:p = ... , so because of the W:mb, they execute before W:haz||=0 .

Also, for the use-after-free, W:read_acq haz needs to read before (thewrite part of) R:haz||=_acq old .

Then the W:haz ||= 0 also needs to read before (the write part of)R:haz||=_acq old because of coherence on the same location.

Since both of them are atomic RMW, the W:haz||= 0 also needs to writebefore (the write part of) R:haz||=_acq old , and in the absence ofnon-RMW writes (so assuming no thread will just store to the hazardpointer), this implies that the latter reads-from an rmw-sequence-laterstore than W:haz||=0 , which therefore executes before R:haz||=_acq old .

But because of the acquire barrier, R:haz||=_acq old executes beforethe second R:read p .



This gives a cycle
   (2nd)R:read p  ->xb  W:haz||=0  ->xb  R:haz||=_acq  ->xb  (2nd)R:read p

and therefore is forbidden.

Note that in this argument, the two R:read p are not necessarilyreading from the same store. Because of ABA, it can happen that theyread from distinct stores and see the same value.

It does require clearing hazard pointers with an RMW like atomic_and(0)(**). The combination of the two RMW (for setting & clearing the hazardpointer) might in total be slower again than an mb.

(I took the liberty to add an acquire barrier in the writer's while loopfor the case where we read from the (not shown) release store of thereader that clears the hazard pointer. It's arguable whether that acq isneeded since one could argue that a delete is a kind of store, in whichcase control dependency would handle it.)


have fun,
  jonas


(*

you talk about sandwiching, and the data dependency does guaranteesome weaker form of sandwiching than the acq, but I don't thinksandwiching is required at all. If you happened to be able to set thehazard pointer before reading the pointer, that would just extend theprotected area, wouldn't it?

)

(**

I think if you clear the pointer with a store, the hazard bit does nothelp. You could be overwriting the hazard bit, and the RMWs that set thehazard bit might never propagate to your CPU.

Also in your code you probably meant to clear the whole hazard pointerin the retry code of the reader, not just the hazard bit.)

Re: [RFC PATCH 1/4] hazptr: Add initial implementation of hazard pointers

Reply via email to