Questions about rwlock(9) memory ordering semantics: 1. Is rw_tryupgrade an acquire operation, so that all prior reader critical sections happen-before anything sequenced after it?
Or does it only order stores issued by the caller, while loads in the following write section can be speculatively executed in the preceding read section? 2. Is rw_downgrade a release operation, so that whatever is sequenced before it happens-before any subsequent reader critical section? Or does it only order stores issued by the caller, while loads in the preceding write section can be deferred until the following reader section? The man page isn't clear on this. Here's an example to illustrate the difference. Consider the following data structure: struct foo { krwlock_t lock; volatile unsigned nreaders; ... }; Assume every reader increments nreaders on entry, and decrements nreaders on exit: rw_enter(&foo->lock, RW_READER); atomic_inc_uint(&foo->nreaders); ... atomic_dec_uint(&foo->nreaders); rw_exit(&foo->lock); Under that premise, a writer always see nreaders as zero. So the following assertions should never fail: rw_enter(&foo->lock, RW_WRITER); KASSERT(atomic_load_relaxed(&foo->nreaders) == 0); ... KASSERT(atomic_load_relaxed(&foo->nreaders) == 0); rw_exit(&foo->lock); I believe this is guaranteed as rwlock(9) is currently implemented. (Whether it is _intended_ to be guaranteed, I'm not sure.) However, very similar assertions may fail, in principle, when using rw_tryupgrade or rw_downgrade, as rwlock(9) is currently implemented: /* upgrade reader to writer */ rw_enter(&foo->lock, RW_READER); atomic_inc_uint(&foo->nreaders); ... atomic_dec_uint(&foo->nreaders); if (rw_tryupgrade(&foo->nreaders)) { /* we're a writer now */ KASSERT(atomic_load_relaxed(&foo->nreaders) == 0); ... } rw_exit(&foo->lock); /* downgrade writer to reader */ rw_enter(&foo->lock, RW_WRITER); ... unsigned nreaders = atomic_load_relaxed(&foo->nreaders); rw_downgrade(&foo->lock); rw_exit(&foo->lock); KASSERT(nreaders == 0); The reason these can fail is that rw_tryupgrade and rw_downgrade only issue store-before-store barriers (membar_producer), putting no restrictions on when the CPU can load foo->nreaders relative to rw_tryupgrade or rw_downgrade. So, as rwlock(9) is currently implemented: - In the upgrade example, the CPU can speculatively load foo->nreaders before rw_tryupgrade, and thereby witness a nonzero value from before the last reader has decremented it to zero and exited the read section (quickly enough for rw_tryupgrade to succeed, of course). - In the downgrade example, the CPU can defer the load of foo->nreaders past rw_downgrade, and thereby witness a nonzero value from one of the readers that rw_downgrade allows to begin executing. Even though the reader is not technically _just_ a `reader', in that it issues stores as well as loads, I think these possibilities are extremely counterintuitive and possibly dangerous, especially for a relatively high-level API like rwlock(9) that prioritizes ease of use over maximal parallelism -- merely taking a read lock can lead to shared memory contention, which is why we also have harder-to-use but cheaper options like pserialize(9), psref(9), and localcount(9). I'm open to other opinions; perhaps it is intended that loads in a writer can bleed into an adjacent read section, and that readers aren't supposed to issue stores anyway, and perhaps there's a good reason for all this. But absent a good reason, I think rw_downgrade should be a release operation, and rw_tryupgrade should be an acquire operation, just like rw_exit and rw_enter. Which means they need to use membar_release and membar_acquire inside, not membar_producer (and we need to issue pullups).