[PR] Reimplement writer-priority RWLock (brpc)

via GitHub Fri, 08 May 2026 06:05:30 -0700


chenBright opened a new pull request, #3286:
URL: https://github.com/apache/brpc/pull/3286


   ### What problem does this PR solve?
   
   Issue Number: resolve #3051
   
   Problem Summary:
   
   The current bthread RWLock is implemented with a go-like state machine
   where the reader count is bumped first and only rolled back if everything
   later succeeds. As pointed out in #3051, this state is **not safely
   reversible on a partial failure**:
   
   * When read times out while a writer holds the lock, the pre-incremented 
     reader counter is left dangling. Subsequent writers can no longer acquire 
     the lock, even after every legitimate reader has left.
   * For the same structural reason, bthread rwlock cannot offer correct
     `try_*` / `timed_*` APIs at all -- any cancellable wait risks corrupting
     the lock's accounting.
   
   PR #1031 already proposed a writer-priority rwlock that fits this 
requirement; 
   this PR is built on top of that proposal.
   
   ### What is changed and the side effects?
   
   Changed:
   
   **Reimplement `bthread_rwlock_t` on top of a writer-priority algorithm 
     derived from PR #1031.** The lock state is split into:
     * `lock_word` (butex): bit 31 = writer held, bits 0..30 = reader count;
     * `writer_wait_count` (butex): in-flight writer counter used by readers
       to honor writer-priority;
     * `writer_queue_mutex`: serializes writers competing for `lock_word`.
   
     Every wait/park step is fully reversible, so the lock now correctly
     supports `try_*` / `timed_*` operations:
     * `bthread_rwlock_tryrdlock` / `bthread_rwlock_timedrdlock`
     * `bthread_rwlock_trywrlock` / `bthread_rwlock_timedwrlock`
     * `bthread_rwlock_unlock` (auto-dispatches to read/write unlock by
       inspecting `lock_word`).
   
     On any failure path (`EBUSY`, `ETIMEDOUT`, `EINTR`-leading-to-fail) the
     side effects are rolled back via `rwlock_wrlock_cleanup` (release the 
inner 
     mutex if held; decrement `writer_wait_count`; wake parked readers if we 
     were the last in-flight writer). 
   
   **Replace the unconditional broadcast in unlock with selective wakeup**:
     * `unrdlock`: only the last reader (`lock_word` 1 -> 0) does
       `butex_wake(lock_word)` for the at-most-one writer parked on it
       (writers are serialized by `writer_queue_mutex`, so a single wake
       suffices). Earlier readers do not wake anyone.
     * `unwrlock`: deliberately unlocks `writer_queue_mutex` *first* (so the
       next queued writer -- which has already self-accounted in
       `writer_wait_count` -- gets a chance), and only does
       `butex_wake_all(writer_wait_count)` when *we* were the last in-flight
       writer (`fetch_sub` returned 1). The intentional ordering preserves
       strict writer-priority and is documented in detail in the source.
   
     This eliminates the previous "wake everybody, let them re-contend"
     pattern for both readers and writers, which removes a stampede on
     contended workloads.
   
   Side effects:
   - Performance effects:
   
   - Breaking backward compatibility: 
   
   ---
   ### Check List:
   - Please make sure your changes are compilable.
   - When providing us with a new feature, it is best to add related tests.
   - Please follow [Contributor Covenant Code of 
Conduct](https://github.com/apache/brpc/blob/master/CODE_OF_CONDUCT.md).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] Reimplement writer-priority RWLock (brpc)

Reply via email to