On Mon, Jun 22, 2026 at 4:14 AM Xuneng Zhou <[email protected]> wrote:
>
> The direct regression appears to be 5310fac6e0f. It allows this interleaving:
>
> W: LockBufferForCleanup() holds buffer header lock
> W: observes refcount > 1
> P: releases the last competing pin with atomic fetch_sub
> P: old state does not contain BM_PIN_COUNT_WAITER, so no wakeup
> W: publishes BM_PIN_COUNT_WAITER
> W: sleeps in ProcWaitForSignal()
>
> At this point the condition W wanted is already true: refcount is 1,
> meaning only W's own pin remains. So W could sleep indefinitely as no
> future unpin to wake it.
>
> We can fix this with the state returned by UnlockBufHdrExt() when
> publishing BM_PIN_COUNT_WAITER. If the wait refcount is 1, do not
> enter the wait path. Instead, fall through to the existing waiter-bit
> cleanup and retry the loop to acquire the cleanup lock normally. The
> reproducer test passed after applying the patch.

Thanks for investigating!
Does the reproducer pass prior to 5310fac6e0f?

- Melanie


Reply via email to