Re: Buffer locking is special (hints, checksums, AIO writes)

Andres Freund Mon, 12 Jan 2026 15:22:58 -0800

Hi,

On 2026-01-12 17:27:30 -0500, Melanie Plageman wrote:
> On Mon, Jan 12, 2026 at 12:45 PM Andres Freund <[email protected]> wrote:
> >
> > Also working on doing comment polishing of the later patches, found a few
> > things, but not quite enough to be worth reposting yet.
>
> I looked at 0004 and 0005 and re-looked at 0006 - 0007.
>
> For 0004, I think you should clarify the commit message a bit. I had
> trouble understanding when WAKE_IN_PROGRESS is set. So, before
> RELEASE_OK was set all the time except when a process woke up and
> hadn't run yet. Now, WAKE_IN_PROGRESS is only set when a process is
> woken up but hasn't run yet. Personally, I just needed a bit more
> specificity (maybe even a bit more formality and grammatically
> correctness) from the commit message to get it.


Is this better?
    lwlock: Invert meaning of LW_FLAG_RELEASE_OK

    Previously, a flag was set to indicate that a lock release should wake up
    waiters. Since waking waiters is the default behavior in the majority of
    cases, this logic has been inverted. The new LW_FLAG_WAKE_IN_PROGRESS flag 
is
    now set iff wakeups are explicitly inhibited.

    The motivation for this change is that in an upcoming commit, content locks
    will be implemented independently of lwlocks, with the lock state stored as
    part of BufferDesc.state. As all of a buffer's flags are cleared when the
    buffer is invalidated, without this change we would have to re-add the
    RELEASE_OK flag after clearing the flags; otherwise, the next lock release
    would not wake waiters.

    It seems good to keep the implementation of lwlocks and buffer content locks
    as similar as reasonably possible.

    Discussion: 
https://postgr.es/m/4csodkvvfbfloxxjlkgsnl2lgfv2mtzdl7phqzd4jxjadxm4o5@usw7feyb5bzf



> I agree that separating 0005 is helpful.

Kewl.


> 0007 looks basically fine to me. I'd comb through it with an AI tool
> to catch a few nits I saw like an outdated reference to RELEASE_OK and
> a missing word in the commit message, etc.

Found a few that way and with some manual searching.


> Otherwise, I mostly looked to see if the wakeup semantics seemed right
> and if anything jumped out at me while skimming (i.e. I didn't go
> through every line with a fine-toothed comb).
>
> The two things I came up with were:
>
> I wondered why this was needed (i.e. why it wasn't needed before)

> @@ -6688,7 +7428,25 @@ ResOwnerReleaseBufferPin(Datum res)
>      if (BufferIsLocal(buffer))
>          UnpinLocalBufferNoOwner(buffer);
>      else
> +    {
> +        PrivateRefCountEntry *ref;
> +
> +        ref = GetPrivateRefCountEntry(buffer, false);
> +
> +        /*
> +         * If the buffer was locked at the time of the resowner release,
> +         * release the lock now. This should only happen after errors.
> +         */
> +        if (ref->data.lockmode != BUFFER_LOCK_UNLOCK)
> +        {
> +            BufferDesc *buf = GetBufferDescriptor(buffer - 1);
> +
> +            HOLD_INTERRUPTS();    /* match the upcoming RESUME_INTERRUPTS */
> +            BufferLockUnlock(buffer, buf);
> +        }
> +
>          UnpinBufferNoOwner(GetBufferDescriptor(buffer - 1));
> +    }
>  }

It's needed because previously content locks were released as part of the
LWLockReleaseAll() that are sprinkled across various error recovery paths. Now
that content locks aren't implemented via lwlocks anymore, something new is 
needed.


>  is it related to your comment in the commit message

> 2) Error recovery for content locks is implemented as part of the
> already existing private-refcount tracking mechanism in combination
> with resowners?

Yes.


> As for your FIXMEs,
> +    /*
> +     * FIXME: This is reusing the lwlock fields. That's not a correctness
> +     * issue, a backend can't wait for both an lwlock and a buffer content
> +     * lock at the same time. However, it seems pretty ugly, particularly
> +     * given that the field names have an lw* prefix. But duplicating the
> +     * fields also seems somewhat superfluous.
> +     */
>
> personally I can live with reusing the lwlock fields now that it's
> fairly well documented.

Cool. That's the conclusion I also came to. So unless somebody pipes up soon,
I'll remove the FIXME from the commit message and code.


> +    /* XXX: combine with fetch_and above? */
> +    UnlockBufHdr(buf_hdr);
>
> Are you thinking about adding a helper that stops waiting and unlocks?

I'm not sure what you mean by that? Just whether I plan to implement the
FIXME?



> > Perhaps move the locking code into a buffer_locking.h or such? Needs to be 
> > inline functions for efficiency unfortunately.
>
> So you mean put all of the static buffer locking functions you added
> to bufmgr.c inline into a header file?

Yes, that's what I was wondering about.


> bufmgr.c is super long anyway, so it's not like making it separate
> makes the file manageable. On the other hand, it's probably better to
> not keep making it worse.

Yea. OTOH I don't know if a header that's just included by one file is really
an improvement :/


> For example, I find it really annoying that the helper function prototypes
> for res owner and ref count related functions are grouped before their
> implementations and then below that there is another seemingly arbitrary
> group of prototypes and then their implementations. Like, what is the logic
> there?

I agree it's pretty awful this way. I don't know how the hell that happened,
despite probably being the party to blame (4b4b680c3d6d). Nobody in that
thread commented upon it, it was that way starting in the first version. Odd.
I guess I should propose fixing that :/

Greetings,

Andres Freund

Re: Buffer locking is special (hints, checksums, AIO writes)

Reply via email to