Hi, On 2026-01-24 19:54:36 -0500, Tom Lane wrote: > Andres Freund <[email protected]> writes: > > On 2026-01-24 16:11:47 -0500, Tom Lane wrote: > >> In the case at hand I think it is probably driven by two recursion levels > >> trying to acquire free space out of the same buffer. SPGist is expecting > >> the lower level to fail to get the lock and then go find some free space > >> elsewhere. Yeah, we could probably re-code it to get that outcome in > >> another way, but why? > > > Regardless of the assertion, it still feels like there may be something off > > here. Why is the page marked as empty in the FSM, despite actually not being > > empty? > > Who said anything about its being empty?
In the crashes due to Alexander's repro that I have looked at, the page is returned by SpGistNewBuffer()->GetFreeIndexPage(). Which afaict should only contain empty pages. It's easy to see how that could happen with concurrency, but there's none in the test. I was just trying to repro this again while writing this message, and interestingly I got the same issue in nbtree this time. Which a) confirms Peter's statement that the "conditionally locking a buffer we already locked" issue exists for nbtree b) makes me suspect something odd is happening around indexfsm. Anyway, independent of that, the behavior clearly needs to be allowed. Here's a proposed patch. At first I was thinking of just removing the assertion without anything else in place - but I think that's not quite right: We could e.g. be trying to acquire a share or share-exclusive lock when holding a share lock (or the reverse), but we can't currently don't keep track of two different lock modes for the same lock. Therefore it seems safer to just define it so that acquiring a conditional lock on a buffer that is already locked by us will always fail, regardless of what existing lock mode we already hold. I think all current callers good with that. Does that sound reasonable? We could add support for locking the same buffer multiple times, but I don't think it'd be worth the complexity and (small) overhead that would bring with it? It also seems like allowing that would make it more likely for a backend to trample over its own state higher up in the call tree. Greetings, Andres Freund
>From b90f7fbe9958ea785a1302395bb1a201a00f3bc5 Mon Sep 17 00:00:00 2001 From: Andres Freund <[email protected]> Date: Thu, 29 Jan 2026 12:22:02 -0500 Subject: [PATCH v1] bufmgr: Allow conditionally locking of already locked buffer Discussion: https://postgr.es/m/[email protected] --- src/backend/storage/buffer/bufmgr.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c index 6f935648ae9..f5602f4e7e1 100644 --- a/src/backend/storage/buffer/bufmgr.c +++ b/src/backend/storage/buffer/bufmgr.c @@ -5895,6 +5895,13 @@ BufferLockUnlock(Buffer buffer, BufferDesc *buf_hdr) /* * Acquire the content lock for the buffer, but only if we don't have to wait. + * + * It is allowed to try to conditionally acquire a lock on a buffer that this + * backend has already locked, but the lock acquisition will always fail, even + * if the new lock acquisition does not conflict with an already held lock + * (e.g. two share locks). This is because we don't track per-backend + * ownership of multiple lock levels. That is ok for the current uses of + * BufferLockConditional(). */ static bool BufferLockConditional(Buffer buffer, BufferDesc *buf_hdr, BufferLockMode mode) @@ -5902,11 +5909,16 @@ BufferLockConditional(Buffer buffer, BufferDesc *buf_hdr, BufferLockMode mode) PrivateRefCountEntry *entry = GetPrivateRefCountEntry(buffer, true); bool mustwait; - /* - * We better not already hold a lock on the buffer. - */ Assert(entry->data.lockmode == BUFFER_LOCK_UNLOCK); + /* + * As described above, if we're trying to lock a buffer this backend + * already has locked, return false, independent of the existing and + * desired lock level. + */ + if (entry->data.lockmode != BUFFER_LOCK_UNLOCK) + return false; + /* * Lock out cancel/die interrupts until we exit the code section protected * by the content lock. This ensures that interrupts will not interfere -- 2.48.1.76.g4e746b1a31.dirty
