Re: Buffer locking is special (hints, checksums, AIO writes)

Andres Freund Thu, 29 Jan 2026 11:29:53 -0800

Hi,

On 2026-01-29 13:33:02 -0500, Peter Geoghegan wrote:
> On Thu, Jan 29, 2026 at 1:06 PM Andres Freund <[email protected]> wrote:
> > Wonder if - independent of this
> > issue - it could make sense to update the FSM during nbtree WAL recovery...
> 
> Maybe that would make sense. But I tend to think that we should have a
> fully atomic, crash-safe approach to free space management.


I agree that would be nice, but realistically (as you also say below) that
would have to be embedded into the WAL records that use the page that was
acquired from the FSM.  Maybe we could accept a dedicated WAL record for the
index case, but certainly not in the heap case.

Given that we'd need to embed the record somehow anyway, just adding, for now,
a RecordUsedIndexPage() to the redo of XLOG_BTREE_SPLIT* and
XLOG_BTREE_NEWROOT or such could make sense...

It doesn't seem like it'd be great to have a completely outdated index fsm
after a failover. If the index FSM on the newly promoted node is completely
outdated, due to having been copied at a much earlier time while there were a
lot of free pages, a _bt_allocbuf() could take quite a while...

I'm somewhat surprised it doesn't cause more performance issues to keep btree
pages exclusively locked while extending the relation... If that has to write
out pages and flush the WAL...


> Particularly in index AMs, where free space can only ever come in
> BLCKSZ units -- the data structure/concurrency rules can be a lot
> simpler if it only has to accommodate index AM requirements. Maybe the
> WAL-logging could be built into existing index AM record types.

Yea, I have my doubt that makes sense to share code between the index and heap
use cases. I doubt that having one FSM implementation support variable amount
of "space tracking granularity" really makes sense.

Greetings,

Andres Freund

Re: Buffer locking is special (hints, checksums, AIO writes)

Reply via email to