Re: Correcting freeze conflict horizon calculation

Peter Geoghegan Mon, 09 Mar 2026 13:15:40 -0700

On Wed, Mar 4, 2026 at 11:02 AM Melanie Plageman
<[email protected]> wrote:
> > The important principle here is that we don't need a recovery conflict
> > to handle cleanup after an aborted update/delete, regardless of the
> > details. This is a logical consequence of the fact that an aborted
> > transaction "never existed in the logical database".
>
> > > Or are there other ways you can have an
> > > xmax older than OldestXmin?
> >
> > Again, are you talking about xmin or xmax? It's normal for
> > heap_prepare_freeze_tuple to see an xmax older than OldestXmin, last I
> > checked.
>
> Based on the following code in heap_prepare_freeze_tuple(), a normal
> xmax that is older than OldestXmin is assumed to be an aborted
> transaction -- which, as you say, does not need to affect recovery
> conflict at all.


I'm not sure that you need to add any new comments above
FreezeMultiXactId. The underlying principles that justify ignoring
xmax when it is a multi are exactly the same as those that apply when
xmax is a normal XID. I think that what you actually need is a single
comment block (maybe 2) near the start of or above
heap_prepare_freeze_tuple explaining your new snapshotConflictHorizon
maintenance code, mentioning:

* You always need to do such maintenance with an xmin < OldestXmin,
since it will always be frozen by the resulting freeze plan. This
relies on the existing assumption that heap_prepare_freeze_tuple can
never be passed a heap tuple created by an aborted transaction. (This
is obviously not true of an xmin >= OldestXmin, since those are not
eligible to be frozen.)

* You never need any specific snapshotConflictHorizon maintenance step
with *any* xmax, because:

1. If it is from an updater, it must have been from an updater that
aborted (otherwise, pruning would have removed the tuple, and
heap_prepare_freeze_tuple would never have seen it in the first place)
or from an updater that is still considered running (in which case we
shouldn't freeze xmax at all).

2. If it is from a locker, we don't need to consider queries running
on standbys at all, because they don't care about row-level locks held
on the primary. Such locks cannot affect tuple visibility on the
secondary. (heap_prepare_freeze_tuple has to be careful not to remove
lockers that are still needed on the primary, of course, but that's
out of scope here.)

You should perhaps note in passing that *both* points remain true even
if xmax is a multi. With a multi, you remove a subset (possibly all)
of the lockers and/or a single updater. It's easy to see why point 1
is still true. It's a bit harder to see why point 2 remains true,
because the code in FreezeMultiXactId looks quite different from the
ordinary XID handling in its FreezeMultiXactId caller. Here's why
point 2 still holds:

If you remove the single updater, it must be an aborted updater (for
the usual reason). If you keep the updater, then it must be an updater
that is >= OldestXmin/an updater that's still considered to be running
on the primary. (Actually, FreezeMultiXactId is a bit more precise
about the definition of "XID still running" when deciding whether to
remove or keep an XID -- it *can* remove a committed updater >=
OldestXmin, at the cost of checking if the XID is still running
directly, though at a high level it's effectively the same condition.)

This is all pretty closely related to how pruning works in general,
and how pruning generates snapshotConflictHorizon values in
particular. I wouldn't shy away from making that connection.

Maybe mention that we don't "freeze" an aborted xmax >= OldestXmin
when it happens to be a ordinary XID, though we really should, if only
to be as consistent as possible with what pruning *and*
FreezeMultiXactId already do. It would arguably be easier to
understand all this if heap_prepare_freeze_tuple removed every aborted
ordinary XID xmax indifferently (regardless of whether it came before
or after OldestXmin), but it doesn't work that way right now --
heap_prepare_freeze_tuple is the odd one out (kind of, it's also true
that FreezeMultiXactId is lazy about removing aborted updater XIDs >=
OldestXmin for performance reasons).

I'm trying not to be too prescriptive here; I just think that
emphasizing high-level logical database concepts over physical
database implementation details makes sense. I don't expect you to
follow what I've written here all too closely. I didn't have the time
to distill it down myself.

> > Don't forget about plain XIDs that end up as xmax due to a SELECT FOR
> > UPDATE. They usually don't result from aborted transactions.
>
> I assume that in the SELECT FOR UPDATE case, HEAP_XMAX_IS_LOCKED_ONLY
> would return true -- so this is a case where lockers don't affect the
> horizon (even though it is a normal xid and not a multi).

Right.

> I am trying to determine if I need to advance FreezePageConflictXid in
> the above case when freeze_xmax is true. So far, if the only xmaxes
> older than OldestXmin are from aborted update/deletes or SELECT FOR
> UPDATE, then it seems like I wouldn't need to advance the horizon when
> freeze_xmax is true.

Right. (Though of course it is *always* correct to remove an XID left
behind by an aborted xact, no matter how old or new that XID is.)

-- 
Peter Geoghegan

Re: Correcting freeze conflict horizon calculation

Reply via email to