Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

Matthias van de Meent Thu, 04 Dec 2025 08:33:17 -0800

On Thu, 4 Dec 2025 at 09:34, Antonin Houska <[email protected]> wrote:
>
> Matthias van de Meent <[email protected]> wrote:
>
> > On Mon, 1 Dec 2025 at 10:09, Antonin Houska <[email protected]> wrote:
> > >
> > > Matthias van de Meent <[email protected]> wrote:
> > >
> > > > I'm a bit worried, though, that LR may lose updates due to commit
> > > > order differences between WAL and PGPROC. I don't know how that's
> > > > handled in logical decoding, and can't find much literature about it
> > > > in the repo either.
> > >
> > > Can you please give me an example of this problem? I understand that two
> > > transactions do this
> > >
> > > T1: RecordTransactionCommit()
> > > T2: RecordTransactionCommit()
> > > T2: ProcArrayEndTransaction()
> > > T1: ProcArrayEndTransaction()
> > >
> > > but I'm failing to imagine this if both transactions are trying to update 
> > > the
> > > same row.
> >
> > Correct, it doesn't have anything to do with two transactions updating
> > the same row; but instead the same transaction getting applied twice;
> > related to issues described in (among others) [0]:
> > Logical replication applies transactions in WAL commit order, but
> > (normal) snapshots on the primary use the transaction's persistence
> > requirements (and procarray lock acquisition) as commit order.
> >
> > This can cause the snapshot to see T2 as committed before T1, whilst
> > logical replication will apply transactions in T1 -> T2 order. This
> > can break the exactly-once expectations of commits, because a normal
> > snapshot taken between T2 and T1 on the primary (i.e., T2 is
> > considered committed, but T1 not) will have T2 already applied. LR
> > would have to apply changes of T1, which also implies it'd eventually
> > get to T2's commit and apply that too. Alternatively, it'd skip past
> > T2 because that's already present in the snapshot, and lose the
> > changes that were committed with T1.
>
> ISTM that what you consider a problem is copying the table using PGPROC-based
> snapshot and applying logically decoded commits to the result - is that what
> you mean?


Correct.

> In fact, LR (and also REPACK) uses snapshots generated by the logical decoding
> system. The information on running/committed transactions is based here on
> replaying WAL, not on PGPROC.

OK, that's good to know. For reference, do you know where this is
documented, explained, or implemented?

I'm asking, because the code that I could find didn't seem use any
special snapshot (tablesync.c uses
`PushActiveSnapshot(GetTransactionSnapshot())`), and the other
reference to LR's snapshots (snapbuild.c, and inside
`GetTransactionSnapshot()`) explicitly said that its snapshots are
only to be used for catalog lookups, never for general-purpose
queries.


Kind regards,

Matthias van de Meent
Databricks (https://www.databricks.com)

Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

Reply via email to