Re: speed up a logical replica setup

Noah Misch Wed, 26 Jun 2024 12:21:33 -0700

On Tue, Jun 25, 2024 at 09:50:59PM -0300, Euler Taveira wrote:
> On Tue, Jun 25, 2024, at 3:24 AM, Amit Kapila wrote:
> > On Tue, Jun 25, 2024 at 3:38 AM Noah Misch <n...@leadboat.com> wrote:
> > > On Mon, Jun 24, 2024 at 05:20:21PM +0530, Amit Kapila wrote:
> > > > On Sun, Jun 23, 2024 at 11:52 AM Noah Misch <n...@leadboat.com> wrote:
> > > > > > +static void
> > > > > > +create_publication(PGconn *conn, struct LogicalRepInfo *dbinfo)
> > > > > > +{
> > > > >
> > > > > > +     appendPQExpBuffer(str, "CREATE PUBLICATION %s FOR ALL TABLES",
> > > > > > +                                       ipubname_esc);
> > > > >
> > > > > This tool's documentation says it "guarantees that no transaction 
> > > > > will be
> > > > > lost."  I tried to determine whether achieving that will require 
> > > > > something
> > > > > like the fix from
> > > > > https://postgr.es/m/flat/de52b282-1166-1180-45a2-8d8917ca7...@enterprisedb.com.
> > > > > (Not exactly the fix from that thread, since that thread has not 
> > > > > discussed the
> > > > > FOR ALL TABLES version of its race condition.)  I don't know.  On the 
> > > > > one
> > > > > hand, pg_createsubscriber benefits from creating a logical slot after 
> > > > > creating
> > > > > the publication.  That snapbuild.c process will wait for running 
> > > > > XIDs.  On the
> > > > > other hand, an INSERT/UPDATE/DELETE acquires its RowExclusiveLock and 
> > > > > builds
> > > > > its relcache entry before assigning an XID, so perhaps the 
> > > > > snapbuild.c process
> > >
> > > Correction: it doesn't matter how the original INSERT/UPDATE/DELETE 
> > > builds its
> > > relcache entry, just how pgoutput of the change builds the relcache entry 
> > > from
> > > the historic snapshot.
> > >
> > > > > isn't enough to prevent that thread's race condition.  What do you 
> > > > > think?
> > > >
> > > > I am not able to imagine how the race condition discussed in the
> > > > thread you quoted can impact this patch. The problem discussed is
> > > > mainly the interaction when we are processing the changes in logical
> > > > decoding w.r.t concurrent DDL (Alter Publication ... Add Table). The
> > > > problem happens because we use the old cache state.
> > >
> > > Right.  Taking the example from
> > > http://postgr.es/m/20231119021830.d6t6aaxtrkpn7...@awork3.anarazel.de, 
> > > LSNs
> > > between what that mail calls 4) and 5) are not safely usable as start 
> > > points.
> > > pg_createsubscriber evades that thread's problem if the consistent_lsn it
> > > passes to pg_replication_origin_advance() can't be in a bad-start-point 
> > > LSN
> > > span.  I cautiously bet the snapbuild.c process achieves that:
> > >
> > > > I am missing your
> > > > point about the race condition mentioned in the thread you quoted with
> > > > snapbuild.c. Can you please elaborate a bit more?
> > >
> > > When pg_createsubscriber calls pg_create_logical_replication_slot(), the 
> > > key
> > > part starts at:
> > >
> > >         /*
> > >          * If caller needs us to determine the decoding start point, do 
> > > so now.
> > >          * This might take a while.
> > >          */
> > >         if (find_startpoint)
> > >                 DecodingContextFindStartpoint(ctx);
> > >
> > > Two factors protect pg_createsubscriber.  First, (a) CREATE PUBLICATION
> > > committed before pg_create_logical_replication_slot() started.  Second, 
> > > (b)
> > > DecodingContextFindStartpoint() waits for running XIDs to complete, via 
> > > the
> > > process described at the snapbuild.c "starting up in several stages" 
> > > diagram.
> > > Hence, the consistent_lsn is not in a bad-start-point LSN span.  It's fine
> > > even if the original INSERT populated all caches before CREATE PUBLICATION
> > > started and managed to assign an XID only after consistent_lsn.  From the
> > > pgoutput perspective, that's indistinguishable from the transaction 
> > > starting
> > > at its first WAL record, after consistent_lsn.  The linked "long-standing 
> > > data
> > > loss bug in initial sync of logical replication" thread doesn't have (a),
> > > hence its bug.  How close is that to accurate?
> > 
> > Yeah, this theory sounds right to me. The key point is that no DML
> > (processing of WAL corresponding to DML) before CREATE PUBLICATION ...
> > command would have reached pgoutput level because we would have waited
> > for it during snapbuild.c. Can we conclude that the race condition
> > discussed in the other thread won't impact this patch?
> 
> As Noah said the key point is the CREATE PUBLICATION *before* creating the
> replication slots -- that wait transactions to complete.


Let's consider the transaction loss topic concluded.  Thanks for contemplating
it.  I've added an open item for the "dbname containing a space" topic.

Re: speed up a logical replica setup

Reply via email to