On Thu, Sep 28, 2023 at 10:44 AM Bharath Rupireddy
<bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Mon, Sep 25, 2023 at 2:06 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
> >
> > > > [1] 
> > > > https://www.postgresql.org/message-id/CAA4eK1%2BLtWDKXvxS7gnJ562VX%2Bs3C6%2B0uQWamqu%3DUuD8hMfORg%40mail.gmail.com
> > >
> > > I see. IIUC, without that commit e0b2eed [1], it may happen that the
> > > slot's on-disk confirmed_flush LSN value can be higher than the WAL
> > > LSN that's flushed to disk, no?
> > >
> >
> > No, without that commit, there is a very high possibility that even if
> > we have sent the WAL to the subscriber and got the acknowledgment of
> > the same, we would miss updating it before shutdown. This would lead
> > to upgrade failures because upgrades have no way to later identify
> > whether the remaining WAL records are sent to the subscriber.
>
> Thanks for clarifying. I'm trying understand what happens without
> commit e0b2eed0 with an illustration:
>
> step 1: publisher - confirmed_flush LSN  in replication slot on disk
> structure is 80
> step 2: publisher - sends WAL at LSN 100
> step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 100
> step 4: publisher - shuts down without writing the new confirmed_flush
> LSN as 100 to disk, note that commit e0b2eed0 is not in place
> step 5: publisher - restarts
> step 6: subscriber - upon publisher restart, the subscriber requests
> WAL from publisher from LSN 100 as it tracks the last applied LSN in
> replication origin
>
> Now, if the pg_upgrade with the patch in this thread is run on
> publisher after step 4, it complains with "The slot \"%s\" has not
> consumed the WAL yet".
>
> Is my above understanding right?
>

Yes.


-- 
With Regards,
Amit Kapila.


Reply via email to