On Thu, Sep 28, 2023 at 10:44 AM Bharath Rupireddy <bharath.rupireddyforpostg...@gmail.com> wrote: > > On Mon, Sep 25, 2023 at 2:06 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > > > [1] > > > > https://www.postgresql.org/message-id/CAA4eK1%2BLtWDKXvxS7gnJ562VX%2Bs3C6%2B0uQWamqu%3DUuD8hMfORg%40mail.gmail.com > > > > > > I see. IIUC, without that commit e0b2eed [1], it may happen that the > > > slot's on-disk confirmed_flush LSN value can be higher than the WAL > > > LSN that's flushed to disk, no? > > > > > > > No, without that commit, there is a very high possibility that even if > > we have sent the WAL to the subscriber and got the acknowledgment of > > the same, we would miss updating it before shutdown. This would lead > > to upgrade failures because upgrades have no way to later identify > > whether the remaining WAL records are sent to the subscriber. > > Thanks for clarifying. I'm trying understand what happens without > commit e0b2eed0 with an illustration: > > step 1: publisher - confirmed_flush LSN in replication slot on disk > structure is 80 > step 2: publisher - sends WAL at LSN 100 > step 3: subscriber - acknowledges the apply LSN or confirmed_flush LSN as 100 > step 4: publisher - shuts down without writing the new confirmed_flush > LSN as 100 to disk, note that commit e0b2eed0 is not in place > step 5: publisher - restarts > step 6: subscriber - upon publisher restart, the subscriber requests > WAL from publisher from LSN 100 as it tracks the last applied LSN in > replication origin > > Now, if the pg_upgrade with the patch in this thread is run on > publisher after step 4, it complains with "The slot \"%s\" has not > consumed the WAL yet". > > Is my above understanding right? >
Yes. -- With Regards, Amit Kapila.