On Thu, Aug 17, 2023 at 10:31 PM Amit Kapila <amit.kapil...@gmail.com> wrote:
>
> On Thu, Aug 17, 2023 at 6:07 PM Masahiko Sawada <sawada.m...@gmail.com> wrote:
> >
> > On Tue, Aug 15, 2023 at 12:06 PM Amit Kapila <amit.kapil...@gmail.com> 
> > wrote:
> > >
> > > On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada <sawada.m...@gmail.com> 
> > > wrote:
> > > >
> > > > On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila <amit.kapil...@gmail.com> 
> > > > wrote:
> > > > >
> > > > > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada 
> > > > > <sawada.m...@gmail.com> wrote:
> > > > > > Another idea is (which might have already discussed thoguh) that we 
> > > > > > check if the latest shutdown checkpoint LSN in the control file 
> > > > > > matches the confirmed_flush_lsn in pg_replication_slots view. That 
> > > > > > way, we can ensure that the slot has consumed all WAL records 
> > > > > > before the last shutdown. We don't need to worry about WAL records 
> > > > > > generated after starting the old cluster during the upgrade, at 
> > > > > > least for logical replication slots.
> > > > > >
> > > > >
> > > > > Right, this is somewhat closer to what Patch is already doing. But
> > > > > remember in this case we need to remember and use the latest
> > > > > checkpoint from the control file before the old cluster is started
> > > > > because otherwise the latest checkpoint location could be even updated
> > > > > during the upgrade. So, instead of reading from WAL, we need to change
> > > > > so that we rely on the control file's latest LSN.
> > > >
> > > > Yes, I was thinking the same idea.
> > > >
> > > > But it works for only replication slots for logical replication. Do we
> > > > want to check if no meaningful WAL records are generated after the
> > > > latest shutdown checkpoint, for manually created slots (or non-logical
> > > > replication slots)? If so, we would need to have something reading WAL
> > > > records in the end.
> > > >
> > >
> > > This feature only targets logical replication slots. I don't see a
> > > reason to be different for manually created logical replication slots.
> > > Is there something particular that you think we could be missing?
> >
> > Sorry I was not clear. I meant the logical replication slots that are
> > *not* used by logical replication, i.e., are created manually and used
> > by third party tools that periodically consume decoded changes. As we
> > discussed before, these slots will never be able to pass that
> > confirmed_flush_lsn check.
> >
>
> I think normally one would have a background process to periodically
> consume changes. Won't one can use the walsender infrastructure for
> their plugins to consume changes probably by using replication
> protocol?

Not sure.

> Also, I feel it is the plugin author's responsibility to
> consume changes or advance slot to the required position before
> shutdown.

How does the plugin author ensure that the slot consumes all WAL
records including shutdown_checkpoint before shutdown?

>
> > After some thoughts, one thing we might
> > need to consider is that in practice, the upgrade project is performed
> > during the maintenance window and has a backup plan that revert the
> > upgrade process, in case something bad happens. If we require the
> > users to drop such logical replication slots, they cannot resume to
> > use the old cluster in that case, since they would need to create new
> > slots, missing some changes.
> >
>
> Can't one keep the backup before removing slots?

Yes, but restoring the back could take time.

>
> > Other checks in pg_upgrade seem to be
> > compatibility checks that would eventually be required for the upgrade
> > anyway. Do we need to consider this case? For example, we do that
> > confirmed_flush_lsn check for only the slots with pgoutput plugin.
> >
>
> I think one is allowed to use pgoutput plugin even for manually
> created slots. So, such a check may not work.

Right, but I thought it's a very rare case.

Since the slot's flushed_confirmed_lsn check is not a compatibility
check unlike the existing check, I wonder if we can make it optional.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to