On Thu, Aug 17, 2023 at 6:07 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > On Tue, Aug 15, 2023 at 12:06 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > > > On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada <sawada.m...@gmail.com> > > wrote: > > > > > > On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila <amit.kapil...@gmail.com> > > > wrote: > > > > > > > > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada <sawada.m...@gmail.com> > > > > wrote: > > > > > Another idea is (which might have already discussed thoguh) that we > > > > > check if the latest shutdown checkpoint LSN in the control file > > > > > matches the confirmed_flush_lsn in pg_replication_slots view. That > > > > > way, we can ensure that the slot has consumed all WAL records before > > > > > the last shutdown. We don't need to worry about WAL records generated > > > > > after starting the old cluster during the upgrade, at least for > > > > > logical replication slots. > > > > > > > > > > > > > Right, this is somewhat closer to what Patch is already doing. But > > > > remember in this case we need to remember and use the latest > > > > checkpoint from the control file before the old cluster is started > > > > because otherwise the latest checkpoint location could be even updated > > > > during the upgrade. So, instead of reading from WAL, we need to change > > > > so that we rely on the control file's latest LSN. > > > > > > Yes, I was thinking the same idea. > > > > > > But it works for only replication slots for logical replication. Do we > > > want to check if no meaningful WAL records are generated after the > > > latest shutdown checkpoint, for manually created slots (or non-logical > > > replication slots)? If so, we would need to have something reading WAL > > > records in the end. > > > > > > > This feature only targets logical replication slots. I don't see a > > reason to be different for manually created logical replication slots. > > Is there something particular that you think we could be missing? > > Sorry I was not clear. I meant the logical replication slots that are > *not* used by logical replication, i.e., are created manually and used > by third party tools that periodically consume decoded changes. As we > discussed before, these slots will never be able to pass that > confirmed_flush_lsn check. >
I think normally one would have a background process to periodically consume changes. Won't one can use the walsender infrastructure for their plugins to consume changes probably by using replication protocol? Also, I feel it is the plugin author's responsibility to consume changes or advance slot to the required position before shutdown. > After some thoughts, one thing we might > need to consider is that in practice, the upgrade project is performed > during the maintenance window and has a backup plan that revert the > upgrade process, in case something bad happens. If we require the > users to drop such logical replication slots, they cannot resume to > use the old cluster in that case, since they would need to create new > slots, missing some changes. > Can't one keep the backup before removing slots? > Other checks in pg_upgrade seem to be > compatibility checks that would eventually be required for the upgrade > anyway. Do we need to consider this case? For example, we do that > confirmed_flush_lsn check for only the slots with pgoutput plugin. > I think one is allowed to use pgoutput plugin even for manually created slots. So, such a check may not work. -- With Regards, Amit Kapila.