On Thu, Aug 17, 2023 at 10:31 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Thu, Aug 17, 2023 at 6:07 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > > > On Tue, Aug 15, 2023 at 12:06 PM Amit Kapila <amit.kapil...@gmail.com> > > wrote: > > > > > > On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila <amit.kapil...@gmail.com> > > > > wrote: > > > > > > > > > > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada > > > > > <sawada.m...@gmail.com> wrote: > > > > > > Another idea is (which might have already discussed thoguh) that we > > > > > > check if the latest shutdown checkpoint LSN in the control file > > > > > > matches the confirmed_flush_lsn in pg_replication_slots view. That > > > > > > way, we can ensure that the slot has consumed all WAL records > > > > > > before the last shutdown. We don't need to worry about WAL records > > > > > > generated after starting the old cluster during the upgrade, at > > > > > > least for logical replication slots. > > > > > > > > > > > > > > > > Right, this is somewhat closer to what Patch is already doing. But > > > > > remember in this case we need to remember and use the latest > > > > > checkpoint from the control file before the old cluster is started > > > > > because otherwise the latest checkpoint location could be even updated > > > > > during the upgrade. So, instead of reading from WAL, we need to change > > > > > so that we rely on the control file's latest LSN. > > > > > > > > Yes, I was thinking the same idea. > > > > > > > > But it works for only replication slots for logical replication. Do we > > > > want to check if no meaningful WAL records are generated after the > > > > latest shutdown checkpoint, for manually created slots (or non-logical > > > > replication slots)? If so, we would need to have something reading WAL > > > > records in the end. > > > > > > > > > > This feature only targets logical replication slots. I don't see a > > > reason to be different for manually created logical replication slots. > > > Is there something particular that you think we could be missing? > > > > Sorry I was not clear. I meant the logical replication slots that are > > *not* used by logical replication, i.e., are created manually and used > > by third party tools that periodically consume decoded changes. As we > > discussed before, these slots will never be able to pass that > > confirmed_flush_lsn check. > > > > I think normally one would have a background process to periodically > consume changes. Won't one can use the walsender infrastructure for > their plugins to consume changes probably by using replication > protocol?
Not sure. > Also, I feel it is the plugin author's responsibility to > consume changes or advance slot to the required position before > shutdown. How does the plugin author ensure that the slot consumes all WAL records including shutdown_checkpoint before shutdown? > > > After some thoughts, one thing we might > > need to consider is that in practice, the upgrade project is performed > > during the maintenance window and has a backup plan that revert the > > upgrade process, in case something bad happens. If we require the > > users to drop such logical replication slots, they cannot resume to > > use the old cluster in that case, since they would need to create new > > slots, missing some changes. > > > > Can't one keep the backup before removing slots? Yes, but restoring the back could take time. > > > Other checks in pg_upgrade seem to be > > compatibility checks that would eventually be required for the upgrade > > anyway. Do we need to consider this case? For example, we do that > > confirmed_flush_lsn check for only the slots with pgoutput plugin. > > > > I think one is allowed to use pgoutput plugin even for manually > created slots. So, such a check may not work. Right, but I thought it's a very rare case. Since the slot's flushed_confirmed_lsn check is not a compatibility check unlike the existing check, I wonder if we can make it optional. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com