On Mon, Jun 2, 2025 at 10:55 AM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Fri, May 30, 2025 at 3:00 PM Nisha Moond <nisha.moond...@gmail.com> wrote: > > > > Agree that we need to cover the simple pg_dump and pg_restore with the > > patch. > > > > When pg_dump and pg_restore are used outside of pg_upgrade, there's no > > guarantee that the target system does not have any prepared > > transactions. In such cases, restoring a subscription with both > > two_phase and failover enabled could lead to the bug, so we should > > avoid allowing both options via pg_restore. > > > > Here are a few possible solutions: > > > > 1) Split the CREATE SUBSCRIPTION command of such subscriptions in > > pg_restore : > > first create the sub with two_phase: > > - CREATE SUBSCRIPTION ... with (two_phase = on) > > then enable failover > > - ALTER SUBSCRIPTION ... with (failover=true) > > > > This won't work because we always dump a subscription with the > 'connect' option as 'false'. As changing the subscription's failover > option, needs the connection above split won't work. > > > However, this approach adds complexity in pg_restore and may still > > fail if the slot's restart_lsn is earlier than two_phase_at. > > > > 2) Prevent dump of subscription in non-upgrade mode: > > Raise an error/warning during pg_dump when a subscription with both > > two_phase and failover is encountered (unless in binary upgrade mode). > > That is, either fail the pg_dump, or skip dumping subscriptions in > > such a case with a warning. > > We can include a helpful hint: > > "Disable failover for subscription <sub-name> before dumping and > > re-enable it after restore." > > > > 3) Dump such subscriptions with (two_phase=on, failover=on, > > create_slot=false) together. As pg_dump always sets "connect=false" > > for subscription, it is up to users to reactivate the subscription > > suitably. In this case, we can add some document to warn that it's the > > user's responsibility to ensure the remote slot is in a safe state. > > > > What do you think? > > > > Please let me know if you have any other suggestions for handling this > > in pg_dump/pg_restore. > > > > Yet another idea is to dump the subscription with two_phase = on and > failover = false. We should do this when both options are 'true' > during the dump. As we are documenting that we always dump > createsubscription with connect as false and let users take care (see > [1] (When dumping logical replication subscriptions ...)), a similar > reasoning could be given for the failover flag. >
+1 > The one more combination to consider is when someone takes a dump of > an older version and loads it into a newer version. For example, where > users dump from 17.5 and then restore in a newer version, say 17.6 > (which has our fix), the restore will fail due to newer restrictions > added by this patch. Do we need to do anything about it? > > [1] : https://www.postgresql.org/docs/devel/app-pgdump.html > ~~~ Attached v17 patches. Added a top-up patch 0002 implementing the idea suggested by Amit above. Thanks, Nisha
v17-0001-PG17-Approach-3-Fix-slot-synchronization-for-two.patch
Description: Binary data
v17-0002-Fix-for-pg_dump.patch
Description: Binary data