On Fri, Jun 16, 2023 at 3:26 PM Amit Kapila <amit.kapil...@gmail.com> wrote: > > On Mon, Apr 17, 2023 at 7:37 PM Drouvot, Bertrand > <bertranddrouvot...@gmail.com> wrote: > > > > Please find attached V5 (a rebase of V4 posted up-thread). > > > > In addition to the "rebasing" work, the TAP test adds a test about conflict > > handling (logical slot invalidation) > > relying on the work done in the "Minimal logical decoding on standby" patch > > series. > > > > I did not look more at the patch (than what's was needed for the rebase) > > but plan to do so. > > > > Are you still planning to continue working on this? Some miscellaneous > comments while going through this patch are as follows? > > 1. Can you please try to explain the functionality of the overall > patch somewhere in the form of comments and or commit message?
IIUC, there are 2 core ideas of the feature: 1) It will never let the logical replication subscribers go ahead of physical replication standbys specified in standby_slot_names. It implements this by delaying decoding of commit records on the walsenders corresponding to logical replication subscribers on the primary until all the specified standbys confirm receiving the commit LSN. 2) The physical replication standbys will synchronize data of the specified logical replication slots (in synchronize_slot_names) from the primary, creating the logical replication slots if necessary. Since the logical replication subscribers will never go out of physical replication standbys, the standbys can safely synchronize the slots and keep the data necessary for subscribers to connect to it and work seamlessly even after a failover. If my understanding is right, I have few thoughts here: 1. All the logical walsenders are delayed on the primary - per wait_for_standby_confirmation() despite the user being interested in only a few of them via synchronize_slot_names. Shouldn't the delay be for just the slots specified in synchronize_slot_names? 2. I think we can split the patch like this - 0001 can be the logical walsenders delaying decoding on the primary unless standbys confirm, 0002 standby synchronizing the logical slots. 3. I think we need to change the GUC standby_slot_names to better reflect what it is used for - wait_for_replication_slot_names or wait_for_ 4. It allows specifying logical slots in standby_slot_names, meaning, it can disallow logical slots getting ahead of other logical slots specified in standby_slot_names. Should we allow this case with the thinking that if there's anyone using logical replication for failover (well, will anybody do that in production?). 5. Similar to above, it allows specifying physical slots in synchronize_slot_names. Should we disallow? I'm attaching the v6 patch, a rebased version of v5. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
v6-0001-Synchronize-logical-replication-slots-from-primar.patch
Description: Binary data