On Wed, Dec 6, 2023 at 4:53 PM shveta malik <shveta.ma...@gmail.com> wrote: > > v43-001: > 1) Support of 'failover' dump in pg_dump. It was missing earlier. >
Review v43-0001 ================ 1. + * However, we do not enable failover for slots created by the table sync + * worker. This is because the table sync slot might not be fully synced on the + * standby. The reason for not enabling failover for table sync slots is not clearly mentioned. 2. During syncing, the local restart_lsn and/or local catalog_xmin of + * the newly created slot on the standby are typically ahead of those on the + * primary. Therefore, the standby needs to wait for the primary server's + * restart_lsn and catalog_xmin to catch up, which takes time. I think this part of the comment should be moved to 0002 patch. We can probably describe a bit more about why slot on standby will be ahead and about waiting time. 3. validate_standby_slots() { ... + slot = SearchNamedReplicationSlot(name, true); + + if (!slot) + goto ret_standby_slot_names_ng; + + if (!SlotIsPhysical(slot)) + { + GUC_check_errdetail("\"%s\" is not a physical replication slot", + name); + goto ret_standby_slot_names_ng; + } Why the first check (slot not found) doesn't have errdetail? The goto's in this function look a bit odd, can we try to avoid those? 4. + /* Verify syntax and parse string into list of identifiers */ + if (!SplitIdentifierString(rawname, ',', &elemlist)) + { + /* syntax error in name list */ + GUC_check_errdetail("List syntax is invalid."); ... ... + if (!SplitIdentifierString(standby_slot_names_cpy, ',', &standby_slots)) + { + /* This should not happen if GUC checked check_standby_slot_names. */ + elog(ERROR, "invalid list syntax"); Both are checking the same string but giving different error messages. I think the error message should be the same in both cases. The first one seems better. 5. In WalSndFilterStandbySlots(), the comments around else if checks should move inside the checks. It is hard to read the code in the current format. I have tried to change the same in the attached. Apart from the above, I have changed the comments and made some minor cosmetic changes in the attached. Kindly include in next version if you are fine with it. -- With Regards, Amit Kapila.
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 58e298af89..ef486ca92c 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1674,33 +1674,38 @@ WalSndFilterStandbySlots(XLogRecPtr wait_for_lsn, List **standby_slots) continue; } - - /* - * It may happen that the slot specified in standby_slot_names GUC - * value is dropped, so let's skip over it. - */ else if (!slot) + { + /* + * It may happen that the slot specified in standby_slot_names GUC + * value is dropped, so let's skip over it. + */ warningfmt = _("replication slot \"%s\" specified in parameter \"%s\" does not exist, ignoring"); - - /* - * If a logical slot name is provided in standby_slot_names, issue a - * WARNING and skip it. Although logical slots are disallowed in the - * GUC check_hook(validate_standby_slots), it is still possible for a - * user to drop an existing physical slot and recreate a logical slot - * with the same name. Since it is harmless, a WARNING should be - * enough, no need to error-out. - */ + } else if (SlotIsLogical(slot)) + { + /* + * If a logical slot name is provided in standby_slot_names, issue + * a WARNING and skip it. Although logical slots are disallowed in + * the GUC check_hook(validate_standby_slots), it is still possible + * for a user to drop an existing physical slot and recreate a + * logical slot with the same name. Since it is harmless, a WARNING + * should be enough, no need to error-out. + */ warningfmt = _("cannot have logical replication slot \"%s\" in parameter \"%s\", ignoring"); - - /* - * Specified physical slot may have been invalidated, so no point in - * waiting for it. - */ + } else if (XLogRecPtrIsInvalid(restart_lsn) || invalidated) + { + /* + * Specified physical slot may have been invalidated, so no point in + * waiting for it. + */ warningfmt = _("physical slot \"%s\" specified in parameter \"%s\" has been invalidated, ignoring"); + } else + { Assert(restart_lsn >= wait_for_lsn); + } /* * Reaching here indicates that either the slot has passed the @@ -1768,9 +1773,9 @@ WalSndWaitForStandbyConfirmation(XLogRecPtr wait_for_lsn) /* * Wait till WAL < loc is flushed to disk so it can be safely sent to client. * - * If the walsender holds a logical slot that has enabled failover, the - * function also waits for all the specified streaming replication standby - * servers to confirm receipt of WAL up to RecentFlushPtr. + * If the walsender holds a logical slot that has enabled failover, we also + * wait for all the specified streaming replication standby servers to + * confirm receipt of WAL up to RecentFlushPtr. * * Returns end LSN of flushed WAL. Normally this will be >= loc, but if we * detect a shutdown request (either from postmaster or client) we will return @@ -1790,7 +1795,7 @@ WalSndWaitForWal(XLogRecPtr loc) /* * Check if all the standby servers have confirmed receipt of WAL up to - * RecentFlushPtr if we already know we have enough WAL available. + * RecentFlushPtr even when we already know we have enough WAL available. * * Note that we cannot directly return without checking the status of * standby servers because the standby_slot_names may have changed, which @@ -1892,7 +1897,10 @@ WalSndWaitForWal(XLogRecPtr loc) wait_for_standby = true; } else + { + /* already caught up and doesn't need to wait for standby_slots */ break; + } /* Waiting for new WAL. Since we need to wait, we're now caught up. */ WalSndCaughtUp = true;