Hi, On 2022-02-07 13:38:38 +0530, Ashutosh Sharma wrote: > Are you talking about this scenario - what if the logical replication > slot on the publisher is dropped, but is being referenced by the > standby where the slot is synchronized?
It's a bit hard to say, because neither in this thread nor in the patch I've found a clear description of what the syncing needs to & tries to guarantee. It might be that that was discussed in one of the precursor threads, but... Generally I don't think we can permit scenarios where a slot can be in a "corrupt" state, i.e. missing required catalog entries, after "normal" administrative commands (i.e. not mucking around in catalog entries / on-disk files). Even if the sequence of commands may be a bit weird. All such cases need to be either prevented or detected. As far as I can tell, the way this patch keeps slots on physical replicas "valid" is solely by reorderbuffer.c blocking during replay via wait_for_standby_confirmation(). Which means that if e.g. the standby_slot_names GUC differs from synchronize_slot_names on the physical replica, the slots synchronized on the physical replica are not going to be valid. Or if the primary drops its logical slots. > Should the redo function for the drop replication slot have the capability > to drop it on standby and its subscribers (if any) as well? Slots are not WAL logged (and shouldn't be). I think you pretty much need the recovery conflict handling infrastructure I referenced upthread, which recognized during replay if a record has a conflict with a slot on a standby. And then ontop of that you can build something like this patch. Greetings, Andres Freund