Hi, When testing streaming replication with a physical slot. I found an unexpected behavior that the walsender could use an invalidated physical slot for streaming.
This occurs when the primary slot is invalidated due to reaching the max_slot_wal_keep_size before initializing the streaming replication (e.g. before creating the standby). Attach a draft script(test_invalidated_slot.sh) which can reproduce this. Once the slot is invalidated, it can no longer protect the WALs and Rows, as these invalidated slots are not considered in functions like ReplicationSlotsComputeRequiredXmin(). Besides, the walsender could advance the restart_lsn of an invalidated slot, then user won't be able to know that if the slot is actually validated or not, because the 'conflicting' of view pg_replication_slot could be set back to null. So, I think it's a bug and one idea to fix is to check the validity of the physical slot in StartReplication() after acquiring the slot like what the attachment does, what do you think ? Best Regards, Hou Zhijie
0001-Forbid-the-use-of-invalidated-slot-in-streaming-repl.patch
Description: 0001-Forbid-the-use-of-invalidated-slot-in-streaming-repl.patch
test_invalidated_slot.sh
Description: test_invalidated_slot.sh