We conducted stress testing for the patch with a setup of one primary node with 100 tables and five subscribers, each having 20 subscriptions. Then created three physical standbys syncing the logical replication slots from the primary node. All 100 slots were successfully synced on all three standbys. We then ran the load and monitored LSN convergence using the prescribed SQL checks. Once the standbys were failover-ready, we were able to successfully promote one of the standbys and all the subscribers seamlessly migrated to the new primary node.
We replicated the tests with 200 tables, creating 200 logical replication slots. With the increased load, all the tests were completed successfully. Minor errors (not due to patch) observed during tests - 1) When the load was run, on subscribers, the logical replication apply workers started failing due to timeout. This is not related to the patch as it happened due to the small "wal_receiver_timeout" setting w.r.t. the load. To confirm, we ran the same load without the patch too, and the same failure happened. 2) There was a buffer overflow exception on the primary node with the '200 replication slots' case. It was not related to the patch as it was due to short memory configuration. All the tests were done on Windows as well as Linux environments. Thank you Ajin for the stress test and analysis on Linux.