Dear Vignesh, Thanks for raising idea!
> a) Introduce a new latch to handle worker attach and exit. Just to confirm - there are three wait events for launchers, so I feel we may be able to create latches per wait event. Is there a reason to introduce "a" latch? > b) Add a new GUC launcher_retry_time which gives more flexibility to > users as suggested by Amit at [1]. Before 5a3a953, the > wal_retrieve_retry_interval plays a similar role as the suggested new > GUC launcher_retry_time, e.g. even if a worker is launched, the > launcher only wait wal_retrieve_retry_interval time before next round. Hmm. My concern is how users estimate the value. Maybe the default will be 3min, but should users change it? If so, how long? I think even if it becomes tunable, they cannot control well. > c) Don't reset the latch at worker attach and allow launcher main to > identify and handle it. For this there is a patch v6-0002 available at > [2]. Does it mean that you want to remove ResetLatch() from WaitForReplicationWorkerAttach(), right? If so, what about the scenario? 1) The launcher waiting the worker is attached in WaitForReplicationWorkerAttach(), and 2) subscription is created before attaching. In this case, the launcher will become un-sleepable because the latch is set but won't be reset. It may waste the CPU time. Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/