On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot <bertranddrouvot...@gmail.com> wrote: > > Hi, > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > > Please find v8 attached. Changes are: > > Thanks! > > A few comments:
Thanks for reviewing. > 1 === > > @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t > startup_data_len) > * slotsync_worker_onexit() but that will need the connection to be > made > * global and we want to avoid introducing global for this purpose. > */ > - before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn)); > + before_shmem_exit(slotsync_worker_disconnect, > PointerGetDatum(wrconn)); > > The comment above this change still states "Register the failure callback once > we have the connection", I think it has to be reworded a bit now that v8 is > making use of slotsync_worker_disconnect(). > > 2 === > > + * Register slotsync_worker_onexit() before we register > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during > exit of > + * slot sync worker, ReplicationSlotShmemExit() is called first, > followed > + * by slotsync_worker_onexit(). Startup process during promotion > waits for > > Worth to mention in shmem_exit() (where it "while (--before_shmem_exit_index > >= 0)" > or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies on > this LIFO behavior? (not sure if there is other "strong" LIFO requirement in > other part of the code). I see other modules as well relying on LIFO behavior. Please see applyparallelworker.c where 'before_shmem_exit(pa_shutdown)' is needed to be done after 'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6). Also in postinit.c, I see such comments atop 'before_shmem_exit(ShutdownPostgres, 0)'. I feel we can skip adding this specific comment about ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has also not added any. I will address the rest of your comments in the next version. > 3 === > > + * Startup process during promotion waits for slot sync to finish and > it > + * does that by checking the 'syncing' flag. > > worth to mention ShutDownSlotSync()? > > 4 === > > I did a few tests manually (launching ShutDownSlotSync() through gdb / with > and > without sync worker and with / without pg_sync_replication_slots() running > concurrently) and it looks like it works as designed. Thanks for testing it. > Having said that, the logic that is in place to take care of the corner cases > described up-thread seems reasonable to me. thanks Shveta