Here are some review comments for patch v63-0003. ====== Commit Message
1. This patch attempts to start slot-sync worker as a special process which is neither a bgworker nor an Auxiliary process. The benefit we get here is we can control the start-conditions of the worker which further allows us to 'enable_syncslot' as PGC_SIGHUP which was otherwise a PGC_POSTMASTER GUC when slotsync worker was registered as bgworker. ~ missing word? /allows us to/allows us to define/ ====== src/backend/postmaster/postmaster.c 2. process_pm_child_exit + /* + * Was it the slot sync worker? Normal exit or FATAL exit (FATAL can + * be caused by libpqwalreceiver on receiving shutdown request by the + * startup process during promotion) can be ignored; we'll start a new + * one at the next iteration of the postmaster's main loop, if + * necessary. Any other exit condition is treated as a crash. + */ + if (pid == SlotSyncWorkerPID) + { + SlotSyncWorkerPID = 0; + if (!EXIT_STATUS_0(exitstatus) && !EXIT_STATUS_1(exitstatus)) + HandleChildCrash(pid, exitstatus, + _("Slotsync worker process")); + continue; + } 2a. I think the 2nd sentence is easier to read if written like: Normal exit or FATAL exit can be ignored (FATAL can be caused by libpqwalreceiver on receiving shutdown request by the startup process during promotion); ~ 2b. All other names nearby are lowercase so maybe change "Slotsync worker process" to ""slotsync worker process" or ""slot sync worker process". ====== src/backend/replication/logical/slotsync.c 3. check_primary_info if (!valid) - ereport(ERROR, + { + *primary_slot_invalid = true; + ereport(LOG, errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("exiting from slot synchronization due to bad configuration"), + errmsg("skipping slot synchronization due to bad configuration"), /* translator: second %s is a GUC variable name */ errdetail("The primary server slot \"%s\" specified by %s is not valid.", PrimarySlotName, "primary_slot_name")); + } Somehow it seems more appropriate for the *caller* to decide what to do (e.g. "skipping...") when the primary slot is invalid. See also the next review comment #4b -- maybe just change this LOG to say "bad configuration for slot synchronization". ~~~ 4. /* * Check that all necessary GUCs for slot synchronization are set - * appropriately. If not, raise an ERROR. + * appropriately. If not, log the message and pass 'valid' as false + * to the caller. * * If all checks pass, extracts the dbname from the primary_conninfo GUC and * returns it. */ static char * -validate_parameters_and_get_dbname(void) +validate_parameters_and_get_dbname(bool *valid) 4a. This feels back-to-front. I think a "validate" function should return boolean. It can return the dbname as a side-effect only when it is valid. SUGGESTION static boolean validate_parameters_and_get_dbname(char *dbname) ~ 4b. It was a bit different when there were ERRORs but now they are LOGs somehow it seems wrong for this function to say what the *caller* will do. Maybe you can rewrite all the errmsg so the don't say "skipping" but they just say "bad configuration for slot synchronization" If valid is false then you can LOG "skipping" at the caller... ~~~ 5. wait_for_valid_params_and_get_dbname + dbname = validate_parameters_and_get_dbname(&valid); + if (valid) + break; + else This code will be simpler when the function is change to return boolean as suggested above in #4a. Also the 'else' is unnecessary. SUGGESTION if (validate_parameters_and_get_dbname(&dbname) break; ~ 6. + if (rc & WL_LATCH_SET) + ResetLatch(MyLatch); + + } + } Unnecessary blank line. ~~~ 7. slotsync_reread_config + if (old_enable_syncslot != enable_syncslot) + { + /* + * We have reached here, so old value must be true and new must be + * false. + */ + Assert(old_enable_syncslot); + Assert(!enable_syncslot); I felt it would be better just to say Assert(enable_syncslot); at the top of this function (before the ProcessConfigFile). Then none of this other comment/assert if really needed because it should be self-evident. ~~~ 8. StartSlotSyncWorker int StartSlotSyncWorker(void) { pid_t pid; #ifdef EXEC_BACKEND switch ((pid = slotsyncworker_forkexec())) #else switch ((pid = fork_process())) #endif { case -1: ereport(LOG, (errmsg("could not fork slot sync worker process: %m"))); return 0; #ifndef EXEC_BACKEND case 0: /* in postmaster child ... */ InitPostmasterChild(); /* Close the postmaster's sockets */ ClosePostmasterPorts(false); ReplSlotSyncWorkerMain(0, NULL); break; #endif default: return (int) pid; } /* shouldn't get here */ return 0; } The switch code can be rearranged so you don't need the #ifndef SUGGESTION #ifdef EXEC_BACKEND switch ((pid = slotsyncworker_forkexec())) { #else switch ((pid = fork_process())) { case 0: /* in postmaster child ... */ InitPostmasterChild(); /* Close the postmaster's sockets */ ClosePostmasterPorts(false); ReplSlotSyncWorkerMain(0, NULL); break; #endif case -1: ereport(LOG, (errmsg("could not fork slot sync worker process: %m"))); return 0; default: return (int) pid; } ====== src/backend/storage/lmgr/proc.c 9. InitProcess * this; it probably should.) + * + * Slot sync worker does not participate in it, see comments atop Backend. */ - if (IsUnderPostmaster && !IsAutoVacuumLauncherProcess()) + if (IsUnderPostmaster && !IsAutoVacuumLauncherProcess() && + !IsLogicalSlotSyncWorker()) MarkPostmasterChildActive(); 9a. /does not participate in it/also does not participate in it/ ~ 9b. It's not clear where "atop Backend" is referring to. ~~~ 10. * way, so tell the postmaster we've cleaned up acceptably well. (XXX * autovac launcher should be included here someday) */ - if (IsUnderPostmaster && !IsAutoVacuumLauncherProcess()) + if (IsUnderPostmaster && !IsAutoVacuumLauncherProcess() && + !IsLogicalSlotSyncWorker()) MarkPostmasterChildInactive(); Should this comment also be updated to mention slot sync worker? ====== src/backend/utils/activity/pgstat_io.c 11. pgstat_tracks_io_bktype case B_WAL_SENDER: + case B_SLOTSYNC_WORKER: return true; } Notice all the other enums were arrange in alphabetical order, so do the same here. ====== src/backend/utils/init/miscinit.c 12. GetBackendTypeDesc + case B_SLOTSYNC_WORKER: + backendDesc = "slotsyncworker"; + break; } All the other case are in alphabetical order, same as the enum values, so do the same here. ~~~ 13. InitializeSessionUserIdStandalone * This function should only be called in single-user mode, in autovacuum * workers, and in background workers. */ - Assert(!IsUnderPostmaster || IsAutoVacuumWorkerProcess() || IsBackgroundWorker); + Assert(!IsUnderPostmaster || IsAutoVacuumWorkerProcess() || + IsLogicalSlotSyncWorker() || IsBackgroundWorker); Looks like this Assert has a stale comment that should be updated. ====== src/include/miscadmin.h 14. GetBackendTypeDesc B_WAL_SUMMARIZER, + B_SLOTSYNC_WORKER, B_WAL_WRITER, } BackendType; It seems strange to jam this new value among the other B_WAL enums. Anyway, it looks like everything else is in alphabetical order, so we do that too. ====== Kind Regards, Peter Smith. Fujitsu Australia