Hi, On 2023-11-30 01:36:25 +0200, Heikki Linnakangas wrote: > - patch 1 splits CreateSharedMemoryAndSemaphores into two functions: > CreateSharedMemoryAndSemaphores is now only called at postmaster startup, > and a new function called AttachSharedMemoryStructs() is called in backends > in EXEC_BACKEND mode. I extracted the common parts of those functions to a > new static function. (Some of this refactoring used to be part of the 3rd > patch in the series, but it seems useful on its own, so I split it out.)
I like that idea. > From a96b6e92fdeaa947bf32774c425419b8f987b8e2 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Thu, 30 Nov 2023 00:01:25 +0200 > Subject: [PATCH v3 1/7] Refactor CreateSharedMemoryAndSemaphores > > For clarity, have separate functions for *creating* the shared memory > and semaphores at postmaster or single-user backend startup, and > for *attaching* to existing shared memory structures in EXEC_BACKEND > case. CreateSharedMemoryAndSemaphores() is now called only at > postmaster startup, and a new AttachSharedMemoryStructs() function is > called at backend startup in EXEC_BACKEND mode. I assume CreateSharedMemoryAndSemaphores() is still called during crash restart? I wonder if it shouldn't split three ways: 1) create 2) initialize 3) attach > From 3478cafcf74a5c8d649e0287e6c72669a29c0e70 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Thu, 30 Nov 2023 00:02:03 +0200 > Subject: [PATCH v3 2/7] Pass BackgroundWorker entry in the parameter file in > EXEC_BACKEND mode > > This makes it possible to move InitProcess later in SubPostmasterMain > (in next commit), as we no longer need to access shared memory to read > background worker entry. > static void read_backend_variables(char *id, Port *port); > @@ -4831,7 +4833,7 @@ SubPostmasterMain(int argc, char *argv[]) > strcmp(argv[1], "--forkavlauncher") == 0 || > strcmp(argv[1], "--forkavworker") == 0 || > strcmp(argv[1], "--forkaux") == 0 || > - strncmp(argv[1], "--forkbgworker=", 15) == 0) > + strncmp(argv[1], "--forkbgworker", 14) == 0) > PGSharedMemoryReAttach(); > else > PGSharedMemoryNoReAttach(); > @@ -4962,10 +4964,8 @@ SubPostmasterMain(int argc, char *argv[]) > > AutoVacWorkerMain(argc - 2, argv + 2); /* does not return */ > } > - if (strncmp(argv[1], "--forkbgworker=", 15) == 0) > + if (strncmp(argv[1], "--forkbgworker", 14) == 0) Now that we don't need to look at parameters anymore, these should probably be just a strcmp(), like the other cases? > From 0d071474e12a70ff8113c7b0731c5b97fec45007 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Wed, 29 Nov 2023 23:47:25 +0200 > Subject: [PATCH v3 3/7] Refactor how InitProcess is called > > The order of process initialization steps is now more consistent > between !EXEC_BACKEND and EXEC_BACKEND modes. InitProcess() is called > at the same place in either mode. We can now also move the > AttachSharedMemoryStructs() call into InitProcess() itself. This > reduces the number of "#ifdef EXEC_BACKEND" blocks. Yay. > diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c > index cdfdd6fbe1d..6c708777dde 100644 > --- a/src/backend/storage/lmgr/proc.c > +++ b/src/backend/storage/lmgr/proc.c > @@ -461,6 +461,12 @@ InitProcess(void) > */ > InitLWLockAccess(); > InitDeadLockChecking(); > + > +#ifdef EXEC_BACKEND > + /* Attach process to shared data structures */ > + if (IsUnderPostmaster) > + AttachSharedMemoryStructs(); > +#endif > } > > /* > @@ -614,6 +620,12 @@ InitAuxiliaryProcess(void) > * Arrange to clean up at process exit. > */ > on_shmem_exit(AuxiliaryProcKill, Int32GetDatum(proctype)); > + > +#ifdef EXEC_BACKEND > + /* Attach process to shared data structures */ > + if (IsUnderPostmaster) > + AttachSharedMemoryStructs(); > +#endif > } Aside: Somewhat odd that InitAuxiliaryProcess() doesn't call InitLWLockAccess(). I think a short comment explaining why we can attach to shmem structs after already accessing shared memory earlier in the function would be worthwhile. > From ce51876f87f1e4317e25baf64184749448fcd033 Mon Sep 17 00:00:00 2001 > From: Heikki Linnakangas <heikki.linnakan...@iki.fi> > Date: Thu, 30 Nov 2023 00:07:34 +0200 > Subject: [PATCH v3 7/7] Refactor postmaster child process launching > > - Move code related to launching backend processes to new source file, > launch_backend.c > > - Introduce new postmaster_child_launch() function that deals with the > differences between EXEC_BACKEND and fork mode. > > - Refactor the mechanism of passing information from the parent to > child process. Instead of using different command-line arguments when > launching the child process in EXEC_BACKEND mode, pass a > variable-length blob of data along with all the global variables. The > contents of that blob depend on the kind of child process being > launched. In !EXEC_BACKEND mode, we use the same blob, but it's simply > inherited from the parent to child process. > > [...] > 33 files changed, 1787 insertions(+), 2002 deletions(-) Well, that's not small... I think it may be worth splitting some of the file renaming out into a separate commit, makes it harder to see what changed here. > +AutoVacLauncherMain(char *startup_data, size_t startup_data_len) > { > - pid_t AutoVacPID; > + sigjmp_buf local_sigjmp_buf; > > -#ifdef EXEC_BACKEND > - switch ((AutoVacPID = avlauncher_forkexec())) > -#else > - switch ((AutoVacPID = fork_process())) > -#endif > + /* Release postmaster's working memory context */ > + if (PostmasterContext) > { > - case -1: > - ereport(LOG, > - (errmsg("could not fork autovacuum > launcher process: %m"))); > - return 0; > - > -#ifndef EXEC_BACKEND > - case 0: > - /* in postmaster child ... */ > - InitPostmasterChild(); > - > - /* Close the postmaster's sockets */ > - ClosePostmasterPorts(false); > - > - AutoVacLauncherMain(0, NULL); > - break; > -#endif > - default: > - return (int) AutoVacPID; > + MemoryContextDelete(PostmasterContext); > + PostmasterContext = NULL; > } > > - /* shouldn't get here */ > - return 0; > -} This if (PostmasterContext) ... else "shouldn't get here" business seems pretty silly, more likely to hide problems than to help. > +/* > + * Information needed to launch different kinds of child processes. > + */ > +static const struct > +{ > + const char *name; > + void (*main_fn) (char *startup_data, size_t > startup_data_len); > + bool shmem_attach; > +} entry_kinds[] = { > + [PMC_BACKEND] = {"backend", BackendMain, true}, Personally I'd give the struct an actual name - makes the debugging experience a bit nicer than anonymous structs that you can't even reference by a typedef. > + [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true}, > + [PMC_AV_WORKER] = {"autovacuum worker", AutoVacWorkerMain, true}, > + [PMC_BGWORKER] = {"bgworker", BackgroundWorkerMain, true}, > + [PMC_SYSLOGGER] = {"syslogger", SysLoggerMain, false}, > + > + [PMC_STARTUP] = {"startup", StartupProcessMain, true}, > + [PMC_BGWRITER] = {"bgwriter", BackgroundWriterMain, true}, > + [PMC_ARCHIVER] = {"archiver", PgArchiverMain, true}, > + [PMC_CHECKPOINTER] = {"checkpointer", CheckpointerMain, true}, > + [PMC_WAL_WRITER] = {"wal_writer", WalWriterMain, true}, > + [PMC_WAL_RECEIVER] = {"wal_receiver", WalReceiverMain, true}, > +}; It feels like we have too many different ways of documenting the type of a process. This new PMC_ stuff, enum AuxProcType, enum BackendType. Which then leads to code like this: > -CheckpointerMain(void) > +CheckpointerMain(char *startup_data, size_t startup_data_len) > { > sigjmp_buf local_sigjmp_buf; > MemoryContext checkpointer_context; > > + Assert(startup_data_len == 0); > + > + MyAuxProcType = CheckpointerProcess; > + MyBackendType = B_CHECKPOINTER; > + AuxiliaryProcessInit(); > + For each type of child process. That seems a bit too redundant. Can't we unify this at least somewhat? Can't we just reuse BackendType here? Sure, there'd be pointless entry for B_INVALID, but that doesn't seem like a problem, could even be useful, by pointing it to a function raising an error. At the very least this shouldn't deviate from the naming pattern of BackendType. > +/* > + * SubPostmasterMain -- Get the fork/exec'd process into a state equivalent > + * to what it would be if we'd simply forked on Unix, and > then > + * dispatch to the appropriate place. > + * > + * The first two command line arguments are expected to be > "--forkchild=<name>", > + * where <name> indicates which postmaster child we are to become, and > + * the name of a variables file that we can read to load data that would > + * have been inherited by fork() on Unix. > + */ > +void > +SubPostmasterMain(int argc, char *argv[]) > +{ > + PostmasterChildType child_type; > + char *startup_data; > + size_t startup_data_len; > + char *entry_name; > + bool found = false; > + > + /* In EXEC_BACKEND case we will not have inherited these settings */ > + IsPostmasterEnvironment = true; > + whereToSendOutput = DestNone; > + > + /* Setup essential subsystems (to ensure elog() behaves sanely) */ > + InitializeGUCOptions(); > + > + /* Check we got appropriate args */ > + if (argc != 3) > + elog(FATAL, "invalid subpostmaster invocation"); > + > + if (strncmp(argv[1], "--forkchild=", 12) != 0) > + elog(FATAL, "invalid subpostmaster invocation (--forkchild > argument missing)"); > + entry_name = argv[1] + 12; > + found = false; > + for (int idx = 0; idx < lengthof(entry_kinds); idx++) > + { > + if (strcmp(entry_kinds[idx].name, entry_name) == 0) > + { > + child_type = idx; > + found = true; > + break; > + } > + } > + if (!found) > + elog(ERROR, "unknown child kind %s", entry_name); If we then have to search linearly, why don't we just pass the index into the array? > > -#define StartupDataBase() StartChildProcess(StartupProcess) > -#define StartArchiver() > StartChildProcess(ArchiverProcess) > -#define StartBackgroundWriter() StartChildProcess(BgWriterProcess) > -#define StartCheckpointer() StartChildProcess(CheckpointerProcess) > -#define StartWalWriter() StartChildProcess(WalWriterProcess) > -#define StartWalReceiver() StartChildProcess(WalReceiverProcess) > +#define StartupDataBase() StartChildProcess(PMC_STARTUP) > +#define StartArchiver() StartChildProcess(PMC_ARCHIVER) > +#define StartBackgroundWriter() StartChildProcess(PMC_BGWRITER) > +#define StartCheckpointer() StartChildProcess(PMC_CHECKPOINTER) > +#define StartWalWriter() StartChildProcess(PMC_WAL_WRITER) > +#define StartWalReceiver() StartChildProcess(PMC_WAL_RECEIVER) > + > +#define StartAutoVacLauncher() StartChildProcess(PMC_AV_LAUNCHER); > +#define StartAutoVacWorker() StartChildProcess(PMC_AV_WORKER); Obviously not your fault, but these macros are so pointless... Making it harder to find where we start child processes, all to save a a few characters in one place, while adding considerably more in others. > +void > +BackendMain(char *startup_data, size_t startup_data_len) > +{ Is there any remaining reason for this to live in postmaster.c? Given that other backend types don't, that seems oddly assymmetrical. Greetings, Andres Freund