Hi,

On 2023-11-30 01:36:25 +0200, Heikki Linnakangas wrote:
> - patch 1 splits CreateSharedMemoryAndSemaphores into two functions:
> CreateSharedMemoryAndSemaphores is now only called at postmaster startup,
> and a new function called AttachSharedMemoryStructs() is called in backends
> in EXEC_BACKEND mode. I extracted the common parts of those functions to a
> new static function. (Some of this refactoring used to be part of the 3rd
> patch in the series, but it seems useful on its own, so I split it out.)

I like that idea.



> From a96b6e92fdeaa947bf32774c425419b8f987b8e2 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
> Date: Thu, 30 Nov 2023 00:01:25 +0200
> Subject: [PATCH v3 1/7] Refactor CreateSharedMemoryAndSemaphores
>
> For clarity, have separate functions for *creating* the shared memory
> and semaphores at postmaster or single-user backend startup, and
> for *attaching* to existing shared memory structures in EXEC_BACKEND
> case. CreateSharedMemoryAndSemaphores() is now called only at
> postmaster startup, and a new AttachSharedMemoryStructs() function is
> called at backend startup in EXEC_BACKEND mode.

I assume CreateSharedMemoryAndSemaphores() is still called during crash
restart?  I wonder if it shouldn't split three ways:
1) create
2) initialize
3) attach


> From 3478cafcf74a5c8d649e0287e6c72669a29c0e70 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
> Date: Thu, 30 Nov 2023 00:02:03 +0200
> Subject: [PATCH v3 2/7] Pass BackgroundWorker entry in the parameter file in
>  EXEC_BACKEND mode
>
> This makes it possible to move InitProcess later in SubPostmasterMain
> (in next commit), as we no longer need to access shared memory to read
> background worker entry.

>  static void read_backend_variables(char *id, Port *port);
> @@ -4831,7 +4833,7 @@ SubPostmasterMain(int argc, char *argv[])
>               strcmp(argv[1], "--forkavlauncher") == 0 ||
>               strcmp(argv[1], "--forkavworker") == 0 ||
>               strcmp(argv[1], "--forkaux") == 0 ||
> -             strncmp(argv[1], "--forkbgworker=", 15) == 0)
> +             strncmp(argv[1], "--forkbgworker", 14) == 0)
>               PGSharedMemoryReAttach();
>       else
>               PGSharedMemoryNoReAttach();
> @@ -4962,10 +4964,8 @@ SubPostmasterMain(int argc, char *argv[])
>
>               AutoVacWorkerMain(argc - 2, argv + 2);  /* does not return */
>       }
> -     if (strncmp(argv[1], "--forkbgworker=", 15) == 0)
> +     if (strncmp(argv[1], "--forkbgworker", 14) == 0)


Now that we don't need to look at parameters anymore, these should probably be
just a strcmp(), like the other cases?


> From 0d071474e12a70ff8113c7b0731c5b97fec45007 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
> Date: Wed, 29 Nov 2023 23:47:25 +0200
> Subject: [PATCH v3 3/7] Refactor how InitProcess is called
>
> The order of process initialization steps is now more consistent
> between !EXEC_BACKEND and EXEC_BACKEND modes. InitProcess() is called
> at the same place in either mode. We can now also move the
> AttachSharedMemoryStructs() call into InitProcess() itself. This
> reduces the number of "#ifdef EXEC_BACKEND" blocks.

Yay.


> diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
> index cdfdd6fbe1d..6c708777dde 100644
> --- a/src/backend/storage/lmgr/proc.c
> +++ b/src/backend/storage/lmgr/proc.c
> @@ -461,6 +461,12 @@ InitProcess(void)
>        */
>       InitLWLockAccess();
>       InitDeadLockChecking();
> +
> +#ifdef EXEC_BACKEND
> +     /* Attach process to shared data structures */
> +     if (IsUnderPostmaster)
> +             AttachSharedMemoryStructs();
> +#endif
>  }
>
>  /*
> @@ -614,6 +620,12 @@ InitAuxiliaryProcess(void)
>        * Arrange to clean up at process exit.
>        */
>       on_shmem_exit(AuxiliaryProcKill, Int32GetDatum(proctype));
> +
> +#ifdef EXEC_BACKEND
> +     /* Attach process to shared data structures */
> +     if (IsUnderPostmaster)
> +             AttachSharedMemoryStructs();
> +#endif
>  }

Aside: Somewhat odd that InitAuxiliaryProcess() doesn't call
InitLWLockAccess().


I think a short comment explaining why we can attach to shmem structs after
already accessing shared memory earlier in the function would be worthwhile.


> From ce51876f87f1e4317e25baf64184749448fcd033 Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas <heikki.linnakan...@iki.fi>
> Date: Thu, 30 Nov 2023 00:07:34 +0200
> Subject: [PATCH v3 7/7] Refactor postmaster child process launching
>
> - Move code related to launching backend processes to new source file,
>   launch_backend.c
>
> - Introduce new postmaster_child_launch() function that deals with the
>   differences between EXEC_BACKEND and fork mode.
>
> - Refactor the mechanism of passing information from the parent to
>   child process. Instead of using different command-line arguments when
>   launching the child process in EXEC_BACKEND mode, pass a
>   variable-length blob of data along with all the global variables. The
>   contents of that blob depend on the kind of child process being
>   launched. In !EXEC_BACKEND mode, we use the same blob, but it's simply
>   inherited from the parent to child process.
>
>  [...]
>  33 files changed, 1787 insertions(+), 2002 deletions(-)

Well, that's not small...

I think it may be worth splitting some of the file renaming out into a
separate commit, makes it harder to see what changed here.


> +AutoVacLauncherMain(char *startup_data, size_t startup_data_len)
>  {
> -     pid_t           AutoVacPID;
> +     sigjmp_buf      local_sigjmp_buf;
>
> -#ifdef EXEC_BACKEND
> -     switch ((AutoVacPID = avlauncher_forkexec()))
> -#else
> -     switch ((AutoVacPID = fork_process()))
> -#endif
> +     /* Release postmaster's working memory context */
> +     if (PostmasterContext)
>       {
> -             case -1:
> -                     ereport(LOG,
> -                                     (errmsg("could not fork autovacuum 
> launcher process: %m")));
> -                     return 0;
> -
> -#ifndef EXEC_BACKEND
> -             case 0:
> -                     /* in postmaster child ... */
> -                     InitPostmasterChild();
> -
> -                     /* Close the postmaster's sockets */
> -                     ClosePostmasterPorts(false);
> -
> -                     AutoVacLauncherMain(0, NULL);
> -                     break;
> -#endif
> -             default:
> -                     return (int) AutoVacPID;
> +             MemoryContextDelete(PostmasterContext);
> +             PostmasterContext = NULL;
>       }
>
> -     /* shouldn't get here */
> -     return 0;
> -}

This if (PostmasterContext) ... else "shouldn't get here" business seems
pretty silly, more likely to hide problems than to help.


> +/*
> + * Information needed to launch different kinds of child processes.
> + */
> +static const struct
> +{
> +     const char *name;
> +     void            (*main_fn) (char *startup_data, size_t 
> startup_data_len);
> +     bool            shmem_attach;
> +}                    entry_kinds[] = {
> +     [PMC_BACKEND] = {"backend", BackendMain, true},

Personally I'd give the struct an actual name - makes the debugging experience
a bit nicer than anonymous structs that you can't even reference by a typedef.


> +     [PMC_AV_LAUNCHER] = {"autovacuum launcher", AutoVacLauncherMain, true},
> +     [PMC_AV_WORKER] = {"autovacuum worker", AutoVacWorkerMain, true},
> +     [PMC_BGWORKER] = {"bgworker", BackgroundWorkerMain, true},
> +     [PMC_SYSLOGGER] = {"syslogger", SysLoggerMain, false},
> +
> +     [PMC_STARTUP] = {"startup", StartupProcessMain, true},
> +     [PMC_BGWRITER] = {"bgwriter", BackgroundWriterMain, true},
> +     [PMC_ARCHIVER] = {"archiver", PgArchiverMain, true},
> +     [PMC_CHECKPOINTER] = {"checkpointer", CheckpointerMain, true},
> +     [PMC_WAL_WRITER] = {"wal_writer", WalWriterMain, true},
> +     [PMC_WAL_RECEIVER] = {"wal_receiver", WalReceiverMain, true},
> +};


It feels like we have too many different ways of documenting the type of a
process. This new PMC_ stuff, enum AuxProcType, enum BackendType. Which then
leads to code like this:


> -CheckpointerMain(void)
> +CheckpointerMain(char *startup_data, size_t startup_data_len)
>  {
>       sigjmp_buf      local_sigjmp_buf;
>       MemoryContext checkpointer_context;
>
> +     Assert(startup_data_len == 0);
> +
> +     MyAuxProcType = CheckpointerProcess;
> +     MyBackendType = B_CHECKPOINTER;
> +     AuxiliaryProcessInit();
> +

For each type of child process. That seems a bit too redundant.  Can't we
unify this at least somewhat? Can't we just reuse BackendType here? Sure,
there'd be pointless entry for B_INVALID, but that doesn't seem like a
problem, could even be useful, by pointing it to a function raising an error.

At the very least this shouldn't deviate from the naming pattern of
BackendType.


> +/*
> + * SubPostmasterMain -- Get the fork/exec'd process into a state equivalent
> + *                   to what it would be if we'd simply forked on Unix, and 
> then
> + *                   dispatch to the appropriate place.
> + *
> + * The first two command line arguments are expected to be 
> "--forkchild=<name>",
> + * where <name> indicates which postmaster child we are to become, and
> + * the name of a variables file that we can read to load data that would
> + * have been inherited by fork() on Unix.
> + */
> +void
> +SubPostmasterMain(int argc, char *argv[])
> +{
> +     PostmasterChildType child_type;
> +     char       *startup_data;
> +     size_t          startup_data_len;
> +     char       *entry_name;
> +     bool            found = false;
> +
> +     /* In EXEC_BACKEND case we will not have inherited these settings */
> +     IsPostmasterEnvironment = true;
> +     whereToSendOutput = DestNone;
> +
> +     /* Setup essential subsystems (to ensure elog() behaves sanely) */
> +     InitializeGUCOptions();
> +
> +     /* Check we got appropriate args */
> +     if (argc != 3)
> +             elog(FATAL, "invalid subpostmaster invocation");
> +
> +     if (strncmp(argv[1], "--forkchild=", 12) != 0)
> +             elog(FATAL, "invalid subpostmaster invocation (--forkchild 
> argument missing)");
> +     entry_name = argv[1] + 12;
> +     found = false;
> +     for (int idx = 0; idx < lengthof(entry_kinds); idx++)
> +     {
> +             if (strcmp(entry_kinds[idx].name, entry_name) == 0)
> +             {
> +                     child_type = idx;
> +                     found = true;
> +                     break;
> +             }
> +     }
> +     if (!found)
> +             elog(ERROR, "unknown child kind %s", entry_name);

If we then have to search linearly, why don't we just pass the index into the
array?

>
> -#define StartupDataBase()            StartChildProcess(StartupProcess)
> -#define StartArchiver()                      
> StartChildProcess(ArchiverProcess)
> -#define StartBackgroundWriter() StartChildProcess(BgWriterProcess)
> -#define StartCheckpointer()          StartChildProcess(CheckpointerProcess)
> -#define StartWalWriter()             StartChildProcess(WalWriterProcess)
> -#define StartWalReceiver()           StartChildProcess(WalReceiverProcess)
> +#define StartupDataBase()            StartChildProcess(PMC_STARTUP)
> +#define StartArchiver()                      StartChildProcess(PMC_ARCHIVER)
> +#define StartBackgroundWriter() StartChildProcess(PMC_BGWRITER)
> +#define StartCheckpointer()          StartChildProcess(PMC_CHECKPOINTER)
> +#define StartWalWriter()             StartChildProcess(PMC_WAL_WRITER)
> +#define StartWalReceiver()           StartChildProcess(PMC_WAL_RECEIVER)
> +
> +#define StartAutoVacLauncher()       StartChildProcess(PMC_AV_LAUNCHER);
> +#define StartAutoVacWorker() StartChildProcess(PMC_AV_WORKER);

Obviously not your fault, but these macros are so pointless... Making it
harder to find where we start child processes, all to save a a few characters
in one place, while adding considerably more in others.


> +void
> +BackendMain(char *startup_data, size_t startup_data_len)
> +{

Is there any remaining reason for this to live in postmaster.c? Given that
other backend types don't, that seems oddly assymmetrical.

Greetings,

Andres Freund


Reply via email to