At Thu, 7 Sep 2023 10:53:41 +0000, "Hayato Kuroda (Fujitsu)"
<[email protected]> wrote in
> My first idea is that to move the checking part to above, but this may not
> handle
> the case the postmaster is still alive (now sure this is real issue). Do we
> have to
> add a new indicator which ensures the identity of processes for windows?
> Please tell me how you feel.
It doesn't seem to work as expected. We still lose the relationship
between the PID file and the launched postmaster.
> > Now, I
> > also recall that the processes spawned by pg_ctl on Windows make the
> > status handling rather tricky to reason about..
>
> Did you say about the below comment? Currently I have no idea to make
> codes more proper, sorry.
>
> ```
> * On Windows, we may be checking the postmaster's parent
> shell, but
> * that's fine for this purpose.
> ```
Ditching cmd.exe seems like a big hassle. So, on the flip side, I
tried to identify the postmaster PID using the shell's PID, and it
seem to work. The APIs used are avaiable from XP/2003 onwards.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 807d7023a9..d54cc6ab54 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -32,6 +32,9 @@
#ifdef WIN32 /* on Unix, we don't need libpq */
#include "pqexpbuffer.h"
#endif
+#ifdef WIN32
+#include <tlhelp32.h>
+#endif
typedef enum
@@ -425,6 +428,91 @@ free_readfile(char **optlines)
* start/test/stop routines
*/
+#ifdef WIN32
+/*
+ * Find the PID of the launched postmaster.
+ *
+ * On Windows, the cmd.exe doesn't support the exec command. As a result, we
+ * don't directly get the postmaster's PID. This function identifies the PID of
+ * the postmaster started by the child cmd.exe.
+ *
+ * Returns the postmaster's PID. If the shell is alive but the postmaster is
+ * missing, returns 0. Otherwise terminates this command with an error.
+ */
+DWORD
+win32_find_postmaster_pid(int shell_pid)
+{
+ HANDLE hSnapshot;
+ DWORD flags;
+ DWORD p_pid = shell_pid;
+ PROCESSENTRY32 ppe;
+ DWORD pm_pid = 0;
+ bool shell_exists = false;
+ bool multiple_children = false;
+ DWORD last_error;
+
+ /* Create a process snapshot */
+ hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, shell_pid);
+ if (hSnapshot == INVALID_HANDLE_VALUE)
+ {
+ write_stderr(_("%s: CreateToolhelp32Snapshot failed\n"),
+ progname);
+ return;
+ }
+
+ /* start iterating on the snapshot */
+ ppe.dwSize = sizeof(PROCESSENTRY32);
+ if (!Process32First(hSnapshot, &ppe))
+ {
+ write_stderr(_("%s: Process32First failed: errcode=%08x\n"),
+ progname, GetLastError());
+ exit(1);
+ }
+
+ /* iterate over the snapshot */
+ do
+ {
+ if (ppe.th32ProcessID == shell_pid)
+ shell_exists = true;
+ if (ppe.th32ParentProcessID == shell_pid)
+ {
+ if (pm_pid > 0)
+ multiple_children = true;
+ pm_pid = ppe.th32ProcessID;
+ }
+ }
+ while (Process32Next(hSnapshot, &ppe));
+
+ /* avoid multiple calls primary for clarity, not out of necessity */
+ last_error = GetLastError();
+ if (last_error != ERROR_NO_MORE_FILES)
+ {
+ write_stderr(_("%s: Process32Next failed: errcode=%08x\n"),
+ progname, last_error);
+ exit(1);
+ }
+ CloseHandle(hSnapshot);
+
+ /* assuming the launching shell executes a single process */
+ if (multiple_children)
+ {
+ write_stderr(_("%s: lancher shell executed multiple processes\n"),
+ progname);
+ exit(1);
+ }
+
+ /* Check if the process is gone for any reason */
+ if (!shell_exists)
+ {
+ /* Report the condition as server start failure */
+ write_stderr(_("%s: launcher shell died\n"), progname);
+ exit(1);
+ }
+
+ return pm_pid;
+}
+#endif
+
/*
* Start the postmaster and return its PID.
*
@@ -609,7 +697,11 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint)
/* File is complete enough for us, parse it */
pid_t pmpid;
time_t pmstart;
-
+#ifndef WIN32
+ pid_t wait_pid = pm_pid;
+#else
+ pid_t wait_pid = win32_find_postmaster_pid(pm_pid);
+#endif
/*
* Make sanity checks. If it's for the wrong PID, or the recorded
* start time is before pg_ctl started, then either we are looking
@@ -619,21 +711,15 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint)
*/
pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]);
pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]);
- if (pmstart >= start_time - 2 &&
-#ifndef WIN32
- pmpid == pm_pid
-#else
- /* Windows can only reject standalone-backend PIDs */
- pmpid > 0
-#endif
- )
+
+ if (pmstart >= start_time - 2 && pmpid == wait_pid)
{
/*
* OK, seems to be a valid pidfile from our child. Check the
* status line (this assumes a v10 or later server).
*/
char *pmstatus = optlines[LOCK_FILE_LINE_PM_STATUS - 1];
-
+
if (strcmp(pmstatus, PM_STATUS_READY) == 0 ||
strcmp(pmstatus, PM_STATUS_STANDBY) == 0)
{