Hello.

At Thu, 25 Jun 2020 19:27:54 +0200, Jehan-Guillaume de Rorthais 
<j...@dalibo.com> wrote in 
> Here is a summary of my work during the last few days on this demote approach.
> 
> Please, find in attachment v2-0001-Demote-PoC.patch and the comments in the
> commit message and as FIXME in code.
> 
> The patch is not finished or bug-free yet, I'm still not very happy with the
> coding style, it probably lack some more code documentation, but a lot has
> changed since v1. It's still a PoC to push the discussion a bit further after
> being myself silent for some days.
> 
> The patch is currently relying on a demote checkpoint. I understand a forced
> checkpoint overhead can be massive and cause major wait/downtime. But I keep
> this for a later step. Maybe we should be able to cancel a running checkpoint?
> Or leave it to its synching work but discard the result without wirting it to
> XLog?

If we are going to dive so close to server shutdown, we can just
utilize the restart-after-crash path, which we can assume to work
reliably. The attached is a quite rough sketch, hijacking smart
shutdown path for a convenience, of that but seems working.  "pg_ctl
-m s -W stop" lets server demote.

> I hadn't time to investigate Robert's concern about shared memory for snapshot
> during recovery.

The patch does all required clenaup of resources including shared
memory, I believe.  It's enough if we don't need to keep any resources
alive?

> The patch doesn't deal with prepared xact yet. Testing 
> "start->demote->promote"
> raise an assert if some prepared xact exist. I suppose I will rollback them
> during demote in next patch version.
> 
> I'm not sure how to divide this patch in multiple small independent steps. I
> suppose I can split it like:
> 
> 1. add demote checkpoint
> 2. support demote: mostly postmaster, startup/xlog and checkpointer related
>    code
> 3. cli using pg_ctl demote
> 
> ...But I'm not sure it worth it.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index b4d475bb0b..a4adf3e587 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -2752,6 +2752,7 @@ SIGHUP_handler(SIGNAL_ARGS)
 /*
  * pmdie -- signal handler for processing various postmaster signals.
  */
+static bool		demoting = false;
 static void
 pmdie(SIGNAL_ARGS)
 {
@@ -2774,59 +2775,17 @@ pmdie(SIGNAL_ARGS)
 		case SIGTERM:
 
 			/*
-			 * Smart Shutdown:
+			 * XXX: Hijacked as DEMOTE
 			 *
-			 * Wait for children to end their work, then shut down.
+			 * Runs fast shutdown, then restart as standby
 			 */
 			if (Shutdown >= SmartShutdown)
 				break;
 			Shutdown = SmartShutdown;
 			ereport(LOG,
-					(errmsg("received smart shutdown request")));
-
-			/* Report status */
-			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
-#ifdef USE_SYSTEMD
-			sd_notify(0, "STOPPING=1");
-#endif
-
-			if (pmState == PM_RUN || pmState == PM_RECOVERY ||
-				pmState == PM_HOT_STANDBY || pmState == PM_STARTUP)
-			{
-				/* autovac workers are told to shut down immediately */
-				/* and bgworkers too; does this need tweaking? */
-				SignalSomeChildren(SIGTERM,
-								   BACKEND_TYPE_AUTOVAC | BACKEND_TYPE_BGWORKER);
-				/* and the autovac launcher too */
-				if (AutoVacPID != 0)
-					signal_child(AutoVacPID, SIGTERM);
-				/* and the bgwriter too */
-				if (BgWriterPID != 0)
-					signal_child(BgWriterPID, SIGTERM);
-				/* and the walwriter too */
-				if (WalWriterPID != 0)
-					signal_child(WalWriterPID, SIGTERM);
-
-				/*
-				 * If we're in recovery, we can't kill the startup process
-				 * right away, because at present doing so does not release
-				 * its locks.  We might want to change this in a future
-				 * release.  For the time being, the PM_WAIT_READONLY state
-				 * indicates that we're waiting for the regular (read only)
-				 * backends to die off; once they do, we'll kill the startup
-				 * and walreceiver processes.
-				 */
-				pmState = (pmState == PM_RUN) ?
-					PM_WAIT_BACKUP : PM_WAIT_READONLY;
-			}
-
-			/*
-			 * Now wait for online backup mode to end and backends to exit. If
-			 * that is already the case, PostmasterStateMachine will take the
-			 * next step.
-			 */
-			PostmasterStateMachine();
-			break;
+					(errmsg("received demote request")));
+			demoting = true;
+			/* FALL THROUGH */
 
 		case SIGINT:
 
@@ -2839,8 +2798,10 @@ pmdie(SIGNAL_ARGS)
 			if (Shutdown >= FastShutdown)
 				break;
 			Shutdown = FastShutdown;
-			ereport(LOG,
-					(errmsg("received fast shutdown request")));
+
+			if (!demoting)
+				ereport(LOG,
+						(errmsg("received fast shutdown request")));
 
 			/* Report status */
 			AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, PM_STATUS_STOPPING);
@@ -2887,6 +2848,13 @@ pmdie(SIGNAL_ARGS)
 				pmState = PM_WAIT_BACKENDS;
 			}
 
+			/* create standby signal file */
+			{
+				FILE *standby_file = AllocateFile(STANDBY_SIGNAL_FILE, "w");
+
+				Assert (standby_file && !FreeFile(standby_file));
+			}
+
 			/*
 			 * Now wait for backends to exit.  If there are none,
 			 * PostmasterStateMachine will take the next step.
@@ -3958,7 +3926,7 @@ PostmasterStateMachine(void)
 	 * EOF on its input pipe, which happens when there are no more upstream
 	 * processes.
 	 */
-	if (Shutdown > NoShutdown && pmState == PM_NO_CHILDREN)
+	if (!demoting && Shutdown > NoShutdown && pmState == PM_NO_CHILDREN)
 	{
 		if (FatalError)
 		{
@@ -3996,13 +3964,23 @@ PostmasterStateMachine(void)
 		ExitPostmaster(1);
 
 	/*
-	 * If we need to recover from a crash, wait for all non-syslogger children
-	 * to exit, then reset shmem and StartupDataBase.
+	 * If we need to recover from a crash or demoting, wait for all
+	 * non-syslogger children to exit, then reset shmem and StartupDataBase.
 	 */
-	if (FatalError && pmState == PM_NO_CHILDREN)
+	if ((demoting || FatalError) && pmState == PM_NO_CHILDREN)
 	{
-		ereport(LOG,
-				(errmsg("all server processes terminated; reinitializing")));
+		if (demoting)
+			ereport(LOG,
+					(errmsg("all server processes terminated; starting as standby")));
+		else
+			ereport(LOG,
+					(errmsg("all server processes terminated; reinitializing")));
+
+		if (demoting)
+		{
+			Shutdown = NoShutdown;
+			demoting = false;
+		}
 
 		/* allow background workers to immediately restart */
 		ResetBackgroundWorkerCrashTimes();

Reply via email to