Re: [HACKERS] Re: [COMMITTERS] pgsql: Start background writer during archive recovery.

Heikki Linnakangas Thu, 19 Feb 2009 12:30:41 -0800

Tom Lane wrote:

Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> writes:
Tom Lane wrote:
Maybe the postmaster should wait for startup process exit before
deciding to open for business, instead of just a signal.
Not a bad idea. Although, there's nothing wrong with the current wayeither. The startup process does a proc_exit(0) right after sending thesignal ATM, so there's no real work left at that point.
The thing wrong with it is assuming that nothing interesting will happen
during proc_exit().  We hang enough stuff on on_proc_exit hooks that
that seems like a pretty shaky assumption.

I can't get too worried, given that proc_exit() is a very well-beatencode path. Admittedly not so much for an auxiliary process, but that'sjust a dumbed down version of what happens with a full-blown backend.

However I started looking into that idea anyway, and figured that itdoes simplify the logic in postmaster.c quite a bit, so I think it'sworth doing on those grounds alone. Attached is a patch against CVS HEADand also against a snapshot before the recovery infra patch, for easierreading. I'll give that some more testing and commit if I find no issues.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 7695,7701 **** StartupProcessMain(void)
  	BuildFlatFiles(false);
  
  	/* Let postmaster know that startup is finished */
! 	SendPostmasterSignal(PMSIGNAL_RECOVERY_COMPLETED);
  
  	/* exit normally */
  	proc_exit(0);
--- 7695,7701 ----
  	BuildFlatFiles(false);
  
  	/* Let postmaster know that startup is finished */
! 	SetPostmasterSignal(PMSIGNAL_RECOVERY_COMPLETED);
  
  	/* exit normally */
  	proc_exit(0);
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 227,240 **** static int	Shutdown = NoShutdown;
  static bool FatalError = false; /* T if recovering from backend crash */
  static bool RecoveryError = false; /* T if recovery failed */
  
- /* State of WAL redo */
- #define			NoRecovery			0
- #define			RecoveryStarted		1
- #define			RecoveryConsistent	2
- #define			RecoveryCompleted	3
- 
- static int	RecoveryStatus = NoRecovery;
- 
  /*
   * We use a simple state machine to control startup, shutdown, and
   * crash recovery (which is rather like shutdown followed by startup).
--- 227,232 ----
***************
*** 253,261 **** static int	RecoveryStatus = NoRecovery;
   * point, if we had the infrastructure to do that.
   *
   * When the WAL redo is finished, the startup process signals us the third
!  * time, and we switch to PM_RUN state. The startup process can also skip the
!  * recovery and consistent recovery phases altogether, as it will during
!  * normal startup when there's no recovery to be done, for example.
   *
   * Normal child backends can only be launched when we are in PM_RUN state.
   * (We also allow it in PM_WAIT_BACKUP state, but only for superusers.)
--- 245,256 ----
   * point, if we had the infrastructure to do that.
   *
   * When the WAL redo is finished, the startup process signals us the third
!  * time, and exits. We don't process the 3d signal immediately but when we
!  * see the that the startup process has exited, we check that we have
!  * received the signal. If everything is OK, we then switch to PM_RUN state.
!  * The startup process can also skip the recovery and consistent recovery
!  * phases altogether, as it will during normal startup when there's no
!  * recovery to be done, for example.
   *
   * Normal child backends can only be launched when we are in PM_RUN state.
   * (We also allow it in PM_WAIT_BACKUP state, but only for superusers.)
***************
*** 338,344 **** static void pmdie(SIGNAL_ARGS);
  static void reaper(SIGNAL_ARGS);
  static void sigusr1_handler(SIGNAL_ARGS);
  static void dummy_handler(SIGNAL_ARGS);
- static void CheckRecoverySignals(void);
  static void CleanupBackend(int pid, int exitstatus);
  static void HandleChildCrash(int pid, int exitstatus, const char *procname);
  static void LogChildExit(int lev, const char *procname,
--- 333,338 ----
***************
*** 2019,2025 **** pmdie(SIGNAL_ARGS)
  			ereport(LOG,
  					(errmsg("received smart shutdown request")));
  
! 			if (pmState == PM_RUN || pmState == PM_RECOVERY || pmState == PM_RECOVERY_CONSISTENT)
  			{
  				/* autovacuum workers are told to shut down immediately */
  				SignalAutovacWorkers(SIGTERM);
--- 2013,2020 ----
  			ereport(LOG,
  					(errmsg("received smart shutdown request")));
  
! 			if (pmState == PM_RUN || pmState == PM_RECOVERY ||
! 				pmState == PM_RECOVERY_CONSISTENT)
  			{
  				/* autovacuum workers are told to shut down immediately */
  				SignalAutovacWorkers(SIGTERM);
***************
*** 2159,2181 **** reaper(SIGNAL_ARGS)
  		 */
  		if (pid == StartupPID)
  		{
  			StartupPID = 0;
  
  			/*
! 			 * Check if we've received a signal from the startup process
! 			 * first. This can change pmState. If the startup process sends
! 			 * a signal and exits immediately after that, we might not have
! 			 * processed the signal yet. We need to know if it completed
! 			 * recovery before it exited.
  			 */
! 			CheckRecoverySignals();
  
  			/*
  			 * Unexpected exit of startup process (including FATAL exit)
  			 * during PM_STARTUP is treated as catastrophic. There is no
! 			 * other processes running yet.
  			 */
! 			if (pmState == PM_STARTUP)
  			{
  				LogChildExit(LOG, _("startup process"),
  							 pid, exitstatus);
--- 2154,2177 ----
  		 */
  		if (pid == StartupPID)
  		{
+ 			bool recoveryCompleted;
+ 
  			StartupPID = 0;
  
  			/*
! 			 * Check if the startup process completed recovery before exiting
  			 */
! 			if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_COMPLETED))
! 				recoveryCompleted = true;
! 			else
! 				recoveryCompleted = false;
  
  			/*
  			 * Unexpected exit of startup process (including FATAL exit)
  			 * during PM_STARTUP is treated as catastrophic. There is no
! 			 * other processes running yet, so we can just exit.
  			 */
! 			if (pmState == PM_STARTUP && !recoveryCompleted)
  			{
  				LogChildExit(LOG, _("startup process"),
  							 pid, exitstatus);
***************
*** 2196,2212 **** reaper(SIGNAL_ARGS)
  				continue;
  			}
  			/*
  			 * Startup process exited normally, but didn't finish recovery.
  			 * This can happen if someone else than postmaster kills the
  			 * startup process with SIGTERM. Treat it like a crash.
  			 */
! 			if (pmState == PM_RECOVERY || pmState == PM_RECOVERY_CONSISTENT)
  			{
  				RecoveryError = true;
  				HandleChildCrash(pid, exitstatus,
  								 _("startup process"));
  				continue;
  			}
  		}
  
  		/*
--- 2192,2255 ----
  				continue;
  			}
  			/*
+ 			 * Startup process exited in response to a shutdown request (or
+ 			 * it finished normally regardless of the shutdown request).
+ 			 */
+ 			if (Shutdown > NoShutdown)
+ 			{
+ 				pmState = PM_WAIT_BACKENDS;
+ 				/* PostmasterStateMachine logic does the rest */
+ 				continue;
+ 			}
+ 			/*
  			 * Startup process exited normally, but didn't finish recovery.
  			 * This can happen if someone else than postmaster kills the
  			 * startup process with SIGTERM. Treat it like a crash.
  			 */
! 			if (!recoveryCompleted)
  			{
  				RecoveryError = true;
  				HandleChildCrash(pid, exitstatus,
  								 _("startup process"));
  				continue;
  			}
+ 
+ 			/*
+ 			 * Startup succeeded, commence normal operations
+ 			 */
+ 			pmState = PM_RUN;
+ 
+ 			/*
+ 			 * Load the flat authorization file into postmaster's cache. The
+ 			 * startup process has recomputed this from the database contents,
+ 			 * so we wait till it finishes before loading it.
+ 			 */
+ 			load_role();
+ 
+ 			/*
+ 			 * Crank up the background writer, if we didn't do that already
+ 			 * when we entered consistent recovery phase.  It doesn't matter
+ 			 * if this fails, we'll just try again later.
+ 			 */
+ 			if (BgWriterPID == 0)
+ 				BgWriterPID = StartBackgroundWriter();
+ 
+ 			/*
+ 			 * Likewise, start other special children as needed.  In a restart
+ 			 * situation, some of them may be alive already.
+ 			 */
+ 			if (WalWriterPID == 0)
+ 				WalWriterPID = StartWalWriter();
+ 			if (AutoVacuumingActive() && AutoVacPID == 0)
+ 				AutoVacPID = StartAutoVacLauncher();
+ 			if (XLogArchivingActive() && PgArchPID == 0)
+ 				PgArchPID = pgarch_start();
+ 			if (PgStatPID == 0)
+ 				PgStatPID = pgstat_start();
+ 
+ 			/* at this point we are really open for business */
+ 			ereport(LOG,
+ 				(errmsg("database system is ready to accept connections")));
  		}
  
  		/*
***************
*** 2622,2748 **** LogChildExit(int lev, const char *procname, int pid, int exitstatus)
  static void
  PostmasterStateMachine(void)
  {
- 	/* Startup states */
- 
- 	if (pmState == PM_STARTUP && RecoveryStatus > NoRecovery)
- 	{
- 		/* WAL redo has started. We're out of reinitialization. */
- 		FatalError = false;
- 
- 		/*
- 		 * Go to shutdown mode if a shutdown request was pending.
- 		 */
- 		if (Shutdown > NoShutdown)
- 		{
- 			pmState = PM_WAIT_BACKENDS;
- 			/* PostmasterStateMachine logic does the rest */
- 		}
- 		else
- 		{
- 			/*
- 			 * Crank up the background writer.	It doesn't matter if this
- 			 * fails, we'll just try again later.
- 			 */
- 			Assert(BgWriterPID == 0);
- 			BgWriterPID = StartBackgroundWriter();
- 
- 			pmState = PM_RECOVERY;
- 		}
- 	}
- 	if (pmState == PM_RECOVERY && RecoveryStatus >= RecoveryConsistent)
- 	{
- 		/*
- 		 * Go to shutdown mode if a shutdown request was pending.
- 		 */
- 		if (Shutdown > NoShutdown)
- 		{
- 			pmState = PM_WAIT_BACKENDS;
- 			/* PostmasterStateMachine logic does the rest */
- 		}
- 		else
- 		{
- 			/*
- 			 * Startup process has entered recovery. We consider that good
- 			 * enough to reset FatalError.
- 			 */
- 			pmState = PM_RECOVERY_CONSISTENT;
- 
- 			/*
- 			 * Load the flat authorization file into postmaster's cache. The
- 			 * startup process won't have recomputed this from the database yet,
- 			 * so we it may change following recovery. 
- 			 */
- 			load_role();
- 
- 			/*
- 			 * Likewise, start other special children as needed.
- 			 */
- 			Assert(PgStatPID == 0);
- 			PgStatPID = pgstat_start();
- 
- 			/* XXX at this point we could accept read-only connections */
- 			ereport(DEBUG1,
- 				 (errmsg("database system is in consistent recovery mode")));
- 		}
- 	}
- 	if ((pmState == PM_RECOVERY || 
- 		 pmState == PM_RECOVERY_CONSISTENT ||
- 		 pmState == PM_STARTUP) &&
- 		RecoveryStatus == RecoveryCompleted)
- 	{
- 		/*
- 		 * Startup succeeded.
- 		 *
- 		 * Go to shutdown mode if a shutdown request was pending.
- 		 */
- 		if (Shutdown > NoShutdown)
- 		{
- 			pmState = PM_WAIT_BACKENDS;
- 			/* PostmasterStateMachine logic does the rest */
- 		}
- 		else
- 		{
- 			/*
- 			 * Otherwise, commence normal operations.
- 			 */
- 			pmState = PM_RUN;
- 
- 			/*
- 			 * Load the flat authorization file into postmaster's cache. The
- 			 * startup process has recomputed this from the database contents,
- 			 * so we wait till it finishes before loading it.
- 			 */
- 			load_role();
- 
- 			/*
- 			 * Crank up the background writer, if we didn't do that already
- 			 * when we entered consistent recovery phase.  It doesn't matter
- 			 * if this fails, we'll just try again later.
- 			 */
- 			if (BgWriterPID == 0)
- 				BgWriterPID = StartBackgroundWriter();
- 
- 			/*
- 			 * Likewise, start other special children as needed.  In a restart
- 			 * situation, some of them may be alive already.
- 			 */
- 			if (WalWriterPID == 0)
- 				WalWriterPID = StartWalWriter();
- 			if (AutoVacuumingActive() && AutoVacPID == 0)
- 				AutoVacPID = StartAutoVacLauncher();
- 			if (XLogArchivingActive() && PgArchPID == 0)
- 				PgArchPID = pgarch_start();
- 			if (PgStatPID == 0)
- 				PgStatPID = pgstat_start();
- 
- 			/* at this point we are really open for business */
- 			ereport(LOG,
- 				(errmsg("database system is ready to accept connections")));
- 		}
- 	}
- 
- 	/* Shutdown states */
- 
  	if (pmState == PM_WAIT_BACKUP)
  	{
  		/*
--- 2665,2670 ----
***************
*** 2904,2911 **** PostmasterStateMachine(void)
  		shmem_exit(1);
  		reset_shared(PostPortNumber);
  
- 		RecoveryStatus = NoRecovery;
- 
  		StartupPID = StartupDataBase();
  		Assert(StartupPID != 0);
  		pmState = PM_STARTUP;
--- 2826,2831 ----
***************
*** 4010,4056 **** ExitPostmaster(int status)
  }
  
  /*
!  * common code used in sigusr1_handler() and reaper() to handle
!  * recovery-related signals from startup process
   */
  static void
! CheckRecoverySignals(void)
  {
! 	bool changed = false;
  
! 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED))
! 	{
! 		Assert(pmState == PM_STARTUP);
  
! 		RecoveryStatus = RecoveryStarted;
! 		changed = true;
! 	}
! 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT))
  	{
! 		RecoveryStatus = RecoveryConsistent;
! 		changed = true;
  	}
! 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_COMPLETED))
  	{
! 		RecoveryStatus = RecoveryCompleted;
! 		changed = true;
! 	}
! 
! 	if (changed)
! 		PostmasterStateMachine();
! }
  
! /*
!  * sigusr1_handler - handle signal conditions from child processes
!  */
! static void
! sigusr1_handler(SIGNAL_ARGS)
! {
! 	int			save_errno = errno;
  
! 	PG_SETMASK(&BlockSig);
  
! 	CheckRecoverySignals();
  
  	if (CheckPostmasterSignal(PMSIGNAL_PASSWORD_CHANGE))
  	{
--- 3930,3987 ----
  }
  
  /*
!  * sigusr1_handler - handle signal conditions from child processes
   */
  static void
! sigusr1_handler(SIGNAL_ARGS)
  {
! 	int			save_errno = errno;
  
! 	PG_SETMASK(&BlockSig);
  
! 	/*
! 	 * RECOVERY_STARTED and RECOVERY_CONSISTENT signals are ignored in
! 	 * unexpected states. If the startup process quickly starts up, completes
! 	 * recovery, exits, we might process the death of the startup process
! 	 * first. We don't want to go back to recovery in that case.
! 	 */
! 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
! 		pmState == PM_STARTUP)
  	{
! 		/* WAL redo has started. We're out of reinitialization. */
! 		FatalError = false;
! 
! 		/*
! 		 * Crank up the background writer.	It doesn't matter if this
! 		 * fails, we'll just try again later.
! 		 */
! 		Assert(BgWriterPID == 0);
! 		BgWriterPID = StartBackgroundWriter();
! 
! 		pmState = PM_RECOVERY;
  	}
! 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT) &&
! 		pmState == PM_RECOVERY)
  	{
! 		/*
! 		 * Load the flat authorization file into postmaster's cache. The
! 		 * startup process won't have recomputed this from the database yet,
! 		 * so we it may change following recovery. 
! 		 */
! 		load_role();
  
! 		/*
! 		 * Likewise, start other special children as needed.
! 		 */
! 		Assert(PgStatPID == 0);
! 		PgStatPID = pgstat_start();
  
! 		/* XXX at this point we could accept read-only connections */
! 		ereport(DEBUG1,
! 				(errmsg("database system is in consistent recovery mode")));
  
! 		pmState = PM_RECOVERY_CONSISTENT;
! 	}
  
  	if (CheckPostmasterSignal(PMSIGNAL_PASSWORD_CHANGE))
  	{
*** a/src/backend/storage/ipc/ipc.c
--- b/src/backend/storage/ipc/ipc.c
***************
*** 76,81 **** static int	on_proc_exit_index,
--- 76,83 ----
  void
  proc_exit(int code)
  {
+ 	elog(LOG, "proc_exit: %d", on_proc_exit_index);
+ 
  	/*
  	 * Once we set this flag, we are committed to exit.  Any ereport() will
  	 * NOT send control back to the main loop, but right back here.
***************
*** 95,102 **** proc_exit(int code)
  	InterruptHoldoffCount = 1;
  	CritSectionCount = 0;
  
- 	elog(DEBUG3, "proc_exit(%d)", code);
- 
  	/* do our shared memory exits first */
  	shmem_exit(code);
  
--- 97,102 ----
*** a/src/backend/storage/ipc/pmsignal.c
--- b/src/backend/storage/ipc/pmsignal.c
***************
*** 72,77 **** SendPostmasterSignal(PMSignalReason reason)
--- 72,94 ----
  }
  
  /*
+  * SetPostmasterSignal - like SendPostmasterSignal, but don't wake up
+  *						 postmaster
+  *
+  * This is for signals that the postmaster polls with CheckPostmasterSignal()
+  * but isn't interested in processing immediately.
+  */
+ void
+ SetPostmasterSignal(PMSignalReason reason)
+ {
+ 	/* If called in a standalone backend, do nothing */
+ 	if (!IsUnderPostmaster)
+ 		return;
+ 	/* Atomically set the proper flag */
+ 	PMSignalFlags[reason] = true;
+ }
+ 
+ /*
   * CheckPostmasterSignal - check to see if a particular reason has been
   * signaled, and clear the signal flag.  Should be called by postmaster
   * after receiving SIGUSR1.
*** a/src/include/storage/pmsignal.h
--- b/src/include/storage/pmsignal.h
***************
*** 39,44 **** typedef enum
--- 39,45 ----
   */
  extern void PMSignalInit(void);
  extern void SendPostmasterSignal(PMSignalReason reason);
+ extern void SetPostmasterSignal(PMSignalReason reason);
  extern bool CheckPostmasterSignal(PMSignalReason reason);
  extern bool PostmasterIsAlive(bool amDirectChild);

*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
***************
*** 225,235 **** static pid_t StartupPID = 0,
--- 225,257 ----
  static int	Shutdown = NoShutdown;
  
  static bool FatalError = false; /* T if recovering from backend crash */
+ static bool RecoveryError = false; /* T if recovery failed */
  
  /*
   * We use a simple state machine to control startup, shutdown, and
   * crash recovery (which is rather like shutdown followed by startup).
   *
+  * After doing all the postmaster initialization work, we enter PM_STARTUP
+  * state and the startup process is launched. The startup process begins by
+  * reading the control file and other preliminary initialization steps. When
+  * it's ready to start WAL redo, it signals postmaster, and we switch to
+  * PM_RECOVERY phase. The background writer is launched, while the startup
+  * process continues applying WAL. 
+  * 
+  * After reaching a consistent point in WAL redo, startup process signals
+  * us again, and we switch to PM_RECOVERY_CONSISTENT phase. There's currently
+  * no difference between PM_RECOVERY and PM_RECOVERY_CONSISTENT, but we
+  * could start accepting connections to perform read-only queries at this
+  * point, if we had the infrastructure to do that.
+  *
+  * When the WAL redo is finished, the startup process signals us the third
+  * time, and exits. We don't process the 3d signal immediately but when we
+  * see the that the startup process has exited, we check that we have
+  * received the signal. If everything is OK, we then switch to PM_RUN state.
+  * The startup process can also skip the recovery and consistent recovery
+  * phases altogether, as it will during normal startup when there's no
+  * recovery to be done, for example.
+  *
   * Normal child backends can only be launched when we are in PM_RUN state.
   * (We also allow it in PM_WAIT_BACKUP state, but only for superusers.)
   * In other states we handle connection requests by launching "dead_end"
***************
*** 245,259 **** static bool FatalError = false; /* T if recovering from backend crash */
   *
   * Notice that this state variable does not distinguish *why* we entered
   * states later than PM_RUN --- Shutdown and FatalError must be consulted
!  * to find that out.  FatalError is never true in PM_RUN state, nor in
!  * PM_SHUTDOWN states (because we don't enter those states when trying to
!  * recover from a crash).  It can be true in PM_STARTUP state, because we
!  * don't clear it until we've successfully recovered.
   */
  typedef enum
  {
  	PM_INIT,					/* postmaster starting */
  	PM_STARTUP,					/* waiting for startup subprocess */
  	PM_RUN,						/* normal "database is alive" state */
  	PM_WAIT_BACKUP,				/* waiting for online backup mode to end */
  	PM_WAIT_BACKENDS,			/* waiting for live backends to exit */
--- 267,285 ----
   *
   * Notice that this state variable does not distinguish *why* we entered
   * states later than PM_RUN --- Shutdown and FatalError must be consulted
!  * to find that out.  FatalError is never true in PM_RECOVERY_* or PM_RUN
!  * states, nor in PM_SHUTDOWN states (because we don't enter those states
!  * when trying to recover from a crash).  It can be true in PM_STARTUP state,
!  * because we don't clear it until we've successfully started WAL redo.
!  * Similarly, RecoveryError means that we have crashed during recovery, and
!  * should not try to restart.
   */
  typedef enum
  {
  	PM_INIT,					/* postmaster starting */
  	PM_STARTUP,					/* waiting for startup subprocess */
+ 	PM_RECOVERY,				/* in recovery mode */
+ 	PM_RECOVERY_CONSISTENT,		/* consistent recovery mode */
  	PM_RUN,						/* normal "database is alive" state */
  	PM_WAIT_BACKUP,				/* waiting for online backup mode to end */
  	PM_WAIT_BACKENDS,			/* waiting for live backends to exit */
***************
*** 1302,1308 **** ServerLoop(void)
  		 * state that prevents it, start one.  It doesn't matter if this
  		 * fails, we'll just try again later.
  		 */
! 		if (BgWriterPID == 0 && pmState == PM_RUN)
  			BgWriterPID = StartBackgroundWriter();
  
  		/*
--- 1328,1336 ----
  		 * state that prevents it, start one.  It doesn't matter if this
  		 * fails, we'll just try again later.
  		 */
! 		if (BgWriterPID == 0 &&
! 			(pmState == PM_RUN || pmState == PM_RECOVERY || 
! 			 pmState == PM_RECOVERY_CONSISTENT))
  			BgWriterPID = StartBackgroundWriter();
  
  		/*
***************
*** 1752,1758 **** canAcceptConnections(void)
  			return CAC_WAITBACKUP;	/* allow superusers only */
  		if (Shutdown > NoShutdown)
  			return CAC_SHUTDOWN;	/* shutdown is pending */
! 		if (pmState == PM_STARTUP && !FatalError)
  			return CAC_STARTUP; /* normal startup */
  		return CAC_RECOVERY;	/* else must be crash recovery */
  	}
--- 1780,1789 ----
  			return CAC_WAITBACKUP;	/* allow superusers only */
  		if (Shutdown > NoShutdown)
  			return CAC_SHUTDOWN;	/* shutdown is pending */
! 		if (!FatalError &&
! 			(pmState == PM_STARTUP ||
! 			 pmState == PM_RECOVERY ||
! 			 pmState == PM_RECOVERY_CONSISTENT))
  			return CAC_STARTUP; /* normal startup */
  		return CAC_RECOVERY;	/* else must be crash recovery */
  	}
***************
*** 1982,1988 **** pmdie(SIGNAL_ARGS)
  			ereport(LOG,
  					(errmsg("received smart shutdown request")));
  
! 			if (pmState == PM_RUN)
  			{
  				/* autovacuum workers are told to shut down immediately */
  				SignalAutovacWorkers(SIGTERM);
--- 2013,2020 ----
  			ereport(LOG,
  					(errmsg("received smart shutdown request")));
  
! 			if (pmState == PM_RUN || pmState == PM_RECOVERY ||
! 				pmState == PM_RECOVERY_CONSISTENT)
  			{
  				/* autovacuum workers are told to shut down immediately */
  				SignalAutovacWorkers(SIGTERM);
***************
*** 2019,2025 **** pmdie(SIGNAL_ARGS)
  
  			if (StartupPID != 0)
  				signal_child(StartupPID, SIGTERM);
! 			if (pmState == PM_RUN || pmState == PM_WAIT_BACKUP)
  			{
  				ereport(LOG,
  						(errmsg("aborting any active transactions")));
--- 2051,2064 ----
  
  			if (StartupPID != 0)
  				signal_child(StartupPID, SIGTERM);
! 			if (pmState == PM_RECOVERY)
! 			{
! 				/* only bgwriter is active in this state */
! 				pmState = PM_WAIT_BACKENDS;
! 			}
! 			if (pmState == PM_RUN ||
! 				pmState == PM_WAIT_BACKUP ||
! 				pmState == PM_RECOVERY_CONSISTENT)
  			{
  				ereport(LOG,
  						(errmsg("aborting any active transactions")));
***************
*** 2115,2125 **** reaper(SIGNAL_ARGS)
  		 */
  		if (pid == StartupPID)
  		{
  			StartupPID = 0;
- 			Assert(pmState == PM_STARTUP);
  
! 			/* FATAL exit of startup is treated as catastrophic */
! 			if (!EXIT_STATUS_0(exitstatus))
  			{
  				LogChildExit(LOG, _("startup process"),
  							 pid, exitstatus);
--- 2154,2177 ----
  		 */
  		if (pid == StartupPID)
  		{
+ 			bool recoveryCompleted;
+ 
  			StartupPID = 0;
  
! 			/*
! 			 * Check if the startup process completed recovery before exiting
! 			 */
! 			if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_COMPLETED))
! 				recoveryCompleted = true;
! 			else
! 				recoveryCompleted = false;
! 
! 			/*
! 			 * Unexpected exit of startup process (including FATAL exit)
! 			 * during PM_STARTUP is treated as catastrophic. There is no
! 			 * other processes running yet, so we can just exit.
! 			 */
! 			if (pmState == PM_STARTUP && !recoveryCompleted)
  			{
  				LogChildExit(LOG, _("startup process"),
  							 pid, exitstatus);
***************
*** 2127,2141 **** reaper(SIGNAL_ARGS)
  				(errmsg("aborting startup due to startup process failure")));
  				ExitPostmaster(1);
  			}
- 
  			/*
! 			 * Startup succeeded - we are done with system startup or
! 			 * recovery.
  			 */
! 			FatalError = false;
! 
  			/*
! 			 * Go to shutdown mode if a shutdown request was pending.
  			 */
  			if (Shutdown > NoShutdown)
  			{
--- 2179,2199 ----
  				(errmsg("aborting startup due to startup process failure")));
  				ExitPostmaster(1);
  			}
  			/*
! 			 * Any unexpected exit (including FATAL exit) of the startup
! 			 * process is treated as a crash, except that we don't want
! 			 * to reinitialize.
  			 */
! 			if (!EXIT_STATUS_0(exitstatus))
! 			{
! 				RecoveryError = true;
! 				HandleChildCrash(pid, exitstatus,
! 								 _("startup process"));
! 				continue;
! 			}
  			/*
! 			 * Startup process exited in response to a shutdown request (or
! 			 * it finished normally regardless of the shutdown request).
  			 */
  			if (Shutdown > NoShutdown)
  			{
***************
*** 2143,2151 **** reaper(SIGNAL_ARGS)
  				/* PostmasterStateMachine logic does the rest */
  				continue;
  			}
  
  			/*
! 			 * Otherwise, commence normal operations.
  			 */
  			pmState = PM_RUN;
  
--- 2201,2221 ----
  				/* PostmasterStateMachine logic does the rest */
  				continue;
  			}
+ 			/*
+ 			 * Startup process exited normally, but didn't finish recovery.
+ 			 * This can happen if someone else than postmaster kills the
+ 			 * startup process with SIGTERM. Treat it like a crash.
+ 			 */
+ 			if (!recoveryCompleted)
+ 			{
+ 				RecoveryError = true;
+ 				HandleChildCrash(pid, exitstatus,
+ 								 _("startup process"));
+ 				continue;
+ 			}
  
  			/*
! 			 * Startup succeeded, commence normal operations
  			 */
  			pmState = PM_RUN;
  
***************
*** 2157,2167 **** reaper(SIGNAL_ARGS)
  			load_role();
  
  			/*
! 			 * Crank up the background writer.	It doesn't matter if this
! 			 * fails, we'll just try again later.
  			 */
! 			Assert(BgWriterPID == 0);
! 			BgWriterPID = StartBackgroundWriter();
  
  			/*
  			 * Likewise, start other special children as needed.  In a restart
--- 2227,2238 ----
  			load_role();
  
  			/*
! 			 * Crank up the background writer, if we didn't do that already
! 			 * when we entered consistent recovery phase.  It doesn't matter
! 			 * if this fails, we'll just try again later.
  			 */
! 			if (BgWriterPID == 0)
! 				BgWriterPID = StartBackgroundWriter();
  
  			/*
  			 * Likewise, start other special children as needed.  In a restart
***************
*** 2178,2186 **** reaper(SIGNAL_ARGS)
  
  			/* at this point we are really open for business */
  			ereport(LOG,
! 				 (errmsg("database system is ready to accept connections")));
! 
! 			continue;
  		}
  
  		/*
--- 2249,2255 ----
  
  			/* at this point we are really open for business */
  			ereport(LOG,
! 				(errmsg("database system is ready to accept connections")));
  		}
  
  		/*
***************
*** 2443,2448 **** HandleChildCrash(int pid, int exitstatus, const char *procname)
--- 2512,2529 ----
  		}
  	}
  
+ 	/* Take care of the startup process too */
+ 	if (pid == StartupPID)
+ 		StartupPID = 0;
+ 	else if (StartupPID != 0 && !FatalError)
+ 	{
+ 		ereport(DEBUG2,
+ 				(errmsg_internal("sending %s to process %d",
+ 								 (SendStop ? "SIGSTOP" : "SIGQUIT"),
+ 								 (int) StartupPID)));
+ 		signal_child(BgWriterPID, (SendStop ? SIGSTOP : SIGQUIT));
+ 	}
+ 
  	/* Take care of the bgwriter too */
  	if (pid == BgWriterPID)
  		BgWriterPID = 0;
***************
*** 2514,2520 **** HandleChildCrash(int pid, int exitstatus, const char *procname)
  
  	FatalError = true;
  	/* We now transit into a state of waiting for children to die */
! 	if (pmState == PM_RUN ||
  		pmState == PM_WAIT_BACKUP ||
  		pmState == PM_SHUTDOWN)
  		pmState = PM_WAIT_BACKENDS;
--- 2595,2603 ----
  
  	FatalError = true;
  	/* We now transit into a state of waiting for children to die */
! 	if (pmState == PM_RECOVERY ||
! 		pmState == PM_RECOVERY_CONSISTENT ||
! 		pmState == PM_RUN ||
  		pmState == PM_WAIT_BACKUP ||
  		pmState == PM_SHUTDOWN)
  		pmState = PM_WAIT_BACKENDS;
***************
*** 2723,2728 **** PostmasterStateMachine(void)
--- 2806,2820 ----
  	}
  
  	/*
+ 	 * If recovery failed, wait for all non-syslogger children to exit,
+ 	 * and then exit postmaster. We don't try to reinitialize when recovery
+ 	 * fails, because more than likely it will just fail again and we will
+ 	 * keep trying forever.
+ 	 */
+ 	if (RecoveryError && pmState == PM_NO_CHILDREN)
+ 		ExitPostmaster(1);		
+ 
+ 	/*
  	 * If we need to recover from a crash, wait for all non-syslogger
  	 * children to exit, then reset shmem and StartupDataBase.
  	 */
***************
*** 3847,3852 **** sigusr1_handler(SIGNAL_ARGS)
--- 3939,3988 ----
  
  	PG_SETMASK(&BlockSig);
  
+ 	/*
+ 	 * RECOVERY_STARTED and RECOVERY_CONSISTENT signals are ignored in
+ 	 * unexpected states. If the startup process quickly starts up, completes
+ 	 * recovery, exits, we might process the death of the startup process
+ 	 * first. We don't want to go back to recovery in that case.
+ 	 */
+ 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_STARTED) &&
+ 		pmState == PM_STARTUP)
+ 	{
+ 		/* WAL redo has started. We're out of reinitialization. */
+ 		FatalError = false;
+ 
+ 		/*
+ 		 * Crank up the background writer.	It doesn't matter if this
+ 		 * fails, we'll just try again later.
+ 		 */
+ 		Assert(BgWriterPID == 0);
+ 		BgWriterPID = StartBackgroundWriter();
+ 
+ 		pmState = PM_RECOVERY;
+ 	}
+ 	if (CheckPostmasterSignal(PMSIGNAL_RECOVERY_CONSISTENT) &&
+ 		pmState == PM_RECOVERY)
+ 	{
+ 		/*
+ 		 * Load the flat authorization file into postmaster's cache. The
+ 		 * startup process won't have recomputed this from the database yet,
+ 		 * so we it may change following recovery. 
+ 		 */
+ 		load_role();
+ 
+ 		/*
+ 		 * Likewise, start other special children as needed.
+ 		 */
+ 		Assert(PgStatPID == 0);
+ 		PgStatPID = pgstat_start();
+ 
+ 		/* XXX at this point we could accept read-only connections */
+ 		ereport(DEBUG1,
+ 				(errmsg("database system is in consistent recovery mode")));
+ 
+ 		pmState = PM_RECOVERY_CONSISTENT;
+ 	}
+ 
  	if (CheckPostmasterSignal(PMSIGNAL_PASSWORD_CHANGE))
  	{
  		/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Start background writer during archive recovery.

Reply via email to