On 2020/12/16 16:24, Kyotaro Horiguchi wrote:
At Tue, 15 Dec 2020 23:10:28 +0900, Fujii Masao <masao.fu...@oss.nttdata.com> 
wrote in
I pushed the following two patches.
- v1-use-standard-SIGHUP-hanlder-in-syslogger-process.patch
- v1-use-MyLatch-and-standard-SIGHUP-handler-in-startup-process.patch

As I told in other thread [1], I'm thinking to revert this patch
because this change could have bad side-effect on the startup
process waiting for recovery conflict.

Before applying the patch, the latch that the startup process
used to wait for recovery conflict was different from the latch
that SIGHUP signal handler or walreceiver process, etc set to
wake the startup process up. So SIGHUP or walreceiver didn't
wake the startup process waiting for recovery conflict up unnecessary.

But the patch got rid of the dedicated latch for signaling
the startup process. This change forced us to use the same latch
to make the startup process wait or wake up. Which caused SIGHUP
signal handler or walreceiver proces to wake the startup process
waiting on the latch for recovery conflict up unnecessarily
frequently.

While waiting for recovery conflict on buffer pin, deadlock needs
to be checked at least every deadlock_timeout. But that frequent
wakeups could prevent the deadlock timer from being triggered and
could delay that deadlock checks.

I thought that spurious wakeups don't harm. But actually
ResolveRecoveryConflictWithBufferPin doesn't consider spurious
wakeups.  Only the timer woke up ResolveRecoveryConflictWithBufferPin
before the patch comes. Currently SIGHUP and XLogFlush() (on
walreceiver) also wake up startup process.

For a moment I thought that ResolveRecoveryConflictWithBufferPin
should wake up at shutdown time by the old recovery latch but it's not
the case since it wakes up after all blockers go away.  It seems to me
simpler to revert the patches than making the function properly handle
spurious wakeups.

Therefore, I'm thinking to revert the commit ac22929a26 and
113d3591b8.

[1]
https://www.postgresql.org/message-id/a802f1c0-58d9-bd3f-bc3e-bdad54726...@oss.nttdata.com

As the result, I agree to revert them. But I think we need to add a
comment for the reason we don't use MyLatch for recovery-wakeup after
reverting them.

Agreed. Attached is the patch that reverts those patches and
adds the comments about why both procLatch and recoveryWakeupLatch
are necessary.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 8dd225c2e1..b1e5d2dbff 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -681,8 +681,18 @@ typedef struct XLogCtlData
         * recoveryWakeupLatch is used to wake up the startup process to 
continue
         * WAL replay, if it is waiting for WAL to arrive or failover trigger 
file
         * to appear.
+        *
+        * Note that the startup process also uses another latch, its procLatch,
+        * to wait for recovery conflict. If we get rid of recoveryWakeupLatch 
for
+        * signaling the startup process in favor of using its procLatch, which
+        * comports better with possible generic signal handlers using that 
latch.
+        * But we should not do that because the startup process doesn't assume
+        * that it's waken up by walreceiver process or SIGHUP signal handler
+        * while it's waiting for recovery conflict. The separate latches,
+        * recoveryWakeupLatch and procLatch, should be used for inter-process
+        * communication for WAL replay and recovery conflict, respectively.
         */
-       Latch           *recoveryWakeupLatch;
+       Latch           recoveryWakeupLatch;
 
        /*
         * During recovery, we keep a copy of the latest checkpoint record here.
@@ -5186,6 +5196,7 @@ XLOGShmemInit(void)
        SpinLockInit(&XLogCtl->Insert.insertpos_lck);
        SpinLockInit(&XLogCtl->info_lck);
        SpinLockInit(&XLogCtl->ulsn_lck);
+       InitSharedLatch(&XLogCtl->recoveryWakeupLatch);
 }
 
 /*
@@ -6121,7 +6132,7 @@ recoveryApplyDelay(XLogReaderState *record)
 
        while (true)
        {
-               ResetLatch(MyLatch);
+               ResetLatch(&XLogCtl->recoveryWakeupLatch);
 
                /* might change the trigger file's location */
                HandleStartupProcInterrupts();
@@ -6140,7 +6151,7 @@ recoveryApplyDelay(XLogReaderState *record)
 
                elog(DEBUG2, "recovery apply delay %ld milliseconds", msecs);
 
-               (void) WaitLatch(MyLatch,
+               (void) WaitLatch(&XLogCtl->recoveryWakeupLatch,
                                                 WL_LATCH_SET | WL_TIMEOUT | 
WL_EXIT_ON_PM_DEATH,
                                                 msecs,
                                                 
WAIT_EVENT_RECOVERY_APPLY_DELAY);
@@ -6469,11 +6480,11 @@ StartupXLOG(void)
        }
 
        /*
-        * Advertise our latch that other processes can use to wake us up
-        * if we're going to sleep during recovery.
+        * Take ownership of the wakeup latch if we're going to sleep during
+        * recovery.
         */
        if (ArchiveRecoveryRequested)
-               XLogCtl->recoveryWakeupLatch = &MyProc->procLatch;
+               OwnLatch(&XLogCtl->recoveryWakeupLatch);
 
        /* Set up XLOG reader facility */
        MemSet(&private, 0, sizeof(XLogPageReadPrivate));
@@ -7484,11 +7495,11 @@ StartupXLOG(void)
                ResetUnloggedRelations(UNLOGGED_RELATION_INIT);
 
        /*
-        * We don't need the latch anymore. It's not strictly necessary to reset
-        * it to NULL, but let's do it for the sake of tidiness.
+        * We don't need the latch anymore. It's not strictly necessary to 
disown
+        * it, but let's do it for the sake of tidiness.
         */
        if (ArchiveRecoveryRequested)
-               XLogCtl->recoveryWakeupLatch = NULL;
+               DisownLatch(&XLogCtl->recoveryWakeupLatch);
 
        /*
         * We are now done reading the xlog from stream. Turn off streaming
@@ -12300,12 +12311,12 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool 
randAccess,
                                                wait_time = 
wal_retrieve_retry_interval -
                                                        
TimestampDifferenceMilliseconds(last_fail_time, now);
 
-                                               (void) WaitLatch(MyLatch,
+                                               (void) 
WaitLatch(&XLogCtl->recoveryWakeupLatch,
                                                                                
 WL_LATCH_SET | WL_TIMEOUT |
                                                                                
 WL_EXIT_ON_PM_DEATH,
                                                                                
 wait_time,
                                                                                
 WAIT_EVENT_RECOVERY_RETRIEVE_RETRY_INTERVAL);
-                                               ResetLatch(MyLatch);
+                                               
ResetLatch(&XLogCtl->recoveryWakeupLatch);
                                                now = GetCurrentTimestamp();
 
                                                /* Handle interrupt signals of 
startup process */
@@ -12559,11 +12570,11 @@ WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool 
randAccess,
                                         * to react to a trigger file promptly 
and to check if the
                                         * WAL receiver is still active.
                                         */
-                                       (void) WaitLatch(MyLatch,
+                                       (void) 
WaitLatch(&XLogCtl->recoveryWakeupLatch,
                                                                         
WL_LATCH_SET | WL_TIMEOUT |
                                                                         
WL_EXIT_ON_PM_DEATH,
                                                                         5000L, 
WAIT_EVENT_RECOVERY_WAL_STREAM);
-                                       ResetLatch(MyLatch);
+                                       
ResetLatch(&XLogCtl->recoveryWakeupLatch);
                                        break;
                                }
 
@@ -12735,8 +12746,7 @@ CheckPromoteSignal(void)
 void
 WakeupRecovery(void)
 {
-       if (XLogCtl->recoveryWakeupLatch)
-               SetLatch(XLogCtl->recoveryWakeupLatch);
+       SetLatch(&XLogCtl->recoveryWakeupLatch);
 }
 
 /*
diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index eab9c8c4ed..64af7b8707 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -37,6 +37,7 @@
 /*
  * Flags set by interrupt handlers for later service in the redo loop.
  */
+static volatile sig_atomic_t got_SIGHUP = false;
 static volatile sig_atomic_t shutdown_requested = false;
 static volatile sig_atomic_t promote_signaled = false;
 
@@ -48,6 +49,7 @@ static volatile sig_atomic_t in_restore_command = false;
 
 /* Signal handlers */
 static void StartupProcTriggerHandler(SIGNAL_ARGS);
+static void StartupProcSigHupHandler(SIGNAL_ARGS);
 
 
 /* --------------------------------
@@ -62,7 +64,19 @@ StartupProcTriggerHandler(SIGNAL_ARGS)
        int                     save_errno = errno;
 
        promote_signaled = true;
-       SetLatch(MyLatch);
+       WakeupRecovery();
+
+       errno = save_errno;
+}
+
+/* SIGHUP: set flag to re-read config file at next convenient time */
+static void
+StartupProcSigHupHandler(SIGNAL_ARGS)
+{
+       int                     save_errno = errno;
+
+       got_SIGHUP = true;
+       WakeupRecovery();
 
        errno = save_errno;
 }
@@ -77,7 +91,7 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
                proc_exit(1);
        else
                shutdown_requested = true;
-       SetLatch(MyLatch);
+       WakeupRecovery();
 
        errno = save_errno;
 }
@@ -123,9 +137,9 @@ HandleStartupProcInterrupts(void)
        /*
         * Process any requests or signals received recently.
         */
-       if (ConfigReloadPending)
+       if (got_SIGHUP)
        {
-               ConfigReloadPending = false;
+               got_SIGHUP = false;
                StartupRereadConfig();
        }
 
@@ -158,7 +172,7 @@ StartupProcessMain(void)
        /*
         * Properly accept or ignore signals the postmaster might send us.
         */
-       pqsignal(SIGHUP, SignalHandlerForConfigReload); /* reload config file */
+       pqsignal(SIGHUP, StartupProcSigHupHandler); /* reload config file */
        pqsignal(SIGINT, SIG_IGN);      /* ignore query cancel */
        pqsignal(SIGTERM, StartupProcShutdownHandler);  /* request shutdown */
        /* SIGQUIT handler was already set up by InitPostmasterChild */
diff --git a/src/backend/storage/ipc/standby.c 
b/src/backend/storage/ipc/standby.c
index 4ea3cf1f5c..92d9027776 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -519,7 +519,14 @@ ResolveRecoveryConflictWithBufferPin(void)
                enable_timeouts(timeouts, 2);
        }
 
-       /* Wait to be signaled by UnpinBuffer() */
+       /*
+        * Wait to be signaled by UnpinBuffer().
+        *
+        * We assume that only UnpinBuffer() and the timeout requests 
established
+        * above can wake us up here. WakeupRecovery() called by walreceiver or
+        * SIGHUP signal handler, etc cannot do that because it uses the 
different
+        * latch from that ProcWaitForSignal() waits on.
+        */
        ProcWaitForSignal(PG_WAIT_BUFFER_PIN);
 
        /*

Reply via email to