Thanks for your comments Bharath.

On Mon, Jul 27, 2020 at 10:13 AM Bharath Rupireddy
<bharath.rupireddyforpostg...@gmail.com> wrote:
> 1. Do we need "worker" as a function argument in
> update_parallel_worker_sigmask(BackgroundWorker *worker,.... ? Since
> MyBgworkerEntry is a global variable, can't we have a local variable
> instead?

Fixed, I have moved the worker check to the caller function.

> 2. Instead of update_parallel_worker_sigmask() serving only for
> parallel workers, can we make it generic, so that for any bgworker,
> given a signal it unblocks it, although there's no current use case
> for a bg worker unblocking a single signal other than a parallel
> worker doing it for SIGUSR1 for this hang issue. Please note that we
> have BackgroundWorkerBlockSignals() and
> BackgroundWorkerUnblockSignals().

Fixed. I have slightly modified the changes to break into
BackgroundWorkerRemoveBlockSignal & BackgroundWorkerAddBlockSignal.
This maintains the consistency similar to
BackgroundWorkerBlockSignals() and BackgroundWorkerUnblockSignals().

> If okay, with the BackgroundWorkerUpdateSignalMask() function, please
> note that we might have to add it in bgworker.sgml as well.

Included the documentation.

Attached the updated patch for the same.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com
From f06a5966d754d6622a3425c6cac1ad72314832ef Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Tue, 28 Jul 2020 10:48:17 +0530
Subject: [PATCH v4] Fix for Parallel worker hangs while handling errors.

Worker is not able to receive the signals while processing error flow. Worker
hangs in this case because when the worker is started the signals will be
masked using sigprocmask. Unblocking of signals is done by calling
BackgroundWorkerUnblockSignals in ParallelWorkerMain. Now due to error
handling the worker has jumped to setjmp in StartBackgroundWorker function.
Here the signals are in blocked state, hence the signal is not received by the
worker process.

Authors: Vignesh C, Bharath Rupireddy
---
 doc/src/sgml/bgworker.sgml          |  6 +++++-
 src/backend/postmaster/bgworker.c   | 20 ++++++++++++++++++++
 src/backend/postmaster/postmaster.c | 17 +++++++++++++++++
 src/include/postmaster/bgworker.h   |  4 ++++
 4 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/bgworker.sgml b/doc/src/sgml/bgworker.sgml
index 6e1cf12..d900fc9 100644
--- a/doc/src/sgml/bgworker.sgml
+++ b/doc/src/sgml/bgworker.sgml
@@ -210,7 +210,11 @@ typedef struct BackgroundWorker
    allow the process to customize its signal handlers, if necessary.
    Signals can be unblocked in the new process by calling
    <function>BackgroundWorkerUnblockSignals</function> and blocked by calling
-   <function>BackgroundWorkerBlockSignals</function>.
+   <function>BackgroundWorkerBlockSignals</function>. A particular signal can
+   be unblocked in the new process by calling
+   <function>BackgroundWorkerRemoveBlockSignal(<parameter>int signum</parameter>)</function>
+   and blocked by calling
+   <function>BackgroundWorkerAddBlockSignal(<parameter>int signum</parameter>)</function>.
   </para>
 
   <para>
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index beb5e85..458b513 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -747,6 +747,17 @@ StartBackgroundWorker(void)
 	 */
 	if (sigsetjmp(local_sigjmp_buf, 1) != 0)
 	{
+		/*
+		 * In case of parallel workers, unblock SIGUSR1 signal, it was blocked
+		 * when the postmaster forked us. Leader process will send SIGUSR1 signal
+		 * to the worker process(worker process will be in waiting state as
+		 * there is no space available) to indicate shared memory space is freed
+		 * up. Once the signal is received worker process will start populating
+		 * the error message further.
+		 */
+		if ((worker->bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
+			BackgroundWorkerRemoveBlockSignal(SIGUSR1);
+
 		/* Since not using PG_TRY, must reset error stack by hand */
 		error_context_stack = NULL;
 
@@ -757,6 +768,15 @@ StartBackgroundWorker(void)
 		EmitErrorReport();
 
 		/*
+		 * Undo the unblocking of SIGUSR1 which was done above, as to
+		 * not cause any further issues from unblocking SIGUSR1 during
+		 * the execution of callbacks and other processing that will be
+		 * done during proc_exit().
+		 */
+		if ((worker->bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
+			BackgroundWorkerAddBlockSignal(SIGUSR1);
+
+		/*
 		 * Do we need more cleanup here?  For shmem-connected bgworkers, we
 		 * will call InitProcess below, which will install ProcKill as exit
 		 * callback.  That will take care of releasing locks, etc.
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index dec0258..751cd16 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -5765,6 +5765,23 @@ BackgroundWorkerUnblockSignals(void)
 	PG_SETMASK(&UnBlockSig);
 }
 
+/*
+ * Block/unblock a particular signal in a background worker.
+ */
+void
+BackgroundWorkerAddBlockSignal(int signum)
+{
+	sigaddset(&BlockSig, signum);
+	PG_SETMASK(&BlockSig);
+}
+
+void
+BackgroundWorkerRemoveBlockSignal(int signum)
+{
+	sigdelset(&BlockSig, signum);
+	PG_SETMASK(&BlockSig);
+}
+
 #ifdef EXEC_BACKEND
 static pid_t
 bgworker_forkexec(int shmem_slot)
diff --git a/src/include/postmaster/bgworker.h b/src/include/postmaster/bgworker.h
index 4c6ebaa..cb5423b 100644
--- a/src/include/postmaster/bgworker.h
+++ b/src/include/postmaster/bgworker.h
@@ -158,4 +158,8 @@ extern void BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid, ui
 extern void BackgroundWorkerBlockSignals(void);
 extern void BackgroundWorkerUnblockSignals(void);
 
+/* Block/unblock a particular signal in a background worker process */
+extern void BackgroundWorkerAddBlockSignal(int signum);
+extern void BackgroundWorkerRemoveBlockSignal(int signum);
+
 #endif							/* BGWORKER_H */
-- 
1.8.3.1

Reply via email to