Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sun, Aug 28, 2022 at 11:03 AM Thomas Munro wrote: > On Sun, Jun 26, 2022 at 11:18 AM Thomas Munro wrote: > > On Tue, May 17, 2022 at 3:31 PM Thomas Munro wrote: > > > On Mon, May 16, 2022 at 3:45 PM Japin Li wrote: > > > > Maybe use the __illumos__ macro more accurity. > > > > > > > > +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \ > > > > + !defined(__sun__) > > > > > > Thanks, updated, and with a new commit message. > > > > Pushed to master and REL_14_STABLE. > > FTR: I noticed that https://www.illumos.org/issues/13700 had been > marked fixed, so I asked if we should remove our check[1]. Nope, > another issue was opened at https://www.illumos.org/issues/14892, > which I'll keep an eye on. It seems we're pretty good at hitting > poll/event-related kernel bugs in various OSes. I happened to notice in the release notes for OmniOS that Stephen posted in the nearby GSSAPI thread that this has now been fixed. I think there's no point in changing the back branches (hard to synchronise with kernel upgrades), but I also don't want to leave this weird wart in the code forever. Shall we remove it in 16? I don't personally care if it's 16 or 17, but I wanted to make a note about the cleanup opportunity either way, and will add this to the open commitfest. 0001-Trust-signalfd-on-illumos-again.patch Description: Binary data
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sun, Jun 26, 2022 at 11:18 AM Thomas Munro wrote: > On Tue, May 17, 2022 at 3:31 PM Thomas Munro wrote: > > On Mon, May 16, 2022 at 3:45 PM Japin Li wrote: > > > Maybe use the __illumos__ macro more accurity. > > > > > > +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \ > > > + !defined(__sun__) > > > > Thanks, updated, and with a new commit message. > > Pushed to master and REL_14_STABLE. FTR: I noticed that https://www.illumos.org/issues/13700 had been marked fixed, so I asked if we should remove our check[1]. Nope, another issue was opened at https://www.illumos.org/issues/14892, which I'll keep an eye on. It seems we're pretty good at hitting poll/event-related kernel bugs in various OSes. [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3ab4fc5dcf30ebc90a23ad878342dc528e2d25ce
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Tue, May 17, 2022 at 3:31 PM Thomas Munro wrote: > On Mon, May 16, 2022 at 3:45 PM Japin Li wrote: > > Maybe use the __illumos__ macro more accurity. > > > > +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \ > > + !defined(__sun__) > > Thanks, updated, and with a new commit message. Pushed to master and REL_14_STABLE. I'll email the illumos build farm animal owners to say that they should be able to remove -DWAIT_USE_POLL. Theoretically, it might be useful that we've separated the WAIT_USE_SELF_PIPE code from WAIT_USE_POLL if someone eventually wants to complete the set of possible WaitEventSet implementations by adding /dev/poll (Solaris, HPUX) and pollset (AIX) support. I don't think those have a nicer way to receive race-free signal wakeups. Realistically no one's likely to show up with a patch for those old proprietary Unixen at this point on the timeline, I just think it's interesting that every OS had something better than poll(), we just need that fallback for lack of patches, not lack of kernel features. Ironically the typical monster AIX systems I've run into in the wild are probably much more capable of suffering from poll() contention than all these puny x86 systems, with oodles of CPUs and NUMA nodes. If someone *is* still interested in scalability on AIX, I'd recommend looking at pollset for latch.c, and also the stalled huge pages thing[1]. [1] https://www.postgresql.org/message-id/CA%2BhUKGJE4dq%2BNZHrm%3DpNSNCYwDCH%2BT6HtaWm5Lm8vZzygknPpA%40mail.gmail.com
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Mon, May 16, 2022 at 3:45 PM Japin Li wrote: > Maybe use the __illumos__ macro more accurity. > > +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \ > + !defined(__sun__) Thanks, updated, and with a new commit message. I don't know much about these OSes (though I used lots of Sun machines during the Jurassic period). I know that there are three distributions of illumos: OmniOS, SmartOS and OpenIndiana, and they share the same kernel and base system. The off-list reports I received about hangs and kernel panics were from OpenIndiana animals hake and haddock, which are not currently reporting (I'll ask why), and then their owner defined -DWAIT_USE_POLL to clear that up while we waited for progress on his kernel panic bug report. I see that OmniOS animal pollock is currently reporting and also uses -DWAIT_USE_POLL, but I couldn't find any discussion about that. Of course, you might be hitting some completely different problem, given the lack of information. I'd be interested in the output of "p *MyLatch" (= to see if the latch has already been set), and whether "kill -URG PID" dislodges the stuck process. But given the open kernel bug report that I've now been reminded of, I'm thinking about pushing this anyway. Then we could ask the animal owners to remove -DWAIT_USE_POLL so that they'd effectively be running with -DWAIT_USE_EPOLL and -DWAIT_USE_SELF_PIPE, which would be more like PostgreSQL 13, but people who want to reproduce the problem on the illumos side could build with -DWAIT_USE_SIGNALFD. From 9a2dd1ed57c3364c98fe459071e84702015c5814 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 14 May 2022 10:15:30 +1200 Subject: [PATCH v3] Don't trust signalfd() on illumos. Hangs and kernel panics have been reported on illumos systems since we started using signalfd() to receive latch wakeup events. A bug report exists at https://www.illumos.org/issues/13700 but no fix is available yet. Provide a way to go back to using a self-pipe with -DWAIT_USE_SELF_PIPE, and make that the default on that platform. Users can explicitly provide -DWAIT_USE_SIGNALFD if required in case that's helpful to investigate what's going wrong. Back-patch to 14, where we started using signalfd(). Reported-by: Japin Li Reported-by: Olaf Bohlen (off-list) Reviewed-by: Japin Li Discussion: https://postgr.es/m/MEYP282MB1669C8D88F0997354C2313C1B6CA9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM --- src/backend/storage/ipc/latch.c | 58 +++-- 1 file changed, 40 insertions(+), 18 deletions(-) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 78c6a89271..fba8a9ea94 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -72,7 +72,7 @@ #if defined(WAIT_USE_EPOLL) || defined(WAIT_USE_POLL) || \ defined(WAIT_USE_KQUEUE) || defined(WAIT_USE_WIN32) /* don't overwrite manual choice */ -#elif defined(HAVE_SYS_EPOLL_H) && defined(HAVE_SYS_SIGNALFD_H) +#elif defined(HAVE_SYS_EPOLL_H) #define WAIT_USE_EPOLL #elif defined(HAVE_KQUEUE) #define WAIT_USE_KQUEUE @@ -84,6 +84,22 @@ #error "no wait set implementation available" #endif +/* + * By default, we use a self-pipe with poll() and a signalfd with epoll(), if + * available. We avoid signalfd on illumos for now based on problem reports. + * For testing the choice can also be manually specified. + */ +#if defined(WAIT_USE_POLL) || defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SELF_PIPE) || defined(WAIT_USE_SIGNALFD) +/* don't overwrite manual choice */ +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \ + !defined(__illumos__) +#define WAIT_USE_SIGNALFD +#else +#define WAIT_USE_SELF_PIPE +#endif +#endif + /* typedef in latch.h */ struct WaitEventSet { @@ -146,12 +162,12 @@ static WaitEventSet *LatchWaitSet; static volatile sig_atomic_t waiting = false; #endif -#ifdef WAIT_USE_EPOLL +#ifdef WAIT_USE_SIGNALFD /* On Linux, we'll receive SIGURG via a signalfd file descriptor. */ static int signal_fd = -1; #endif -#if defined(WAIT_USE_POLL) +#ifdef WAIT_USE_SELF_PIPE /* Read and write ends of the self-pipe */ static int selfpipe_readfd = -1; static int selfpipe_writefd = -1; @@ -164,7 +180,7 @@ static void latch_sigurg_handler(SIGNAL_ARGS); static void sendSelfPipeByte(void); #endif -#if defined(WAIT_USE_POLL) || defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SELF_PIPE) || defined(WAIT_USE_SIGNALFD) static void drain(void); #endif @@ -190,7 +206,7 @@ static inline int WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, void InitializeLatchSupport(void) { -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) int pipefd[2]; if (IsUnderPostmaster) @@ -264,7 +280,7 @@ InitializeLatchSupport(void) pqsignal(SIGURG, latch_sigurg_handler); #endif -#ifdef WAIT_USE_EPOLL +#ifdef WAIT_USE_SIGNALFD sigset_t signalfd_mask; /* Block SIGURG, because we'll receive it through a signalfd. */ @@ -316,7
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sat, 14 May 2022 at 11:01, Thomas Munro wrote: > On Sat, May 14, 2022 at 10:25 AM Thomas Munro wrote: >> Japin, are you able to reproduce the problem reliably? Did I guess >> right, that you're on illumos? Does this help? I used >> defined(__sun__) to select the option, but I don't remember if that's >> the right way to detect that OS family, could you confirm that, or >> adjust as required? > > Better version. Now you can independently set -DWAIT_USE_{POLL,EPOLL} > and -DWAIT_USE_{SELF_PIPE,SIGNALFD} for testing, and it picks a > sensible default. Thanks for your patch! The illumos already defined the following macros. $ gcc -dM -E -
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Fri, 13 May 2022 at 22:08, Robert Haas wrote: > On Fri, May 13, 2022 at 6:16 AM Japin Li wrote: >> The process cannot be terminated by pg_terminate_backend(), although >> it returns true. > > pg_terminate_backend() just sends SIGINT. What I'm wondering is what > happens when the stuck process receives SIGINT. It would be useful, I > think, to check the value of the global variable InterruptHoldoffCount > in the stuck process by attaching to it with gdb. I would also try > running "strace -p $PID" on the stuck process and then try terminating > it again with pg_terminate_backend(). Either the system call in which > it's currently stuck returns and then it makes the same system call > again and hangs again ... or the signal doesn't dislodge it from the > system call in which it's stuck in the first place. It would be useful > to know which of those two things is happening. > > One thing I find a bit curious is that the top of the stack in your > case is ioctl(). And there are no calls to ioctl() anywhere in > latch.c, nor have there ever been. What operating system is this? We > have 4 different versions of WaitEventSetWaitBlock() that call > epoll_wait(), kevent(), poll(), and WaitForMultipleObjects() > respectively. I wonder which of those we're using, and whether one of > those calls is showing up as ioctl() in the stacktrace, or whether > there's some other function being called in here that is somehow > resulting in ioctl() getting called. Thanks for your advice. I will try this on Monday. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sat, 14 May 2022 at 11:01, Thomas Munro wrote: > On Sat, May 14, 2022 at 10:25 AM Thomas Munro wrote: >> Japin, are you able to reproduce the problem reliably? Did I guess >> right, that you're on illumos? Does this help? I used >> defined(__sun__) to select the option, but I don't remember if that's >> the right way to detect that OS family, could you confirm that, or >> adjust as required? > > Better version. Now you can independently set -DWAIT_USE_{POLL,EPOLL} > and -DWAIT_USE_{SELF_PIPE,SIGNALFD} for testing, and it picks a > sensible default. Sorry for the late reply. My bad! It actually SmartOS, which is based on illumos. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sat, May 14, 2022 at 10:25 AM Thomas Munro wrote: > Japin, are you able to reproduce the problem reliably? Did I guess > right, that you're on illumos? Does this help? I used > defined(__sun__) to select the option, but I don't remember if that's > the right way to detect that OS family, could you confirm that, or > adjust as required? Better version. Now you can independently set -DWAIT_USE_{POLL,EPOLL} and -DWAIT_USE_{SELF_PIPE,SIGNALFD} for testing, and it picks a sensible default. From 4f3027b35d5f25da37836b68d81fbcfb077e41d5 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 14 May 2022 10:15:30 +1200 Subject: [PATCH v2] Don't trust signalfd() on illumos. Allow the choice between signalfd vs self-pipe to be controlled independently of the choice between epoll() and poll(). Make the default choice the same as before (signalfd + epoll() for Linux systems, self-pipe + poll() for other Unixes that don't have kqueue), except on illumos where it's self-pipe + epoll(). We don't have a very good understanding of why, yet, but its signalfd doesn't seem to behave the same as Linux in some edge case that leads to lost wakeups. This way, illumos users get a working default that doesn't give up the benefits of epoll() over poll(). Back-patch to 14, where use of signalfd() appeared. Discussion: https://postgr.es/m/MEYP282MB1669C8D88F0997354C2313C1B6CA9%40MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM --- src/backend/storage/ipc/latch.c | 58 +++-- 1 file changed, 40 insertions(+), 18 deletions(-) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 78c6a89271..c99dbb9f46 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -72,7 +72,7 @@ #if defined(WAIT_USE_EPOLL) || defined(WAIT_USE_POLL) || \ defined(WAIT_USE_KQUEUE) || defined(WAIT_USE_WIN32) /* don't overwrite manual choice */ -#elif defined(HAVE_SYS_EPOLL_H) && defined(HAVE_SYS_SIGNALFD_H) +#elif defined(HAVE_SYS_EPOLL_H) #define WAIT_USE_EPOLL #elif defined(HAVE_KQUEUE) #define WAIT_USE_KQUEUE @@ -84,6 +84,22 @@ #error "no wait set implementation available" #endif +/* + * By default, we use a self-pipe with poll() and a signalfd with epoll(), if + * available. We avoid signalfd on illumos because it doesn't seem to work + * reliably. For testing the choice can also be manually specified. + */ +#if defined(WAIT_USE_POLL) || defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SELF_PIPE) || defined(WAIT_USE_SIGNALFD) +/* don't overwrite manual choice */ +#elif defined(WAIT_USE_EPOLL) && defined(HAVE_SYS_SIGNALFD_H) && \ + !defined(__sun__) +#define WAIT_USE_SIGNALFD +#else +#define WAIT_USE_SELF_PIPE +#endif +#endif + /* typedef in latch.h */ struct WaitEventSet { @@ -146,12 +162,12 @@ static WaitEventSet *LatchWaitSet; static volatile sig_atomic_t waiting = false; #endif -#ifdef WAIT_USE_EPOLL +#ifdef WAIT_USE_SIGNALFD /* On Linux, we'll receive SIGURG via a signalfd file descriptor. */ static int signal_fd = -1; #endif -#if defined(WAIT_USE_POLL) +#ifdef WAIT_USE_SELF_PIPE /* Read and write ends of the self-pipe */ static int selfpipe_readfd = -1; static int selfpipe_writefd = -1; @@ -164,7 +180,7 @@ static void latch_sigurg_handler(SIGNAL_ARGS); static void sendSelfPipeByte(void); #endif -#if defined(WAIT_USE_POLL) || defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SELF_PIPE) || defined(WAIT_USE_SIGNALFD) static void drain(void); #endif @@ -190,7 +206,7 @@ static inline int WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, void InitializeLatchSupport(void) { -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) int pipefd[2]; if (IsUnderPostmaster) @@ -264,7 +280,7 @@ InitializeLatchSupport(void) pqsignal(SIGURG, latch_sigurg_handler); #endif -#ifdef WAIT_USE_EPOLL +#ifdef WAIT_USE_SIGNALFD sigset_t signalfd_mask; /* Block SIGURG, because we'll receive it through a signalfd. */ @@ -316,7 +332,7 @@ ShutdownLatchSupport(void) LatchWaitSet = NULL; } -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) close(selfpipe_readfd); close(selfpipe_writefd); selfpipe_readfd = -1; @@ -324,7 +340,7 @@ ShutdownLatchSupport(void) selfpipe_owner_pid = InvalidPid; #endif -#if defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SIGNALFD) close(signal_fd); signal_fd = -1; #endif @@ -341,9 +357,12 @@ InitLatch(Latch *latch) latch->owner_pid = MyProcPid; latch->is_shared = false; -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) /* Assert InitializeLatchSupport has been called in this process */ Assert(selfpipe_readfd >= 0 && selfpipe_owner_pid == MyProcPid); +#elif defined(WAIT_USE_SIGNALFD) + /* Assert InitializeLatchSupport has been called in this process */ + Assert(signal_fd >= 0); #elif defined(WAIT_USE_WIN32) latch->event = CreateEvent(NULL, TRUE, FALSE, NULL); if (latch->event == NULL) @@ -405,9 +424,12 @@ OwnLatch(Latch *latch) /* Sanity checks */
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sat, May 14, 2022 at 9:25 AM Thomas Munro wrote: > In short, I'd recommend -DWAIT_USE_POLL for now. It's possible that > we could do something to prevent the selection of WAIT_USE_EPOLL on > that platform, or that we should have a halfway option epoll() but not > signalfd() (= go back to using the self-pipe trick), patches welcome, > but that feels kinda strange and would be very niche combination that > isn't fun to maintain... the real solution is to fix the bug. I felt a bit sad about writing that, so I took a crack at trying to write a patch that separates the signalfd/self-pipe choice from the epoll/poll choice. Maybe it's not too bad. Japin, are you able to reproduce the problem reliably? Did I guess right, that you're on illumos? Does this help? I used defined(__sun__) to select the option, but I don't remember if that's the right way to detect that OS family, could you confirm that, or adjust as required? From 316ec0895c7ab65102c8c100774b51fc0e86c859 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 14 May 2022 10:15:30 +1200 Subject: [PATCH] Don't trust signalfd() on illumos. --- src/backend/storage/ipc/latch.c | 40 + 1 file changed, 26 insertions(+), 14 deletions(-) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 78c6a89271..8b0b31d100 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -84,6 +84,18 @@ #error "no wait set implementation available" #endif +/* + * We need the self-pipe trick for race-free wakeups when using poll(). + * + * XXX We also need it when using epoll() on illumos, because its signalfd() + * seems to be broken. + */ +#if defined(WAIT_USE_POLL) || (defined(WAIT_USE_EPOLL) && defined(__sun__)) +#define WAIT_USE_SELF_PIPE +#elif defined(WAIT_USE_EPOLL) +#define WAIT_USE_SIGNALFD +#endif + /* typedef in latch.h */ struct WaitEventSet { @@ -146,12 +158,12 @@ static WaitEventSet *LatchWaitSet; static volatile sig_atomic_t waiting = false; #endif -#ifdef WAIT_USE_EPOLL +#ifdef WAIT_USE_SIGNALFD /* On Linux, we'll receive SIGURG via a signalfd file descriptor. */ static int signal_fd = -1; #endif -#if defined(WAIT_USE_POLL) +#ifdef WAIT_USE_SELF_PIPE /* Read and write ends of the self-pipe */ static int selfpipe_readfd = -1; static int selfpipe_writefd = -1; @@ -164,7 +176,7 @@ static void latch_sigurg_handler(SIGNAL_ARGS); static void sendSelfPipeByte(void); #endif -#if defined(WAIT_USE_POLL) || defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SELF_PIPE) || defined(WAIT_USE_SIGNALFD) static void drain(void); #endif @@ -190,7 +202,7 @@ static inline int WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, void InitializeLatchSupport(void) { -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) int pipefd[2]; if (IsUnderPostmaster) @@ -264,7 +276,7 @@ InitializeLatchSupport(void) pqsignal(SIGURG, latch_sigurg_handler); #endif -#ifdef WAIT_USE_EPOLL +#ifdef WAIT_USE_SIGNALFD sigset_t signalfd_mask; /* Block SIGURG, because we'll receive it through a signalfd. */ @@ -316,7 +328,7 @@ ShutdownLatchSupport(void) LatchWaitSet = NULL; } -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) close(selfpipe_readfd); close(selfpipe_writefd); selfpipe_readfd = -1; @@ -324,7 +336,7 @@ ShutdownLatchSupport(void) selfpipe_owner_pid = InvalidPid; #endif -#if defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SIGNALFD) close(signal_fd); signal_fd = -1; #endif @@ -904,9 +916,9 @@ AddWaitEventToSet(WaitEventSet *set, uint32 events, pgsocket fd, Latch *latch, { set->latch = latch; set->latch_pos = event->pos; -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) event->fd = selfpipe_readfd; -#elif defined(WAIT_USE_EPOLL) +#elif defined(WAIT_USE_SIGNALFD) event->fd = signal_fd; #else event->fd = PGINVALID_SOCKET; @@ -2083,7 +2095,7 @@ GetNumRegisteredWaitEvents(WaitEventSet *set) return set->nevents; } -#if defined(WAIT_USE_POLL) +#if defined(WAIT_USE_SELF_PIPE) /* * SetLatch uses SIGURG to wake up the process waiting on the latch. @@ -2134,7 +2146,7 @@ retry: #endif -#if defined(WAIT_USE_POLL) || defined(WAIT_USE_EPOLL) +#if defined(WAIT_USE_SELF_PIPE) || defined(WAIT_USE_SIGNALFD) /* * Read all available data from self-pipe or signalfd. @@ -2150,7 +2162,7 @@ drain(void) int rc; int fd; -#ifdef WAIT_USE_POLL +#ifdef WAIT_USE_SELF_PIPE fd = selfpipe_readfd; #else fd = signal_fd; @@ -2168,7 +2180,7 @@ drain(void) else { waiting = false; -#ifdef WAIT_USE_POLL +#ifdef WAIT_USE_SELF_PIPE elog(ERROR, "read() on self-pipe failed: %m"); #else elog(ERROR, "read() on signalfd failed: %m"); @@ -2178,7 +2190,7 @@ drain(void) else if (rc == 0) { waiting = false; -#ifdef WAIT_USE_POLL +#ifdef WAIT_USE_SELF_PIPE elog(ERROR, "unexpected EOF on self-pipe"); #else elog(ERROR, "unexpected EOF on signalfd"); --
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Sat, May 14, 2022 at 2:09 AM Robert Haas wrote: > On Fri, May 13, 2022 at 6:16 AM Japin Li wrote: > > The process cannot be terminated by pg_terminate_backend(), although > > it returns true. > One thing I find a bit curious is that the top of the stack in your > case is ioctl(). And there are no calls to ioctl() anywhere in > latch.c, nor have there ever been. What operating system is this? We > have 4 different versions of WaitEventSetWaitBlock() that call > epoll_wait(), kevent(), poll(), and WaitForMultipleObjects() > respectively. I wonder which of those we're using, and whether one of > those calls is showing up as ioctl() in the stacktrace, or whether > there's some other function being called in here that is somehow > resulting in ioctl() getting called. I guess this is really illumos (née OpenSolaris), not Solaris, using our epoll build mode, with illumos's emulation of epoll, which maps epoll onto Sun's /dev/poll driver: https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/sys/epoll.c#L230 That'd explain: fb7fef216f4a ioctl (d, d001, fb7fffdfa0e0) That matches the value DP_POLL from: https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/sys/devpoll.h#L44 Or if it's really Solaris, huh, are people moving illumos code back into closed Solaris these days? As for why it's hanging, I don't know, but one thing that we changed in 14 was that we started using signalfd() to receive latch signals on systems that have it, and illumos also has an emulation of signalfd() that our configure script finds: https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/io/signalfd.c There were in fact a couple of unexplained hangs on the illumos build farm animals, and then they were changed to use -DWAIT_USE_POLL so that they wouldn't automatically choose epoll()/signalfd(). That is not very satisfactory, but as far as I know there is a bug in either epoll() or signalfd(), or at least some difference compared to the Linux implementation they are emulating. spent quite a bit of time ping ponging emails back and forth with the owner of a hanging BF animal trying to get a minimal repro for a bug report, without success. I mean, it's possible that the bug is in PostgreSQL (though no complaint has ever reached me about this stuff on Linux), but while trying to investigate it a kernel panic happened[1], which I think counts as a point against that theory... (For what it's worth, WSL1 also emulates these two Linux interfaces and also apparently doesn't do so well enough for our purposes, also for reasons not understood by us.) In short, I'd recommend -DWAIT_USE_POLL for now. It's possible that we could do something to prevent the selection of WAIT_USE_EPOLL on that platform, or that we should have a halfway option epoll() but not signalfd() (= go back to using the self-pipe trick), patches welcome, but that feels kinda strange and would be very niche combination that isn't fun to maintain... the real solution is to fix the bug. [1] https://www.illumos.org/issues/13700
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Fri, May 13, 2022 at 6:16 AM Japin Li wrote: > The process cannot be terminated by pg_terminate_backend(), although > it returns true. pg_terminate_backend() just sends SIGINT. What I'm wondering is what happens when the stuck process receives SIGINT. It would be useful, I think, to check the value of the global variable InterruptHoldoffCount in the stuck process by attaching to it with gdb. I would also try running "strace -p $PID" on the stuck process and then try terminating it again with pg_terminate_backend(). Either the system call in which it's currently stuck returns and then it makes the same system call again and hangs again ... or the signal doesn't dislodge it from the system call in which it's stuck in the first place. It would be useful to know which of those two things is happening. One thing I find a bit curious is that the top of the stack in your case is ioctl(). And there are no calls to ioctl() anywhere in latch.c, nor have there ever been. What operating system is this? We have 4 different versions of WaitEventSetWaitBlock() that call epoll_wait(), kevent(), poll(), and WaitForMultipleObjects() respectively. I wonder which of those we're using, and whether one of those calls is showing up as ioctl() in the stacktrace, or whether there's some other function being called in here that is somehow resulting in ioctl() getting called. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Fri, 13 May 2022 at 19:41, Justin Pryzby wrote: > On Fri, May 13, 2022 at 06:16:23PM +0800, Japin Li wrote: >> I had an incident on my Postgres 14 that queries hung in wait event >> IPC / MessageQueueInternal, MessageQueueReceive. It likes [1], >> however, it doesn't have any discussions. > > If the process is still running, or if the problem recurs, I suggest to create > a corefile with gcore, aka gdb generate-core-file. Then, we can look at the > backtrace at our leisure, even if the cluster needed to be restarted right > away. > Thanks for your advice, I will try it later. > What minor version of postgres is this, and what OS ? PostgreSQL 14.2 and Solaris. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: Backends stunk in wait event IPC/MessageQueueInternal
On Fri, May 13, 2022 at 06:16:23PM +0800, Japin Li wrote: > I had an incident on my Postgres 14 that queries hung in wait event > IPC / MessageQueueInternal, MessageQueueReceive. It likes [1], > however, it doesn't have any discussions. If the process is still running, or if the problem recurs, I suggest to create a corefile with gcore, aka gdb generate-core-file. Then, we can look at the backtrace at our leisure, even if the cluster needed to be restarted right away. What minor version of postgres is this, and what OS ? -- Justin
Backends stunk in wait event IPC/MessageQueueInternal
Hi, hackers I had an incident on my Postgres 14 that queries hung in wait event IPC / MessageQueueInternal, MessageQueueReceive. It likes [1], however, it doesn't have any discussions. The process cannot be terminated by pg_terminate_backend(), although it returns true. Here is the call stack comes from pstack: 485073: /opt/local/pgsql/14/bin/postgres fb7fef216f4a ioctl (d, d001, fb7fffdfa0e0) 008b8ec2 WaitEventSetWait () + 112 008b920f WaitLatch () + 6f 008bf434 shm_mq_wait_internal () + 64 008bff74 shm_mq_receive () + 2b4 0079fdc8 TupleQueueReaderNext () + 28 0077d8ca gather_merge_readnext () + 13a 0077db25 ExecGatherMerge () + 215 00790675 ExecNextLoop () + 175 00790675 ExecNextLoop () + 175 0076267d standard_ExecutorRun () + fd fb7fe3965fbd pgss_executorRun () + fd 008df99b PortalRunSelect () + 1cb 008e0dcf PortalRun () + 17f 008ddacd PostgresMain () + 100d 00857f62 ServerLoop () + cd2 00858cee main () + 453 005ab777 _start_crt () + 87 005ab6d8 _start () + 18 Any suggestions? Thanks in advance! [1] https://www.postgresql.org/message-id/flat/E9FA92C2921F31408041863B74EE4C2001A479E590%40CCPMAILDAG03.cantab.local -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.