On Fri, 13 May 2022 at 22:08, Robert Haas <robertmh...@gmail.com> wrote: > On Fri, May 13, 2022 at 6:16 AM Japin Li <japi...@hotmail.com> wrote: >> The process cannot be terminated by pg_terminate_backend(), although >> it returns true. > > pg_terminate_backend() just sends SIGINT. What I'm wondering is what > happens when the stuck process receives SIGINT. It would be useful, I > think, to check the value of the global variable InterruptHoldoffCount > in the stuck process by attaching to it with gdb. I would also try > running "strace -p $PID" on the stuck process and then try terminating > it again with pg_terminate_backend(). Either the system call in which > it's currently stuck returns and then it makes the same system call > again and hangs again ... or the signal doesn't dislodge it from the > system call in which it's stuck in the first place. It would be useful > to know which of those two things is happening. > > One thing I find a bit curious is that the top of the stack in your > case is ioctl(). And there are no calls to ioctl() anywhere in > latch.c, nor have there ever been. What operating system is this? We > have 4 different versions of WaitEventSetWaitBlock() that call > epoll_wait(), kevent(), poll(), and WaitForMultipleObjects() > respectively. I wonder which of those we're using, and whether one of > those calls is showing up as ioctl() in the stacktrace, or whether > there's some other function being called in here that is somehow > resulting in ioctl() getting called.
Thanks for your advice. I will try this on Monday. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.