Hello, I'm trying to solve a long-running problem whereby my Apache mod_perl processes get stuck in a "FUTEX_WAIT" state instead of exiting.
I believe this is the same issue as reported here: http://www.gossamer-threads.com/lists/modperl/modperl/99879 The problem occurs fairly frequently following a burst of traffic, when Apache spawns new processes, then attempts to cull them afterward. It also occurred, before I disabled this, when Apache tried to cull a process upon reaching MaxRequestsPerChild. Usually, from the child's point of view, this looks like this: $ strace -p 21764 Process 21764 attached - interrupt to quit read(5, "!", 1) = 1 tgkill(21764, 21791, SIGHUP) = 0 tgkill(21764, 21791, SIG_0) = 0 select(0, NULL, NULL, NULL, {0, 500000}) = ? ERESTARTNOHAND (To be restarted) --- SIGTERM (Terminated) @ 0 (0) --- rt_sigreturn(0xf) = -1 EINTR (Interrupted system call) munmap(0x7f9905750000, 8392704) = 0 munmap(0x7f98f8736000, 8392704) = 0 ... madvise(0x7f98e4021000, 73728, MADV_DONTNEED) = 0 exit_group(0) = ? Process 21764 detached However, every five or so attempts, it instead goes like this: $ strace -p 24133 Process 24133 attached - interrupt to quit read(5, "!", 1) = 1 tgkill(24133, 24164, SIGHUP) = 0 tgkill(24133, 24164, SIG_0) = 0 --- SIGTERM (Terminated) @ 0 (0) --- rt_sigreturn(0xf) = 0 select(0, NULL, NULL, NULL, {0, 500000}) = 0 (Timeout) tgkill(24133, 24140, SIGUSR1) = 0 futex(0x7f9904f4e9d0, FUTEX_WAIT, 24140, NULL ... and goes no further. Sometimes, after a few minutes of doing nothing, the process will suddenly free itself, spit out a bunch of "munmap" calls, and exit. But more often it hangs indefinitely. Given time, these hung children accumulate until they occupy all available RAM, which sends the box into swap and eventually crashes it. This problem has occurred on various flavors of Apache & Ubuntu over the last two years. I'm currently seeing it regularly on the two boxes I manage, which are: - Apache/2.2.17 (Ubuntu) mod_perl/2.0.4 Perl/v5.10.1 on Ubuntu 11.04 (2.6.38-11-generic #50-Ubuntu SMP x86_64). - Apache/2.2.14 (Ubuntu) mod_perl/2.0.4 Perl/v5.10.1 on Ubuntu 10.04 (2.6.32-30-server #59-Ubuntu SMP x86_64). The problem does not occur on Apache running without mod_perl. I have tried to debug this problem for a long time, but don't know how to advance any further. Thanks in advance for any advice! Max.