George-Cristian Bîrzan wrote:
On Mon, 2006-02-13 at 13:39 -0500, Alan DeKok wrote:
waitpid(22058, NULL, WNOHANG) = -1 ECHILD (No child
processes)
22058 exists, and is another process that hung.
That's definitely a bug in your OS somewhere. If 22058 exists, and
is "defunct", then the parent process calling "waitpid" on it should
*never* get ECHILD.
It's not the parent, from what I can tell. At which point it should do
that, no?
No it should not do that, that's why you are having problems.
Posix MANDATES that any thread in a thread group can reap any child
process, not just the thread that created the process. The Linux manpage
for waitpid says this was fixed in the 2.4 kernel, but clearly not -
possibly the problem lies in the pthread library (LinuxThreads) you're
using, which is bundled with glibc. There's also the morass of SIGCHLD
handling...
I would suggest trying a system with proper NPTL threading (2.6 kernel,
recent libc compiled appropriately) which will almost certainly work.
Upgrading your main system to that may be a little more... involved.
-
List info/subscribe/unsubscribe? See http://www.freeradius.org/list/users.html