I've been observing a race with the ptrace(2) ATF tests. There is a function await_zombie() that checks whether a process (traced by a debugger) died and is now a zombie. The current code might return a single process twice, from the allproc list and later from zombproc. I detect that in both cases the process returned twice has SDYING status.
Reproducer for NetBSD, FreeBSD, Linux: http://netbsd.org/~kamil/kernel/attach2b.c This code is stable for FreeBSD and Linux, and can break on NetBSD with return value -1 and errno ENOMEM. Patch with a potential fix: http://netbsd.org/~kamil/patch-00046-zombielist-marker-sysctl_doeproc.txt I'm adding an additional marker at the head of zombproc list, and skip zombie processes that were appended to the list during the iteration over the processes. This approach generates a potential scenario when a process might be silently moved to the zombproc list. and thus becoming invisible in another race scenario. Ideally we would eliminate the proc_lock relock during the operation over all process lists. One solution would be to mark this buffer wire and disable EFAULT scenarios (and potentially disable need to relock proc_lock)... however so far nothing uses this in the kernel and I'm not sure it's proper time to go into the rabbit hole and just find a quick solution.