The NetBSD shell - and I suspect many others, perhaps all others, waits for any terminated children (reaps them from the kernel) more or less as soon as they exit - then remembers the info in the internal jobs table for later reporting status via "wait $pid" or "jobs" (or just an interactive prompt) at the appropriate time.
This has the advantage that the kernel's process table has zombie processes removed quickly, and isn't cluttered with trash lying around because some script is running lots of background processes without waiting for any of them - the only cost is (or seems to be) some memory in the shell's jobs table (which the standard allows us to bound, if we desire). However, I have been pondering a somewhat weird case (or more correctly, possibility, as I have never actually seen it happen) Consider bg-process-1 & PID1=$! long-running-monster-fg-process bg-process-2 & PID2=$! "long-running-monster-fg-orocess" is something like a complete system build, including lots of add-on utilities (imagine, gnome and all that goes with it, and kde, and all associated with that ...) - it doesn't really matter, except that there are lots of processes being run. It is irrelevant whether that is lots of childrem from the current shell, or whether that is a script (or "make" or something) that simply takes a long time to complete. In this case, and with the shell strategy above, it is possible that PID1 and PID2 contain the same value. In that case, if both background processes have exited, and the script then does wait $PID1 what are we supposed to do? How are we to distinguish that from wait $PID2 ? Does anyone know of a shell that correctly handles this now? The only solutions I can see are to: Only ever use waitpid() with an explicit pid for the particular process of which we actually want the exit status, and leave all other completed processes as zombies until they are wanted (any of the other newer wait*() sys calls with similar functionality would do as well of course). This would mean that while a pipeline is running, we would be unable to report status of earlier completed elements of the pipe when the final (rightmost) process is still yet to complete, which would be annoying (but not actually fatal to anything). It would also mean that there would be no way to retain the wait -p PID -n $PID1 $PID2 ... command option that the NetBSD shell has, which waits for any one of the specified jobs to finish (any that was already finished, in which case there is no actual wait and a random one of the completed jobs is selected) or the next of them which happens to finish, if none were already done. (The "-p PID" option names a variable in which the ID of the job that finished is placed - the same as the arg string if there is one, with no pid args, the pid of the job (what $! was when the job started), the exit status of the wait command is the status of that job). That relies on being able to wait for any child to exit. Or: We always use wait*() with the WNOWAIT flag when waiting for any random child to complete, and then wait() again (wthout WNOWAIT, but with the explicit pid) when we want to clean up the jobs table entry for that job. The problem with this (aside from WNOWAIT in the standard only applying to waitid() - in practice I suspect that all of the wait*() sys calls that take a flags arg implement the same set of flags - certainly NetBSD does) is that I see no way to prevent that child process being returned again and again every time we do an anonymous wait*() system call. That is, I see no way to wait for something not previously ever waited upon, which is what we would need here - the kernel would need a bunch more mechanism, and a new WXXXXX flag would be required. NetBSD has WNOZOMBIE ("Ignore zombies") which only waits for some running process to change status - but that's no use, we want to get status from processes that have already exited (ie: zombies) if there are any - just only once. Of course, both of these "solutions" mean keeping zombies in the kernel process table - that's the point, as that prevents the kernel from re-using the process ID. Or: Every time the shell forks, before running any of the subshell code, it could check whether the PID it was assigned is a PID that is still "live" in the jobs table, and if so, it simply exits without doing anything. Simultaneously the parent is doing the same check using the new child's PID. Since the two are simply forks() of the parent, the data structures they see are identical - both child and parent will answer that check the same way. When the check reports "still in use" the child simply exits (as mentioned). the parent simply does a waitpid(PID, ...) to clean up that child (without ever having entered it into any data structs) and then forks again, and the whole process repeats. This is the solution I see with most promise, but relies upon the kernel not simply assigning the same pid over and over again (even if there happens to only be one available unused pid to assign). To deal with this the parent shell would need something like a counter of attempts, and if we fail to get a new pid after a few attempts, give up, and signal a fork error. This looks kind of cumbersome and ugly to me - even though I don't currently see any other plausible solution to this, that meets our goals. I'd love to hear from anyone who has (or can even imagine, regardless of whether it is currently implemented anywhere) a better solution for this issue. Or if for some reason I am not understanding this isn't even a potential (certainly it is extremely unlikely) problem, then why. kre ps: note that we don't currently have a problem with the kernel assigning the pid of a previously exited process, which is still alive in the jobs table, the shell can cope with that - the issue only arises when that pid is communicated to the script, and then used by the script. A similar problem would be if the script attempted kill $PID1 after bg-process-1 has finished (without the script realising that) which then ends up signalling $PID2 (the same thing) which is still running. Of course, a similar problem can happen here, without PID2 being involved - with the script simply signalling some unintended process. The only way of avoiding that would be to keep the zombies until the script has been made aware that the process is completed, after which it is simply a script bug if it tries to kill a process it knows is already complete.