> Shouldn't kill -9 kill a process no matter what?
Ordinarily, this is true. However, it is possible for a process to get
stuck in a state where even kill -9 can't kill it. I have seen this
happen where a disk device (for example) fails in a spectacular way. A
process that was blocking on a read from that device can get into this
disk wait state (ps shows it with a "D" in the process status column).
In fact, it recently happened to me...
I have FreeBSD on my laptop, and FreeBSD on another machine here at
home. On the server machine, I keep a copy of the FreeBSD CVS
repository, which I typically mount via NFS onto my other client
machines. I often suspend the laptop and take it to other locations
(when you're a consultant, you're often expected to provide your own
hardware). I resume, work, and suspend on a regular basis.
Now, at some point, I suspended while this NFS filesystem was still
mounted. Reaching a remote location, I discovered that the filesystem
was not available (obviously) and decided to unmount it. Much to my
dismay, umount got itself into a D state trying to unmount the drive
(presumably there were buffers waiting to be flushed or some such).
Ultimately, I had to reboot the laptop (no big deal, I'm forced to
reboot periodically when the battery freaks out on me) to get rid of
the stuck umount process.
Note that this is totally different from a "Z" state or zombie process,
which might be what you were looking at. Did your unkillable processes
have a Z in the state column? If so, they were actually really dead,
but still holding a process table slot because they had not been
"waited" for yet. In Unix, when a process exits, it generally has some
type of exit status to report (usually zero, but occasionally some
other number, see /usr/include/sysexits.h for some examples). It is
the responsibility of the parent process to "wait" (by calling the wait
system call) for these deceased processes and retrieve their exit
status. Normally, your shell does this for processes it spawns
automatically, but occasionally the parent process fails to wait for
it's children (sloppy coding, generally) and these "zombie" processes
hang around in the process table.
So how do you get rid of zombies without rebooting? ps -l will show
you the parent process id of each process, and you can then hunt down
the parent that is refusing to wait for it's children and kill it.
Once a process has been killed that had children, those children are
"inherited" by their parent's parent. Ultimately, init (process number
one and the parent or grandparent of all processes) has the responsibility
to wait for everything on the system.
I hope this helps some people understand a little about how child
processes are reaped and what to do about the processes that just wont
die.
-jan-
--
Jan L. Peterson
Unemployed "Computer Facilitator"
http://www.peterson.ath.cx/~jlp/resume.html
____________________
BYU Unix Users Group
http://uug.byu.edu/
___________________________________________________________________
List Info: http://uug.byu.edu/cgi-bin/mailman/listinfo/uug-list