Re: kill: cannot kill some processes
On Sat, Mar 03, 2001 at 08:52:36AM -0500, Cory Snavely wrote: Right now on a big Solaris machine of mine I have about a dozen zombied Perls--parent process (Apache) long gone, and when I -9ed them, their PPIDs became 1 (init). Classic zombie. Hrrrm? Not quite. Init eventually inherits zombie children (when the parent dies), but init reaps the dead children. Perhaps your children aren't dead? Brian, you're right. Now that I look more closely, they're in sleep state. If I just knew why... Problem is, these Perls are running scripts off a software RAID, and thus have it locked. This happened before--when I reboot the server to get rid of the zombies, or some other reason, the filesystem won't unmount, won't get a clean flag, and therefore will force fsck on reboot. As it's over 100GB, a full fsck takes several hours. Now maybe there's something I don't know to recover from this cleanly, or maybe Linux handles it a different way, but it seems like this is an example of zombies causing a real problem. If anyone knows a way around it, I'd be real grateful! Doesn't sound like a zombie to me. A zombie has -no- open files and goes away as soon as init inherits it. A zombie is in state 'Z' on ps. What you describe sounds more like something in state 'D', which is waiting for IO to complete. (This can happen on NFS when things break in just the wrong way for some reason.) They're not zombies because they're not dead yet (they need to release their files before they are really dead). For processes stuck in a 'D' state, there is very little you can do about them. You may be able to sneak out of re-fscking by remounting the drive read-only before rebooting, though. Yeah, that's what I was thinking. Thanks.
Re: kill: cannot kill some processes
On Fri, 2 Mar 2001, Ron Peterson wrote: away. They don't consume any CPU time, or any other resources other than the slot in the process table and the less than 1K of memory required to ... Not entirely true. Init can inherit enough zombie processes that it hits its process limit (1024, if I remember correctly). Can you Well, like I said, they do still take up the slot in the process table. Zombies that *have* been inherited by init go away - it's those that are still waiting for their parent process to check their status that pile up. Init itself doesn't have any limit on the number of zombies it can clean up, otherwise it would be a problem with any long uptime system. 'shutdown'? Nope. Not unless you can free up a slot. And if something's going haywire and spawning zombies quickly, this can be a problem. Linux reserves processes for root so unless your haywire program is running as root you are at least partly shielded from this. control-alt-delete should still be able to reboot the system in such a case (or login as root on the console), if it comes to that. Not a common occurance, though... This is true :}
Re: kill: cannot kill some processes
One thing about zombie process: Don't worry about trying to make them go away. They don't consume any CPU time, or any other resources other than the slot in the process table and the less than 1K of memory required to hold their state information. They are not worth worrying about. Not entirely true. Init can inherit enough zombie processes that it hits its process limit (1024, if I remember correctly). Can you 'shutdown'? Nope. Not unless you can free up a slot. And if something's going haywire and spawning zombies quickly, this can be a problem. Not a common occurance, though... Seconded, although for a different reason and based on an experience on Solaris. Right now on a big Solaris machine of mine I have about a dozen zombied Perls--parent process (Apache) long gone, and when I -9ed them, their PPIDs became 1 (init). Classic zombie. Problem is, these Perls are running scripts off a software RAID, and thus have it locked. This happened before--when I reboot the server to get rid of the zombies, or some other reason, the filesystem won't unmount, won't get a clean flag, and therefore will force fsck on reboot. As it's over 100GB, a full fsck takes several hours. Now maybe there's something I don't know to recover from this cleanly, or maybe Linux handles it a different way, but it seems like this is an example of zombies causing a real problem. If anyone knows a way around it, I'd be real grateful! c
Re: kill: cannot kill some processes
On Sat, Mar 03, 2001 at 08:52:36AM -0500, Cory Snavely wrote: Right now on a big Solaris machine of mine I have about a dozen zombied Perls--parent process (Apache) long gone, and when I -9ed them, their PPIDs became 1 (init). Classic zombie. Hrrrm? Not quite. Init eventually inherits zombie children (when the parent dies), but init reaps the dead children. Perhaps your children aren't dead? Problem is, these Perls are running scripts off a software RAID, and thus have it locked. This happened before--when I reboot the server to get rid of the zombies, or some other reason, the filesystem won't unmount, won't get a clean flag, and therefore will force fsck on reboot. As it's over 100GB, a full fsck takes several hours. Now maybe there's something I don't know to recover from this cleanly, or maybe Linux handles it a different way, but it seems like this is an example of zombies causing a real problem. If anyone knows a way around it, I'd be real grateful! Doesn't sound like a zombie to me. A zombie has -no- open files and goes away as soon as init inherits it. A zombie is in state 'Z' on ps. What you describe sounds more like something in state 'D', which is waiting for IO to complete. (This can happen on NFS when things break in just the wrong way for some reason.) They're not zombies because they're not dead yet (they need to release their files before they are really dead). For processes stuck in a 'D' state, there is very little you can do about them. You may be able to sneak out of re-fscking by remounting the drive read-only before rebooting, though. -- CueCat decoder .signature by Larry Wall: #!/usr/bin/perl -n printf Serial: %s Type: %s Code: %s\n, map { tr/a-zA-Z0-9+-/ -_/; $_ = unpack 'u', chr(32 + length()*3/4) . $_; s/\0+$//; $_ ^= C x length; } /\.([^.]+)/g;
Re: kill: cannot kill some processes
William T Wilson wrote: On Thu, 22 Feb 2001, brian moore wrote: does the process list Z under STAT ? if it is the process has gone zombied and i don't think there is much you can do. sometimes zombie'd processes die on their own eventually many times they will not die until you reboot .. Not quite true... zombies don't ever die: they're already dead. While the description of zombie processes is accurate, I think another likely situation is that the process is in uninterruptible sleep, i.e. the 'D' state. This happens when a process is blocked in a system call - it will be 'D' until the kernel function returns. Kernel bugs, hardware problems, and dead NFS mounts can cause these kernel functions to take a long time or forever. In such a case, you really are stuck; unless the resource the process is waiting for frees up, it's going to hang around until a reboot. One thing about zombie process: Don't worry about trying to make them go away. They don't consume any CPU time, or any other resources other than the slot in the process table and the less than 1K of memory required to hold their state information. They are not worth worrying about. Not entirely true. Init can inherit enough zombie processes that it hits its process limit (1024, if I remember correctly). Can you 'shutdown'? Nope. Not unless you can free up a slot. And if something's going haywire and spawning zombies quickly, this can be a problem. Not a common occurance, though... -Ron- GPG and other info at: http://www.yellowbank.com/
Re: kill: cannot kill some processes
On Thu, Feb 22, 2001 at 09:59:47PM -0800, Nate Amsden wrote: Brian Stults wrote: As the subject indicates, there are some processes that hang and cannot be killed. Specifically, occasionally dselect will hang while trying to install a package. After waiting for a long time, I try both Ctrl-c and Ctrl-z and neither will work. Then I try to kill the process from another xterm. It looks as if the kill worked, but when I do a ps, the job is still there. It also just happened with df. I use this to kill: kill -9 [pid] Any suggestions? does the process list Z under STAT ? if it is the process has gone zombied and i don't think there is much you can do. sometimes zombie'd processes die on their own eventually many times they will not die until you reboot .. Not quite true... zombies don't ever die: they're already dead. So what are they doing there? Simple: every process returns some stuff to it's parent: time used, and a bunch of other stuff, most importantly the value 'exit()' returns. A zombie is a process which has died, but its parent has not collected that information. It has to be held until the parent collects it. With a proper parent process, it will notice the child died and call wait() and collect the return code and anything else it needs. If the parent dies without collecting the return status, the zombie is inherited by init. init is smart enough to collect the status and let the zombie go to the grave. So the bug in a 'zombie' is in the process that spawned it. Kill it, and the zombie children will go away. Most importantly, though, it is a bug and should be fixed. It's not difficult to write code that correctly reaps dead children. -- CueCat decoder .signature by Larry Wall: #!/usr/bin/perl -n printf Serial: %s Type: %s Code: %s\n, map { tr/a-zA-Z0-9+-/ -_/; $_ = unpack 'u', chr(32 + length()*3/4) . $_; s/\0+$//; $_ ^= C x length; } /\.([^.]+)/g;
Re: kill: cannot kill some processes
On Thu, 22 Feb 2001, brian moore wrote: does the process list Z under STAT ? if it is the process has gone zombied and i don't think there is much you can do. sometimes zombie'd processes die on their own eventually many times they will not die until you reboot .. Not quite true... zombies don't ever die: they're already dead. While the description of zombie processes is accurate, I think another likely situation is that the process is in uninterruptible sleep, i.e. the 'D' state. This happens when a process is blocked in a system call - it will be 'D' until the kernel function returns. Kernel bugs, hardware problems, and dead NFS mounts can cause these kernel functions to take a long time or forever. In such a case, you really are stuck; unless the resource the process is waiting for frees up, it's going to hang around until a reboot. One thing about zombie process: Don't worry about trying to make them go away. They don't consume any CPU time, or any other resources other than the slot in the process table and the less than 1K of memory required to hold their state information. They are not worth worrying about.
Re: kill: cannot kill some processes
William == William T Wilson [EMAIL PROTECTED] writes: William One thing about zombie process: Don't worry about trying William to make them go away. They don't consume any CPU time, William or any other resources other than the slot in the process William table and the less than 1K of memory required to hold William their state information. They are not worth worrying William about. My understanding is that zombie processes occur when the task dies, but the parent hasn't called waitpid(...) on that process yet. The Linux kernel needs to keep track of the process so it can return the processes status, in case the parent does call waitpid(...). If you see a zombie persist, then the parent process is probably slow or buggy, as it has not yet done the right thing. Of course, you could argue that the penalty is small, but IMHO, it is still a bug if the parent doesn't clean up after its children. -- Brian May [EMAIL PROTECTED]
Re: kill: cannot kill some processes
William T Wilson [EMAIL PROTECTED] wrote: On Thu, 22 Feb 2001, brian moore wrote: does the process list Z under STAT ? if it is the process has gone zombied and i don't think there is much you can do. sometimes zombie'd processes die on their own eventually many times they will not die until you reboot .. Not quite true... zombies don't ever die: they're already dead. While the description of zombie processes is accurate, I think another likely situation is that the process is in uninterruptible sleep, i.e. the 'D' state. This happens when a process is blocked in a system call - it will be 'D' until the kernel function returns. Kernel bugs, hardware problems, and dead NFS mounts can cause these kernel functions to take a long time or forever. A hint for NFS: mount your NFS filesystems with the '-o intr' mount option. That way you'll be able to interrupt system calls related to them. -- Colin Watson [EMAIL PROTECTED]
Re: kill: cannot kill some processes
William T Wilson wrote: On Thu, 22 Feb 2001, brian moore wrote: does the process list Z under STAT ? if it is the process has gone Not quite true... zombies don't ever die: they're already dead. While the description of zombie processes is accurate, I think another likely situation is that the process is in uninterruptible sleep, i.e. the 'D' state. This happens when a process is blocked in a system call - it will be 'D' until the kernel function returns. Kernel bugs, hardware problems, and dead NFS mounts can cause these kernel functions to take a long time or forever. Thanks to everyone for the help. It turns out the processes are indeed in uninterruptible sleep (or 'D' under STAT). And, as suggested, I believe it is due to a failed NFS mount. I will heed Colin Watson's instruction to mount these with the '-o intr' option and see if that helps. Thanks again! -- Brian J. Stults Doctoral Candidate Department of Sociology University at Albany - SUNY Phone: (518) 442-4652 Fax: (518) 442-4936 Web: http://www.albany.edu/~bs7452
kill: cannot kill some processes
As the subject indicates, there are some processes that hang and cannot be killed. Specifically, occasionally dselect will hang while trying to install a package. After waiting for a long time, I try both Ctrl-c and Ctrl-z and neither will work. Then I try to kill the process from another xterm. It looks as if the kill worked, but when I do a ps, the job is still there. It also just happened with df. I use this to kill: kill -9 [pid] Any suggestions? -- Brian J. Stults Doctoral Candidate Department of Sociology University at Albany - SUNY Phone: (518) 442-4652 Fax: (518) 442-4936 Web: http://www.albany.edu/~bs7452
Re: kill: cannot kill some processes
Brian Stults wrote: As the subject indicates, there are some processes that hang and cannot be killed. Specifically, occasionally dselect will hang while trying to install a package. After waiting for a long time, I try both Ctrl-c and Ctrl-z and neither will work. Then I try to kill the process from another xterm. It looks as if the kill worked, but when I do a ps, the job is still there. It also just happened with df. I use this to kill: kill -9 [pid] Any suggestions? does the process list Z under STAT ? if it is the process has gone zombied and i don't think there is much you can do. sometimes zombie'd processes die on their own eventually many times they will not die until you reboot .. its rare..but it can happen. nate -- ::: ICQ: 75132336 http://www.aphroland.org/ http://www.linuxpowered.net/ [EMAIL PROTECTED]