Re: Re: Unable to kill runaway app. -
Rick Stevens wrote, On 12/23/-28158 02:59 PM: Todd Denniston wrote: Bob Goodwin wrote, On 12/23/-28158 02:59 PM: I've added the option soft to the client /etc/fstab which may make it possible to interrupt things? That is, if I have done the right thing in the right place. Bob Assuming that after you reboot[1], the situation is better with soft, I would suggest going back to hard but use the intr[2] option. i.e. server:/usr/local/pub/pub nfshard,intr I have seen soft loose data on networks that are some what loaded, with out even giving you any error notifications. The probability seemed somewhat proportional with how many times larger the file you are writing is than the wsize parameter. It's "lose" (as in "lost") not "loose" (as in "running wild"). English lessons aside, I sometimes dislike my 'mother' tongue. did you use TCP instead of the default UDP on that heavily loaded network? was not available on the server of that time (Solaris 2.6 or was it 2.5). [1] so that the process that is currently stuck and CAN NOT be killed is finally terminated. :) [2] man nfs|grep -3 EINTR or read the man and search for intr -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Todd Denniston wrote: Bob Goodwin wrote, On 12/23/-28158 02:59 PM: I've added the option soft to the client /etc/fstab which may make it possible to interrupt things? That is, if I have done the right thing in the right place. Bob Assuming that after you reboot[1], the situation is better with soft, I would suggest going back to hard but use the intr[2] option. i.e. server:/usr/local/pub/pub nfshard,intr I have seen soft loose data on networks that are some what loaded, with out even giving you any error notifications. The probability seemed somewhat proportional with how many times larger the file you are writing is than the wsize parameter. It's "lose" (as in "lost") not "loose" (as in "running wild"). English lessons aside, did you use TCP instead of the default UDP on that heavily loaded network? [1] so that the process that is currently stuck and CAN NOT be killed is finally terminated. :) [2] man nfs|grep -3 EINTR or read the man and search for intr -- - Rick Stevens, Systems Engineer ri...@nerd.com - - AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 - -- - Vegetarian: Old Indian word for "lousy hunter" - -- -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Todd Denniston wrote: Bob Goodwin wrote, On 12/23/-28158 02:59 PM: I've added the option soft to the client /etc/fstab which may make it possible to interrupt things? That is, if I have done the right thing in the right place. Bob Assuming that after you reboot[1], the situation is better with soft, I would suggest going back to hard but use the intr[2] option. i.e. server:/usr/local/pub/pub nfshard,intr I have seen soft loose data on networks that are some what loaded, with out even giving you any error notifications. The probability seemed somewhat proportional with how many times larger the file you are writing is than the wsize parameter. [1] so that the process that is currently stuck and CAN NOT be killed is finally terminated. :) [2] man nfs|grep -3 EINTR or read the man and search for intr Yes, I saw "intr" in a page I found via google while investigating "soft" and how to apply it. Wondered about it ... I will try it too. Thank you. Bob -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Re: Unable to kill runaway app. -
Bob Goodwin wrote, On 12/23/-28158 02:59 PM: I've added the option soft to the client /etc/fstab which may make it possible to interrupt things? That is, if I have done the right thing in the right place. Bob Assuming that after you reboot[1], the situation is better with soft, I would suggest going back to hard but use the intr[2] option. i.e. server:/usr/local/pub/pub nfshard,intr I have seen soft loose data on networks that are some what loaded, with out even giving you any error notifications. The probability seemed somewhat proportional with how many times larger the file you are writing is than the wsize parameter. [1] so that the process that is currently stuck and CAN NOT be killed is finally terminated. :) [2] man nfs|grep -3 EINTR or read the man and search for intr -- Todd Denniston Crane Division, Naval Surface Warfare Center (NSWC Crane) Harnessing the Power of Technology for the Warfighter -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Patrick O'Callaghan wrote: On Thu, 2009-08-20 at 12:31 -0700, Peter Langfelder wrote: As previously stated, use kill -9 . The kill command without the -9 only works if the process actually listens to signals, which is not likely if it's stuck in some (semi-)infinite loop. To be pedantic, even -9 will only work if the process is "listening". That's because signal-handling is done by the kernel side of the process itself. The point about -9 (SIGKILL) is that the process can't trap or mask it, but if it's stuck waiting on an uninterruptible kernel event ('D' state) there is nothing that will kill it short of rebooting. poc Yes, I guess I've had that demonstrated to me. I've added the option soft to the client /etc/fstab which may make it possible to interrupt things? That is, if I have done the right thing in the right place. Bob -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
On Thu, 2009-08-20 at 12:31 -0700, Peter Langfelder wrote: > On Thu, Aug 20, 2009 at 12:11 PM, Bob Goodwin wrote: > > I just had perhaps the third occurrence of this problem. > > > > I tried to shut down gthumb which was displaying a a photo from the nfs > > server. It would not shut down, at least not in a reasonable amount if time. > > Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. > > Something did eventually time out but that did not calm the cup activity.: . > > > > 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server > > > > Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff > > and it got as far as "halting system" and stayed there until I pressed the > > power button for five seconds or so. > > > > This happened once last night and it sat there saying it was busy, the power > > button was required to kill it then too. > > > > I don't expect anyone to troubleshoot the problem but would like to know > > what other commands I might try to restore things without shutting down and > > rebooting. > > > > This is an F-10 system pretty much up to date, certainly all security > > updates and perhaps all the rest, I've lost track at the moment. I suspect > > the problem is related to some horse photo files from my daughters Mac. But > > I need a way to stop things when this happens ... > > > > Any help appreciated. > > > > Bob > > As previously stated, use kill -9 . The kill command without the > -9 only works if the process actually listens to signals, which is not > likely if it's stuck in some (semi-)infinite loop. To be pedantic, even -9 will only work if the process is "listening". That's because signal-handling is done by the kernel side of the process itself. The point about -9 (SIGKILL) is that the process can't trap or mask it, but if it's stuck waiting on an uninterruptible kernel event ('D' state) there is nothing that will kill it short of rebooting. poc -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Howard Wilkinson wrote: Bob Goodwin wrote: I just had perhaps the third occurrence of this problem. I tried to shut down gthumb which was displaying a a photo from the nfs server. It would not shut down, at least not in a reasonable amount if time. Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. Something did eventually time out but that did not calm the cup activity.: . 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff and it got as far as "halting system" and stayed there until I pressed the power button for five seconds or so. This happened once last night and it sat there saying it was busy, the power button was required to kill it then too. I don't expect anyone to troubleshoot the problem but would like to know what other commands I might try to restore things without shutting down and rebooting. This is an F-10 system pretty much up to date, certainly all security updates and perhaps all the rest, I've lost track at the moment. I suspect the problem is related to some horse photo files from my daughters Mac. But I need a way to stop things when this happens ... Any help appreciated. Bob Bob, what kernel version do you have loaded, is the processor a multicore or multiprocessor unit. If the kernel version is a recent FC10 update and you are on an SMP motherboard then I have seen the same thing happen with other processes. The problem seems to be in the area where it interacts with the NFS code, BUT it look like a kernel problem with the SMP system. I have not been able to get a dump to prove this but try downgrading to an older kernel and see if it goes away - I used the last FC9 kernel and it did. I have since upgraded to FC11 and this also does not exhibit the problem so it may just have been with one or two of the latest FC10 builds! Howard. This is an older computer, certainly not ancient, a Dell gx280 I bought used a few months ago. [b...@box9 ~]$ uname -a Linux box9 2.6.27.29-170.2.79.fc10.i686 #1 SMP Fri Aug 14 21:11:41 EDT 2009 i686 i686 i386 GNU/Linux I believe that's the most recent Kernel from a few days ago, again I don't recall exactly when but I could try an earlier one, I usually save two older ones but never seem to need them. dmidecode shows: Handle 0x0400, DMI type 4, 32 bytes Processor Information Socket Designation: Microprocessor Type: Central Processor Family: Pentium 4 Manufacturer: Intel ID: 41 0F 00 00 FF FB EB BF Signature: Type 0, Family 15, Model 4, Stepping 1 and also: Handle 0x0100, DMI type 1, 25 bytes System Information Manufacturer: Dell Inc. Product Name: OptiPlex GX280 Version: Not Specified Serial Number: 9HY0281 UUID: 44454C4C-4800-1059-8030-B9C04F323831 Wake-up Type: APM Timer Handle 0x0200, DMI type 2, 8 bytes Base Board Information Manufacturer: Dell Inc. Product Name: 0H7276 Version: Serial Number: ..CN1374056S00IZ. I guess that makes it a multicore processor. -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Bob Goodwin wrote: I just had perhaps the third occurrence of this problem. I tried to shut down gthumb which was displaying a a photo from the nfs server. It would not shut down, at least not in a reasonable amount if time. Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. Something did eventually time out but that did not calm the cup activity.: . 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff and it got as far as "halting system" and stayed there until I pressed the power button for five seconds or so. This happened once last night and it sat there saying it was busy, the power button was required to kill it then too. I don't expect anyone to troubleshoot the problem but would like to know what other commands I might try to restore things without shutting down and rebooting. This is an F-10 system pretty much up to date, certainly all security updates and perhaps all the rest, I've lost track at the moment. I suspect the problem is related to some horse photo files from my daughters Mac. But I need a way to stop things when this happens ... Any help appreciated. Bob Bob, what kernel version do you have loaded, is the processor a multicore or multiprocessor unit. If the kernel version is a recent FC10 update and you are on an SMP motherboard then I have seen the same thing happen with other processes. The problem seems to be in the area where it interacts with the NFS code, BUT it look like a kernel problem with the SMP system. I have not been able to get a dump to prove this but try downgrading to an older kernel and see if it goes away - I used the last FC9 kernel and it did. I have since upgraded to FC11 and this also does not exhibit the problem so it may just have been with one or two of the latest FC10 builds! Howard. -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Christopher K. Johnson wrote: Bob Goodwin wrote: I just had perhaps the third occurrence of this problem. I tried to shut down gthumb which was displaying a a photo from the nfs server. It would not shut down, at least not in a reasonable amount if time. Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. Something did eventually time out but that did not calm the cup activity.: . 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff and it got as far as "halting system" and stayed there until I pressed the power button for five seconds or so. This happened once last night and it sat there saying it was busy, the power button was required to kill it then too. I don't expect anyone to troubleshoot the problem but would like to know what other commands I might try to restore things without shutting down and rebooting. This is an F-10 system pretty much up to date, certainly all security updates and perhaps all the rest, I've lost track at the moment. I suspect the problem is related to some horse photo files from my daughters Mac. But I need a way to stop things when this happens ... Any help appreciated. Bob Try "soft" option on the nfs mount in case the root cause is a problem with the nfs access to the image file. Ok, I will try that. If I understand the soft option goes in the client /etc/fstab? It can also be assigned a time value? Thanks. Bob -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Andras Simon wrote: On 8/20/09, Bob Goodwin wrote: I don't expect anyone to troubleshoot the problem but would like to know what other commands I might try to restore things without shutting down and rebooting. kill -9 can be pretty effective. Andras Yes I tried kill -9 3487 and even -0. I have some trouble understanding the Kill man page but those seemed like something to try. And I must apologizes for "cup" instead of cpu, my spell checker did that for me. I thought I told it to remember the word but must have clicked the wrong spot? Bob -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
On Thu, Aug 20, 2009 at 12:11 PM, Bob Goodwin wrote: > I just had perhaps the third occurrence of this problem. > > I tried to shut down gthumb which was displaying a a photo from the nfs > server. It would not shut down, at least not in a reasonable amount if time. > Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. > Something did eventually time out but that did not calm the cup activity.: . > > 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server > > Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff > and it got as far as "halting system" and stayed there until I pressed the > power button for five seconds or so. > > This happened once last night and it sat there saying it was busy, the power > button was required to kill it then too. > > I don't expect anyone to troubleshoot the problem but would like to know > what other commands I might try to restore things without shutting down and > rebooting. > > This is an F-10 system pretty much up to date, certainly all security > updates and perhaps all the rest, I've lost track at the moment. I suspect > the problem is related to some horse photo files from my daughters Mac. But > I need a way to stop things when this happens ... > > Any help appreciated. > > Bob As previously stated, use kill -9 . The kill command without the -9 only works if the process actually listens to signals, which is not likely if it's stuck in some (semi-)infinite loop. HTH, Peter -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
Bob Goodwin wrote: I just had perhaps the third occurrence of this problem. I tried to shut down gthumb which was displaying a a photo from the nfs server. It would not shut down, at least not in a reasonable amount if time. Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. Something did eventually time out but that did not calm the cup activity.: . 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff and it got as far as "halting system" and stayed there until I pressed the power button for five seconds or so. This happened once last night and it sat there saying it was busy, the power button was required to kill it then too. I don't expect anyone to troubleshoot the problem but would like to know what other commands I might try to restore things without shutting down and rebooting. This is an F-10 system pretty much up to date, certainly all security updates and perhaps all the rest, I've lost track at the moment. I suspect the problem is related to some horse photo files from my daughters Mac. But I need a way to stop things when this happens ... Any help appreciated. Bob Try "soft" option on the nfs mount in case the root cause is a problem with the nfs access to the image file. -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: Unable to kill runaway app. -
On 8/20/09, Bob Goodwin wrote: > I don't expect anyone to troubleshoot the problem but would like to know > what other commands I might try to restore things without shutting down > and rebooting. kill -9 can be pretty effective. Andras -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Unable to kill runaway app. -
I just had perhaps the third occurrence of this problem. I tried to shut down gthumb which was displaying a a photo from the nfs server. It would not shut down, at least not in a reasonable amount if time. Gkrellm showed cup1 running at max. and top indicated the cup at 99.5%. Something did eventually time out but that did not calm the cup activity.: . 3487 bobg 20 0 2928 1068 932 R 99.5 0.0 445:55.55 gam_server Kill 3487 does not stop it. In fact nothing seems to. I told it to poweroff and it got as far as "halting system" and stayed there until I pressed the power button for five seconds or so. This happened once last night and it sat there saying it was busy, the power button was required to kill it then too. I don't expect anyone to troubleshoot the problem but would like to know what other commands I might try to restore things without shutting down and rebooting. This is an F-10 system pretty much up to date, certainly all security updates and perhaps all the rest, I've lost track at the moment. I suspect the problem is related to some horse photo files from my daughters Mac. But I need a way to stop things when this happens ... Any help appreciated. Bob -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines