Re: hard lock under 3.4-STABLE
On Fri, Feb 11, 2000 at 06:03:16PM -0800, Matthew Dillon wrote: > I presume its the client that is locking up? If you remove the > server binary and the client takes a page fault on the binary, > and does not have the page in the cache, what is supposed to happen > is that the program is supposed to seg fault when the NFS read fails. > It's quite possible that there is a bug in dealing with this situation > and if you can get it repeatable we can probably fix it fairly easily. I did some experiments with this sort of thing a few months ago. I think you can kill 3.X NFS client machines by truncating a binary on the NFS server. You can also make the machine extreamly slugish by catching SIGBUS and SIGSEGV in an executable and then causing one of these signals once the binary is modified. We see it quite frequently with people using MPI. I'll see if I can reproduce any of these effects and let you know how to do it. David. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: hard lock under 3.4-STABLE
:I am seeing a situation where a 3.4 system hard-locks while running 3.4 :(hard lock being that it does not respond to its serial console, nor is :it pingable). I believe (perhaps) that it may be NFS related, with a :program running on an NFS client when the executable itself is deleted :from the server (although I haven't seen that style of panic in quite :some time, and it is usually has a couple of lines earlier in the output :to the effect that it lost its backing store. : :I realize that information is sparse in this, but that is because the :information that I have is equally sparse... I have no kernel messages, :I cannot drop into the kernel debugger, and no crashdump is ever created :(I need to hit the reset button to recover.) : :I am trying to reproduce a test case, but it is difficult not knowing what :has caused the problems in the first place. : :-- :David Cross | email: [EMAIL PROTECTED] :Acting Lab Director | NYSLP: FREEBSD I presume its the client that is locking up? If you remove the server binary and the client takes a page fault on the binary, and does not have the page in the cache, what is supposed to happen is that the program is supposed to seg fault when the NFS read fails. It's quite possible that there is a bug in dealing with this situation and if you can get it repeatable we can probably fix it fairly easily. -Matt Matthew Dillon <[EMAIL PROTECTED]> To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
hard lock under 3.4-STABLE
I am seeing a situation where a 3.4 system hard-locks while running 3.4 (hard lock being that it does not respond to its serial console, nor is it pingable). I believe (perhaps) that it may be NFS related, with a program running on an NFS client when the executable itself is deleted from the server (although I haven't seen that style of panic in quite some time, and it is usually has a couple of lines earlier in the output to the effect that it lost its backing store. I realize that information is sparse in this, but that is because the information that I have is equally sparse... I have no kernel messages, I cannot drop into the kernel debugger, and no crashdump is ever created (I need to hit the reset button to recover.) I am trying to reproduce a test case, but it is difficult not knowing what has caused the problems in the first place. -- David Cross | email: [EMAIL PROTECTED] Acting Lab Director | NYSLP: FREEBSD Systems Administrator/Research Programmer | Web: http://www.cs.rpi.edu/~crossd Rensselaer Polytechnic Institute, | Ph: 518.276.2860 Department of Computer Science| Fax: 518.276.4033 I speak only for myself. | WinNT:Linux::Linux:FreeBSD To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message