Re: Blocked process
On Thu, 20 Aug 2009, Daniel O'Connor wrote: > Unfortunately the system is in Finland and I'm in Australia so I > can't sit at the console :( Someone visited the site and determined that the floppy drive cable was intermittently fouling the CPU fan. I believe this was causing the CPU to overheat and be throttled by the BIOS. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
On Sat, 22 Aug 2009, Robert Watson wrote: > On Sat, 22 Aug 2009, Daniel O'Connor wrote: > > On Fri, 21 Aug 2009, CmdLnKid wrote: > >> came back or the machine was rebooted. I continued for a while > >> using /var/mail over NFS while setting or unset mail variables for > >> the shell. You may also want to check into whether something is > >> trying to acquire a lock on a file over that NFS mount which could > >> accrue some extra time making it seem like a process is hung. > > > > We don't have any NFS mounts so I don't think that's it :( > > A number of issues were corrected over the course of the 6.x life > span involving scheduing, including some relating to "lost wakeups". > Many bug fixes relating to threading were also introduced (not sure > if that's relevant to your workload). While it's never a > particularly fun recommendation, I think I'd suggest sliding forward > to the most recent 6.x kernel (but otherwise identical > configuration), perhaps sticking with your current userspace, and > seeing if that resolves the issue. OK, should be feasible to do that I think. Luckily people at this site don't need to drive for a few hours to fix the PC :) Thanks. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
On Sat, 22 Aug 2009, Daniel O'Connor wrote: On Fri, 21 Aug 2009, CmdLnKid wrote: came back or the machine was rebooted. I continued for a while using /var/mail over NFS while setting or unset mail variables for the shell. You may also want to check into whether something is trying to acquire a lock on a file over that NFS mount which could accrue some extra time making it seem like a process is hung. We don't have any NFS mounts so I don't think that's it :( Hi Daniel-- A number of issues were corrected over the course of the 6.x life span involving scheduing, including some relating to "lost wakeups". Many bug fixes relating to threading were also introduced (not sure if that's relevant to your workload). While it's never a particularly fun recommendation, I think I'd suggest sliding forward to the most recent 6.x kernel (but otherwise identical configuration), perhaps sticking with your current userspace, and seeing if that resolves the issue. Robert N M Watson Computer Laboratory University of Cambridge ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Blocked process
On Fri, 21 Aug 2009, CmdLnKid wrote: > came back or the machine was rebooted. I continued for a while using > /var/mail over NFS while setting or unset mail variables for the > shell. You may also want to check into whether something is trying to > acquire a lock on a file over that NFS mount which could accrue some > extra time making it seem like a process is hung. We don't have any NFS mounts so I don't think that's it :( -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
On Thu, 20 Aug 2009 08:42 -, doconnor wrote: On Thu, 20 Aug 2009, Kostik Belousov wrote: Things like ls on the console might take several seconds to respond when the box didn't seem to be very busy (but wasn't idle, maybe serving a little NFS). It wasn't the shell getting swapped out or anything else obvious. This was on SMP, not using X. The problem went away with 6.4R (had to stay with 6.x for unrelated reasons). 6.1 was released with a bug in NFS server, causing serious slowdown when non-MPSAFE fs was exported. Hmmm.. this is 6.2 (and a half) so I guess that's not my problem. Next! ;) I had a problem like this once when the NFS mount stopped responding and any command that was issued seemed to hang. This all was happening while not paying attention to the NFS mount and that mount being various directories under /var and including /var/mail. A little deeper I eventually came across and what made me feel pretty stupid is that the "$SHELL" whether it be csh, ksh, bash or sh checks for mail on command completion or invocation and being so that the NFS mount stopped responding the process would hang until the mount came back or the machine was rebooted. I continued for a while using /var/mail over NFS while setting or unset mail variables for the shell. You may also want to check into whether something is trying to acquire a lock on a file over that NFS mount which could accrue some extra time making it seem like a process is hung. -- - (2^(N-1)) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Blocked process
On Thu, 20 Aug 2009, Kostik Belousov wrote: > > Things like ls on the console might take several seconds to respond > > when the box didn't seem to be very busy (but wasn't idle, maybe > > serving a little NFS). It wasn't the shell getting swapped out or > > anything else obvious. This was on SMP, not using X. The problem > > went away with 6.4R (had to stay with 6.x for unrelated reasons). > > 6.1 was released with a bug in NFS server, causing serious slowdown > when non-MPSAFE fs was exported. Hmmm.. this is 6.2 (and a half) so I guess that's not my problem. Next! ;) -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
On Thu, Aug 20, 2009 at 12:35:46PM +0100, Bob Bishop wrote: > > On 20 Aug 2009, at 12:06, Daniel O'Connor wrote: > > >On Thu, 20 Aug 2009, Daniel O'Connor wrote: > >>On Thu, 20 Aug 2009, Bob Bishop wrote: > >>>On 20 Aug 2009, at 08:25, Daniel O'Connor wrote: > This is running in 6.2 ish using 4BSD, I was under the impression > ULE wasn't very stable in 6.2. > > I could probably try it though... > >>> > >>>Hmm. ISTR having similar problems around the 6.1-2 era. You might > >>>try 6.4 if that's possible for you. > >> > >>Someone is going to visit it, but if I can't solve it remotely I'll > >>probably just update it to 7.2 or so. > > > >What sort of problems did you have BTW? > > Things like ls on the console might take several seconds to respond > when the box didn't seem to be very busy (but wasn't idle, maybe > serving a little NFS). It wasn't the shell getting swapped out or > anything else obvious. This was on SMP, not using X. The problem went > away with 6.4R (had to stay with 6.x for unrelated reasons). 6.1 was released with a bug in NFS server, causing serious slowdown when non-MPSAFE fs was exported. pgpsWKZDXLIe8.pgp Description: PGP signature
Re: Blocked process
On 20 Aug 2009, at 12:06, Daniel O'Connor wrote: On Thu, 20 Aug 2009, Daniel O'Connor wrote: On Thu, 20 Aug 2009, Bob Bishop wrote: On 20 Aug 2009, at 08:25, Daniel O'Connor wrote: This is running in 6.2 ish using 4BSD, I was under the impression ULE wasn't very stable in 6.2. I could probably try it though... Hmm. ISTR having similar problems around the 6.1-2 era. You might try 6.4 if that's possible for you. Someone is going to visit it, but if I can't solve it remotely I'll probably just update it to 7.2 or so. What sort of problems did you have BTW? Things like ls on the console might take several seconds to respond when the box didn't seem to be very busy (but wasn't idle, maybe serving a little NFS). It wasn't the shell getting swapped out or anything else obvious. This was on SMP, not using X. The problem went away with 6.4R (had to stay with 6.x for unrelated reasons). -- Bob Bishop r...@gid.co.uk ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Blocked process
On Thu, 20 Aug 2009, Daniel O'Connor wrote: > On Thu, 20 Aug 2009, Bob Bishop wrote: > > On 20 Aug 2009, at 08:25, Daniel O'Connor wrote: > > > This is running in 6.2 ish using 4BSD, I was under the impression > > > ULE wasn't very stable in 6.2. > > > > > > I could probably try it though... > > > > Hmm. ISTR having similar problems around the 6.1-2 era. You might > > try 6.4 if that's possible for you. > > Someone is going to visit it, but if I can't solve it remotely I'll > probably just update it to 7.2 or so. What sort of problems did you have BTW? -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
On Thu, 20 Aug 2009, Bob Bishop wrote: > On 20 Aug 2009, at 08:25, Daniel O'Connor wrote: > > This is running in 6.2 ish using 4BSD, I was under the impression > > ULE wasn't very stable in 6.2. > > > > I could probably try it though... > > Hmm. ISTR having similar problems around the 6.1-2 era. You might try > 6.4 if that's possible for you. Someone is going to visit it, but if I can't solve it remotely I'll probably just update it to 7.2 or so. -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
On 20 Aug 2009, at 08:25, Daniel O'Connor wrote: This is running in 6.2 ish using 4BSD, I was under the impression ULE wasn't very stable in 6.2. I could probably try it though... Hmm. ISTR having similar problems around the 6.1-2 era. You might try 6.4 if that's possible for you. -- Bob Bishop r...@gid.co.uk ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Blocked process
On Thu, 20 Aug 2009, Bob Bishop wrote: > Hi, > > On 20 Aug 2009, at 03:34, Daniel O'Connor wrote: > > [...] > > The problem appears to now be that the userland process that reads > > data > > out of the kernel is being stalled for over 4 seconds. This process > > reads from the kernel and does some minor processing and then > > writes it > > out to a child process to do some more work on it. > > > > [...] > > Given that renice'ing has an effect it seems to be a scheduler > > problem, [etc] > > Which scheduler are you using? Have you tried the other one? This is running in 6.2 ish using 4BSD, I was under the impression ULE wasn't very stable in 6.2. I could probably try it though... -- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum GPG Fingerprint - 5596 B766 97C0 0E94 4347 295E E593 DC20 7B3F CE8C signature.asc Description: This is a digitally signed message part.
Re: Blocked process
Hi, On 20 Aug 2009, at 03:34, Daniel O'Connor wrote: [...] The problem appears to now be that the userland process that reads data out of the kernel is being stalled for over 4 seconds. This process reads from the kernel and does some minor processing and then writes it out to a child process to do some more work on it. [...] Given that renice'ing has an effect it seems to be a scheduler problem, [etc] Which scheduler are you using? Have you tried the other one? -- Bob Bishop r...@gid.co.uk ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"