On Thu, 2012-04-26 at 10:02 +0200, Sven Hoexter wrote: > On Thu, Apr 26, 2012 at 04:49:56AM +0100, Ben Hutchings wrote: > > On Wed, 2012-04-25 at 10:36 +0200, Sven Hoexter wrote: > > Hi, > > > > Searching through munin graphs we could narrow down the starting point of > > > this issue > > > to the point when the hpet interrupts for one CPU core multiplied. > > > Sometimes they > > > multiplied by six. Looking further we've found the Kernel [events/$x] in > > > state D > > > where $x is the number of the CPU core which has the high number of hpet > > > interrupts. > > > > > > When we started strace -f on the sshd master process everything works > > > until you logout. > > > Then you'll again see the forked sshd process hanging in state D. > > > > This is strange, because D state means uninterruptible sleep (not > > handling signals). But perhaps the sshd process was repeatedly changing > > between uninterruptible and interruptible state. > > Is it possible to gather such data? I guess grep'ing through ps output > is not the right tool here. > > From a system currently suffering from this issue: [...]
You can use 'echo w > /proc/sysrq-trigger' to get a traceback for all the tasks in D state, which might provide some clues. Ben. -- Ben Hutchings For every action, there is an equal and opposite criticism. - Harrison
signature.asc
Description: This is a digitally signed message part