On Sat, Jun 24, 2006 at 02:57:27PM -0300, Marc G. Fournier wrote: > On Sat, 24 Jun 2006, Kostik Belousov wrote: > > >On Sat, Jun 24, 2006 at 11:55:26AM +0400, Dmitry Morozovsky wrote: > >>On Sat, 24 Jun 2006, Marc G. Fournier wrote: > >> > >>MGF> > 'b' stands for "blocked", not "busy". Judging by your page fault > >>rate > >>MGF> > and the high number of frees and pages being scanned, you're > >>probably > >>MGF> > swapping tasks in and out and are waiting on disk. Take a look at > >>MGF> > "vmstat -s", and consider adding more RAM if this is correct... > >>MGF> > >>MGF> is there a way of finding out what processes are blocked? > >> > >>Aren't they in 'D' status by ps? > >Use ps axlww. In this way, at least actual blocking points are shown. > > 'k, stupid question then ... what am I searching for? > > # ps axlww | awk '{print $9}' | sort | uniq -c | sort -nr > 654 select > 230 lockf > 166 wait > 85 - > 80 piperd > 71 nanslp > 33 kserel > 22 user > 10 pause > 9 ttyin > 5 sbwait > 3 psleep > 3 accept > 2 kqread > 2 Giant > 1 vlruwt > 1 syncer > 1 sdflus > 1 ppwait > 1 ktrace > 1 MWCHAN > > According to vmstat, I'm holding at '4 blocked' for the most part ... > sbwwait is socket related, not disk ... and none of the others look right > ... I would say, using big magic cristall ball, that you problems are not kernel-related. I see only too suspicious points:
1. high number of pipe readers and waiters for file locks. It may be normal for your load. 2. 2 Giant holders/lockers. Is it constant ? Are the processes holding/waiting for Giant are the same ? Anyway, being in your shoes, I would start looking at applications. Ah, and does dmesg show anything ?
pgpuiHdwOdsiN.pgp
Description: PGP signature