Hi,
I've got a system that's behaing a bit odd. It's running a classic
network service that's got one parent proc and spawns one child proc /
connection. It's fine with about 100 or so concurrent child procs, but
once it starts hitting a higher number <defunt> procs start appearing.
Up to about 300 or so, the <defunt> procs appear and disappear so fast
that trying to preap them will only preap a couple of them with the rest
not reaching 60 seconds before going away. Lately it's been getting to
the point where there's more <defunt> procs and not much work getting
done. Setting a higher limit for the number of child procs helped moving
a bit more through, but that's also left me with something in the order
of 600 <defunt> procs. First thought was running out of resources, but
the load stays down around 5 (4 proc box) and there's plenty of free
memory. The traffic only runs at about 40-50 KBits/sec.
Running hotkernel from dtrace toolkit, I get something like:
FUNCTION COUNT PCNT
unix`default_lock_delay 2105 0.7%
unix`mutex_exit 2408 0.8%
unix`generic_idle_cpu 2527 0.8%
unix`page_vpsub 2545 0.8%
unix`page_unlock 3245 1.0%
genunix`pvn_vplist_dirty 6644 2.1%
unix`mutex_delay_default 9892 3.2%
unix`page_lock_es 10404 3.4%
unix`mutex_enter 20079 6.5%
unix`page_trylock 21461 6.9%
unix`idle 22421 7.3%
unix`page_vpadd 31324 10.1%
unix`disp_getwork 96444 31.2%
Checking for failing syscalls (errinfo) I get things like:
SYSCALL ERR COUNT DESC
shutdown 134 13 Socket is not connected
pollsys 4 694 interrupted system call
c2audit 22 1428 Invalid argument
ioctl 25 2181 Inappropriate ioctl for device
open64 2 3090 No such file or directory
putmsg 9 4600 Bad file number
fcntl 22 6618 Invalid argument
stat64 2 9530 No such file or directory
lstat64 2 11892 No such file or directory
close 9 12871 Bad file number
stat 2 61254 No such file or directory
the number of failed stats is a bit high, but they're expected because
there's some silliness checking for .files.
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 467 0 1535 185 5 2402 159 343 1404 11 5153 7 53 0 40
1 455 0 1978 3140 3027 1874 132 307 1064 9 5898 9 53 0 38
2 520 0 1314 837 693 2112 116 291 1288 10 4442 7 56 0 37
3 533 0 1573 1047 767 3273 189 410 1281 11 4950 7 45 0 47
I'm a bit short on ideas of where to go digging next, so any hints/ideas would
be greatly appriated.
thanks,
/Mads
--
http://soulfood.dk
_______________________________________________
perf-discuss mailing list
[email protected]