----- Original Message -----
From: "Eric Anderson" <[EMAIL PROTECTED]>
I saw this once before, a long time back, and every time I went through a debugging session, it came to some kind of lock on the
sysctl tree with regards to the geom info (maybe the XML kind of tree dump or something). I don't recall all the details, but
it was something like that.
Yep thats where I've traced it to its requesting: kern.geom.confxml
Which does:-
static int
sysctl_kern_geom_confxml(SYSCTL_HANDLER_ARGS)
{
int error;
struct sbuf *sb;
sb = sbuf_new(NULL, NULL, 0, SBUF_AUTOEXTEND);
g_waitfor_event(g_confxml, sb, M_WAITOK, NULL);
error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1);
sbuf_delete(sb);
return error;
}
What I dont understand is why this would lock the entire machine.
I've enabled LOCK_PROFILING and reran and I get the following which
seems to indicate the culpret is: SYSCTL_LOCK()
From what I can tell g_waitfor_event is returning EAGAIN for a large
amount of time which means we get stuck in:-
userland_sysctl
...
SYSCTL_LOCK();
do {
req.oldidx = 0;
req.newidx = 0;
error = sysctl_root(0, name, namelen, &req);
} while (error == EAGAIN);
if (req.lock == REQ_WIRED && req.validlen > 0)
vsunlock(req.oldptr, req.validlen);
SYSCTL_UNLOCK();
...
The only reason I can see for returning EAGAIN is g_destroy_geom
calling g_cancel_event
To confirm this wasnt related to anything that top is doing and
we are seeing a total lockup I used:
sh -c 'while [ 1 ]; do sleep 1; date; done'
Fri Feb 1 06:04:47 GMT 2008
Fri Feb 1 06:04:48 GMT 2008
Fri Feb 1 06:04:49 GMT 2008 <= Lockup
Fri Feb 1 06:05:16 GMT 2008 <= Recover
Fri Feb 1 06:05:17 GMT 2008
Fri Feb 1 06:05:18 GMT 2008
Enabling geom debugging I get:-
Feb 1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0,
0xffffff00010e6100, 2, 0)
Feb 1 06:04:45 geomtest kernel: ref 0xffffff00010e6100
Feb 1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0,
0xffffff00014e6700, 2, 0)
Feb 1 06:04:45 geomtest kernel: ref 0xffffff00014e6700
Feb 1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270,
0xffffff00010e6100, 2, 0)
Feb 1 06:04:49 geomtest kernel: ref 0xffffff00010e6100
Feb 1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270,
0xffffff00014e6700, 2, 0)
Feb 1 06:04:49 geomtest kernel: ref 0xffffff00014e6700
Feb 1 06:04:49 geomtest kernel: mbr_taste(MBR,da0s3)
Feb 1 06:04:49 geomtest kernel: g_mbrext_taste(MBREXT,da0s3)
Feb 1 06:04:49 geomtest kernel: g_slice_spoiled(0xffffff0001b27180/da0s3)
Feb 1 06:04:49 geomtest kernel: g_wither_geom(0xffffff0001a33c00(da0s3))
Feb 1 06:04:49 geomtest kernel: g_part_taste(PART,da0s3)
Feb 1 06:04:56 geomtest kernel: g_post_event_x(0xffffffff80235b10,
0xffffff000144a9c0, 2, 262144)
Feb 1 06:05:00 geomtest kernel: g_wither_geom(0xffffff000158b300(da0s3))
Feb 1 06:05:00 geomtest kernel:
Feb 1 06:05:00 geomtest kernel: g_label_taste(LABEL, da0s3)
Feb 1 06:05:00 geomtest kernel:
Feb 1 06:05:16 geomtest kernel: GEOM_LABEL[1]: MSDOSFS: da0s3: FAT32 volume
not valid.
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23980)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23980)
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001b4be00(label:taste))
Feb 1 06:05:16 geomtest kernel: mbr_taste(MBR,da0s3g)
Feb 1 06:05:16 geomtest kernel: g_slice_spoiled(0xffffff0001a3c280/da0s3g)
Feb 1 06:05:16 geomtest kernel: g_wither_geom(0xffffff0001b4b600(da0s3g))
Feb 1 06:05:16 geomtest kernel: g_mbrext_taste(MBREXT,da0s3g)
Feb 1 06:05:16 geomtest kernel: bsd_taste(BSD,da0s3g)
Feb 1 06:05:16 geomtest kernel: g_slice_spoiled(0xffffff0001b27200/da0s3g)
Feb 1 06:05:16 geomtest kernel: g_wither_geom(0xffffff0001504c00(da0s3g))
Feb 1 06:05:16 geomtest kernel: g_part_taste(PART,da0s3g)
Feb 1 06:05:16 geomtest kernel: g_wither_geom(0xffffff0001527100(da0s3g))
Feb 1 06:05:16 geomtest kernel: g_label_taste(LABEL, da0s3g)
Feb 1 06:05:16 geomtest kernel: GEOM_LABEL[1]: UFS2 file system detected on
da0s3g.
Feb 1 06:05:16 geomtest kernel: GEOM_LABEL[1]: MSDOSFS: da0s3g: no FAT
signature found.
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23a80)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23a80)
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001b4be00(label:taste))
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001a3c280)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001a3c280)
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001b4b600(da0s3g))
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001b27180)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b27180)
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001a33c00(da0s3))
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001b27200)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b27200)
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001504c00(da0s3g))
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001adc580)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001adc580)
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001527100(da0s3g))
Feb 1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23b00)
Feb 1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23b00)
Feb 1 06:05:16 geomtest kernel:
Feb 1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff000158b300(da0s3))
So after all that I can see why the sysctl call is taking
so long to complete but the burning question is why does
the entire system lock because of this? What else is
waiting on the sysctl lock which is so critical?
debug.lock.prof.stats:
max total wait_total count avg wait_avg cnt_hold
cnt_lock name
153585139 181478925 0 112 1620347 0 0
0 kern/vfs_subr.c:2062 (lockmgr:ufs)
26298436 26298513 23195820 24 1095771 966492 1
1 kern/kern_sysctl.c:1396 (sx:sysctl lock)
2396114 5604484 0 596 9403 0 0
0 sys/buf.h:280 (lockmgr:bufwait)
214 76577 1837 12490 6 0 185 280 kern/subr_sleepqueue.c:232 (spin mutex:sleepq
chain)
1 34126 2 30415 1 0 2
1 kern/kern_clock.c:224 (spin mutex:sched lock 1)
2 34069 5 30415 1 0 3
2 kern/kern_clock.c:224 (spin mutex:sched lock 0)
1 34067 0 30415 1 0 0
0 kern/kern_clock.c:264 (spin mutex:callout)
54 23629 1 4122 5 0 6
1 kern/uipc_socket.c:2468 (sleep mutex:so_snd)
34 22599 0 9232 2 0 0 0 amd64/amd64/pmap.c:935 (sleep mutex:vm page
queue mutex)
34 22573 0 9232 2 0 0
0 amd64/amd64/pmap.c:936 (sleep mutex:pmap)
121 20165 0 586 34 0 0
0 kern/uipc_sockbuf.c:145 (sx:so_snd_sx)
50 19229 38 9401 2 0 3
3 geom/geom_io.c:68 (sleep mutex:bio queue)
117 16281 0 586 27 0 17
0 netinet/tcp_usrreq.c:781 (sleep mutex:inp)
50 14954 245 11968 1 0 15
62 kern/kern_conf.c:69 (sleep mutex:cdev)
52 13599 0 4122 3 0 0
0 kern/uipc_socket.c:2469 (sleep mutex:so_rcv)
227 11779 0 625 18 0 1 0 cam/cam_xpt.c:7153 (sleep mutex:arcmsr Q buffer
lock)
40 11779 0 639 18 0 50
0 kern/kern_conf.c:372 (sleep mutex:Giant)
52 11201 219 3536 3 0 3
47 kern/kern_conf.c:384 (sleep mutex:Giant)
11 10888 0 1770 6 0 0 0 kern/subr_taskqueue.c:71 (spin
mutex:fast_taskqueue)
41 10866 0 3041 3 0 0
0 netinet/tcp_subr.c:1426 (sleep mutex:isn_mtx)
50 9858 5 8243 1 0 1
3 kern/sys_generic.c:1079 (sleep mutex:sellck)
5 9545 7 3416 2 0 3
2 kern/sched_ule.c:1872 (spin mutex:sched lock 0)
2 9029 677 7356 1 0 103
319 kern/kern_timeout.c:419 (spin mutex:callout)
52 8916 0 1813 4 0 1
0 kern/sys_generic.c:759 (sleep mutex:sellck)
222 8318 0 625 13 0 0
0 geom/geom_disk.c:198 (sleep mutex:g_disk_done)
1 7843 3 6407 1 0 3
2 kern/kern_timeout.c:296 (spin mutex:callout)
5 7240 17 5090 1 0 5 7 kern/kern_switch.c:182 (spin mutex:sched lock
0)
64 6940 58 1276 5 0 0
3 kern/kern_timeout.c:241 (sleep mutex:Giant)
5 6924 2880 1730 4 1 2 1715 amd64/amd64/mp_machdep.c:966 (spin mutex:sched
lock 0)
13 6845 0 623 10 0 0 0 amd64/amd64/mp_machdep.c:845 (spin mutex:smp
rendezvous)
53 6279 0 1766 3 0 0
0 kern/sys_pipe.c:1331 (sleep mutex:pipe mutex)
5 6063 93 4189 1 0 2
27 kern/kern_intr.c:1131 (spin mutex:sched lock 0)
15 5910 25 660 8 0 7
10 kern/sys_generic.c:1127 (sleep mutex:sellck)
56 5290 727 357 14 2 0
17 netinet/tcp_input.c:479 (sleep mutex:inp)
144 5264 0 116 45 0 0
0 vm/vm_map.c:3111 (sx:user map)
7 4971 25 1552 3 0 9
11 kern/sched_ule.c:1872 (spin mutex:sched lock 1)
11 4959 0 625 7 0 0 0 dev/arcmsr/arcmsr.c:1129 (sleep mutex:arcmsr Q
buffer lock)
210 4901 56 3436 1 0 9
27 kern/sched_ule.c:979 (spin mutex:sched lock 0)
69 4899 0 357 13 0 0
0 netinet/tcp_input.c:400 (sleep mutex:tcp)
1 4698 0 4045 1 0 0
0 kern/kern_clock.c:412 (spin mutex:sched lock 1)
1 4697 0 4189 1 0 1
0 kern/kern_intr.c:706 (spin mutex:sched lock 0)
1 4696 0 4184 1 0 0
0 kern/kern_timeout.c:191 (spin mutex:callout)
45 4604 0 591 7 0 6
0 kern/vfs_bio.c:3790 (sleep mutex:bdone lock)
2 4550 3 4045 1 0 0
1 kern/kern_clock.c:412 (spin mutex:sched lock 0)
53 4384 0 1228 3 0 0
0 kern/sys_generic.c:750 (sleep mutex:sellck)
2 4189 3 3392 1 0 1 2 kern/subr_sleepqueue.c:331 (spin mutex:sched
lock 0)
3695 4042 0 4 1010 0 0
0 kern/vfs_vnops.c:515 (lockmgr:ufs)
8 3891 0 645 6 0 0 0 kern/subr_sleepqueue.c:409 (spin mutex:sleepq
chain)
10 3773 0 3103 1 0 1
0 kern/vfs_bio.c:3002 (sleep mutex:bdone lock)
55 3323 22 634 5 0 4
7 kern/kern_mutex.c:141 (sleep mutex:sellck)
50 3188 0 2469 1 0 0 0 kern/kern_descrip.c:2139 (sleep mutex:sleep
mtxpool)
1853 3083 0 4 770 0 0
0 kern/vfs_subr.c:1665 (lockmgr:syncer)
6 3074 0 1310 2 0 0 0 kern/kern_lock.c:200 (sleep mutex:lockbuilder
mtxpool)
10 3003 0 645 4 0 0 0 kern/subr_sleepqueue.c:390 (sleep mutex:process
lock)
32 2991 0 2448 1 0 0 0 kern/kern_descrip.c:2007 (sleep mutex:sleep
mtxpool)
50 2852 0 2401 1 0 0
0 kern/kern_sig.c:996 (sleep mutex:process lock)
94 2458 0 589 4 0 0
0 dev/em/if_em.c:980 (sleep mutex:em0)
54 2358 6 592 3 0 0
3 kern/kern_conf.c:360 (sleep mutex:Giant)
1 2343 35 1547 1 0 20 20 kern/subr_sleepqueue.c:331 (spin mutex:sched
lock 1)
2 2272 98 1540 1 0 238
39 kern/sched_ule.c:979 (spin mutex:sched lock 1)
1 2214 0 1897 1 0 0 0 kern/subr_sleepqueue.c:595 (spin mutex:sched
lock 0)
1 2194 4 1329 1 0 2
2 kern/kern_timeout.c:500 (spin mutex:callout)
1 2189 33 1526 1 0 0 4 kern/subr_sleepqueue.c:800 (spin mutex:sleepq
chain)
5 2160 1959 1173 1 1 0 1168 amd64/amd64/mp_machdep.c:966 (spin mutex:sched
lock 1)
1 2158 1 1660 1 0 0 1 kern/sys_generic.c:753 (spin mutex:sched lock
0)
8 1972 0 248 7 0 0 0 kern/kern_timeout.c:241 (sleep mutex:process
lock)
6 1968 11 1245 1 0 2 5 kern/kern_intr.c:1131 (spin mutex:sched lock 1)
================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please
telephone +44 845 868 1337
or return the E.mail to [EMAIL PROTECTED]
_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"