----- Original Message ----- From: "Eric Anderson" <[EMAIL PROTECTED]>

I saw this once before, a long time back, and every time I went through a debugging session, it came to some kind of lock on the sysctl tree with regards to the geom info (maybe the XML kind of tree dump or something). I don't recall all the details, but it was something like that.

Yep thats where I've traced it to its requesting: kern.geom.confxml

Which does:-
static int
sysctl_kern_geom_confxml(SYSCTL_HANDLER_ARGS)
{
   int error;
   struct sbuf *sb;

   sb = sbuf_new(NULL, NULL, 0, SBUF_AUTOEXTEND);
   g_waitfor_event(g_confxml, sb, M_WAITOK, NULL);
   error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1);
   sbuf_delete(sb);
   return error;
}

What I dont understand is why this would lock the entire machine.

I've enabled LOCK_PROFILING and reran and I get the following which
seems to indicate the culpret is: SYSCTL_LOCK()

From what I can tell g_waitfor_event is returning EAGAIN for a large
amount of time which means we get stuck in:-
userland_sysctl
...
   SYSCTL_LOCK();

   do {
       req.oldidx = 0;
       req.newidx = 0;
       error = sysctl_root(0, name, namelen, &req);
   } while (error == EAGAIN);

   if (req.lock == REQ_WIRED && req.validlen > 0)
       vsunlock(req.oldptr, req.validlen);

   SYSCTL_UNLOCK();
...

The only reason I can see for returning EAGAIN is g_destroy_geom
calling g_cancel_event

To confirm this wasnt related to anything that top is doing and
we are seeing a total lockup I used:
sh -c 'while [ 1 ]; do sleep 1; date; done'
Fri Feb  1 06:04:47 GMT 2008
Fri Feb  1 06:04:48 GMT 2008
Fri Feb  1 06:04:49 GMT 2008 <= Lockup
Fri Feb  1 06:05:16 GMT 2008 <= Recover
Fri Feb  1 06:05:17 GMT 2008
Fri Feb  1 06:05:18 GMT 2008


Enabling geom debugging I get:-
Feb  1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0, 
0xffffff00010e6100, 2, 0)
Feb  1 06:04:45 geomtest kernel: ref 0xffffff00010e6100
Feb  1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0, 
0xffffff00014e6700, 2, 0)
Feb  1 06:04:45 geomtest kernel: ref 0xffffff00014e6700
Feb  1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270, 
0xffffff00010e6100, 2, 0)
Feb  1 06:04:49 geomtest kernel: ref 0xffffff00010e6100
Feb  1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270, 
0xffffff00014e6700, 2, 0)
Feb  1 06:04:49 geomtest kernel: ref 0xffffff00014e6700
Feb  1 06:04:49 geomtest kernel: mbr_taste(MBR,da0s3)
Feb  1 06:04:49 geomtest kernel: g_mbrext_taste(MBREXT,da0s3)
Feb  1 06:04:49 geomtest kernel: g_slice_spoiled(0xffffff0001b27180/da0s3)
Feb  1 06:04:49 geomtest kernel: g_wither_geom(0xffffff0001a33c00(da0s3))
Feb  1 06:04:49 geomtest kernel: g_part_taste(PART,da0s3)
Feb  1 06:04:56 geomtest kernel: g_post_event_x(0xffffffff80235b10, 
0xffffff000144a9c0, 2, 262144)
Feb  1 06:05:00 geomtest kernel: g_wither_geom(0xffffff000158b300(da0s3))
Feb  1 06:05:00 geomtest kernel:
Feb  1 06:05:00 geomtest kernel: g_label_taste(LABEL, da0s3)
Feb  1 06:05:00 geomtest kernel:
Feb  1 06:05:16 geomtest kernel: GEOM_LABEL[1]: MSDOSFS: da0s3: FAT32 volume 
not valid.
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23980)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23980)
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001b4be00(label:taste))
Feb  1 06:05:16 geomtest kernel: mbr_taste(MBR,da0s3g)
Feb  1 06:05:16 geomtest kernel: g_slice_spoiled(0xffffff0001a3c280/da0s3g)
Feb  1 06:05:16 geomtest kernel: g_wither_geom(0xffffff0001b4b600(da0s3g))
Feb  1 06:05:16 geomtest kernel: g_mbrext_taste(MBREXT,da0s3g)
Feb  1 06:05:16 geomtest kernel: bsd_taste(BSD,da0s3g)
Feb  1 06:05:16 geomtest kernel: g_slice_spoiled(0xffffff0001b27200/da0s3g)
Feb  1 06:05:16 geomtest kernel: g_wither_geom(0xffffff0001504c00(da0s3g))
Feb  1 06:05:16 geomtest kernel: g_part_taste(PART,da0s3g)
Feb  1 06:05:16 geomtest kernel: g_wither_geom(0xffffff0001527100(da0s3g))
Feb  1 06:05:16 geomtest kernel: g_label_taste(LABEL, da0s3g)
Feb  1 06:05:16 geomtest kernel: GEOM_LABEL[1]: UFS2 file system detected on 
da0s3g.
Feb  1 06:05:16 geomtest kernel: GEOM_LABEL[1]: MSDOSFS: da0s3g: no FAT 
signature found.
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23a80)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23a80)
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001b4be00(label:taste))
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001a3c280)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001a3c280)
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001b4b600(da0s3g))
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001b27180)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b27180)
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001a33c00(da0s3))
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001b27200)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b27200)
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001504c00(da0s3g))
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001adc580)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001adc580)
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff0001527100(da0s3g))
Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23b00)
Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23b00)
Feb  1 06:05:16 geomtest kernel:
Feb  1 06:05:16 geomtest kernel: g_destroy_geom(0xffffff000158b300(da0s3))

So after all that I can see why the sysctl call is taking
so long to complete but the burning question is why does
the entire system lock because of this? What else is
waiting on the sysctl lock which is so critical?


debug.lock.prof.stats:
  max        total   wait_total       count   avg wait_avg     cnt_hold     
cnt_lock name
153585139    181478925            0         112 1620347     0            0      
      0 kern/vfs_subr.c:2062 (lockmgr:ufs)
26298436     26298513     23195820          24 1095771 966492            1      
      1 kern/kern_sysctl.c:1396 (sx:sysctl lock)
2396114      5604484            0         596  9403     0            0          
  0 sys/buf.h:280 (lockmgr:bufwait)
214 76577 1837 12490 6 0 185 280 kern/subr_sleepqueue.c:232 (spin mutex:sleepq chain)
    1        34126            2       30415     1     0            2            
1 kern/kern_clock.c:224 (spin mutex:sched lock 1)
    2        34069            5       30415     1     0            3            
2 kern/kern_clock.c:224 (spin mutex:sched lock 0)
    1        34067            0       30415     1     0            0            
0 kern/kern_clock.c:264 (spin mutex:callout)
   54        23629            1        4122     5     0            6            
1 kern/uipc_socket.c:2468 (sleep mutex:so_snd)
34 22599 0 9232 2 0 0 0 amd64/amd64/pmap.c:935 (sleep mutex:vm page queue mutex)
   34        22573            0        9232     2     0            0            
0 amd64/amd64/pmap.c:936 (sleep mutex:pmap)
  121        20165            0         586    34     0            0            
0 kern/uipc_sockbuf.c:145 (sx:so_snd_sx)
   50        19229           38        9401     2     0            3            
3 geom/geom_io.c:68 (sleep mutex:bio queue)
  117        16281            0         586    27     0           17            
0 netinet/tcp_usrreq.c:781 (sleep mutex:inp)
   50        14954          245       11968     1     0           15           
62 kern/kern_conf.c:69 (sleep mutex:cdev)
   52        13599            0        4122     3     0            0            
0 kern/uipc_socket.c:2469 (sleep mutex:so_rcv)
227 11779 0 625 18 0 1 0 cam/cam_xpt.c:7153 (sleep mutex:arcmsr Q buffer lock)
   40        11779            0         639    18     0           50            
0 kern/kern_conf.c:372 (sleep mutex:Giant)
   52        11201          219        3536     3     0            3           
47 kern/kern_conf.c:384 (sleep mutex:Giant)
11 10888 0 1770 6 0 0 0 kern/subr_taskqueue.c:71 (spin mutex:fast_taskqueue)
   41        10866            0        3041     3     0            0            
0 netinet/tcp_subr.c:1426 (sleep mutex:isn_mtx)
   50         9858            5        8243     1     0            1            
3 kern/sys_generic.c:1079 (sleep mutex:sellck)
    5         9545            7        3416     2     0            3            
2 kern/sched_ule.c:1872 (spin mutex:sched lock 0)
    2         9029          677        7356     1     0          103          
319 kern/kern_timeout.c:419 (spin mutex:callout)
   52         8916            0        1813     4     0            1            
0 kern/sys_generic.c:759 (sleep mutex:sellck)
  222         8318            0         625    13     0            0            
0 geom/geom_disk.c:198 (sleep mutex:g_disk_done)
    1         7843            3        6407     1     0            3            
2 kern/kern_timeout.c:296 (spin mutex:callout)
5 7240 17 5090 1 0 5 7 kern/kern_switch.c:182 (spin mutex:sched lock 0)
   64         6940           58        1276     5     0            0            
3 kern/kern_timeout.c:241 (sleep mutex:Giant)
5 6924 2880 1730 4 1 2 1715 amd64/amd64/mp_machdep.c:966 (spin mutex:sched lock 0) 13 6845 0 623 10 0 0 0 amd64/amd64/mp_machdep.c:845 (spin mutex:smp rendezvous)
   53         6279            0        1766     3     0            0            
0 kern/sys_pipe.c:1331 (sleep mutex:pipe mutex)
    5         6063           93        4189     1     0            2           
27 kern/kern_intr.c:1131 (spin mutex:sched lock 0)
   15         5910           25         660     8     0            7           
10 kern/sys_generic.c:1127 (sleep mutex:sellck)
   56         5290          727         357    14     2            0           
17 netinet/tcp_input.c:479 (sleep mutex:inp)
  144         5264            0         116    45     0            0            
0 vm/vm_map.c:3111 (sx:user map)
    7         4971           25        1552     3     0            9           
11 kern/sched_ule.c:1872 (spin mutex:sched lock 1)
11 4959 0 625 7 0 0 0 dev/arcmsr/arcmsr.c:1129 (sleep mutex:arcmsr Q buffer lock)
  210         4901           56        3436     1     0            9           
27 kern/sched_ule.c:979 (spin mutex:sched lock 0)
   69         4899            0         357    13     0            0            
0 netinet/tcp_input.c:400 (sleep mutex:tcp)
    1         4698            0        4045     1     0            0            
0 kern/kern_clock.c:412 (spin mutex:sched lock 1)
    1         4697            0        4189     1     0            1            
0 kern/kern_intr.c:706 (spin mutex:sched lock 0)
    1         4696            0        4184     1     0            0            
0 kern/kern_timeout.c:191 (spin mutex:callout)
   45         4604            0         591     7     0            6            
0 kern/vfs_bio.c:3790 (sleep mutex:bdone lock)
    2         4550            3        4045     1     0            0            
1 kern/kern_clock.c:412 (spin mutex:sched lock 0)
   53         4384            0        1228     3     0            0            
0 kern/sys_generic.c:750 (sleep mutex:sellck)
2 4189 3 3392 1 0 1 2 kern/subr_sleepqueue.c:331 (spin mutex:sched lock 0)
 3695         4042            0           4  1010     0            0            
0 kern/vfs_vnops.c:515 (lockmgr:ufs)
8 3891 0 645 6 0 0 0 kern/subr_sleepqueue.c:409 (spin mutex:sleepq chain)
   10         3773            0        3103     1     0            1            
0 kern/vfs_bio.c:3002 (sleep mutex:bdone lock)
   55         3323           22         634     5     0            4            
7 kern/kern_mutex.c:141 (sleep mutex:sellck)
50 3188 0 2469 1 0 0 0 kern/kern_descrip.c:2139 (sleep mutex:sleep mtxpool)
 1853         3083            0           4   770     0            0            
0 kern/vfs_subr.c:1665 (lockmgr:syncer)
6 3074 0 1310 2 0 0 0 kern/kern_lock.c:200 (sleep mutex:lockbuilder mtxpool) 10 3003 0 645 4 0 0 0 kern/subr_sleepqueue.c:390 (sleep mutex:process lock) 32 2991 0 2448 1 0 0 0 kern/kern_descrip.c:2007 (sleep mutex:sleep mtxpool)
   50         2852            0        2401     1     0            0            
0 kern/kern_sig.c:996 (sleep mutex:process lock)
   94         2458            0         589     4     0            0            
0 dev/em/if_em.c:980 (sleep mutex:em0)
   54         2358            6         592     3     0            0            
3 kern/kern_conf.c:360 (sleep mutex:Giant)
1 2343 35 1547 1 0 20 20 kern/subr_sleepqueue.c:331 (spin mutex:sched lock 1)
    2         2272           98        1540     1     0          238           
39 kern/sched_ule.c:979 (spin mutex:sched lock 1)
1 2214 0 1897 1 0 0 0 kern/subr_sleepqueue.c:595 (spin mutex:sched lock 0)
    1         2194            4        1329     1     0            2            
2 kern/kern_timeout.c:500 (spin mutex:callout)
1 2189 33 1526 1 0 0 4 kern/subr_sleepqueue.c:800 (spin mutex:sleepq chain) 5 2160 1959 1173 1 1 0 1168 amd64/amd64/mp_machdep.c:966 (spin mutex:sched lock 1) 1 2158 1 1660 1 0 0 1 kern/sys_generic.c:753 (spin mutex:sched lock 0) 8 1972 0 248 7 0 0 0 kern/kern_timeout.c:241 (sleep mutex:process lock) 6 1968 11 1245 1 0 2 5 kern/kern_intr.c:1131 (spin mutex:sched lock 1)

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it.
In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to [EMAIL PROTECTED]

_______________________________________________
freebsd-performance@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to