Hi all, i need your help debugging a hang issue on the latest build. Occasionally the system hangs. Ping is still possible but it seems I/O to the rpool disks is halted. The system consists of 42 disks. 40 of them are attached via two LSI 1086E controller. The remaining two disks are attached via motherboard (dont know for sure what controller this is).
When the hang occurs the system itself continues to run. I/O to the disks attached via LSI controllers seems still possible to some extent. However I/O to the rpool disks (attached via mainboard) seems to be completely blocked. Ping the system still works, login via SSH not. Even running "new" commands does not work anymore. In order to dig the problem i started the system with mdb (-k switch on boot) and waited for the occurance again. This morning i was able to grab such a hang and i pressed F1-A on console which got me to mdb and i was able to issue a systemdump. To my surprise mdb was able to dump to the rpool disks. After the system dumped the files i started mdb: > $C ffffff008bfe06b0 debug_enter+0x38(0) ffffff008bfe06d0 abort_sequence_enter+0x35(0) ffffff008bfe0720 kbtrans_streams_key+0x102(ffffff13767ebd00, 4, 0) ffffff008bfe0750 conskbdlrput+0xe7(ffffff13771622a8, ffffff13d18cff60) ffffff008bfe07c0 putnext+0x21e(ffffff1377343cf0, ffffff13d18cff60) ffffff008bfe0800 kbtrans_queueevent+0x7c(ffffff1376ef8c80, ffffff008bfe0840) ffffff008bfe0830 kbtrans_queuepress+0x7c(ffffff1376ef8c80, 4, ffffff008bfe0840) ffffff008bfe0870 kbtrans_untrans_keypressed_raw+0x46(ffffff1376ef8c80, 4) ffffff008bfe08a0 kbtrans_processkey+0x32(ffffff1376ef8c80, ffffffffc0014430, 4, 0) ffffff008bfe08f0 kbtrans_streams_key+0x175(ffffff1376ef8c80, 4, 0) ffffff008bfe0920 usbkbm_wrap_kbtrans+0x20(ffffff135edb5940, 4, 0) ffffff008bfe0960 usbkbm_streams_callback+0x3c(ffffff135edb5940, 4, 0) ffffff008bfe09e0 usbkbm_unpack_usb_packet+0x2f6(ffffff135edb5940, fffffffff7e43860, ffffff137720e070) ffffff008bfe0a10 usbkbm_rput+0x84(ffffff1377343cf0, ffffff14057c6a20) ffffff008bfe0a80 putnext+0x21e(ffffff1377162018, ffffff14057c6a20) ffffff008bfe0ac0 hid_interrupt_pipe_callback+0x7c(ffffff1376ecb3e0, ffffff13b30daea0) ffffff008bfe0b00 usba_req_normal_cb+0x155(ffffff13b30dae10) ffffff008bfe0b60 hcdi_do_cb+0x133(ffffff1377157340, ffffff13b30dae10, ffffff13768b2398) ffffff008bfe0ba0 hcdi_cb_thread+0xb2(ffffff1377157340) ffffff008bfe0c40 taskq_thread+0x248(ffffff13770bfae0) ffffff008bfe0c50 thread_start+8() >::msgbuf [...] smp4 at mpt4: wwn 5001c450000c4700 smp4 is /p...@0,0/pci8086,3...@5/pci1014,3...@0/s...@w5001c450000c4700 created version 22 pool bla using 22 panic[cpu0]/thread=ffffff008bfe0c60: BAD TRAP: type=e (#pf Page fault) rp=ffffff008bfe05a0 addr=0 occurred in module "<unknown>" due to a NULL pointer dereference sched: #pf Page fault Bad kernel fault at addr=0x0 pid=0, pc=0x0, sp=0xffffff008bfe0690, eflags=0x10002 cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> cr2: 0 cr3: 4000000 cr8: c rdi: 286 rsi: 0 rdx: 0 rcx: 0 r8: 0 r9: ffffff135f0a5000 rax: 202 rbx: 0 rbp: ffffff008bfe0690 r10: ffffff13761e7040 r11: 0 r12: 0 r13: ffffff13767ebd00 r14: ffffff1376277a28 r15: ffffff13767ebe18 fsb: 0 gsb: fffffffffbc2fa60 ds: 4b es: 4b fs: 0 gs: 1c3 trp: e err: 10 rip: 0 cs: 30 rfl: 10002 rsp: ffffff008bfe0690 ss: 38 ffffff008bfe0480 unix:die+dd () ffffff008bfe0590 unix:trap+177b () ffffff008bfe05a0 unix:cmntrap+e6 () ffffff008bfe0690 0 () ffffff008bfe06b0 unix:debug_enter+38 () ffffff008bfe06d0 unix:abort_sequence_enter+35 () ffffff008bfe0720 kbtrans:kbtrans_streams_key+102 () ffffff008bfe0750 conskbd:conskbdlrput+e7 () ffffff008bfe07c0 unix:putnext+21e () ffffff008bfe0800 kbtrans:kbtrans_queueevent+7c () ffffff008bfe0830 kbtrans:kbtrans_queuepress+7c () ffffff008bfe0870 kbtrans:kbtrans_untrans_keypressed_raw+46 () ffffff008bfe08a0 kbtrans:kbtrans_processkey+32 () ffffff008bfe08f0 kbtrans:kbtrans_streams_key+175 () ffffff008bfe0920 usbkbm:usbkbm_wrap_kbtrans+20 () ffffff008bfe0960 usbkbm:usbkbm_streams_callback+3c () ffffff008bfe09e0 usbkbm:usbkbm_unpack_usb_packet+2f6 () ffffff008bfe0a10 usbkbm:usbkbm_rput+84 () ffffff008bfe0a80 unix:putnext+21e () ffffff008bfe0ac0 hid:hid_interrupt_pipe_callback+7c () ffffff008bfe0b00 usba:usba_req_normal_cb+155 () ffffff008bfe0b60 usba:hcdi_do_cb+133 () ffffff008bfe0ba0 usba:hcdi_cb_thread+b2 () ffffff008bfe0c40 genunix:taskq_thread+248 () ffffff008bfe0c50 unix:thread_start+8 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: all > ::cpuinfo -v ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 0 fffffffffbc3a080 1b 2 0 60 yes no t-1 ffffff008bfe0c60 sched | | RUNNING <--+ +--> PRI THREAD PROC READY 99 ffffff008bb81c60 sched EXISTS 60 ffffff008cbd4c60 sched ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 1 ffffff13767fc080 1f 8 0 -1 no no t-17 ffffff008b526c60 (idle) | | RUNNING <--+ +--> PRI THREAD PROC READY 99 ffffff008bb69c60 sched QUIESCED 60 ffffff008b915c60 sched EXISTS 60 ffffff008b017c60 sched ENABLE 60 ffffff008c877c60 sched 59 ffffff1377476760 sendmail 59 ffffff13817fe040 hald 59 ffffff13817fb8e0 syslogd 59 ffffff13a35f2a80 nsrexecd ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 2 ffffff13767f8000 1f 2 0 60 no no t-0 ffffff008b9cfc60 fsflush | | RUNNING <--+ +--> PRI THREAD PROC READY 59 ffffff1377475c80 inetd QUIESCED 59 ffffff13a5bb5e80 iostat EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 3 ffffff1376a57ac0 1f 0 0 59 no no t-0 ffffff13817e7520 utmpd | RUNNING <--+ READY QUIESCED EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 4 ffffff1376a52580 1f 3 0 -1 no no t-0 ffffff008b775c60 (idle) | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff008b631c60 sched QUIESCED 60 ffffff008b34cc60 sched EXISTS 60 ffffff008cbecc60 sched ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 5 ffffff1376a51080 1f 7 0 99 no no t-0 ffffff008b131c60 sched | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff008cbe0c60 sched QUIESCED 60 ffffff008b01dc60 sched EXISTS 59 ffffff13a35fab00 smbd ENABLE 59 ffffff1377475900 ntpd 59 ffffff1381a42ac0 devfsadm 59 ffffff13a36043c0 smbd 59 ffffff13a5bb3000 fmd ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 6 ffffff1376a4ba80 1f 2 0 59 no no t-0 ffffff13a5bb81e0 intrd | | RUNNING <--+ +--> PRI THREAD PROC READY 59 ffffff13817e7c20 nscd QUIESCED 59 ffffff1376b378a0 svc.configd EXISTS ENABLE ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC 7 ffffff1376a44540 1f 7 0 99 no no t-0 ffffff008b137c60 sched | | RUNNING <--+ +--> PRI THREAD PROC READY 60 ffffff1381a42e40 mdb QUIESCED 60 ffffff13a35ee720 nfsd EXISTS 60 ffffff1377474400 bash ENABLE 60 ffffff008c1bec60 sched 59 ffffff13817ef540 nscd 59 ffffff13817fbc60 syslogd 58 ffffff1381a38720 smbd So just judgint from the stack trace ($C macro) the problem seems to be related to some kind of USB device? Can anyone help me? CC to device-drivers list Thanks in advance Ronny Egner -- This message posted from opensolaris.org _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org