Thank you for the analysis... it sounds like I was eventually on the right track (last post to drivers-discuss), though it would have taken me a long time (due to my inexperience with mdb and crashdump analysis in general) to get to that.. I suspected the vbox driver, so I did upgrade to the latest, and got the same hang (and a crash dump) the stack trace looked similar, though with your detailed explanation, I think I should be able to validate that it is the same problem. If so, I'll pursue this with the vbox folks and/or submit a patch if there isn't an existing bug.
On Sat, Sep 19, 2009 at 3:04 PM, Kerry Shu <[email protected]> wrote: > Hi Jason, > > Thanks for sharing the core dumps. > > I've root caused this issue as a deadlock issue. It's caused by using > spin lock for cv_timedwait_sig()/cv_signal() calls in > rtSemEventMultiWait()/RTSemEventSignal() in vboxdrv module of virtual > box. You may want to try latest vbox to see if this issue is still there > or not. See analysis below for details: > > >> ::cpuinfo -v > ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC > 0 fffffffffbc36f80 1b 1 2 -1 no no t-5 ffffff0007805c60 > (idle) > | | | > RUNNING <--+ | +--> PIL THREAD > READY | 2 ffffff00078cbc60 > EXISTS | > ENABLE +--> PRI THREAD PROC > 60 ffffff0007e11c60 sched > > ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC > 1 ffffff01cbee7000 1f 0 0 168 no no t-6968 ffffff0007c49c60 > sched > | | > RUNNING <--+ +--> PIL THREAD > READY 9 ffffff0007c49c60 > QUIESCED 6 ffffff0007c4fc60 > EXISTS - ffffff01d6d55740 VirtualBox > ENABLE > > > Here we see ffffff01d6d55740 is preempted by e1000g interrupt(IPL 6), > then e1000g interrupt is preempted by nvidia graphics interrupt. > >> ffffff0007c49c60::findstack -v > stack pointer for thread ffffff0007c49c60: ffffff0007c49b30 > ffffff0007c49bb0 rm_isr+0xaa() > ffffff0007c49c00 av_dispatch_autovect+0x7c(10) > ffffff0007c49c40 dispatch_hardint+0x33(10, 6) > ffffff0007c4f450 switch_sp_and_call+0x13() > ffffff0007c4f4a0 do_interrupt+0x9e(ffffff0007c4f4b0, b) > ffffff0007c4f4b0 _interrupt+0xba() > ffffff0007c4f5c0 default_lock_delay+0x8c(4) > ffffff0007c4f630 lock_set_spl_spin+0xc2(ffffff01e4840d99, b, > ffffff01e4840d9c > , 6) > ffffff0007c4f690 mutex_vector_enter+0x45e(ffffff01e4840d98) > ffffff0007c4f6c0 RTSemEventSignal+0x6a() > ffffff0007c4f740 0xfffffffff836c57b() > ffffff0007c4f770 0xfffffffff836d73a() > ffffff0007c4f830 vboxNetFltSolarisRecv+0x331() > ffffff0007c4f880 VBoxNetFltSolarisModReadPut+0x107() > ffffff0007c4f8f0 putnext+0x21e(ffffff01e425ed30, ffffff01ccb1a0a0) > ffffff0007c4f950 dld_str_rx_raw+0xb3(ffffff01ce74a6c8, 0, ffffff01ccb1a0a0, > ffffff0007c4f980) > ffffff0007c4fa10 dls_rx_promisc+0x179(ffffff01ce74a6c8, 0, > ffffff01ccb1a0a0, 0 > ) > ffffff0007c4fa50 mac_promisc_dispatch_one+0x5f(ffffff01df15ee78, > ffffff01d20f7640, 0) > ffffff0007c4fac0 mac_promisc_dispatch+0x105(ffffff01cdc5f788, > ffffff01d20f7640 > , 0) > ffffff0007c4fb10 mac_rx+0x3e(ffffff01cdc5f788, ffffff01cdb86ee8, > ffffff01d20f7640) > ffffff0007c4fb50 mac_rx_ring+0x4c(ffffff01cdc5f788, ffffff01cdb86ee8, > ffffff01d20f7640, 1) > ffffff0007c4fbb0 e1000g_intr+0x17e(ffffff01cd3ec000) > ffffff0007c4fc00 av_dispatch_autovect+0x7c(30) > ffffff0007c4fc40 dispatch_hardint+0x33(30, 0) > ffffff0008b8e7d0 switch_sp_and_call+0x13() > ffffff0008b8e820 do_interrupt+0x9e(ffffff0008b8e830, 0) > ffffff0008b8e830 _interrupt+0xba() > ffffff0008b8e960 untimeout_generic+0xaf(17e3e41c8, 0) > ffffff0008b8e980 untimeout_default+0x15(17e3e41c8, 0) > ffffff0008b8ea00 cv_timedwait_sig_internal+0x1f8(ffffff01dd2a2910, > ffffff01dd2a2908, 553c76, 0) > ffffff0008b8ea30 cv_timedwait_sig+0x1f(ffffff01dd2a2910, ffffff01dd2a2908, > 553c76) > ffffff0008b8ea90 rtSemEventMultiWait+0xe2() > ffffff0008b8eae0 0xfffffffff8333a80() > ffffff0008b8eb10 0xfffffffff833e60c() > ffffff0008b8eb50 0xfffffffff833edeb() > ffffff0008b8ec30 supdrvIOCtl+0xd62() > ffffff0008b8ed00 VBoxDrvSolarisIOCtl+0x359() > ffffff0008b8ed40 cdev_ioctl+0x45(10a00000000, c0185687, fffffd7ffe55cda0, > 202003, ffffff01dd4364b0, ffffff0008b8ee24) > ffffff0008b8ed80 spec_ioctl+0x83(ffffff01de3ca200, c0185687, > fffffd7ffe55cda0 > , 202003, ffffff01dd4364b0, ffffff0008b8ee24, 0) > ffffff0008b8ee00 fop_ioctl+0x7b(ffffff01de3ca200, c0185687, > fffffd7ffe55cda0, > 202003, ffffff01dd4364b0, ffffff0008b8ee24, 0) > ffffff0008b8ef00 ioctl+0x18e(17, ffffffffc0185687, fffffd7ffe55cda0) > ffffff0008b8ef10 sys_syscall+0x17b() >> > > > Anyway, e1000g interrupt(ffffff01d6d55740) is spinning on a spin lock: > >> ffffff01e4840d98::mutex > ADDR TYPE HELD MINSPL OLDSPL WAITERS > ffffff01e4840d98 spin yes 11 0 - > > > The question now is who is holding the spin lock? Let's print out > all the threads related to vboxdrv: > >> ::stacks -a -m vboxdrv > THREAD STATE SOBJTYPE COUNT > ffffff01ccad6e60 SLEEP MUTEX 1 > swtch+0x147 > turnstile_block+0x764 > mutex_vector_enter+0x261 > timeout_generic+0x87 > cv_timedwait_sig_internal+0xef > cv_timedwait_sig+0x1f > rtSemEventWait+0xe2 > 0xfffffffff836d354 > 0xfffffffff833e6ae > 0xfffffffff833edeb > supdrvIOCtl+0xd62 > VBoxDrvSolarisIOCtl+0x359 > cdev_ioctl+0x45 > spec_ioctl+0x83 > fop_ioctl+0x7b > ioctl+0x18e > > THREAD STATE SOBJTYPE COUNT > ffffff0007c4fc60 ONPROC <NONE> 1 > default_lock_delay+0x8c > lock_set_spl_spin+0xc2 > mutex_vector_enter+0x45e > RTSemEventSignal+0x6a > 0xfffffffff836c57b > 0xfffffffff836d73a > vboxNetFltSolarisRecv+0x331 > VBoxNetFltSolarisModReadPut+0x107 > putnext+0x21e > dld_str_rx_raw+0xb3 > dls_rx_promisc+0x179 > mac_promisc_dispatch_one+0x5f > mac_promisc_dispatch+0x105 > mac_rx+0x3e > mac_rx_ring+0x4c > e1000g_intr+0x17e > av_dispatch_autovect+0x7c > dispatch_hardint+0x33 > > THREAD STATE SOBJTYPE COUNT > ffffff01d6d55740 ONPROC <NONE> 1 > untimeout_generic+0xaf > untimeout_default+0x15 > cv_timedwait_sig_internal+0x1f8 > cv_timedwait_sig+0x1f > rtSemEventMultiWait+0xe2 > 0xfffffffff8333a80 > 0xfffffffff833e60c > 0xfffffffff833edeb > supdrvIOCtl+0xd62 > VBoxDrvSolarisIOCtl+0x359 > cdev_ioctl+0x45 > spec_ioctl+0x83 > fop_ioctl+0x7b > ioctl+0x18e > > Now we can see the deadlock senario is: > > - Thread A(ffffff01d6d55740) is cancelling a cv timeout with the callout > table lock(at untimeout_generic+0x54) being hold. > > - Thread B(ffffff01ccad6e60) is blocked and sleep trying to get the > callout table lock to enable a timeout. However, it has hold the > vboxdrv spin lock(at rtSemEventMultiWait+0x8b) and hasn't got chance > to release it yet at (cv_timedwait_sig_internal+0x138). > > - E1000g interrup(thread ffffff0007c4fc60) preempted thread A and calls > RTSemEventSignal() which needs to hold the vboxdrv spin lock, but the > spin lock is held at thread B. This cause e1000g interrupt thread > spins there foreever on CPU1. (Only intterrupts with higher priority > than e1000g, i.e. >6, can get scheduled on CPU1) > > To solve this issue, vboxdrv needs to change to not use spin lock but > adaptive lock for cv_timedwait_sig()/cv_signal() calls in > rtSemEventMultiWait()/RTSemEventSignal(). > > On the other hand, can we also improve Solaris cv_* interface to avoid > such issue? I CCed Eric and Madhavan for comments. > > Regards, > Kerry > > Jason King wrote: >> >> On Thu, Sep 17, 2009 at 11:55 AM, Dana H. Myers <[email protected]> >> wrote: >>> >>> Kerry Shu wrote: >>>> >>>> What you are looking for is 0x10, not 0x0a. Looks to me, here you have >>>> IRQ# 16 interrupt (might be either hci1394#0, uhci#3, uhci#0, or >>>> nvidia#0) preempting e1000g#0 interrupt. I guess such situation happened >>>> frequently since you felt system freeze. So are you running something >>>> that let both e1000g0 and other 4 driver instances at IRQ# 16 busy? For >>>> example, are you putting heavy load on both network and graphics? >>> >>> I don't understand what the VirtualBox/bridged-Ethernet configuration >>> does here, but one comment about xHCIs - they tend to interrupt >>> regularly at a reasonable rate, IIRC, even without a heavy >>> load. Graphics interrupts periodically but not that often, I >>> believe tied to vertical retrace/frame rate, which is 50-100 per >>> second. >> >> >> Could it be possible that it's stuck in in the e1000g ISR and the IRQ >> 16 preemption is just a red herring? >> >> I'm not really familiar with the Opensolaris interrupt implementation, >> so I'm making some assumptions here (corrections are welcome) --- >> looking at the stack, I'm guessing the preemption by IRQ 16 starts at >> the _interrupt call. Since xHCIs interrupt regularly, if the e1000g >> ISR (or really the vboxfilt stream module sitting ontop of e1000g) is >> deadlocked on a mutex, isn't that what you'd see? I'm assuming the >> mutex call prior to _interrupt isn't part of the preemption process. > _______________________________________________ driver-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/driver-discuss
