Actually, a totally different possibility. If any one of those device
drivers doesn't properly return DDI_INTR_UNCLAIMED when it isn't the one
interrupting, then badness can ensue (looks like a stuck interrupt).
Might be worth checking the driver code for interrupt handling for each
of them.
- Garrett
Kerry Shu wrote:
Jason King wrote:
On Thu, Sep 17, 2009 at 10:16 AM, Garrett D'Amore <[email protected]>
wrote:
Look closely at the stack. You'll notice that a PIL9 interrupt
*interrupted* e1000g while it was servicing an interrupt. I don't
think
e1000g is at fault here. Something else is doing it.
This is probably my lack of knowledge about how solaris handles
interrupts, but with doing a little digging:
0xffffff0007c49c60::findstack -v
stack pointer for thread ffffff0007c49c60: ffffff0007c49b30
ffffff0007c49bb0 rm_isr+0xaa()
ffffff0007c49c00 av_dispatch_autovect+0x7c(10)
ffffff0007c49c40 dispatch_hardint+0x33(10, 6)
ffffff0007c4f450 switch_sp_and_call+0x13()
ffffff0007c4f4a0 do_interrupt+0x9e(ffffff0007c4f4b0, b)
ffffff0007c4f4b0 _interrupt+0xba()
I'm assuming this portion of the stack dump is what you're talking
about... looking at the function signature for dispatch_hardint -- the
new vector is 10, and the old ipl is 6.
::interrupts -d
IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s)
3 0xb1 12 ISA Edg Fixed 0 1 0x0/0x3 asy#1
4 0xb0 12 ISA Edg Fixed 0 1 0x0/0x4 asy#0
6 0x41 5 ISA Edg Fixed 0 1 0x0/0x6 fdc#0
7 0x42 5 ISA Edg Fixed 1 1 0x0/0x7 ecpp#0
9 0x81 9 PCI Lvl Fixed 1 1 0x0/0x9 acpi_wrapper_isr
15 0x43 5 ISA Edg Fixed 0 1 0x0/0xf ata#1
16 0x83 9 PCI Lvl Fixed 1 4 0x0/0x10 hci1394#0,
uhci#3, uhci#0,
nvidia#0
17 0x87 8 PCI Lvl Fixed 0 1 0x0/0x11 audio810#0
18 0x86 9 PCI Lvl Fixed 1 1 0x0/0x12 pci-ide#1
19 0x85 9 PCI Lvl Fixed 0 1 0x0/0x13 uhci#1
23 0x84 9 PCI Lvl Fixed 1 1 0x0/0x17 ehci#0
26 0x40 5 PCI Lvl Fixed 1 1 0x1/0x2 aac#0
48 0x60 6 PCI Lvl Fixed 1 1 0x2/0x0 e1000g#0
72 0x82 7 PCI Edg MSI 0 1 - pcie_pci#0
73 0x30 4 PCI Edg MSI 0 1 - pcie_pci#2
74 0x44 5 PCI Edg MSI 0 1 - adpu320#0
160 0xa0 0 Edg IPI all 0 - poke_cpu
192 0xc0 13 Edg IPI all 1 - xc_serv
208 0xd0 14 Edg IPI all 1 -
kcpc_hw_overflow_intr
209 0xd1 14 Edg IPI all 1 - cbe_fire
210 0xd3 14 Edg IPI all 1 - cbe_fire
240 0xe0 15 Edg IPI all 1 - xc_serv
241 0xe1 15 Edg IPI all 1 - apic_error_intr
That makes sense -- e1000g#0 is IPL 6, however shouldn't there then be
an entry somewhere in there with a VECT value of 0x0a and an IPL of 9?
Or do i still have more learning to do?
What you are looking for is 0x10, not 0x0a. Looks to me, here you have
IRQ# 16 interrupt (might be either hci1394#0, uhci#3, uhci#0, or
nvidia#0) preempting e1000g#0 interrupt. I guess such situation happened
frequently since you felt system freeze. So are you running something
that let both e1000g0 and other 4 driver instances at IRQ# 16 busy? For
example, are you putting heavy load on both network and graphics?
Regards,
Kerry
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss