Re: [driver-discuss] Am I understanding this correctly? -- potential e1000g bug

Garrett D'Amore Thu, 17 Sep 2009 09:52:00 -0700

Actually, a totally different possibility. If any one of those devicedrivers doesn't properly return DDI_INTR_UNCLAIMED when it isn't the oneinterrupting, then badness can ensue (looks like a stuck interrupt).Might be worth checking the driver code for interrupt handling for eachof them.


   - Garrett


Kerry Shu wrote:



Jason King wrote:

On Thu, Sep 17, 2009 at 10:16 AM, Garrett D'Amore <[email protected]>wrote:

Look closely at the stack.  You'll notice that a PIL9 interrupt
*interrupted* e1000g while it was servicing an interrupt. I don'tthink
e1000g is at fault here.  Something else is doing it.


This is probably my lack of knowledge about how solaris handles
interrupts, but with doing a little digging:

 0xffffff0007c49c60::findstack -v

stack pointer for thread ffffff0007c49c60: ffffff0007c49b30
  ffffff0007c49bb0 rm_isr+0xaa()
  ffffff0007c49c00 av_dispatch_autovect+0x7c(10)
  ffffff0007c49c40 dispatch_hardint+0x33(10, 6)
  ffffff0007c4f450 switch_sp_and_call+0x13()
  ffffff0007c4f4a0 do_interrupt+0x9e(ffffff0007c4f4b0, b)
  ffffff0007c4f4b0 _interrupt+0xba()

I'm assuming this portion of the stack dump is what you're talking
about... looking at the function signature for dispatch_hardint -- the
new vector is 10, and the old ipl is 6.

::interrupts -d

IRQ  Vect IPL Bus    Trg Type   CPU Share APIC/INT# Driver Name(s)
3    0xb1 12  ISA    Edg Fixed  0   1     0x0/0x3   asy#1
4    0xb0 12  ISA    Edg Fixed  0   1     0x0/0x4   asy#0
6    0x41 5   ISA    Edg Fixed  0   1     0x0/0x6   fdc#0
7    0x42 5   ISA    Edg Fixed  1   1     0x0/0x7   ecpp#0
9    0x81 9   PCI    Lvl Fixed  1   1     0x0/0x9   acpi_wrapper_isr
15   0x43 5   ISA    Edg Fixed  0   1     0x0/0xf   ata#1

16 0x83 9 PCI Lvl Fixed 1 4 0x0/0x10 hci1394#0,uhci#3, uhci#0,

nvidia#0
17   0x87 8   PCI    Lvl Fixed  0   1     0x0/0x11  audio810#0
18   0x86 9   PCI    Lvl Fixed  1   1     0x0/0x12  pci-ide#1
19   0x85 9   PCI    Lvl Fixed  0   1     0x0/0x13  uhci#1
23   0x84 9   PCI    Lvl Fixed  1   1     0x0/0x17  ehci#0
26   0x40 5   PCI    Lvl Fixed  1   1     0x1/0x2   aac#0
48   0x60 6   PCI    Lvl Fixed  1   1     0x2/0x0   e1000g#0
72   0x82 7   PCI    Edg MSI    0   1     -         pcie_pci#0
73   0x30 4   PCI    Edg MSI    0   1     -         pcie_pci#2
74   0x44 5   PCI    Edg MSI    0   1     -         adpu320#0
160  0xa0 0          Edg IPI    all 0     -         poke_cpu
192  0xc0 13         Edg IPI    all 1     -         xc_serv

208 0xd0 14 Edg IPI all 1 -kcpc_hw_overflow_intr

209  0xd1 14         Edg IPI    all 1     -         cbe_fire
210  0xd3 14         Edg IPI    all 1     -         cbe_fire
240  0xe0 15         Edg IPI    all 1     -         xc_serv
241  0xe1 15         Edg IPI    all 1     -         apic_error_intr

That makes sense -- e1000g#0 is IPL 6, however shouldn't there then be
an entry somewhere in there with a VECT value of 0x0a and an IPL of 9?
 Or do i still have more learning to do?


What you are looking for is 0x10, not 0x0a. Looks to me, here you have
IRQ# 16 interrupt (might be either hci1394#0, uhci#3, uhci#0, or
nvidia#0) preempting e1000g#0 interrupt. I guess such situation happened
frequently since you felt system freeze. So are you running something
that let both e1000g0 and other 4 driver instances at IRQ# 16 busy? For
example, are you putting heavy load on both network and graphics?

Regards,
Kerry
_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss


_______________________________________________
driver-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/driver-discuss

Re: [driver-discuss] Am I understanding this correctly? -- potential e1000g bug

Reply via email to