Trying to debug interrupt flood after unbind

Rob Groner Tue, 31 May 2016 11:58:11 -0700

I am trying to load a driver for an Exar serial chip, but that chip is gobbled 
up by the 8250 driver on boot.  So, I use the "unbind" command in 
/sys/bus/pci/drivers/serial to remove the device from the clutches of 8250.  
Based on cobbled together google searches, I use the following to unbind it 
(assuming the address in /sys/bus/pci/drivers/serial is 0000:04:00.0.


sudo echo -n "0000:04:00.0" | tee ./unbind

The address disappears from that dir when I do this command, so I'm assuming it 
works.

I then go and install the Exar-provided driver.  Within about 3 seconds, I then 
get a system notification that the IRQ used by the Exar driver has been 
disabled.  I can also go look at /proc/interrupts and see a huge amount of 
interrupts happening on that IRQ.

There's a crash message in the log:

[  167.938861] irq 17: nobody cared (try booting with the "irqpoll" option)
[  167.938868] CPU: 0 PID: 801 Comm: Xorg Tainted: G           O  
3.16.6-2-desktop #1
[  167.938871] Hardware name: RTD Embedded Technologies, Inc CMA34CR/CMA34CR, 
BIOS v3.72.51.0009-1.1.85582 02/09/2015 09:39:58
[  167.938873]  ffff880148672cc4 ffffffff8161ab03 ffff880148672c00 
ffffffff810b8acd
[  167.938877]  ffff880148672c00 0000000000000011 0000000000000000 
ffffffff810b9011
[  167.938880]  0000000000000000 0000000000000000 0000000000000011 
0000000000000000
[  167.938883] Call Trace:
[  167.938899]  [<ffffffff8100519e>] dump_trace+0x8e/0x350
[  167.938905]  [<ffffffff81005506>] show_stack_log_lvl+0xa6/0x190
[  167.938909]  [<ffffffff81006c01>] show_stack+0x21/0x50
[  167.938914]  [<ffffffff8161ab03>] dump_stack+0x49/0x6a
[  167.938922]  [<ffffffff810b8acd>] __report_bad_irq+0x2d/0xc0
[  167.938928]  [<ffffffff810b9011>] note_interrupt+0x241/0x290
[  167.938935]  [<ffffffff810b67f1>] handle_irq_event_percpu+0xa1/0x1d0
[  167.938940]  [<ffffffff810b695e>] handle_irq_event+0x3e/0x60
[  167.938945]  [<ffffffff810b9b58>] handle_fasteoi_irq+0x88/0x160
[  167.938949]  [<ffffffff810050fd>] handle_irq+0x1d/0x30
[  167.938955]  [<ffffffff81624549>] do_IRQ+0x49/0xe0
[  167.938959]  [<ffffffff816224ad>] common_interrupt+0x6d/0x6d
[  167.938967]  [<ffffffff81620dce>] _raw_spin_unlock_irqrestore+0xe/0x30
[  167.938974]  [<ffffffff815c3f05>] unix_poll+0x25/0xb0
[  167.938980]  [<ffffffff81513fa9>] sock_poll+0x49/0x110
[  167.938986]  [<ffffffff811caf40>] do_select+0x390/0x7a0
[  167.938991]  [<ffffffff811cb4e4>] core_sys_select+0x194/0x2b0
[  167.938995]  [<ffffffff811cb6aa>] SyS_select+0xaa/0xf0
[  167.938999]  [<ffffffff8162182d>] system_call_fastpath+0x1a/0x1f
[  167.939015]  [<00007f7563abda43>] 0x7f7563abda42
[  167.939016] handlers:
[  167.939020] [<ffffffffa0549110>] serialxr_interrupt [xr17v35x]
[  167.939023] Disabling IRQ #17


So, as near as I can tell, when the Exar driver is inserted, an interrupt flood 
occurs, and the Exar driver (the only interrupt handler on that IRQ) does not 
respond to any of them.  I put in some debug code and verified that the Exar 
interrupt handler is called... but the handler just returns with an IRQ_NONE 
value.

I've tried:

1)      Multiple CPUs from different families (Core i7, Core 2 Duo, AMD 
G-Series) and it occurs with all of them.

2)      Kernels 3.16 and 4.2 and it occurs with both of them.

3)      Disabling ModemManager and it still happens.

4)      Contacting Exar about this.  They could not reproduce the problem.

5)      openSUSE 13.2 and Fedora 20.  It happened with openSUSE, but NOT with 
Fedora 20.

6)      The reference Exar implementation vs. our implementation, and it occurs 
with both of them.

Do you have any suggestions on how I can discover what is sending all of those 
interrupts?  Are there kernel tools specifically for that?

Thank you.


Rob Groner

_______________________________________________
Kernelnewbies mailing list
[email protected]
http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

Trying to debug interrupt flood after unbind

Reply via email to