Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm
Thank-you very much for your comments in your reply. Actually the patch did work - I confirmed it was run and the iomap call was successful by adding a pr_info() after the pci_iomap() success branch. The only time I am getting the IRQ 17 nobody cared message is on suspend / resume. A fresh boot always had below the 100k interrupt threshold level. I tried your new patch and the number is even lower < 30,000 over two boots. BUT on suspend resume again 126856. Have you any insights on fixing suspend to disk / resume paths which presumably face the same issue of being passed live hardware on boot up? On 13 April 2016 at 04:32, Lukas Wunner wrote: > Hi Andrew, > > thank you for the extensive testing. > > On Sun, Apr 10, 2016 at 08:09:29PM +1000, Andrew Worsley wrote: >> Further testing Broadcom 4331 reset quirk to prevent IRQ storm patch >> testing reveals that: >> 1. quirk is run on initial boot up and this time appears to have >> vastly reduced the interrupts (only 81 this time): >> cat /proc/interrupts| grep 17 >> 17: 81 0 0 0 0 0 >> 0 0 IO-APIC-fasteoi snd_hda_intel > > Something in the ballpark of 81 interrupt requests is fine. > > The kernel will print the error message about spurious interrupts and > switch to polling at 10 requests. But even 2 is way too much. > This just means that b43 loaded quickly enough to stop the interrupts > before the kernel limit of 10 was reached, but the wireless card > wasn't reset early on as it should have been. > > It looks like the patch didn't work at all on your machine for some > reason. Do you see a message "cannot iomap device, IRQ storm ahead" > in dmesg? Result from two reboots with my 3.16 kernel and your new patch Three full boots (all below 30k interrupts): 17: 23978 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel 17: 30088 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel 17: 26853 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel dmesg output showing quirk running dmesg | grep -C 5 quirk [3.270315] pci :00:1c.0: PCI bridge to [bus 03] [3.270323] pci :00:1c.0: bridge window [mem 0xc1a0-0xc1af] [3.270331] pci :00:1c.0: bridge window [mem 0xc180-0xc18f 64bit pref] [3.270463] pci :04:00.0: [14e4:4331] type 00 class 0x028000 [3.270495] pci :04:00.0: reg 0x10: [mem 0xc190-0xc1903fff 64bit] [3.270574] pci :04:00.0: b43 quirk: resetting controller [3.270711] pci :04:00.0: supports D1 D2 [3.270712] pci :04:00.0: PME# supported from D0 D3hot D3cold [3.270759] pci :04:00.0: System wakeup disabled by ACPI [3.278239] pci :00:1c.1: PCI bridge to [bus 04] [3.278251] pci :00:1c.1: bridge window [mem 0xc190-0xc19f] Output after resume. Note: Some times it looks it can happen on the suspend to disk? But a new one is always present after the resume. 17: 126856 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel [ 53.404157] xhci_hcd :00:14.0: xHCI xhci_drop_endpoint called with disabled ep 88045d495540 [ 53.468249] irq 17: nobody cared (try booting with the "irqpoll" option) [ 53.468253] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G C O 3.16.7-ckt25-3.16-bcm4331-patch2 #7 [ 53.468254] Hardware name: Apple Inc. MacBookPro10,1/Mac-C3EC7CD22292981F, BIOS MBP101.88Z.00EE.B00.1205101839 05/10/2012 [ 53.468259] 81520370 88045a8a8c00 88045a8a8cc4 [ 53.468262] 810bfe7d 88045a8a8c00 0011 [ 53.468264] 810c022f 0011 [ 53.468265] Call Trace: [ 53.468275][] ? dump_stack+0x5d/0x78 [ 53.468282] [] ? __report_bad_irq+0x2d/0xd0 [ 53.468286] [] ? note_interrupt+0x25f/0x2b0 [ 53.468290] [] ? handle_irq_event_percpu+0x121/0x190 [ 53.468294] [] ? handle_irq_event+0x38/0x50 [ 53.468296] [] ? handle_fasteoi_irq+0x7f/0x150 [ 53.468302] [] ? handle_irq+0x1d/0x30 [ 53.468307] [] ? do_IRQ+0x48/0xe0 [ 53.468311] [] ? common_interrupt+0x6d/0x6d [ 53.468317][] ? cpuidle_enter_state+0x4c/0xc0 [ 53.468320] [] ? cpuidle_enter_state+0x42/0xc0 [ 53.468323] [] ? cpu_startup_entry+0x33a/0x460 [ 53.468326] [] ? start_kernel+0x473/0x47b [ 53.468331] [] ? early_idt_handler_array+0x120/0x120 [ 53.468335] [] ? x86_64_start_kernel+0x14d/0x15c [ 53.468336] handlers: [ 53.468367] [] azx_interrupt [snd_hda_controller] [ 53.468368] Disabling IRQ #17 [ 53.513740] usb 3-1: reset high-speed USB device
Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm
Further testing Broadcom 4331 reset quirk to prevent IRQ storm patch testing reveals that: 1. quirk is run on initial boot up and this time appears to have vastly reduced the interrupts (only 81 this time): cat /proc/interrupts| grep 17 17: 81 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel 2. But it is apparently *NOT* run after a suspend/resume and we get the problem: 17: 100084 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel Rebooting a further nine times shows the low number (below 100) only happens around 1/3 of the times: boot #2 17: 38706 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot #3 17: 87 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel LOC: 2494 2031 2094 1831 1157 1171 1573 1271 Local timer interrupts boot #4 17: 50616 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot#5 17: 26454 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot#6 17: 34440 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot#7 17: 79 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot#8 17: 84 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot#9 17: 37054 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel boot#10 17: 24648 0 0 0 0 0 0 0 IO-APIC-fasteoi snd_hda_intel Is there an easy setpci command to stop this we can add to grub? Presently I have a grub work around for black screen as described here: http://askubuntu.com/questions/264247/proprietary-nvidia-drivers-with-efi-on-mac-to-prevent-overheating/613573#613573 which basically involves adding a grub scriptlet to enable PCI-E bus mastering on graphics cards: In /etc/grub.d/01_enable_vga.conf: setpci -s "00:01.0" 3e.b=8 setpci -s "01:00.0" 04.b=7 Can we do some similar magic setpci commands to disable 04:00.0 which is my BCM4331 lspci | grep 4331 04:00.0 Network controller: Broadcom Corporation BCM4331 802.11a/b/g/n (rev 02) On 7 April 2016 at 22:04, Andrew Worsley wrote: > Sorry but testing the patch shows no difference. > > I have just compiled debian jessie kernel 3.16.7-ckt25 and booted it > and hibernated it twice, then did the same with your patch applied. > There appeared to be no difference > Thanks for any suggestions Andrew -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm
Sorry but testing the patch shows no difference. I have just compiled debian jessie kernel 3.16.7-ckt25 and booted it and hibernated it twice, then did the same with your patch applied. There appeared to be no difference On first boot I didn't get the nobody card disabling problem but after each hibernate I got the problem. But I did get 51130 IRQ 17 interrupts on the first boot but after the hibernate restore each time I got 10 extra interrupts in /proc/interrupts and the irq 17: nobody cared message. I could not see any difference with or with out the patch. I boot with grub-efi using the linux/initrd commands So perhaps the hibernate-restore needs the fix? Andrew On 3 April 2016 at 21:49, Lukas Wunner wrote: > Hi Andrew, > > On Sat, Apr 02, 2016 at 10:40:41PM +1100, Andrew Worsley wrote: >> On 30 March 2016 at 04:41, Lukas Wunner wrote: >> > Broadcom 4331 wireless cards built into Apple Macs unleash an IRQ storm >> > on boot until they are reset, causing spurious interrupts if the IRQ is >> > shared. Apparently the EFI bootloader enables the device and does not >> > disable it before passing control to the OS. The bootloader contains a >> > driver for the wireless card which allows it to phone home to Cupertino. >> > This is used for Internet Recovery (download and install OS X images) >> > and probably also for Back to My Mac (remote access, RFC 6281) and to >> > discover stolen hardware. >> > >> > The issue is most pronounced on 2011 and 2012 MacBook Pros where the IRQ >> > is shared with 3 other devices (Light Ridge Thunderbolt controller, SDXC >> > reader, HDA card on discrete GPU). As soon as an interrupt handler is >> > installed for one of these devices, the ensuing storm of spurious IRQs >> > causes the kernel to disable the IRQ and switch to polling. This lasts >> > until the b43 driver loads and resets the device. >> > >> > Loading the b43 driver first is not always an option, in particular with >> > the Light Ridge Thunderbolt controller: The PCI hotplug IRQ handler gets >> > installed early on because it is built in, unlike b43 which is usually >> > a module. >> > >> > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79301 >> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=895951 >> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1009819 >> > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1149632 >> >> I do see an irq 17 problem on my macbook, but I thought grub is >> supposed to stop the boardcom wireless? >> >> Investigating grub2 git://git.savannah.gnu.org/grub.git I see this >> patch rev 9d34bb8 which says it disables Broadcom wireless hardware >> on Apples: > > Thanks for the pointer to the grub2 commit, I wasn't aware of that. > > The commit puts the wireless card in power state D3hot but that doesn't > stop it from sending interrupts. I have just tested that. So it's > perfectly plausible that you're still seeing spurious interrupts > despite using grub. Please test the patch I've posted, the spurious > interrupts should disappear. If you "cat /proc/interrupts", you should > then only see a few hundred interrupts on IRQ 17. Without the patch it > should be in the 10+ range. > > Best regards, > > Lukas > >> >> * commit 9d34bb8 >> | Author: Matthew Garrett >> | Date: Thu May 3 17:26:55 2012 +0200 >> | >> | Suspend broadcom cards in order to stop their DMA. >> | >> | * grub-core/Makefile.am (KERNEL_HEADER_FILES): Add pci.h on x86 EFI. >> | * grub-core/Makefile.core.def (kernel): Add pci.c on x86 EFI. >> | (pci): Don't build on x86 EFI. >> | * grub-core/bus/pci.c (grub_pci_find_capability): New function. >> | * grub-core/kern/efi/mm.c (stop_broadcom) [__i386__ || __x86_64__]: >> | New function. >> | (grub_efi_finish_boot_services) [__i386__ || __x86_64__]: Call >> | stop_broadcom if running on EFI. >> | * include/grub/pci.h (GRUB_PCI_CLASS_NETWORK): New enum value. >> | (GRUB_PCI_CAP_POWER_MANAGEMENT): Likewise. >> | (GRUB_PCI_VENDOR_BROADCOM): Likewise. >> | (grub_pci_find_capability): New proto. >> | >> | Also-By: Vladimir Serbinenko >> | >> | M ChangeLog >> | M grub-core/Makefile.am >> | M grub-core/Makefile.core.def >> | M grub-core/bus/pci.c >> | M grub-core/kern/efi/mm.c >> | M include/grub/pci.h >> >> But I run debian grub2-common 2.02~beta2-22+deb8u1 which has this fix &
Re: [PATCH] PCI: Add Broadcom 4331 reset quirk to prevent IRQ storm
That patch appears to be the grub 1 equivalent to grub2 git://git.savannah.gnu.org/grub.git rev 9d34bb8 which puts the network device into D3 power state. I am running grub2 with that patch and it doesn't fix my irq 17 problem. Can we not fix this in grub2 via something like Lukas original patch or disable any DMA transfers before the kernel starts? Andrew On 6 April 2016 at 05:59, Matthew Garrett wrote: > On Tue, Apr 05, 2016 at 02:40:15PM -0500, Bjorn Helgaas wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=111781 and >> https://mjg59.dreamwidth.org/11235.html describe a sort of similar >> issue, but with DMA. An interrupt from the device is probably to >> signal a DMA completion, but these problem reports only mention the >> "IRQ nobody cared" issue; I don't see anything about memory >> corruption. > > I "fixed" this with > https://github.com/mjg59/grub-fedora/commit/21fcd6d79b7601e4b20ad70c5408adff2dabbc1d > - doing the same in the kernel EFI stub would probably be the best way > to handle it. This way you're guaranteed to stop DMA before the kernel > reclaims boot services memory, which guarantees you won't have any > corruption. -- To unsubscribe from this list: send the line "unsubscribe linux-wireless" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html