Re: 157k interrupts per second causing 60% CPU load on idle system
On Sat, 14 Apr 2012 01:13:30 +0200, Matt Thyer matt.th...@gmail.com wrote: On Apr 7, 2012 2:38 PM, Matt Thyer matt.th...@gmail.com wrote: On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote: Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no longer having that disk evicted from the raidz2 pool with write errors and I thought that the high interrupt rate issue had also been solved but it's back again. This is on 8-STABLE at revision 230921 (before the new driver hit 8-STABLE). So now I need to go back to trying to determine what the cause is. vmstat -i has shown that the issue was on irq 16. Unfortunately there seems to be a lot of things on irq 16: $ dmesg | grep irq 16 pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0 mps0: LSI SAS2008 port 0xee00-0xeeff mem 0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xff00-0xff07 mem 0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device 26.0 on pci0 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 atapci0: JMicron JMB368 UDMA133 controller port 0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16 at device 0.0 on pci3 Any idea how to isolate which bit of hardware could be triggering the interrupts ? Unfortunately the only device I could remove would be the SuperMicro AOC-USAS2-L8i (so yes I could eliminate that). My biggest problem right now is not knowing how to trigger the issue. At this stage I'm going to upgrade to 9-STABLE and see if it returns. The problem does not occur with 9-STABLE. Who knows what the problem was ? USB maybe ? Do you still have the same hardware on the same interrupts on 9-STABLE? Are there changes in the use of MSI(-X)? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Apr 15, 2012 6:27 PM, Ronald Klop ronald-freeb...@klop.yi.org wrote: The problem does not occur with 9-STABLE. Who knows what the problem was ? USB maybe ? Do you still have the same hardware on the same interrupts on 9-STABLE? Are there changes in the use of MSI(-X)? I made no hardware or BIOS changes and I'm running a GENERIC kernel in all testing. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Sun, 15 Apr 2012 15:09:34 +0200, Matt Thyer matt.th...@gmail.com wrote: On Apr 15, 2012 6:27 PM, Ronald Klop ronald-freeb...@klop.yi.org wrote: The problem does not occur with 9-STABLE. Who knows what the problem was ? USB maybe ? Do you still have the same hardware on the same interrupts on 9-STABLE? Are there changes in the use of MSI(-X)? I made no hardware or BIOS changes and I'm running a GENERIC kernel in all testing. That does not mean FreeBSD 9 can't put devices on other interrupts than 8 did. 'dmesg | grep irq' like you did before might show a difference with your previous output. I'm just guessing here for a clue on the result you are seeing, but without any data I cannot answer you (and I guess nobody can). Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Apr 16, 2012 5:42 AM, Ronald Klop ronald-freeb...@klop.yi.org wrote: On Sun, 15 Apr 2012 15:09:34 +0200, Matt Thyer matt.th...@gmail.com wrote: On Apr 15, 2012 6:27 PM, Ronald Klop ronald-freeb...@klop.yi.org wrote: The problem does not occur with 9-STABLE. Who knows what the problem was ? USB maybe ? Do you still have the same hardware on the same interrupts on 9-STABLE? Are there changes in the use of MSI(-X)? I made no hardware or BIOS changes and I'm running a GENERIC kernel in all testing. That does not mean FreeBSD 9 can't put devices on other interrupts than 8 did. 'dmesg | grep irq' like you did before might show a difference with your previous output. I'm just guessing here for a clue on the result you are seeing, but without any data I cannot answer you (and I guess nobody can). Ronald, The irqs seem to be the same: $ grep irq\ 16 /var/run/dmesg.boot pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0 mps0: LSI SAS2008 port 0xee00-0xeeff mem 0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xff00-0xff07 mem 0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device 26.0 on pci0 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 atapci0: JMicron JMB368 UDMA133 controller port 0xdf00-0xdf07,0xde00-0xde03,0xd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16 at device 0.0 on pci3 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Apr 7, 2012 2:38 PM, Matt Thyer matt.th...@gmail.com wrote: On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote: Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no longer having that disk evicted from the raidz2 pool with write errors and I thought that the high interrupt rate issue had also been solved but it's back again. This is on 8-STABLE at revision 230921 (before the new driver hit 8-STABLE). So now I need to go back to trying to determine what the cause is. vmstat -i has shown that the issue was on irq 16. Unfortunately there seems to be a lot of things on irq 16: $ dmesg | grep irq 16 pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0 mps0: LSI SAS2008 port 0xee00-0xeeff mem 0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xff00-0xff07 mem 0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device 26.0 on pci0 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 atapci0: JMicron JMB368 UDMA133 controller port 0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16 at device 0.0 on pci3 Any idea how to isolate which bit of hardware could be triggering the interrupts ? Unfortunately the only device I could remove would be the SuperMicro AOC-USAS2-L8i (so yes I could eliminate that). My biggest problem right now is not knowing how to trigger the issue. At this stage I'm going to upgrade to 9-STABLE and see if it returns. The problem does not occur with 9-STABLE. Who knows what the problem was ? USB maybe ? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 5 April 2012 01:18, Freddie Cash fjwc...@gmail.com wrote: On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote: So it seems that both the old and new mps driver have a problem with the Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS 6G) controller (flashed with -IT firmware). I wouldn't say the driver has a problem with that specific drive. More that it might have a problem with a mixed SATA2/SATA3 setup. Sorry, that's what I meant to say but it now seems that the 157K interrupts per second is probably not due to the SuperMicro AOC-USAS2-L8i. Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no longer having that disk evicted from the raidz2 pool with write errors and I thought that the high interrupt rate issue had also been solved but it's back again. This is on 8-STABLE at revision 230921 (before the new driver hit 8-STABLE). So now I need to go back to trying to determine what the cause is. I'll stop posting in this thread as I don't think it's anything to do with either the old or new version of this driver. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote: On 5 April 2012 01:18, Freddie Cash fjwc...@gmail.com wrote: On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote: So it seems that both the old and new mps driver have a problem with the Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS 6G) controller (flashed with -IT firmware). I wouldn't say the driver has a problem with that specific drive. More that it might have a problem with a mixed SATA2/SATA3 setup. Sorry, that's what I meant to say but it now seems that the 157K interrupts per second is probably not due to the SuperMicro AOC-USAS2-L8i. Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no longer having that disk evicted from the raidz2 pool with write errors and I thought that the high interrupt rate issue had also been solved but it's back again. This is on 8-STABLE at revision 230921 (before the new driver hit 8-STABLE). So now I need to go back to trying to determine what the cause is. I'll stop posting in this thread as I don't think it's anything to do with either the old or new version of this driver. Oops... wrong thread I thought I was replying in -CURRENT. So on to the root cause. vmstat -i has shown that the issue was on irq 16. Unfortunately there seems to be a lot of things on irq 16: $ dmesg | grep irq 16 pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0 mps0: LSI SAS2008 port 0xee00-0xeeff mem 0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xff00-0xff07 mem 0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device 26.0 on pci0 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 atapci0: JMicron JMB368 UDMA133 controller port 0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16 at device 0.0 on pci3 pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0 mps0: LSI SAS2008 port 0xee00-0xeeff mem 0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xff00-0xff07 mem 0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device 26.0 on pci0 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 atapci0: JMicron JMB368 UDMA133 controller port 0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16 at device 0.0 on pci3 Any idea how to isolate which bit of hardware could be triggering the interrupts ? Unfortunately the only device I could remove would be the SuperMicro AOC-USAS2-L8i (so yes I could eliminate that). My biggest problem right now is not knowing how to trigger the issue. At this stage I'm going to upgrade to 9-STABLE and see if it returns. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 25 March 2012 22:26, Matt Thyer matt.th...@gmail.com wrote: Does anyone know if and when this driver was merged from current to 8-STABLE ? If I can work out what revision that occurred in I'll go back to just before then to confirm if the problem exists. In the -CURRENT list I've been told that the new driver was introduced into 8-STABLE at revision 230921. I reverted to that revision but the problem was still apparent. So I've now tried: - Previous BIOS - Updating the controller firmware from phase 7 to phase 11 - Going back to the old (pre LSI authored) mps driver But all to no avail. What has worked is to move the single SATA 3 (6 Gbps) drive (the other 7 drives are SATA 2) to the onboard SATA 2 Intel controller. So it seems that both the old and new mps driver have a problem with the Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS 6G) controller (flashed with -IT firmware). I'll continue this in the -CURRENT list in the thread about the new driver as that's where the main discussion has been. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 157k interrupts per second causing 60% CPU load on idle system
-Original Message- From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd- sta...@freebsd.org] On Behalf Of Matt Thyer Sent: Wednesday, April 04, 2012 5:50 PM To: Mike Tancsa Cc: freebsd-stable@freebsd.org Subject: Re: 157k interrupts per second causing 60% CPU load on idle system On 25 March 2012 22:26, Matt Thyer matt.th...@gmail.com wrote: Does anyone know if and when this driver was merged from current to 8-STABLE ? If I can work out what revision that occurred in I'll go back to just before then to confirm if the problem exists. In the -CURRENT list I've been told that the new driver was introduced into 8-STABLE at revision 230921. I reverted to that revision but the problem was still apparent. So I've now tried: - Previous BIOS - Updating the controller firmware from phase 7 to phase 11 - Going back to the old (pre LSI authored) mps driver But all to no avail. What has worked is to move the single SATA 3 (6 Gbps) drive (the other 7 drives are SATA 2) to the onboard SATA 2 Intel controller. So it seems that both the old and new mps driver have a problem with the Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS 6G) controller (flashed with -IT firmware). I'll continue this in the -CURRENT list in the thread about the new driver as that's where the main discussion has been. Mike, Have your purchase LSI controller through Channel or OEM ? It would be a difficult for developers to help you without any support channel invovoled ? If possible can you contact LSI support channel ? ~ Kashyap ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable- unsubscr...@freebsd.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 4 April 2012 21:55, Desai, Kashyap kashyap.de...@lsi.com wrote: Mike, Have your purchase LSI controller through Channel or OEM ? It would be a difficult for developers to help you without any support channel invovoled ? If possible can you contact LSI support channel ? Kashyap, It's me, Matt, not Mike reporting this problem. I purchased the Super Micro card off eBay through a seller with good reputation. I've bought a couple of said cards through him now. I'm not sure why you are thinking the lack of middle men would be a problem. I hope that Super Micro LSI would both be interested in a detailed report of a problem. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 157k interrupts per second causing 60% CPU load on idle system
Matt, Sorry for addressing to wrong person.! I am working in Device Driver group and it difficult to track request coming through emails. We can solve those request through emails which are minor/medium in complexity. There are some group in LSI which has expertise to root cause issue and provide better support than developers. Sometimes issue can be non-driver issue (e.a Can be hw issue/fw issues etc.) and those issue need customer interaction and may be sometimes reproduction. So I request to contact LSI support channel ( at least for complex issues) to get better tracking and speed up resolution. ` Kashyap From: Matt Thyer [mailto:matt.th...@gmail.com] Sent: Wednesday, April 04, 2012 6:08 PM To: Desai, Kashyap Cc: Mike Tancsa; freebsd-stable@freebsd.org; McConnell, Stephen Subject: Re: 157k interrupts per second causing 60% CPU load on idle system On 4 April 2012 21:55, Desai, Kashyap kashyap.de...@lsi.commailto:kashyap.de...@lsi.com wrote: Mike, Have your purchase LSI controller through Channel or OEM ? It would be a difficult for developers to help you without any support channel invovoled ? If possible can you contact LSI support channel ? Kashyap, It's me, Matt, not Mike reporting this problem. I purchased the Super Micro card off eBay through a seller with good reputation. I've bought a couple of said cards through him now. I'm not sure why you are thinking the lack of middle men would be a problem. I hope that Super Micro LSI would both be interested in a detailed report of a problem. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote: So it seems that both the old and new mps driver have a problem with the Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS 6G) controller (flashed with -IT firmware). I wouldn't say the driver has a problem with that specific drive. More that it might have a problem with a mixed SATA2/SATA3 setup. I have a 9-STABLE box with 24x WDC WD2002FAEX SATA3 (6 Gbps) drives attached to 3 SuperMicro AOC-USAS2-8Li controllers, using the new mps(4) driver without any issues. Was actually amazed yesterday when I say it doing writes just shy of 500 MBps to the ZFS pool, via zfs send/recv from another box. No issues with excessive interrupts. Using 10.0 firmware on the controllers. -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 23 March 2012 01:16, Mike Tancsa m...@sentex.net wrote: Sorry, what I was getting at was that a bad bios (eg latest could have introduced a regression) can cause the symptoms you are seeing. The bios change sure seemed to fix my problem. I've updated the firmware of the SuperMicro AOC-USAS2-L8i to the latest -IT firmware on the Supermicro FTP site (this is version 11 and I was on 7 before) but this made no difference. I then tried downgrading the motherboard BIOS to the F3 release that I was running previously but again this made no difference. So it would seem that this is a problem is to do with a change in FreeBSD-STABLE between r225723 and r232477. I'm wondering whether this is due to the new LSI authored driver for chips such as the LSI SAS2008 that are used in the SuperMicro AOC-USAS2-L8i. I know this driver is in CURRENT but do not know if it's in 8-STABLE. Does anyone know if and when this driver was merged from current to 8-STABLE ? If I can work out what revision that occurred in I'll go back to just before then to confirm if the problem exists. Matt ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Mar 22, 2012 10:14 AM, Mike Tancsa m...@sentex.net wrote: On 3/20/2012 1:26 AM, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Any idea what could be the cause ? This sounds like the problem I had with my intel board. BIOS update fixed it. What chipset and MB vendor do you have ? The original post tells you this. I've already updated to the latest BIOS and this could have caused the problem. I think I should try updating the firmware of the SAS/SATA HBA but not until I've confirmed if a BIOS downgrade works. Unfortunately it's very hard to get downtime on this system so it may be some time before I can report any news. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 3/22/2012 10:33 AM, Matt Thyer wrote: The original post tells you this. I've already updated to the latest BIOS and this could have caused the problem. Sorry, what I was getting at was that a bad bios (eg latest could have introduced a regression) can cause the symptoms you are seeing. The bios change sure seemed to fix my problem. In the mean time, try dmidecode to extract some of the versions of the BIOS and hardware you are using. Perhaps it might tweak someone's memory to a similar problem as yours. ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 3/20/2012 1:26 AM, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Any idea what could be the cause ? This sounds like the problem I had with my intel board. BIOS update fixed it. What chipset and MB vendor do you have ? /usr/ports/sysutils/dmidecode is handy for digging up BIOS and chipset info http://lists.freebsd.org/pipermail/freebsd-questions/2012-March/239394.html ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 20/03/2012 06:26, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Ok, but *which* interrupt is getting triggered? Please send the output of vmstat -i. signature.asc Description: OpenPGP digital signature
Re: 157k interrupts per second causing 60% CPU load on idle system
On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote: On 20/03/2012 06:26, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Ok, but *which* interrupt is getting triggered? Please send the output of vmstat -i. interrupt total rate irq16: uhci0+ 3392184862 126692 cpu0: timer 53549677 1999 irq256: mps0 2643187 98 irq257: re0 5508108205 irq258: ahci0 160717 6 cpu1: timer 53525300 1999 cpu2: timer 53525300 1999 cpu3: timer 53525296 1999 Total 3614622447 134999 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote: On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote: On 20/03/2012 06:26, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Ok, but *which* interrupt is getting triggered? Please send the output of vmstat -i. interrupt total rate irq16: uhci0+ 3392184862 126692 Ok, something's probably wrong with USB. Can you disable it in BIOS? cpu0: timer 53549677 1999 irq256: mps0 2643187 98 irq257: re0 5508108 205 irq258: ahci0 160717 6 cpu1: timer 53525300 1999 cpu2: timer 53525300 1999 cpu3: timer 53525296 1999 Total 3614622447 134999 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 20 March 2012 22:24, Ivan Voras ivo...@freebsd.org wrote: On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote: On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote: On 20/03/2012 06:26, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Ok, but *which* interrupt is getting triggered? Please send the output of vmstat -i. interrupt total rate irq16: uhci0+ 3392184862 126692 Ok, something's probably wrong with USB. Can you disable it in BIOS? cpu0: timer 53549677 1999 irq256: mps0 2643187 98 irq257: re0 5508108205 irq258: ahci0 160717 6 cpu1: timer 53525300 1999 cpu2: timer 53525300 1999 cpu3: timer 53525296 1999 Total 3614622447 134999 I did just update the BIOS at about the same time so the difference may be due to that change. I'll try a few things such as: - Unplugging any USB things (I've only got a keyboard plugged in). - Downgrade BIOS. I'll get back to you all soon. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On Tue, Mar 20, 2012 at 11:10:10PM +1030, Matt Thyer wrote: On 20 March 2012 22:24, Ivan Voras ivo...@freebsd.org wrote: On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote: On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote: On 20/03/2012 06:26, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Ok, but *which* interrupt is getting triggered? Please send the output of vmstat -i. interrupt total rate irq16: uhci0+ 3392184862 126692 Ok, something's probably wrong with USB. Can you disable it in BIOS? cpu0: timer 53549677 1999 irq256: mps0 2643187 98 irq257: re0 5508108205 irq258: ahci0 160717 6 cpu1: timer 53525300 1999 cpu2: timer 53525300 1999 cpu3: timer 53525296 1999 Total 3614622447 134999 I did just update the BIOS at about the same time so the difference may be due to that change. I'll try a few things such as: - Unplugging any USB things (I've only got a keyboard plugged in). - Downgrade BIOS. I'll get back to you all soon. It would be interesting to know if there are other devices on irq16 also. grep 'irq 16' /var/run/dmesg.boot I think the '+' on the irq16 line from vmstat means the interrupt is shared, but the man page doesn't mention it so I'm not 100% sure Thanks, Gary ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 157k interrupts per second causing 60% CPU load on idle system
On 21 March 2012 00:03, Gary Palmer gpal...@freebsd.org wrote: On Tue, Mar 20, 2012 at 11:10:10PM +1030, Matt Thyer wrote: On 20 March 2012 22:24, Ivan Voras ivo...@freebsd.org wrote: On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote: On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote: On 20/03/2012 06:26, Matt Thyer wrote: I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. Ok, but *which* interrupt is getting triggered? Please send the output of vmstat -i. interrupt total rate irq16: uhci0+ 3392184862 126692 Ok, something's probably wrong with USB. Can you disable it in BIOS? cpu0: timer 53549677 1999 irq256: mps0 2643187 98 irq257: re0 5508108205 irq258: ahci0 160717 6 cpu1: timer 53525300 1999 cpu2: timer 53525300 1999 cpu3: timer 53525296 1999 Total 3614622447 134999 I did just update the BIOS at about the same time so the difference may be due to that change. I'll try a few things such as: - Unplugging any USB things (I've only got a keyboard plugged in). - Downgrade BIOS. I'll get back to you all soon. It would be interesting to know if there are other devices on irq16 also. grep 'irq 16' /var/run/dmesg.boot I think the '+' on the irq16 line from vmstat means the interrupt is shared, but the man page doesn't mention it so I'm not 100% sure Thanks, Gary Good point... pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0 mps0: LSI SAS2008 port 0xee00-0xeeff mem 0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xff00-0xff07 mem 0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device 26.0 on pci0 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0 atapci0: JMicron JMB368 UDMA133 controller port 0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16 at device 0.0 on pci3 I'd suspect the SAS/SATA HBA using the mps0 driver as that's where I have the raidz2 on 8 drives. Is this still the old driver or has the new LSI authored driver been added to -STABLE yet ? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
157k interrupts per second causing 60% CPU load on idle system
I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to r232477 (4th Mar 2012) and am finding that a system process called intr is now constantly using about 60% of 1 CPU starting a short time after reboot (possibly triggered by use of the samba server). When this starts, systat -vm 1 says that the system is 85% idle and 14% interrupt handling. It says that there's around 157k interrupts per second. After a reboot the system is back to it's normal state doing between 3 and 250 or so interrupts per second. The hardware is an Intel Core i3-530 (dual core @ 2.93 GHz with Hyperthreading) with 8 GB RAM (2x4GB) on a Gigabyte H55M-D2H rev 1.3 motherboard running the latest BIOS (F4). The system runs a GENERIC kernel with the following significant items in /boot/loader.conf: zfs_load=YES aio_load=YES ahci_load=YES geom_mirror_load=YES vfs.root.mountfrom=zfs:zroot vboxdrv_load=YES It has 2 x 300 GB disks for the system with GPT partitioning and zmirror for the OS ala http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror I have swap on a gmirror as I want swap to survive the loss of one system disk. The NAS data is on a raidz2 pool of 8 disks connected to a SuperMicro AOC-USAS2-L8i (flashed to behave as an AOC-USAS2-L8e). The system is basically a CIFS NAS with ports/net/samba36 built with AIO_SUPPORT and configured like: socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY min receivefile size=16384 use sendfile=true aio read size = 16384 aio write size = 16384 aio write behind = true The only other interesting workload on the box is a java Minecraft server using ports/java/jdk16. I'm going to try to reproduce the problem in a VM and binary search down to the revision where it started as soon as I can work out a reliable way to trigger the behaviour (as it doesn't start at boot time). Any idea what could be the cause ? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org