Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-15 Thread Ronald Klop
On Sat, 14 Apr 2012 01:13:30 +0200, Matt Thyer matt.th...@gmail.com  
wrote:



On Apr 7, 2012 2:38 PM, Matt Thyer matt.th...@gmail.com wrote:


On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote:

Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm

no longer having that disk evicted from the raidz2 pool with write errors
and I thought that the high interrupt rate issue had also been solved but
it's back again.


This is on 8-STABLE at revision 230921 (before the new driver hit

8-STABLE).


So now I need to go back to trying to determine what the cause is.


vmstat -i has shown that the issue was on irq 16.

Unfortunately there seems to be a lot of things on irq 16:

$  dmesg | grep irq 16

pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem

0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1

vgapci0: VGA-compatible display port 0xff00-0xff07 mem

0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0

uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at

device 26.0 on pci0

pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port

0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3


Any idea how to isolate which bit of hardware could be triggering the

interrupts ?


Unfortunately the only device I could remove would be the SuperMicro

AOC-USAS2-L8i (so yes I could eliminate that).


My biggest problem right now is not knowing how to trigger the issue.

At this stage I'm going to upgrade to 9-STABLE and see if it returns.


The problem does not occur with 9-STABLE.

Who knows what the problem was ? USB maybe ?


Do you still have the same hardware on the same interrupts on 9-STABLE?
Are there changes in the use of MSI(-X)?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-15 Thread Matt Thyer
On Apr 15, 2012 6:27 PM, Ronald Klop ronald-freeb...@klop.yi.org wrote:

 The problem does not occur with 9-STABLE.

 Who knows what the problem was ? USB maybe ?


 Do you still have the same hardware on the same interrupts on 9-STABLE?
 Are there changes in the use of MSI(-X)?

I made no hardware or BIOS changes and I'm running a GENERIC kernel in all
testing.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-15 Thread Ronald Klop
On Sun, 15 Apr 2012 15:09:34 +0200, Matt Thyer matt.th...@gmail.com  
wrote:


On Apr 15, 2012 6:27 PM, Ronald Klop ronald-freeb...@klop.yi.org  
wrote:



The problem does not occur with 9-STABLE.

Who knows what the problem was ? USB maybe ?



Do you still have the same hardware on the same interrupts on 9-STABLE?
Are there changes in the use of MSI(-X)?

I made no hardware or BIOS changes and I'm running a GENERIC kernel in  
all

testing.


That does not mean FreeBSD 9 can't put devices on other interrupts than 8  
did.
'dmesg | grep irq' like you did before might show a difference with your  
previous output.
I'm just guessing here for a clue on the result you are seeing, but  
without any data I cannot answer you (and I guess nobody can).


Ronald.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-15 Thread Matt Thyer
On Apr 16, 2012 5:42 AM, Ronald Klop ronald-freeb...@klop.yi.org wrote:

 On Sun, 15 Apr 2012 15:09:34 +0200, Matt Thyer matt.th...@gmail.com
wrote:

 On Apr 15, 2012 6:27 PM, Ronald Klop ronald-freeb...@klop.yi.org
wrote:


 The problem does not occur with 9-STABLE.

 Who knows what the problem was ? USB maybe ?



 Do you still have the same hardware on the same interrupts on 9-STABLE?
 Are there changes in the use of MSI(-X)?

 I made no hardware or BIOS changes and I'm running a GENERIC kernel in
all
 testing.


 That does not mean FreeBSD 9 can't put devices on other interrupts than 8
did.
 'dmesg | grep irq' like you did before might show a difference with your
previous output.
 I'm just guessing here for a clue on the result you are seeing, but
without any data I cannot answer you (and I guess nobody can).

Ronald,

The irqs seem to be the same:

$ grep irq\ 16 /var/run/dmesg.boot
pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device
26.0 on pci0
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq 16
at device 0.0 on pci3
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-13 Thread Matt Thyer
On Apr 7, 2012 2:38 PM, Matt Thyer matt.th...@gmail.com wrote:

 On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote:
 Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm
no longer having that disk evicted from the raidz2 pool with write errors
and I thought that the high interrupt rate issue had also been solved but
it's back again.

 This is on 8-STABLE at revision 230921 (before the new driver hit
8-STABLE).

 So now I need to go back to trying to determine what the cause is.

 vmstat -i has shown that the issue was on irq 16.

 Unfortunately there seems to be a lot of things on irq 16:

 $  dmesg | grep irq 16

 pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
 mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
 vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
 uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at
device 26.0 on pci0
 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
 pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
 atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3

 Any idea how to isolate which bit of hardware could be triggering the
interrupts ?

 Unfortunately the only device I could remove would be the SuperMicro
AOC-USAS2-L8i (so yes I could eliminate that).

 My biggest problem right now is not knowing how to trigger the issue.

 At this stage I'm going to upgrade to 9-STABLE and see if it returns.

The problem does not occur with 9-STABLE.

Who knows what the problem was ? USB maybe ?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-06 Thread Matt Thyer
On 5 April 2012 01:18, Freddie Cash fjwc...@gmail.com wrote:

 On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote:
  So it seems that both the old and new mps driver have a problem with the
  Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
  6G) controller (flashed with -IT firmware).

 I wouldn't say the driver has a problem with that specific drive.
 More that it might have a problem with a mixed SATA2/SATA3 setup.

 Sorry, that's what I meant to say but it now seems that the 157K
interrupts per second is probably not due to the SuperMicro AOC-USAS2-L8i.

Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no
longer having that disk evicted from the raidz2 pool with write errors and
I thought that the high interrupt rate issue had also been solved but it's
back again.

This is on 8-STABLE at revision 230921 (before the new driver hit 8-STABLE).

So now I need to go back to trying to determine what the cause is.

I'll stop posting in this thread as I don't think it's anything to do with
either the old or new version of this driver.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-06 Thread Matt Thyer
On 7 April 2012 14:31, Matt Thyer matt.th...@gmail.com wrote:

 On 5 April 2012 01:18, Freddie Cash fjwc...@gmail.com wrote:

 On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote:
  So it seems that both the old and new mps driver have a problem with the
  Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
  6G) controller (flashed with -IT firmware).

 I wouldn't say the driver has a problem with that specific drive.
 More that it might have a problem with a mixed SATA2/SATA3 setup.

 Sorry, that's what I meant to say but it now seems that the 157K
 interrupts per second is probably not due to the SuperMicro AOC-USAS2-L8i.

 Since moving the SATA 3 disk to the onboard Intel SATA 2 controller I'm no
 longer having that disk evicted from the raidz2 pool with write errors and
 I thought that the high interrupt rate issue had also been solved but it's
 back again.

 This is on 8-STABLE at revision 230921 (before the new driver hit
 8-STABLE).

 So now I need to go back to trying to determine what the cause is.

 I'll stop posting in this thread as I don't think it's anything to do with
 either the old or new version of this driver.


Oops... wrong thread I thought I was replying in -CURRENT.

So on to the root cause.

vmstat -i has shown that the issue was on irq 16.

Unfortunately there seems to be a lot of things on irq 16:

$  dmesg | grep irq 16
pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device
26.0 on pci0
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3
pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device
26.0 on pci0
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3

Any idea how to isolate which bit of hardware could be triggering the
interrupts ?

Unfortunately the only device I could remove would be the SuperMicro
AOC-USAS2-L8i (so yes I could eliminate that).

My biggest problem right now is not knowing how to trigger the issue.

At this stage I'm going to upgrade to 9-STABLE and see if it returns.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-04 Thread Matt Thyer
On 25 March 2012 22:26, Matt Thyer matt.th...@gmail.com wrote:


 Does anyone know if and when this driver was merged from current to
 8-STABLE ?

 If I can work out what revision that occurred in I'll go back to just
 before then to confirm if the problem exists.


In the -CURRENT list I've been told that the new driver was introduced into
8-STABLE at revision 230921.
I reverted to that revision but the problem was still apparent.

So I've now tried:

- Previous BIOS
- Updating the controller firmware from phase 7 to phase 11
- Going back to the old (pre LSI authored) mps driver

But all to no avail.

What has worked is to move the single SATA 3 (6 Gbps) drive (the other 7
drives are SATA 2) to the onboard SATA 2 Intel controller.

So it seems that both the old and new mps driver have a problem with the
Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
6G) controller (flashed with -IT firmware).

I'll continue this in the -CURRENT list in the thread about the new driver
as that's where the main discussion has been.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


RE: 157k interrupts per second causing 60% CPU load on idle system

2012-04-04 Thread Desai, Kashyap


 -Original Message-
 From: owner-freebsd-sta...@freebsd.org [mailto:owner-freebsd-
 sta...@freebsd.org] On Behalf Of Matt Thyer
 Sent: Wednesday, April 04, 2012 5:50 PM
 To: Mike Tancsa
 Cc: freebsd-stable@freebsd.org
 Subject: Re: 157k interrupts per second causing 60% CPU load on idle
 system
 
 On 25 March 2012 22:26, Matt Thyer matt.th...@gmail.com wrote:
 
 
  Does anyone know if and when this driver was merged from current to
  8-STABLE ?
 
  If I can work out what revision that occurred in I'll go back to just
  before then to confirm if the problem exists.
 
 
 In the -CURRENT list I've been told that the new driver was introduced
 into
 8-STABLE at revision 230921.
 I reverted to that revision but the problem was still apparent.
 
 So I've now tried:
 
 - Previous BIOS
 - Updating the controller firmware from phase 7 to phase 11
 - Going back to the old (pre LSI authored) mps driver
 
 But all to no avail.
 
 What has worked is to move the single SATA 3 (6 Gbps) drive (the other 7
 drives are SATA 2) to the onboard SATA 2 Intel controller.
 
 So it seems that both the old and new mps driver have a problem with the
 Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
 6G) controller (flashed with -IT firmware).
 
 I'll continue this in the -CURRENT list in the thread about the new
 driver
 as that's where the main discussion has been.

Mike, 
Have your purchase LSI controller through Channel or OEM ?
It would be a difficult for developers to help you without any support channel 
invovoled ?
If possible can you contact LSI support channel ?

~ Kashyap


 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-
 unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-04 Thread Matt Thyer
On 4 April 2012 21:55, Desai, Kashyap kashyap.de...@lsi.com wrote:

 Mike,
 Have your purchase LSI controller through Channel or OEM ?
 It would be a difficult for developers to help you without any support
 channel invovoled ?
 If possible can you contact LSI support channel ?


Kashyap,

It's me, Matt, not Mike reporting this problem.
I purchased the Super Micro card off eBay through a seller with good
reputation.
I've bought a couple of said cards through him now.

I'm not sure why you are thinking the lack of middle men would be a problem.
I hope that Super Micro  LSI would both be interested in a detailed report
of a problem.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


RE: 157k interrupts per second causing 60% CPU load on idle system

2012-04-04 Thread Desai, Kashyap
Matt, Sorry for addressing to wrong person.!

I am working in Device Driver group and it difficult to track request coming 
through emails. We can solve those request through emails which are 
minor/medium in complexity.
There are some group in LSI which has expertise to root cause issue and provide 
better support than developers. Sometimes issue can be non-driver issue (e.a  
Can be hw issue/fw issues etc.) and those issue need customer interaction and 
may be sometimes reproduction.

So I request to contact LSI support channel ( at least for complex issues) to 
get better tracking and speed up resolution.

` Kashyap

From: Matt Thyer [mailto:matt.th...@gmail.com]
Sent: Wednesday, April 04, 2012 6:08 PM
To: Desai, Kashyap
Cc: Mike Tancsa; freebsd-stable@freebsd.org; McConnell, Stephen
Subject: Re: 157k interrupts per second causing 60% CPU load on idle system

On 4 April 2012 21:55, Desai, Kashyap 
kashyap.de...@lsi.commailto:kashyap.de...@lsi.com wrote:
Mike,
Have your purchase LSI controller through Channel or OEM ?
It would be a difficult for developers to help you without any support channel 
invovoled ?
If possible can you contact LSI support channel ?

Kashyap,

It's me, Matt, not Mike reporting this problem.
I purchased the Super Micro card off eBay through a seller with good reputation.
I've bought a couple of said cards through him now.

I'm not sure why you are thinking the lack of middle men would be a problem.
I hope that Super Micro  LSI would both be interested in a detailed report of 
a problem.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-04-04 Thread Freddie Cash
On Wed, Apr 4, 2012 at 5:19 AM, Matt Thyer matt.th...@gmail.com wrote:
 So it seems that both the old and new mps driver have a problem with the
 Western Digital WD20EARX SATA 3 drive on a SuperMicro AOC-USAS2-L8i (SAS
 6G) controller (flashed with -IT firmware).

I wouldn't say the driver has a problem with that specific drive.
More that it might have a problem with a mixed SATA2/SATA3 setup.

I have a 9-STABLE box with 24x WDC WD2002FAEX SATA3 (6 Gbps) drives
attached to 3 SuperMicro AOC-USAS2-8Li controllers, using the new
mps(4) driver without any issues.  Was actually amazed yesterday when
I say it doing writes just shy of 500 MBps to the ZFS pool, via zfs
send/recv from another box.

No issues with excessive interrupts.  Using 10.0 firmware on the controllers.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-25 Thread Matt Thyer
On 23 March 2012 01:16, Mike Tancsa m...@sentex.net wrote:

 Sorry, what I was getting at was that a bad bios (eg latest could have
 introduced a regression) can cause the symptoms you are seeing. The bios
 change sure seemed to fix my problem.


I've updated the firmware of the SuperMicro AOC-USAS2-L8i to the latest -IT
firmware on the Supermicro FTP site (this is version 11 and I was on 7
before) but this made no difference.

I then tried downgrading the motherboard BIOS to the F3 release that I was
running previously but again this made no difference.

So it would seem that this is a problem is to do with a change in
FreeBSD-STABLE between r225723 and r232477.

I'm wondering whether this is due to the new LSI authored driver for chips
such as the LSI SAS2008 that are used in the SuperMicro AOC-USAS2-L8i.

I know this driver is in CURRENT but do not know if it's in 8-STABLE.

Does anyone know if and when this driver was merged from current to
8-STABLE ?

If I can work out what revision that occurred in I'll go back to just
before then to confirm if the problem exists.

Matt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-22 Thread Matt Thyer
On Mar 22, 2012 10:14 AM, Mike Tancsa m...@sentex.net wrote:

 On 3/20/2012 1:26 AM, Matt Thyer wrote:
  I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
  r232477 (4th Mar 2012) and am finding that a system process called
intr
  is now constantly using about 60% of 1 CPU starting a short time after
  reboot (possibly triggered by use of the samba server).
 
  When this starts, systat -vm 1 says that the system is 85% idle and 14%
  interrupt handling.
  It says that there's around 157k interrupts per second.
 
  Any idea what could be the cause ?

 This sounds like the problem I had with my intel board.  BIOS update
 fixed it.  What chipset and MB vendor do you have ?

The original post tells you this.
I've already updated to the latest BIOS and this could have caused the
problem.

I think I should try updating the firmware of the SAS/SATA HBA but not
until I've confirmed if a BIOS downgrade works.

Unfortunately it's very hard to get downtime on this system so it may be
some time before I can report any news.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-22 Thread Mike Tancsa
On 3/22/2012 10:33 AM, Matt Thyer wrote:
 
 The original post tells you this.
 I've already updated to the latest BIOS and this could have caused the
 problem.


Sorry, what I was getting at was that a bad bios (eg latest could have
introduced a regression) can cause the symptoms you are seeing. The bios
change sure seemed to fix my problem.

In the mean time, try dmidecode to extract some of the versions of the
BIOS and hardware you are using.  Perhaps it might tweak someone's
memory to a similar problem as yours.

---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-21 Thread Mike Tancsa
On 3/20/2012 1:26 AM, Matt Thyer wrote:
 I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
 r232477 (4th Mar 2012) and am finding that a system process called intr
 is now constantly using about 60% of 1 CPU starting a short time after
 reboot (possibly triggered by use of the samba server).
 
 When this starts, systat -vm 1 says that the system is 85% idle and 14%
 interrupt handling.
 It says that there's around 157k interrupts per second.
 
 Any idea what could be the cause ?

This sounds like the problem I had with my intel board.  BIOS update
fixed it.  What chipset and MB vendor do you have ?
/usr/ports/sysutils/dmidecode is handy for digging up BIOS and chipset info

http://lists.freebsd.org/pipermail/freebsd-questions/2012-March/239394.html

---Mike




-- 
---
Mike Tancsa, tel +1 519 651 3400
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   http://www.tancsa.com/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-20 Thread Ivan Voras
On 20/03/2012 06:26, Matt Thyer wrote:
 I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
 r232477 (4th Mar 2012) and am finding that a system process called intr
 is now constantly using about 60% of 1 CPU starting a short time after
 reboot (possibly triggered by use of the samba server).
 
 When this starts, systat -vm 1 says that the system is 85% idle and 14%
 interrupt handling.
 It says that there's around 157k interrupts per second.
 

Ok, but *which* interrupt is getting triggered? Please send the output
of vmstat -i.




signature.asc
Description: OpenPGP digital signature


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-20 Thread Matt Thyer
On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote:

 On 20/03/2012 06:26, Matt Thyer wrote:
  I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
  r232477 (4th Mar 2012) and am finding that a system process called intr
  is now constantly using about 60% of 1 CPU starting a short time after
  reboot (possibly triggered by use of the samba server).
 
  When this starts, systat -vm 1 says that the system is 85% idle and 14%
  interrupt handling.
  It says that there's around 157k interrupts per second.
 

 Ok, but *which* interrupt is getting triggered? Please send the output
 of vmstat -i.


 interrupt  total   rate
irq16: uhci0+ 3392184862 126692
cpu0: timer 53549677   1999
irq256: mps0 2643187 98
irq257: re0  5508108205
irq258: ahci0 160717  6
cpu1: timer 53525300   1999
cpu2: timer 53525300   1999
cpu3: timer 53525296   1999
Total 3614622447 134999
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-20 Thread Ivan Voras
On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote:
 On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote:

 On 20/03/2012 06:26, Matt Thyer wrote:
  I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
  r232477 (4th Mar 2012) and am finding that a system process called
  intr
  is now constantly using about 60% of 1 CPU starting a short time after
  reboot (possibly triggered by use of the samba server).
 
  When this starts, systat -vm 1 says that the system is 85% idle and 14%
  interrupt handling.
  It says that there's around 157k interrupts per second.
 

 Ok, but *which* interrupt is getting triggered? Please send the output
 of vmstat -i.


 interrupt  total   rate
 irq16: uhci0+ 3392184862 126692

Ok, something's probably wrong with USB. Can you disable it in BIOS?


 cpu0: timer 53549677   1999
 irq256: mps0 2643187 98
 irq257: re0  5508108    205
 irq258: ahci0 160717  6
 cpu1: timer 53525300   1999
 cpu2: timer 53525300   1999
 cpu3: timer 53525296   1999
 Total 3614622447 134999

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-20 Thread Matt Thyer
On 20 March 2012 22:24, Ivan Voras ivo...@freebsd.org wrote:

 On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote:
  On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote:
 
  On 20/03/2012 06:26, Matt Thyer wrote:
   I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
   r232477 (4th Mar 2012) and am finding that a system process called
   intr
   is now constantly using about 60% of 1 CPU starting a short time after
   reboot (possibly triggered by use of the samba server).
  
   When this starts, systat -vm 1 says that the system is 85% idle and
 14%
   interrupt handling.
   It says that there's around 157k interrupts per second.
  
 
  Ok, but *which* interrupt is getting triggered? Please send the output
  of vmstat -i.
 
 
  interrupt  total   rate
  irq16: uhci0+ 3392184862 126692

 Ok, something's probably wrong with USB. Can you disable it in BIOS?


  cpu0: timer 53549677   1999
  irq256: mps0 2643187 98
  irq257: re0  5508108205
  irq258: ahci0 160717  6
  cpu1: timer 53525300   1999
  cpu2: timer 53525300   1999
  cpu3: timer 53525296   1999
  Total 3614622447 134999
 


I did just update the BIOS at about the same time so the difference may be
due to that change.

I'll try a few things such as:

- Unplugging any USB things (I've only got a keyboard plugged in).
- Downgrade BIOS.

I'll get back to you all soon.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-20 Thread Gary Palmer
On Tue, Mar 20, 2012 at 11:10:10PM +1030, Matt Thyer wrote:
 On 20 March 2012 22:24, Ivan Voras ivo...@freebsd.org wrote:
 
  On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote:
   On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote:
  
   On 20/03/2012 06:26, Matt Thyer wrote:
I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
r232477 (4th Mar 2012) and am finding that a system process called
intr
is now constantly using about 60% of 1 CPU starting a short time after
reboot (possibly triggered by use of the samba server).
   
When this starts, systat -vm 1 says that the system is 85% idle and
  14%
interrupt handling.
It says that there's around 157k interrupts per second.
   
  
   Ok, but *which* interrupt is getting triggered? Please send the output
   of vmstat -i.
  
  
   interrupt  total   rate
   irq16: uhci0+ 3392184862 126692
 
  Ok, something's probably wrong with USB. Can you disable it in BIOS?
 
 
   cpu0: timer 53549677   1999
   irq256: mps0 2643187 98
   irq257: re0  5508108205
   irq258: ahci0 160717  6
   cpu1: timer 53525300   1999
   cpu2: timer 53525300   1999
   cpu3: timer 53525296   1999
   Total 3614622447 134999
  
 
 
 I did just update the BIOS at about the same time so the difference may be
 due to that change.
 
 I'll try a few things such as:
 
 - Unplugging any USB things (I've only got a keyboard plugged in).
 - Downgrade BIOS.
 
 I'll get back to you all soon.

It would be interesting to know if there are other devices on irq16 also.

grep 'irq 16' /var/run/dmesg.boot

I think the '+' on the irq16 line from vmstat means the interrupt is
shared, but the man page doesn't mention it so I'm not 100% sure

Thanks,

Gary
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 157k interrupts per second causing 60% CPU load on idle system

2012-03-20 Thread Matt Thyer
On 21 March 2012 00:03, Gary Palmer gpal...@freebsd.org wrote:

 On Tue, Mar 20, 2012 at 11:10:10PM +1030, Matt Thyer wrote:
  On 20 March 2012 22:24, Ivan Voras ivo...@freebsd.org wrote:
 
   On 20 March 2012 12:52, Matt Thyer matt.th...@gmail.com wrote:
On 20 March 2012 21:12, Ivan Voras ivo...@freebsd.org wrote:
   
On 20/03/2012 06:26, Matt Thyer wrote:
 I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011)
 to
 r232477 (4th Mar 2012) and am finding that a system process called
 intr
 is now constantly using about 60% of 1 CPU starting a short time
 after
 reboot (possibly triggered by use of the samba server).

 When this starts, systat -vm 1 says that the system is 85% idle
 and
   14%
 interrupt handling.
 It says that there's around 157k interrupts per second.

   
Ok, but *which* interrupt is getting triggered? Please send the
 output
of vmstat -i.
   
   
interrupt  total   rate
irq16: uhci0+ 3392184862 126692
  
   Ok, something's probably wrong with USB. Can you disable it in BIOS?
  
  
cpu0: timer 53549677   1999
irq256: mps0 2643187 98
irq257: re0  5508108205
irq258: ahci0 160717  6
cpu1: timer 53525300   1999
cpu2: timer 53525300   1999
cpu3: timer 53525296   1999
Total 3614622447 134999
   
  
 
  I did just update the BIOS at about the same time so the difference may
 be
  due to that change.
 
  I'll try a few things such as:
 
  - Unplugging any USB things (I've only got a keyboard plugged in).
  - Downgrade BIOS.
 
  I'll get back to you all soon.

 It would be interesting to know if there are other devices on irq16 also.

 grep 'irq 16' /var/run/dmesg.boot

 I think the '+' on the irq16 line from vmstat means the interrupt is
 shared, but the man page doesn't mention it so I'm not 100% sure

 Thanks,

 Gary


Good point...

pcib1: PCI-PCI bridge irq 16 at device 1.0 on pci0
mps0: LSI SAS2008 port 0xee00-0xeeff mem
0xfbdfc000-0xfbdf,0xfbd8-0xfbdb irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xff00-0xff07 mem
0xfb40-0xfb7f,0xe000-0xefff irq 16 at device 2.0 on pci0
uhci0: UHCI (generic) USB controller port 0xfe00-0xfe1f irq 16 at device
26.0 on pci0
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pcib3: ACPI PCI-PCI bridge irq 16 at device 28.4 on pci0
atapci0: JMicron JMB368 UDMA133 controller port
0xdf00-0xdf07,0xde00-0xde03,0xdd00-0xdd07,0xdc00-0xdc03,0xdb00-0xdb0f irq
16 at device 0.0 on pci3

I'd suspect the SAS/SATA HBA using the mps0 driver as that's where I have
the raidz2 on 8 drives.

Is this still the old driver or has the new LSI authored driver been added
to -STABLE yet ?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


157k interrupts per second causing 60% CPU load on idle system

2012-03-19 Thread Matt Thyer
I've upgraded my FreeBSD-STABLE NAS from r225723 (22nd Sept 2011) to
r232477 (4th Mar 2012) and am finding that a system process called intr
is now constantly using about 60% of 1 CPU starting a short time after
reboot (possibly triggered by use of the samba server).

When this starts, systat -vm 1 says that the system is 85% idle and 14%
interrupt handling.
It says that there's around 157k interrupts per second.

After a reboot the system is back to it's normal state doing between 3 and
250 or so interrupts per second.

The hardware is an Intel Core i3-530 (dual core @ 2.93 GHz with
Hyperthreading) with 8 GB RAM (2x4GB) on a Gigabyte H55M-D2H rev 1.3
motherboard running the latest BIOS (F4).

The system runs a GENERIC kernel with the following significant items in
/boot/loader.conf:

zfs_load=YES
aio_load=YES
ahci_load=YES
geom_mirror_load=YES
vfs.root.mountfrom=zfs:zroot
vboxdrv_load=YES

It has 2 x 300 GB disks for the system with GPT partitioning and zmirror
for the OS ala http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror
I have swap on a gmirror as I want swap to survive the loss of one system
disk.

The NAS data is on a raidz2 pool of 8 disks connected to a SuperMicro
AOC-USAS2-L8i (flashed to behave as an AOC-USAS2-L8e).

The system is basically a CIFS NAS with ports/net/samba36 built with
AIO_SUPPORT and configured like:

   socket options = SO_RCVBUF=131072 SO_SNDBUF=131072 TCP_NODELAY
   min receivefile size=16384
   use sendfile=true
   aio read size = 16384
   aio write size = 16384
   aio write behind = true

The only other interesting workload on the box is a java Minecraft server
using ports/java/jdk16.

I'm going to try to reproduce the problem in a VM and binary search down to
the revision where it started as soon as I can work out a reliable way to
trigger the behaviour (as it doesn't start at boot time).

Any idea what could be the cause ?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org