Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)

2013-04-19 Thread Holger Lüdecke

Dear maintainer,

I must agree with Henrique de Moraes Holschuh and Sebastian Kutsch.

I've tried the patch from Henrique for two months in a 3.2.35 kernel and it runs smooth without any 
problems. I use irqbalance (1.0.3-3) with --powertresh option.



Please insert this patch into the next wheezy-kernel.


Kind regards

Holger Lüdecke




smime.p7s
Description: S/MIME Kryptografische Unterschrift


Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)

2013-04-17 Thread Faustin Lammler
Dear maintainer,

I must agree with Sebastian Kutsch.

irqbalance may not be the better way to treat interruption problem but it
is doing is job quite good on a (squeeze - 3.2.0-0.bpo.4-amd64).

Regards,
Faustin LAMMLER
http://www.falared.net


Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)

2013-02-08 Thread Sebastian Kutsch
Package: src:linux
Version: 3.2.35-2
Followup-For: Bug #696059

Dear Maintainer,

irqbalance keeps all ethX interrupts to cpu0. This is a major problem
e.g. in NAT routers. Don't see why this bug is marked as resolved.

Regards


-- Package-specific info:
** Version:
Linux version 3.2.0-4-amd64 (debian-kernel@lists.debian.org) (gcc
version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2

** Command line:
BOOT_IMAGE=/boot/vmlinuz-3.2.0-4-amd64
root=UUID=ef90a883-ea1f-476f-9ad3-4db5010a70c1 ro quiet intel_iommu=on
elevator=deadline pci=assign-busses

** Not tainted

** Kernel log:
[  845.126052] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[  845.126061] pci-stub :02:10.0: irq 54 for MSI/MSI-X
[  845.225735] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[  845.225744] pci-stub :02:10.0: irq 54 for MSI/MSI-X
[  845.225751] pci-stub :02:10.0: irq 55 for MSI/MSI-X
[  845.353422] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[  845.397181] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[  845.397186] pci-stub :02:10.1: irq 57 for MSI/MSI-X
[  845.461093] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[  845.461103] pci-stub :02:10.1: irq 57 for MSI/MSI-X
[  845.461110] pci-stub :02:10.1: irq 58 for MSI/MSI-X
[  901.244235] pci-stub :02:10.0: claimed by stub
[  901.633968] pci-stub :02:10.0: enabling device ( - 0002)
[  901.925235] assign device 0:2:10.0
[  901.925853] pci-stub :02:10.1: enabling device ( - 0002)
[  902.027209] assign device 0:2:10.1
[  910.375456] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[  910.439130] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[  910.439139] pci-stub :02:10.0: irq 54 for MSI/MSI-X
[  910.506952] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[  910.506961] pci-stub :02:10.0: irq 54 for MSI/MSI-X
[  910.506968] pci-stub :02:10.0: irq 55 for MSI/MSI-X
[  910.642526] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[  910.706364] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[  910.706373] pci-stub :02:10.1: irq 57 for MSI/MSI-X
[  910.746285] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[  910.746297] pci-stub :02:10.1: irq 57 for MSI/MSI-X
[  910.746304] pci-stub :02:10.1: irq 58 for MSI/MSI-X
[ 1131.883497] pci-stub :02:10.0: claimed by stub
[ 1132.221370] device vnet0 entered promiscuous mode
[ 1132.278792] br0: port 2(vnet0) entering forwarding state
[ 1132.278799] br0: port 2(vnet0) entering forwarding state
[ 1132.559756] pci-stub :02:10.0: enabling device ( - 0002)
[ 1132.851150] assign device 0:2:10.0
[ 1132.851819] pci-stub :02:10.1: enabling device ( - 0002)
[ 1132.955315] assign device 0:2:10.1
[ 1142.324399] vnet0: no IPv6 routers present
[ 1145.487791] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[ 1145.555309] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[ 1145.555318] pci-stub :02:10.0: irq 54 for MSI/MSI-X
[ 1145.623130] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[ 1145.623134] pci-stub :02:10.0: irq 54 for MSI/MSI-X
[ 1145.623149] pci-stub :02:10.0: irq 55 for MSI/MSI-X
[ 1145.701110] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[ 1145.738755] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[ 1145.738765] pci-stub :02:10.1: irq 57 for MSI/MSI-X
[ 1145.802598] pci-stub :02:10.1: irq 56 for MSI/MSI-X
[ 1145.802608] pci-stub :02:10.1: irq 57 for MSI/MSI-X
[ 1145.802618] pci-stub :02:10.1: irq 58 for MSI/MSI-X
[ 1147.238369] br0: port 2(vnet0) entering forwarding state
[ 1663.558960] br0: port 2(vnet0) entering forwarding state
[ 1663.562002] br0: port 2(vnet0) entering disabled state
[ 1663.562105] device vnet0 left promiscuous mode
[ 1663.562109] br0: port 2(vnet0) entering disabled state
[ 1678.425223] Intel(R) Gigabit Virtual Function Network Driver -
version 2.0.1-k
[ 1678.425227] Copyright (c) 2009 - 2011 Intel Corporation.
[ 1678.425252] igbvf :02:10.0: enabling device ( - 0002)
[ 1678.425265] igbvf :02:10.0: setting latency timer to 64
[ 1678.425327] igbvf :02:10.0: irq 53 for MSI/MSI-X
[ 1678.425335] igbvf :02:10.0: irq 54 for MSI/MSI-X
[ 1678.425341] igbvf :02:10.0: irq 55 for MSI/MSI-X
[ 1678.426502] igbvf :02:10.0: PF still in reset state, assigning
new address. Is the PF interface up?
[ 1678.427675] igbvf :02:10.0: PF still resetting
[ 1678.428325] igbvf :02:10.0: Intel(R) 82576 Virtual Function
[ 1678.428328] igbvf :02:10.0: Address: 4e:44:7b:fc:4e:cc
[ 1727.515523] pci-stub :02:10.0: claimed by stub
[ 1727.918240] device vnet0 entered promiscuous mode
[ 1727.983639] br0: port 2(vnet0) entering forwarding state
[ 1727.983645] br0: port 2(vnet0) entering forwarding state
[ 1728.203550] pci-stub :02:10.0: enabling device ( - 0002)
[ 1728.499360] assign device 0:2:10.0
[ 1728.499975] pci-stub :02:10.1: enabling device ( - 0002)
[ 1728.600370] assign device 0:2:10.1
[ 1737.100224] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[ 1737.136216] pci-stub :02:10.0: irq 53 for MSI/MSI-X
[ 1737.136226] pci-stub 

Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)

2012-12-17 Thread Henrique de Moraes Holschuh
On Sun, 16 Dec 2012, Ben Hutchings wrote:
 On Sun, 2012-12-16 at 11:24 -0200, Henrique de Moraes Holschuh wrote:
  Package: linux
  Version: 3.2.35-1
  Severity: important
  Tags: patch
  
  Please include the attached patch in Wheezy, without it, irqbalance fails to
  do its job properly on any server (actually any multi-core system with
  MSI/MSI-X irqs).
 
  It is important to have irqbalance work properly out-of-the-box, since it is
  the only trivial way to get better network/storage behaviour out of the
  MSI-X capable NUMA systems that are 95% of the post-2010 server market.
 
 Would be nice, but it has been broken for so long that 'everyone knows'
 to disable irqbalance.

Yeah, and nobody knows how to use hwloc either, so they probably leave it at
whatever the kernel/BIOS/EFI default mapping is.  And since the kernel
doesn't irqbalance by itself anymore, it will either be round-robin if
you're lucky, or all-in-the-first-core if you're unlucky...

  I've tested the attached patch on stock (kernel.org) 3.2.34 on production,
  and it works fine.  The patch is very simple, it just publishes the MSI
  IRQ vector information to sysfs, which irqbalance uses.
 
 Changes ABI, so will have to wait if we apply it at all.

It is a new ABI, actually, so it has no ill effects on existing
applications.  And this new ABI is already stable, too.

  git commit upstream: b50cac55bf859d5b2fdcc1803a553a251b703456
  
  Alternatively, we might want to add it to -stable series upstream, on the
  grounds that it is widely desired functionality (i.e. useful for all
  distros).
 
 This doesn't suddenly become urgent because you just noticed it.

I don't recall claiming for urgency anywhere, unless you mean the request
that it should be added to the Wheezy kernel.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121217183738.gb32...@khazad-dum.debian.net



Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)

2012-12-16 Thread Henrique de Moraes Holschuh
Package: linux
Version: 3.2.35-1
Severity: important
Tags: patch

Please include the attached patch in Wheezy, without it, irqbalance fails to
do its job properly on any server (actually any multi-core system with
MSI/MSI-X irqs).

It is important to have irqbalance work properly out-of-the-box, since it is
the only trivial way to get better network/storage behaviour out of the
MSI-X capable NUMA systems that are 95% of the post-2010 server market.

I've tested the attached patch on stock (kernel.org) 3.2.34 on production,
and it works fine.  The patch is very simple, it just publishes the MSI
IRQ vector information to sysfs, which irqbalance uses.

git commit upstream: b50cac55bf859d5b2fdcc1803a553a251b703456

Alternatively, we might want to add it to -stable series upstream, on the
grounds that it is widely desired functionality (i.e. useful for all
distros).

-- System Information:
Debian Release: wheezy/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'testing-proposed-updates'), (500, 
'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 3.2.35+ (SMP w/8 CPU cores)
Locale: LANG=pt_BR.UTF-8, LC_CTYPE=pt_BR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh
From b50cac55bf859d5b2fdcc1803a553a251b703456 Mon Sep 17 00:00:00 2001
From: Neil Horman nhor...@tuxdriver.com
Date: Thu, 6 Oct 2011 14:08:18 -0400
Subject: [PATCH] PCI/sysfs: add per pci device msi[x] irq listing (v5)

This patch adds a per-pci-device subdirectory in sysfs called:
/sys/bus/pci/devices/device/msi_irqs

This sub-directory exports the set of msi vectors allocated by a given
pci device, by creating a numbered sub-directory for each vector beneath
msi_irqs.  For each vector various attributes can be exported.
Currently the only attribute is called mode, which tracks the
operational mode of that vector (msi vs. msix)

Acked-by: Greg Kroah-Hartman gre...@suse.de
Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org
---
 Documentation/ABI/testing/sysfs-bus-pci |   18 +
 drivers/pci/msi.c   |  111 +++
 include/linux/msi.h |3 +
 include/linux/pci.h |1 +
 4 files changed, 133 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 349ecf2..34f5110 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -66,6 +66,24 @@ Description:
 		re-discover previously removed devices.
 		Depends on CONFIG_HOTPLUG.
 
+What:		/sys/bus/pci/devices/.../msi_irqs/
+Date:		September, 2011
+Contact:	Neil Horman nhor...@tuxdriver.com
+Description:
+		The /sys/devices/.../msi_irqs directory contains a variable set
+		of sub-directories, with each sub-directory being named after a
+		corresponding msi irq vector allocated to that device.  Each
+		numbered sub-directory N contains attributes of that irq.
+		Note that this directory is not created for device drivers which
+		do not support msi irqs
+
+What:		/sys/bus/pci/devices/.../msi_irqs/N/mode
+Date:		September 2011
+Contact:	Neil Horman nhor...@tuxdriver.com
+Description:
+		This attribute indicates the mode that the irq vector named by
+		the parent directory is in (msi vs. msix)
+
 What:		/sys/bus/pci/devices/.../remove
 Date:		January 2009
 Contact:	Linux PCI developers linux-...@vger.kernel.org
diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 0e6d04d..e6b6b9c 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -323,6 +323,8 @@ static void free_msi_irqs(struct pci_dev *dev)
 			if (list_is_last(entry-list, dev-msi_list))
 iounmap(entry-mask_base);
 		}
+		kobject_del(entry-kobj);
+		kobject_put(entry-kobj);
 		list_del(entry-list);
 		kfree(entry);
 	}
@@ -403,6 +405,98 @@ void pci_restore_msi_state(struct pci_dev *dev)
 }
 EXPORT_SYMBOL_GPL(pci_restore_msi_state);
 
+
+#define to_msi_attr(obj) container_of(obj, struct msi_attribute, attr)
+#define to_msi_desc(obj) container_of(obj, struct msi_desc, kobj)
+
+struct msi_attribute {
+	struct attributeattr;
+	ssize_t (*show)(struct msi_desc *entry, struct msi_attribute *attr,
+			char *buf);
+	ssize_t (*store)(struct msi_desc *entry, struct msi_attribute *attr,
+			 const char *buf, size_t count);
+};
+
+static ssize_t show_msi_mode(struct msi_desc *entry, struct msi_attribute *atr,
+			 char *buf)
+{
+	return sprintf(buf, %s\n, entry-msi_attrib.is_msix ? msix : msi);
+}
+
+static ssize_t msi_irq_attr_show(struct kobject *kobj,
+ struct attribute *attr, char *buf)
+{
+	struct msi_attribute *attribute = to_msi_attr(attr);
+	struct msi_desc *entry = to_msi_desc(kobj);
+
+	if (!attribute-show)
+		return -EIO;
+
+	return attribute-show(entry, 

Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)

2012-12-16 Thread Ben Hutchings
On Sun, 2012-12-16 at 11:24 -0200, Henrique de Moraes Holschuh wrote:
 Package: linux
 Version: 3.2.35-1
 Severity: important
 Tags: patch
 
 Please include the attached patch in Wheezy, without it, irqbalance fails to
 do its job properly on any server (actually any multi-core system with
 MSI/MSI-X irqs).

 It is important to have irqbalance work properly out-of-the-box, since it is
 the only trivial way to get better network/storage behaviour out of the
 MSI-X capable NUMA systems that are 95% of the post-2010 server market.

Would be nice, but it has been broken for so long that 'everyone knows'
to disable irqbalance.

 I've tested the attached patch on stock (kernel.org) 3.2.34 on production,
 and it works fine.  The patch is very simple, it just publishes the MSI
 IRQ vector information to sysfs, which irqbalance uses.

Changes ABI, so will have to wait if we apply it at all.

 git commit upstream: b50cac55bf859d5b2fdcc1803a553a251b703456
 
 Alternatively, we might want to add it to -stable series upstream, on the
 grounds that it is widely desired functionality (i.e. useful for all
 distros).

This doesn't suddenly become urgent because you just noticed it.

Ben.

-- 
Ben Hutchings
Always try to do things in chronological order;
it's less confusing that way.


signature.asc
Description: This is a digitally signed message part