Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)
Dear maintainer, I must agree with Henrique de Moraes Holschuh and Sebastian Kutsch. I've tried the patch from Henrique for two months in a 3.2.35 kernel and it runs smooth without any problems. I use irqbalance (1.0.3-3) with --powertresh option. Please insert this patch into the next wheezy-kernel. Kind regards Holger Lüdecke smime.p7s Description: S/MIME Kryptografische Unterschrift
Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)
Dear maintainer, I must agree with Sebastian Kutsch. irqbalance may not be the better way to treat interruption problem but it is doing is job quite good on a (squeeze - 3.2.0-0.bpo.4-amd64). Regards, Faustin LAMMLER http://www.falared.net
Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)
Package: src:linux Version: 3.2.35-2 Followup-For: Bug #696059 Dear Maintainer, irqbalance keeps all ethX interrupts to cpu0. This is a major problem e.g. in NAT routers. Don't see why this bug is marked as resolved. Regards -- Package-specific info: ** Version: Linux version 3.2.0-4-amd64 (debian-ker...@lists.debian.org) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.35-2 ** Command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-4-amd64 root=UUID=ef90a883-ea1f-476f-9ad3-4db5010a70c1 ro quiet intel_iommu=on elevator=deadline pci=assign-busses ** Not tainted ** Kernel log: [ 845.126052] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 845.126061] pci-stub :02:10.0: irq 54 for MSI/MSI-X [ 845.225735] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 845.225744] pci-stub :02:10.0: irq 54 for MSI/MSI-X [ 845.225751] pci-stub :02:10.0: irq 55 for MSI/MSI-X [ 845.353422] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 845.397181] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 845.397186] pci-stub :02:10.1: irq 57 for MSI/MSI-X [ 845.461093] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 845.461103] pci-stub :02:10.1: irq 57 for MSI/MSI-X [ 845.461110] pci-stub :02:10.1: irq 58 for MSI/MSI-X [ 901.244235] pci-stub :02:10.0: claimed by stub [ 901.633968] pci-stub :02:10.0: enabling device ( - 0002) [ 901.925235] assign device 0:2:10.0 [ 901.925853] pci-stub :02:10.1: enabling device ( - 0002) [ 902.027209] assign device 0:2:10.1 [ 910.375456] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 910.439130] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 910.439139] pci-stub :02:10.0: irq 54 for MSI/MSI-X [ 910.506952] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 910.506961] pci-stub :02:10.0: irq 54 for MSI/MSI-X [ 910.506968] pci-stub :02:10.0: irq 55 for MSI/MSI-X [ 910.642526] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 910.706364] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 910.706373] pci-stub :02:10.1: irq 57 for MSI/MSI-X [ 910.746285] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 910.746297] pci-stub :02:10.1: irq 57 for MSI/MSI-X [ 910.746304] pci-stub :02:10.1: irq 58 for MSI/MSI-X [ 1131.883497] pci-stub :02:10.0: claimed by stub [ 1132.221370] device vnet0 entered promiscuous mode [ 1132.278792] br0: port 2(vnet0) entering forwarding state [ 1132.278799] br0: port 2(vnet0) entering forwarding state [ 1132.559756] pci-stub :02:10.0: enabling device ( - 0002) [ 1132.851150] assign device 0:2:10.0 [ 1132.851819] pci-stub :02:10.1: enabling device ( - 0002) [ 1132.955315] assign device 0:2:10.1 [ 1142.324399] vnet0: no IPv6 routers present [ 1145.487791] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 1145.555309] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 1145.555318] pci-stub :02:10.0: irq 54 for MSI/MSI-X [ 1145.623130] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 1145.623134] pci-stub :02:10.0: irq 54 for MSI/MSI-X [ 1145.623149] pci-stub :02:10.0: irq 55 for MSI/MSI-X [ 1145.701110] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 1145.738755] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 1145.738765] pci-stub :02:10.1: irq 57 for MSI/MSI-X [ 1145.802598] pci-stub :02:10.1: irq 56 for MSI/MSI-X [ 1145.802608] pci-stub :02:10.1: irq 57 for MSI/MSI-X [ 1145.802618] pci-stub :02:10.1: irq 58 for MSI/MSI-X [ 1147.238369] br0: port 2(vnet0) entering forwarding state [ 1663.558960] br0: port 2(vnet0) entering forwarding state [ 1663.562002] br0: port 2(vnet0) entering disabled state [ 1663.562105] device vnet0 left promiscuous mode [ 1663.562109] br0: port 2(vnet0) entering disabled state [ 1678.425223] Intel(R) Gigabit Virtual Function Network Driver - version 2.0.1-k [ 1678.425227] Copyright (c) 2009 - 2011 Intel Corporation. [ 1678.425252] igbvf :02:10.0: enabling device ( - 0002) [ 1678.425265] igbvf :02:10.0: setting latency timer to 64 [ 1678.425327] igbvf :02:10.0: irq 53 for MSI/MSI-X [ 1678.425335] igbvf :02:10.0: irq 54 for MSI/MSI-X [ 1678.425341] igbvf :02:10.0: irq 55 for MSI/MSI-X [ 1678.426502] igbvf :02:10.0: PF still in reset state, assigning new address. Is the PF interface up? [ 1678.427675] igbvf :02:10.0: PF still resetting [ 1678.428325] igbvf :02:10.0: Intel(R) 82576 Virtual Function [ 1678.428328] igbvf :02:10.0: Address: 4e:44:7b:fc:4e:cc [ 1727.515523] pci-stub :02:10.0: claimed by stub [ 1727.918240] device vnet0 entered promiscuous mode [ 1727.983639] br0: port 2(vnet0) entering forwarding state [ 1727.983645] br0: port 2(vnet0) entering forwarding state [ 1728.203550] pci-stub :02:10.0: enabling device ( - 0002) [ 1728.499360] assign device 0:2:10.0 [ 1728.499975] pci-stub :02:10.1: enabling device ( - 0002) [ 1728.600370] assign device 0:2:10.1 [ 1737.100224] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 1737.136216] pci-stub :02:10.0: irq 53 for MSI/MSI-X [ 1737.136226] pci-stub
Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)
On Sun, 16 Dec 2012, Ben Hutchings wrote: On Sun, 2012-12-16 at 11:24 -0200, Henrique de Moraes Holschuh wrote: Package: linux Version: 3.2.35-1 Severity: important Tags: patch Please include the attached patch in Wheezy, without it, irqbalance fails to do its job properly on any server (actually any multi-core system with MSI/MSI-X irqs). It is important to have irqbalance work properly out-of-the-box, since it is the only trivial way to get better network/storage behaviour out of the MSI-X capable NUMA systems that are 95% of the post-2010 server market. Would be nice, but it has been broken for so long that 'everyone knows' to disable irqbalance. Yeah, and nobody knows how to use hwloc either, so they probably leave it at whatever the kernel/BIOS/EFI default mapping is. And since the kernel doesn't irqbalance by itself anymore, it will either be round-robin if you're lucky, or all-in-the-first-core if you're unlucky... I've tested the attached patch on stock (kernel.org) 3.2.34 on production, and it works fine. The patch is very simple, it just publishes the MSI IRQ vector information to sysfs, which irqbalance uses. Changes ABI, so will have to wait if we apply it at all. It is a new ABI, actually, so it has no ill effects on existing applications. And this new ABI is already stable, too. git commit upstream: b50cac55bf859d5b2fdcc1803a553a251b703456 Alternatively, we might want to add it to -stable series upstream, on the grounds that it is widely desired functionality (i.e. useful for all distros). This doesn't suddenly become urgent because you just noticed it. I don't recall claiming for urgency anywhere, unless you mean the request that it should be added to the Wheezy kernel. -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)
Package: linux Version: 3.2.35-1 Severity: important Tags: patch Please include the attached patch in Wheezy, without it, irqbalance fails to do its job properly on any server (actually any multi-core system with MSI/MSI-X irqs). It is important to have irqbalance work properly out-of-the-box, since it is the only trivial way to get better network/storage behaviour out of the MSI-X capable NUMA systems that are 95% of the post-2010 server market. I've tested the attached patch on stock (kernel.org) 3.2.34 on production, and it works fine. The patch is very simple, it just publishes the MSI IRQ vector information to sysfs, which irqbalance uses. git commit upstream: b50cac55bf859d5b2fdcc1803a553a251b703456 Alternatively, we might want to add it to -stable series upstream, on the grounds that it is widely desired functionality (i.e. useful for all distros). -- System Information: Debian Release: wheezy/sid APT prefers testing APT policy: (990, 'testing'), (500, 'testing-proposed-updates'), (500, 'unstable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 3.2.35+ (SMP w/8 CPU cores) Locale: LANG=pt_BR.UTF-8, LC_CTYPE=pt_BR.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash -- One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie. -- The Silicon Valley Tarot Henrique Holschuh From b50cac55bf859d5b2fdcc1803a553a251b703456 Mon Sep 17 00:00:00 2001 From: Neil Horman nhor...@tuxdriver.com Date: Thu, 6 Oct 2011 14:08:18 -0400 Subject: [PATCH] PCI/sysfs: add per pci device msi[x] irq listing (v5) This patch adds a per-pci-device subdirectory in sysfs called: /sys/bus/pci/devices/device/msi_irqs This sub-directory exports the set of msi vectors allocated by a given pci device, by creating a numbered sub-directory for each vector beneath msi_irqs. For each vector various attributes can be exported. Currently the only attribute is called mode, which tracks the operational mode of that vector (msi vs. msix) Acked-by: Greg Kroah-Hartman gre...@suse.de Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org --- Documentation/ABI/testing/sysfs-bus-pci | 18 + drivers/pci/msi.c | 111 +++ include/linux/msi.h |3 + include/linux/pci.h |1 + 4 files changed, 133 insertions(+) diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci index 349ecf2..34f5110 100644 --- a/Documentation/ABI/testing/sysfs-bus-pci +++ b/Documentation/ABI/testing/sysfs-bus-pci @@ -66,6 +66,24 @@ Description: re-discover previously removed devices. Depends on CONFIG_HOTPLUG. +What: /sys/bus/pci/devices/.../msi_irqs/ +Date: September, 2011 +Contact: Neil Horman nhor...@tuxdriver.com +Description: + The /sys/devices/.../msi_irqs directory contains a variable set + of sub-directories, with each sub-directory being named after a + corresponding msi irq vector allocated to that device. Each + numbered sub-directory N contains attributes of that irq. + Note that this directory is not created for device drivers which + do not support msi irqs + +What: /sys/bus/pci/devices/.../msi_irqs/N/mode +Date: September 2011 +Contact: Neil Horman nhor...@tuxdriver.com +Description: + This attribute indicates the mode that the irq vector named by + the parent directory is in (msi vs. msix) + What: /sys/bus/pci/devices/.../remove Date: January 2009 Contact: Linux PCI developers linux-...@vger.kernel.org diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c index 0e6d04d..e6b6b9c 100644 --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -323,6 +323,8 @@ static void free_msi_irqs(struct pci_dev *dev) if (list_is_last(entry-list, dev-msi_list)) iounmap(entry-mask_base); } + kobject_del(entry-kobj); + kobject_put(entry-kobj); list_del(entry-list); kfree(entry); } @@ -403,6 +405,98 @@ void pci_restore_msi_state(struct pci_dev *dev) } EXPORT_SYMBOL_GPL(pci_restore_msi_state); + +#define to_msi_attr(obj) container_of(obj, struct msi_attribute, attr) +#define to_msi_desc(obj) container_of(obj, struct msi_desc, kobj) + +struct msi_attribute { + struct attributeattr; + ssize_t (*show)(struct msi_desc *entry, struct msi_attribute *attr, + char *buf); + ssize_t (*store)(struct msi_desc *entry, struct msi_attribute *attr, + const char *buf, size_t count); +}; + +static ssize_t show_msi_mode(struct msi_desc *entry, struct msi_attribute *atr, + char *buf) +{ + return sprintf(buf, %s\n, entry-msi_attrib.is_msix ? msix : msi); +} + +static ssize_t msi_irq_attr_show(struct kobject *kobj, + struct attribute *attr, char *buf) +{ + struct msi_attribute *attribute = to_msi_attr(attr); + struct msi_desc *entry = to_msi_desc(kobj); + + if (!attribute-show) + return -EIO; + + return attribute-show(entry,
Bug#696059: linux: PATCH required for server interrupt load balancing/irqbalance (tested)
On Sun, 2012-12-16 at 11:24 -0200, Henrique de Moraes Holschuh wrote: Package: linux Version: 3.2.35-1 Severity: important Tags: patch Please include the attached patch in Wheezy, without it, irqbalance fails to do its job properly on any server (actually any multi-core system with MSI/MSI-X irqs). It is important to have irqbalance work properly out-of-the-box, since it is the only trivial way to get better network/storage behaviour out of the MSI-X capable NUMA systems that are 95% of the post-2010 server market. Would be nice, but it has been broken for so long that 'everyone knows' to disable irqbalance. I've tested the attached patch on stock (kernel.org) 3.2.34 on production, and it works fine. The patch is very simple, it just publishes the MSI IRQ vector information to sysfs, which irqbalance uses. Changes ABI, so will have to wait if we apply it at all. git commit upstream: b50cac55bf859d5b2fdcc1803a553a251b703456 Alternatively, we might want to add it to -stable series upstream, on the grounds that it is widely desired functionality (i.e. useful for all distros). This doesn't suddenly become urgent because you just noticed it. Ben. -- Ben Hutchings Always try to do things in chronological order; it's less confusing that way. signature.asc Description: This is a digitally signed message part