Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
>>> On 16.09.15 at 22:31, wrote: > I think the lspci -v output is the same in both cases with the exception > of the xhci_pci which is not present in the Native case lspci -v output. > xhci_pci is built into the kernel. The same kernel/system is used with > this system when booted with Dom0 and native cases. I could rebuild the > kernel without it and see what happens? No point in doing so, the only lines we're interesting in here are ... > Native: > > 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset > Family USB xHCI (rev 05) (prog-if 30 [XHCI]) > Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB > xHCI > Flags: bus master, medium devsel, latency 0, IRQ 27 > Memory at f7e2 (64-bit, non-prefetchable) [size=64K] > Capabilities: [70] Power Management version 2 > Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ ... this and ... > Kernel driver in use: xhci_hcd > > > > Dom0: > > 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset > Family USB xHCI (rev 05) (prog-if 30 [XHCI]) > Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB > xHCI > Flags: bus master, medium devsel, latency 0, IRQ 76 > Memory at f7e2 (64-bit, non-prefetchable) [size=64K] > Capabilities: [70] Power Management version 2 > Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ ... this. Them being identical proves what I suspected: The driver uses only a single interrupt. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On Fri, 2015-09-11 at 04:03 -0600, Jan Beulich wrote: > >>> On 10.09.15 at 18:20, wrote: > > On Wed, 2015-09-09 at 00:48 -0600, Jan Beulich wrote: > >> >>> On 08.09.15 at 18:02, wrote: > >> > I believe the driver does support use of multiple interrupts based on > >> > the previous explanation of the lspci output where it was established > >> > that the device could use up to 8 interrupts which is what I see on bare > >> > metal. > >> > >> Where is the proof of that? All I've seen is output like this > >> > >> Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ > >> > >> which says that one out of eight interrupts is being used. And > >> if in the native case this would indeed be the case, I don't think > >> you've provided complete hypervisor and kernel logs for the > >> Xen case so far, which would allow us to look for respective error > >> indications. And this (ignoring the line wrapping, which makes > >> things hard to read - it would be appreciated if you could fix > >> your mail client)... > >> > >> > Bare metal: > >> > > >> > cat /proc/interrupts > >> >CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > >> > CPU6 CPU7 > >> > 0: 36 0 0 0 0 0 > >> > 0 0 IR-IO-APIC-edge timer > >> >[...] > >> > 27: 337125 47893 708965 4049 53940667 263303 > >> > 87847 4958 IR-PCI-MSI-edge xhci_hcd > >> > >> ... also shows just a single interrupt being in use. > > > > Kernel logs for native and Dom0 with 'debug' appended to grub. xl-dmesg > > with log_lvl=all guest_loglvl=all set. Please let me know if there are > > other logs or log levels that I should provide. > > The native kernel log supports there only being a single interrupt > in use. I'm still not seeing any proof of your claim for this to be > different. Did you double check lspci output in the native case? > > Jan > Jan, I think the lspci -v output is the same in both cases with the exception of the xhci_pci which is not present in the Native case lspci -v output. xhci_pci is built into the kernel. The same kernel/system is used with this system when booted with Dom0 and native cases. I could rebuild the kernel without it and see what happens? Native: 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05) (prog-if 30 [XHCI]) Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI Flags: bus master, medium devsel, latency 0, IRQ 27 Memory at f7e2 (64-bit, non-prefetchable) [size=64K] Capabilities: [70] Power Management version 2 Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ Kernel driver in use: xhci_hcd Dom0: 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05) (prog-if 30 [XHCI]) Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI Flags: bus master, medium devsel, latency 0, IRQ 76 Memory at f7e2 (64-bit, non-prefetchable) [size=64K] Capabilities: [70] Power Management version 2 Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ Kernel driver in use: xhci_hcd Kernel modules: xhci_pci cat /boot/config-3.18.1-1.fc20.x86_64 | grep XHCI CONFIG_USB_XHCI_HCD=y CONFIG_USB_XHCI_PCI=y ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
>>> On 10.09.15 at 18:20, wrote: > On Wed, 2015-09-09 at 00:48 -0600, Jan Beulich wrote: >> >>> On 08.09.15 at 18:02, wrote: >> > I believe the driver does support use of multiple interrupts based on >> > the previous explanation of the lspci output where it was established >> > that the device could use up to 8 interrupts which is what I see on bare >> > metal. >> >> Where is the proof of that? All I've seen is output like this >> >> Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ >> >> which says that one out of eight interrupts is being used. And >> if in the native case this would indeed be the case, I don't think >> you've provided complete hypervisor and kernel logs for the >> Xen case so far, which would allow us to look for respective error >> indications. And this (ignoring the line wrapping, which makes >> things hard to read - it would be appreciated if you could fix >> your mail client)... >> >> > Bare metal: >> > >> > cat /proc/interrupts >> >CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 >> > CPU6 CPU7 >> > 0: 36 0 0 0 0 0 >> > 0 0 IR-IO-APIC-edge timer >> >[...] >> > 27: 337125 47893 708965 4049 53940667 263303 >> > 87847 4958 IR-PCI-MSI-edge xhci_hcd >> >> ... also shows just a single interrupt being in use. > > Kernel logs for native and Dom0 with 'debug' appended to grub. xl-dmesg > with log_lvl=all guest_loglvl=all set. Please let me know if there are > other logs or log levels that I should provide. The native kernel log supports there only being a single interrupt in use. I'm still not seeing any proof of your claim for this to be different. Did you double check lspci output in the native case? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
>>> On 08.09.15 at 18:02, wrote: > I believe the driver does support use of multiple interrupts based on > the previous explanation of the lspci output where it was established > that the device could use up to 8 interrupts which is what I see on bare > metal. Where is the proof of that? All I've seen is output like this Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ which says that one out of eight interrupts is being used. And if in the native case this would indeed be the case, I don't think you've provided complete hypervisor and kernel logs for the Xen case so far, which would allow us to look for respective error indications. And this (ignoring the line wrapping, which makes things hard to read - it would be appreciated if you could fix your mail client)... > Bare metal: > > cat /proc/interrupts >CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > CPU6 CPU7 > 0: 36 0 0 0 0 0 > 0 0 IR-IO-APIC-edge timer >[...] > 27: 337125 47893 708965 4049 53940667 263303 > 87847 4958 IR-PCI-MSI-edge xhci_hcd ... also shows just a single interrupt being in use. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
>>> On 03.09.15 at 18:52, wrote: > On Thu, 2015-09-03 at 09:04 -0600, Jan Beulich wrote: >> >>> On 03.09.15 at 14:04, wrote: >> > I am still confused as to whether any device, or in this case >> > xhci_hcd, >> > can use more than one cpu at any given time. My understanding based on >> > David's response is that it cannot due to the event channel mapping. The >> > device interrupt can be pinned to a specific cpu by specifying the >> > affinity. I was hoping there was a way to allow the driver's interrupt to >> > be >> > scheduled to use more than 1 CPU at any given time. >> >> The problem is that you're mixing up two things: devices and >> interrupts. Any individual interrupt can only be serviced by a single >> CPU at a time, due to the way event channels get bound. Any >> individual device can have more than one interrupt (MSI or MSI-X), >> and then each of these interrupts can be serviced on different >> CPUs. > > Thanks for clarifying. To the original question, with respect to my > limited understanding of the event channels and interrupts, each > interrupt can be serviced on a different CPU using irqbalance or setting > the affinity manually, but the same interrupt cannot be serviced by more > than 1 CPU at a time? If so, is there a way around the 1:1 binding when > loading the Dom0 kernel - a flag or option to use the native interrupt > scheduling for some set of or all 8 CPUs that the device can schedule > interrupts on when not loading the Dom0? The xhci_hcd, as one example, > seems to perform better when it is able to have interrupts serviced by > multiple CPUs. I don't follow - we tell you this doesn't work (multiple times and different people), and you ask yet another time whether this can be made work? Just to make this very clear once again: Under Xen (and leaving aside pure HVM guests), interrupt load from a single device can be spread across CPUs only when the device uses multiple interrupts. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
>>> On 03.09.15 at 14:04, wrote: On 02.09.15 at 19:17, wrote: >> From: Jan Beulich >> Sent: Wednesday, September 2, 2015 4:58 AM > Justin Acker 09/02/15 1:14 AM >>> >>> 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset > Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI]) >>>Subsystem: Dell Device 053e >>>Flags: bus master, medium devsel, latency 0, IRQ 78 >>>Memory at f7f2 (64-bit, non-prefetchable) [size=64K] >>>Capabilities: [70] Power Management version 2 >>>Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ >> >> This shows that the driver could use up to 8 MSI IRQs, but chose to use just > >> one. If >> this is the same under Xen and the native kernel, the driver likely doesn't >> know any >> better. If under native more interrupts are being used, there might be an >> issue with >> Xen specific code in the kernel or hypervisor code. We'd need to see details > >> to be >> able to tell. >> >> Please let me know what details I should provide. I'd like to emphasize what I said in my previous reply: > Please, first of all, get your reply style fixed. Just look at the above > and tell me how a reader should figure which parts of the text were > written by whom. > >[...] > > I am still confused as to whether any device, or in this case xhci_hcd, > can use more than one cpu at any given time. My understanding based on > David's response is that it cannot due to the event channel mapping. The > device interrupt can be pinned to a specific cpu by specifying the > affinity. I was hoping there was a way to allow the driver's interrupt to be > scheduled to use more than 1 CPU at any given time. The problem is that you're mixing up two things: devices and interrupts. Any individual interrupt can only be serviced by a single CPU at a time, due to the way event channels get bound. Any individual device can have more than one interrupt (MSI or MSI-X), and then each of these interrupts can be serviced on different CPUs. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
(re-adding xen-devel) >>> On 02.09.15 at 19:17, wrote: > From: Jan Beulich > Sent: Wednesday, September 2, 2015 4:58 AM Justin Acker 09/02/15 1:14 AM >>> >> 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset >> Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI]) >>Subsystem: Dell Device 053e >>Flags: bus master, medium devsel, latency 0, IRQ 78 >>Memory at f7f2 (64-bit, non-prefetchable) [size=64K] >>Capabilities: [70] Power Management version 2 >>Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+ > > This shows that the driver could use up to 8 MSI IRQs, but chose to use just > one. If > this is the same under Xen and the native kernel, the driver likely doesn't > know any > better. If under native more interrupts are being used, there might be an > issue with > Xen specific code in the kernel or hypervisor code. We'd need to see details > to be > able to tell. > > Please let me know what details I should provide. > > Jan Please, first of all, get you reply style fixed. Just look at the above and tell me how a reader should figure which parts of the text were written by whom. Together with other replies you sent, I first of all wonder whether you've understood what you've been told: Any interrupt delivered via the event channel mechanism can't be delivered to more than one CPU unless it gets moved around them by a tool or manually. When you set the affinity to more than on (v)CPU, the kernel will pick one (usually the first) out of the provided set and bind the event channel to that vCPU. As to, in the XHCI case, using multi-vector MSI: Please tell use whether the lspci output still left in context above was with a kernel running natively or under Xen. In the former case, the driver may need improving. In the latter case we'd need to see, for comparison, the same output with a natively running kernel. If it matches the Xen one, same thing (driver may need improving). If it doesn't match, maximum verbosity hypervisor and kernel logs would be what we'd need to start with. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On 02/09/15 18:25, Justin Acker wrote: > > > *From:* David Vrabel > *To:* Justin Acker ; "xen-devel@lists.xen.org" > > *Sent:* Wednesday, September 2, 2015 9:47 AM > *Subject:* Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU > limited to single interrupt > > On 01/09/15 18:39, Justin Acker wrote: > > > >> Taking this to the dev list from users. >> >> Is there a way to force or enable pirq delivery to a set of cpus as >> opposed to single device from being a assigned a single pirq so that its >> interrupt can be distributed across multiple cpus? > > > No. > > PIRQs are delivered via event channels and these can only be bound to > one VCPU at a time. > > Thanks David. This applies to Dom0 or Dom0/DomU? Both. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
From: Ian Campbell To: Konrad Rzeszutek Wilk ; Justin Acker Cc: "boris.ostrov...@oracle.com" ; "xen-devel@lists.xen.org" Sent: Wednesday, September 2, 2015 9:49 AM Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt On Wed, 2015-09-02 at 08:53 -0400, Konrad Rzeszutek Wilk wrote: > On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote: > > > > From: Konrad Rzeszutek Wilk > > To: Justin Acker > > Cc: "xen-devel@lists.xen.org" ; > > boris.ostrov...@oracle.com > > Sent: Tuesday, September 1, 2015 4:56 PM > > Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU > > limited to single interrupt > > > > On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote: > > > Taking this to the dev list from users. > > > > > > Is there a way to force or enable pirq delivery to a set of cpus as > > > opposed to single device from being a assigned a single pirq so that > > > its interrupt can be distributed across multiple cpus? I believe the > > > device drivers do support multiple queues when run natively without > > > the Dom0 loaded. The device in question is the xhci_hcd driver for > > > which I/O transfers seem to be slowed when the Dom0 is loaded. The > > > behavior seems to pass through to the DomU if pass through is > > > enabled. I found some similar threads, but most relate to Ethernet > > > controllers. I tried some of the x2apic and x2apic_phys dom0 kernel > > > arguments, but none distributed the pirqs. Based on the reading > > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done > > > to avoid an interrupt storm. I tried IRQ balance and when > > > configured/adjusted it will balance individual pirqs, but not > > > multiple interrupts. > > > > Yes. You can do it with smp affinity: > > > > https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt > > Yes, this does allow for assigning a specific interrupt to a single > > cpu, but it will not spread the interrupt load across a defined group > > or all cpus. Is it possible to define a range of CPUs or spread the > > interrupt load for a device across all cpus as it does with a native > > kernel without the Dom0 loaded? > > It should be. Did you try giving it an mask that puts the interrupts on > all the CPUs? > (0xf) ? > > > > I don't follow the "behavior seems to pass through to the DomU if pass > > through is enabled" ? > > The device interrupts are limited to a single pirq if the device is > > used directly in the Dom0. If the device is passed through to a DomU - > > i.e. the xhci_hcd controller - then the DomU cannot spread the > > interrupt load across the cpus in the VM. > > Why? How are you seeing this? The method by which you use smp affinity > should > be exactly the same. > > And it looks to me that the device has a single pirq as well when booting > as baremetal right? > > So the issue here is that you want to spread the interrupt delivery to happen > across > all of the CPUs. The smp_affinity should do it. Did you try modifying it by > hand (you may > want to kill irqbalance when you do this just to make sure it does not write > its own values in)? It sounds then like the real issue is that under native irqbalance is writing smp_affinity values with potentially multiple bits set while under Xen it is only setting a single bit? Justin, is the contents of /proc/irq//smp_affinity for the IRQ in question under Native and Xen consistent with that supposition? Ian, I think the mask is the same in both cases. With irqbalance enabled, the interrupts are mapped - seemed randomly - to various cpus, but only one cpu per interrupt in all cases. With irqbalance disabled at boot and the same kernel version used with Dom0 and baremetal. With Dom0 loaded: cat /proc/irq/78/smp_affinity ff Baremetal kernel: cat /proc/irq/27/smp_affinity ff Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
From: Konrad Rzeszutek Wilk To: Justin Acker Cc: "xen-devel@lists.xen.org" ; "boris.ostrov...@oracle.com" Sent: Wednesday, September 2, 2015 8:53 AM Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote: > > From: Konrad Rzeszutek Wilk > To: Justin Acker > Cc: "xen-devel@lists.xen.org" ; > boris.ostrov...@oracle.com > Sent: Tuesday, September 1, 2015 4:56 PM > Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited >to single interrupt > > On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote: > > Taking this to the dev list from users. > > > > Is there a way to force or enable pirq delivery to a set of cpus as opposed > > to single device from being a assigned a single pirq so that its interrupt > > can be distributed across multiple cpus? I believe the device drivers do > > support multiple queues when run natively without the Dom0 loaded. The > > device in question is the xhci_hcd driver for which I/O transfers seem to > > be slowed when the Dom0 is loaded. The behavior seems to pass through to > > the DomU if pass through is enabled. I found some similar threads, but most > > relate to Ethernet controllers. I tried some of the x2apic and x2apic_phys > > dom0 kernel arguments, but none distributed the pirqs. Based on the reading > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to > > avoid an interrupt storm. I tried IRQ balance and when configured/adjusted > > it will balance individual pirqs, but not multiple interrupts. > > Yes. You can do it with smp affinity: > > https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt > Yes, this does allow for assigning a specific interrupt to a single cpu, but > it will not spread the interrupt load across a defined group or all cpus. Is > it possible to define a range of CPUs or spread the interrupt load for a > device across all cpus as it does with a native kernel without the Dom0 > loaded? It should be. Did you try giving it an mask that puts the interrupts on all the CPUs? (0xf) ? > > I don't follow the "behavior seems to pass through to the DomU if pass > through is enabled" ? > The device interrupts are limited to a single pirq if the device is used > directly in the Dom0. If the device is passed through to a DomU - i.e. the > xhci_hcd controller - then the DomU cannot spread the interrupt load across > the cpus in the VM. Why? How are you seeing this? The method by which you use smp affinity should be exactly the same. And it looks to me that the device has a single pirq as well when booting as baremetal right? On baremetal, it uses all 8 cpus for affinity as noted below (IRQ27) compared to (IRQ78) in the Dom0. CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 baremetal: 27: 17977230 628258 44247270 120391 1597809883 14440991 152189328 73322 IR-PCI-MSI-edge xhci_hcdDom0 or DomU passed through: 78: 82521 0 0 0 0 0 0 0 xen-pirq-msi xhci_hcd So the issue here is that you want to spread the interrupt delivery to happen across all of the CPUs. The smp_affinity should do it. Did you try modifying it by hand (you may want to kill irqbalance when you do this just to make sure it does not write its own values in)? Yes, this would be great if there is a way to spread the affinity across all cpus or a specified set of CPUs similar to the native kernel behavior. I guess it would be spread across a set or all pirqs? With irqbalance disabled, I did adjust try the interrupt affinity manually (i.e. echo ff /prox/irq/78/smp_affinity). The interrupt will move to the specified CPU (0 through 7). Without specifying the affinity manually, it does look like it's mapped to all cpus by default. With the Dom0 loaded, cat /proc/irq/78/smp_affinity returns ff, but the interrupt never appears to be scheduled on more than cpu. > > > > > > > > > With irqbalance enabled in Dom0: > > What version? There was a bug in it where it would never distribute the IRQs > properly > across the CPUs. > irqbalance version 1.0.7. > > Boris (CC-ed) might remember the upstream patch that made this work properly? > > > > > > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > > CPU6 CPU7 > > 76: 11304 0 149579 0 0 0
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On Wed, 2015-09-02 at 08:53 -0400, Konrad Rzeszutek Wilk wrote: > On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote: > > > > From: Konrad Rzeszutek Wilk > > To: Justin Acker > > Cc: "xen-devel@lists.xen.org" ; > > boris.ostrov...@oracle.com > > Sent: Tuesday, September 1, 2015 4:56 PM > > Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU > > limited to single interrupt > > > > On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote: > > > Taking this to the dev list from users. > > > > > > Is there a way to force or enable pirq delivery to a set of cpus as > > > opposed to single device from being a assigned a single pirq so that > > > its interrupt can be distributed across multiple cpus? I believe the > > > device drivers do support multiple queues when run natively without > > > the Dom0 loaded. The device in question is the xhci_hcd driver for > > > which I/O transfers seem to be slowed when the Dom0 is loaded. The > > > behavior seems to pass through to the DomU if pass through is > > > enabled. I found some similar threads, but most relate to Ethernet > > > controllers. I tried some of the x2apic and x2apic_phys dom0 kernel > > > arguments, but none distributed the pirqs. Based on the reading > > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done > > > to avoid an interrupt storm. I tried IRQ balance and when > > > configured/adjusted it will balance individual pirqs, but not > > > multiple interrupts. > > > > Yes. You can do it with smp affinity: > > > > https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt > > Yes, this does allow for assigning a specific interrupt to a single > > cpu, but it will not spread the interrupt load across a defined group > > or all cpus. Is it possible to define a range of CPUs or spread the > > interrupt load for a device across all cpus as it does with a native > > kernel without the Dom0 loaded? > > It should be. Did you try giving it an mask that puts the interrupts on > all the CPUs? > (0xf) ? > > > > I don't follow the "behavior seems to pass through to the DomU if pass > > through is enabled" ? > > The device interrupts are limited to a single pirq if the device is > > used directly in the Dom0. If the device is passed through to a DomU - > > i.e. the xhci_hcd controller - then the DomU cannot spread the > > interrupt load across the cpus in the VM. > > Why? How are you seeing this? The method by which you use smp affinity > should > be exactly the same. > > And it looks to me that the device has a single pirq as well when booting > as baremetal right? > > So the issue here is that you want to spread the interrupt delivery to happen > across > all of the CPUs. The smp_affinity should do it. Did you try modifying it by > hand (you may > want to kill irqbalance when you do this just to make sure it does not write > its own values in)? It sounds then like the real issue is that under native irqbalance is writing smp_affinity values with potentially multiple bits set while under Xen it is only setting a single bit? Justin, is the contents of /proc/irq//smp_affinity for the IRQ in question under Native and Xen consistent with that supposition? Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On 01/09/15 18:39, Justin Acker wrote: > Taking this to the dev list from users. > > Is there a way to force or enable pirq delivery to a set of cpus as > opposed to single device from being a assigned a single pirq so that its > interrupt can be distributed across multiple cpus? No. PIRQs are delivered via event channels and these can only be bound to one VCPU at a time. David ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote: > > From: Konrad Rzeszutek Wilk > To: Justin Acker > Cc: "xen-devel@lists.xen.org" ; > boris.ostrov...@oracle.com > Sent: Tuesday, September 1, 2015 4:56 PM > Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited > to single interrupt > > On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote: > > Taking this to the dev list from users. > > > > Is there a way to force or enable pirq delivery to a set of cpus as opposed > > to single device from being a assigned a single pirq so that its interrupt > > can be distributed across multiple cpus? I believe the device drivers do > > support multiple queues when run natively without the Dom0 loaded. The > > device in question is the xhci_hcd driver for which I/O transfers seem to > > be slowed when the Dom0 is loaded. The behavior seems to pass through to > > the DomU if pass through is enabled. I found some similar threads, but most > > relate to Ethernet controllers. I tried some of the x2apic and x2apic_phys > > dom0 kernel arguments, but none distributed the pirqs. Based on the reading > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to > > avoid an interrupt storm. I tried IRQ balance and when configured/adjusted > > it will balance individual pirqs, but not multiple interrupts. > > Yes. You can do it with smp affinity: > > https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt > Yes, this does allow for assigning a specific interrupt to a single cpu, but > it will not spread the interrupt load across a defined group or all cpus. Is > it possible to define a range of CPUs or spread the interrupt load for a > device across all cpus as it does with a native kernel without the Dom0 > loaded? It should be. Did you try giving it an mask that puts the interrupts on all the CPUs? (0xf) ? > > I don't follow the "behavior seems to pass through to the DomU if pass > through is enabled" ? > The device interrupts are limited to a single pirq if the device is used > directly in the Dom0. If the device is passed through to a DomU - i.e. the > xhci_hcd controller - then the DomU cannot spread the interrupt load across > the cpus in the VM. Why? How are you seeing this? The method by which you use smp affinity should be exactly the same. And it looks to me that the device has a single pirq as well when booting as baremetal right? So the issue here is that you want to spread the interrupt delivery to happen across all of the CPUs. The smp_affinity should do it. Did you try modifying it by hand (you may want to kill irqbalance when you do this just to make sure it does not write its own values in)? > > > > > > > > > With irqbalance enabled in Dom0: > > What version? There was a bug in it where it would never distribute the IRQs > properly > across the CPUs. > irqbalance version 1.0.7. > > Boris (CC-ed) might remember the upstream patch that made this work properly? > > > > > > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > > CPU6 CPU7 > > 76: 11304 0 149579 0 0 0 > > 0 0 xen-pirq-msi :00:1f.2 > > 77: 1243 0 0 35447 0 0 > > 0 0 xen-pirq-msi radeon > > 78: 82521 0 0 0 0 0 > > 0 0 xen-pirq-msi xhci_hcd > > 79: 23 0 0 0 0 0 > > 0 0 xen-pirq-msi mei_me > > 80: 11 0 0 0 0 741 > > 0 0 xen-pirq-msi em1 > > 81: 350 0 0 0 1671 0 > > 0 0 xen-pirq-msi iwlwifi > > 82: 275 0 0 0 0 0 > > 0 0 xen-pirq-msi snd_hda_intel > > > > With native 3.19 kernel: > > > > Without Dom0 for the same system from the first message: > > > > # cat /proc/interrupts > > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > > CPU6 CPU7 > > 0: 33 0 0 0 0 0 > > 0 0 IR-IO-APIC-edge timer > > 8: 0 0 0 0 0 0 > > 1 0 IR-IO-APIC-edge rtc0 > > 9:
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On 09/01/2015 04:56 PM, Konrad Rzeszutek Wilk wrote: On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote: Taking this to the dev list from users. Is there a way to force or enable pirq delivery to a set of cpus as opposed to single device from being a assigned a single pirq so that its interrupt can be distributed across multiple cpus? I believe the device drivers do support multiple queues when run natively without the Dom0 loaded. The device in question is the xhci_hcd driver for which I/O transfers seem to be slowed when the Dom0 is loaded. The behavior seems to pass through to the DomU if pass through is enabled. I found some similar threads, but most relate to Ethernet controllers. I tried some of the x2apic and x2apic_phys dom0 kernel arguments, but none distributed the pirqs. Based on the reading relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to avoid an interrupt storm. I tried IRQ balance and when configured/adjusted it will balance individual pirqs, but not multiple interrupts. Yes. You can do it with smp affinity: https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt I don't follow the "behavior seems to pass through to the DomU if pass through is enabled" ? With irqbalance enabled in Dom0: What version? There was a bug in it where it would never distribute the IRQs properly across the CPUs. Boris (CC-ed) might remember the upstream patch that made this work properly? I think we ended up taking the latest version of irqbalance as the one that added support for Xen guests was still not quite working. Besides, that patch was for xen-dyn-events, not for pirqs. -boris CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 76: 11304 0 149579 0 0 0 0 0 xen-pirq-msi :00:1f.2 77: 1243 0 0 35447 0 0 0 0 xen-pirq-msi radeon 78: 82521 0 0 0 0 0 0 0 xen-pirq-msi xhci_hcd 79: 23 0 0 0 0 0 0 0 xen-pirq-msi mei_me 80: 11 0 0 0 0741 0 0 xen-pirq-msi em1 81:350 0 0 0 1671 0 0 0 xen-pirq-msi iwlwifi 82:275 0 0 0 0 0 0 0 xen-pirq-msi snd_hda_intel With native 3.19 kernel: Without Dom0 for the same system from the first message: # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 33 0 0 0 0 0 0 0 IR-IO-APIC-edge timer 8: 0 0 0 0 0 0 1 0 IR-IO-APIC-edge rtc0 9: 20 0 0 0 0 1 1 1 IR-IO-APIC-fasteoi acpi 16: 15 0 8 1 4 1 1 1 IR-IO-APIC 16-fasteoi ehci_hcd:usb3 18: 703940 56781426226 13033938243 111477 757871510 IR-IO-APIC 18-fasteoi ath9k 23: 11 2 3 0 0 17 2 0 IR-IO-APIC 23-fasteoi ehci_hcd:usb4 24: 0 0 0 0 0 0 0 0 DMAR_MSI-edge dmar0 25: 0 0 0 0 0 0 0 0 DMAR_MSI-edge dmar1 26: 20419 1609 26822567 62281 5426 14928395 IR-PCI-MSI-edge :00:1f.2 27: 17977230 628258 44247270 120391 1597809883 14440991 152189328 73322 IR-PCI-MSI-edge xhci_hcd 28:563 0 0 0 1 0 6 0 IR-PCI-MSI-edge i915 29: 14 0 0 4 2 4 0 0 IR-PCI-MSI-edge mei_me 30: 39514 1744 60339157 129956 19702 72140 83 IR-PCI-MSI-edge eth0 31: 3 0 0 1 54 0 0 2 IR-PCI-MSI-edge snd_hda_intel 32: 28145284 53316 63 139165 4410 25760 27 IR-PCI-MSI-edge eth1-rx-0 33: 1032 43 2392 5 1797265 1507 20 IR-PCI-MSI-edge eth1-tx-0 34: 0 1 0 0 0 1 2 0 IR-PCI-MSI-edge eth1 35: 5
Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt
On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote: > Taking this to the dev list from users. > > Is there a way to force or enable pirq delivery to a set of cpus as opposed > to single device from being a assigned a single pirq so that its interrupt > can be distributed across multiple cpus? I believe the device drivers do > support multiple queues when run natively without the Dom0 loaded. The device > in question is the xhci_hcd driver for which I/O transfers seem to be slowed > when the Dom0 is loaded. The behavior seems to pass through to the DomU if > pass through is enabled. I found some similar threads, but most relate to > Ethernet controllers. I tried some of the x2apic and x2apic_phys dom0 kernel > arguments, but none distributed the pirqs. Based on the reading relating to > IRQs for Xen, I think pinning the pirqs to cpu0 is done to avoid an interrupt > storm. I tried IRQ balance and when configured/adjusted it will balance > individual pirqs, but not multiple interrupts. Yes. You can do it with smp affinity: https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt I don't follow the "behavior seems to pass through to the DomU if pass through is enabled" ? > > > > With irqbalance enabled in Dom0: What version? There was a bug in it where it would never distribute the IRQs properly across the CPUs. Boris (CC-ed) might remember the upstream patch that made this work properly? > > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > CPU6 CPU7 > 76: 11304 0 149579 0 0 0 > 0 0 xen-pirq-msi :00:1f.2 > 77: 1243 0 0 35447 0 0 > 0 0 xen-pirq-msi radeon > 78: 82521 0 0 0 0 0 > 0 0 xen-pirq-msi xhci_hcd > 79: 23 0 0 0 0 0 > 0 0 xen-pirq-msi mei_me > 80: 11 0 0 0 0 741 > 0 0 xen-pirq-msi em1 > 81: 350 0 0 0 1671 0 > 0 0 xen-pirq-msi iwlwifi > 82: 275 0 0 0 0 0 > 0 0 xen-pirq-msi snd_hda_intel > > With native 3.19 kernel: > > Without Dom0 for the same system from the first message: > > # cat /proc/interrupts > CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 > CPU6 CPU7 > 0: 33 0 0 0 0 0 > 0 0 IR-IO-APIC-edge timer > 8: 0 0 0 0 0 0 > 1 0 IR-IO-APIC-edge rtc0 > 9: 20 0 0 0 0 1 > 1 1 IR-IO-APIC-fasteoi acpi > 16: 15 0 8 1 4 1 > 1 1 IR-IO-APIC 16-fasteoi ehci_hcd:usb3 > 18: 703940 5678 1426226 1303 3938243 111477 > 757871 510 IR-IO-APIC 18-fasteoi ath9k > 23: 11 2 3 0 0 17 > 2 0 IR-IO-APIC 23-fasteoi ehci_hcd:usb4 > 24: 0 0 0 0 0 0 > 0 0 DMAR_MSI-edge dmar0 > 25: 0 0 0 0 0 0 > 0 0 DMAR_MSI-edge dmar1 > 26: 20419 1609 26822 567 62281 5426 > 14928 395 IR-PCI-MSI-edge :00:1f.2 > 27: 17977230 628258 44247270 120391 1597809883 14440991 > 152189328 73322 IR-PCI-MSI-edge xhci_hcd > 28: 563 0 0 0 1 0 > 6 0 IR-PCI-MSI-edge i915 > 29: 14 0 0 4 2 4 > 0 0 IR-PCI-MSI-edge mei_me > 30: 39514 1744 60339 157 129956 19702 > 72140 83 IR-PCI-MSI-edge eth0 > 31: 3 0 0 1 54 0 > 0 2 IR-PCI-MSI-edge snd_hda_intel > 32: 28145 284 53316 63 139165 4410 > 25760 27 IR-PCI-MSI-edge eth1-rx-0 > 33: 1032 43 2392 5 1797 265 > 1507 20 IR-PCI-MSI-edge eth1-tx-0 > 34: 0 1 0 0 0 1 > 2 0 IR-PCI-MSI-edge eth1 > 35: 5 0 0 12 148 6 > 2 1 IR-PCI-MSI-edge snd_hda_intel > > > The