Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-21 Thread Jan Beulich
>>> On 16.09.15 at 22:31,  wrote:
> I think the lspci -v output is the same in both cases with the exception
> of the xhci_pci which is not present in the Native case lspci -v output.
> xhci_pci is built into the kernel. The same kernel/system is used with
> this system when booted with Dom0 and native cases. I could rebuild the
> kernel without it and see what happens? 

No point in doing so, the only lines we're interesting in here are ...

> Native:
> 
> 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
> Family USB xHCI (rev 05) (prog-if 30 [XHCI])
>   Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB
> xHCI
>   Flags: bus master, medium devsel, latency 0, IRQ 27
>   Memory at f7e2 (64-bit, non-prefetchable) [size=64K]
>   Capabilities: [70] Power Management version 2
>   Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+

... this and ...

>   Kernel driver in use: xhci_hcd
> 
> 
> 
> Dom0:
> 
> 00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
> Family USB xHCI (rev 05) (prog-if 30 [XHCI])
>   Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB
> xHCI
>   Flags: bus master, medium devsel, latency 0, IRQ 76
>   Memory at f7e2 (64-bit, non-prefetchable) [size=64K]
>   Capabilities: [70] Power Management version 2
>   Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+

... this. Them being identical proves what I suspected: The driver
uses only a single interrupt.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-16 Thread Justin Acker


On Fri, 2015-09-11 at 04:03 -0600, Jan Beulich wrote:

> >>> On 10.09.15 at 18:20,  wrote:
> > On Wed, 2015-09-09 at 00:48 -0600, Jan Beulich wrote:
> >> >>> On 08.09.15 at 18:02,  wrote:
> >> > I believe the driver does support use of multiple interrupts based on
> >> > the previous explanation of the lspci output where it was established
> >> > that the device could use up to 8 interrupts which is what I see on bare
> >> > metal.
> >> 
> >> Where is the proof of that? All I've seen is output like this
> >> 
> >> Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> >> 
> >> which says that one out of eight interrupts is being used. And
> >> if in the native case this would indeed be the case, I don't think
> >> you've provided complete hypervisor and kernel logs for the
> >> Xen case so far, which would allow us to look for respective error
> >> indications. And this (ignoring the line wrapping, which makes
> >> things hard to read - it would be appreciated if you could fix
> >> your mail client)...
> >> 
> >> > Bare metal:
> >> > 
> >> > cat /proc/interrupts 
> >> >CPU0   CPU1   CPU2   CPU3   CPU4   CPU5
> >> > CPU6   CPU7   
> >> >   0: 36  0  0  0  0  0
> >> > 0  0  IR-IO-APIC-edge  timer
> >> >[...]
> >> >  27: 337125  47893 708965   4049   53940667 263303
> >> > 87847   4958  IR-PCI-MSI-edge  xhci_hcd
> >> 
> >> ... also shows just a single interrupt being in use.
> > 
> > Kernel logs for native and Dom0 with 'debug' appended to grub. xl-dmesg
> > with log_lvl=all guest_loglvl=all set. Please let me know if there are
> > other logs or log levels that I should provide. 
> 
> The native kernel log supports there only being a single interrupt
> in use. I'm still not seeing any proof of your claim for this to be
> different. Did you double check lspci output in the native case?
> 
> Jan
> 


Jan,

I think the lspci -v output is the same in both cases with the exception
of the xhci_pci which is not present in the Native case lspci -v output.
xhci_pci is built into the kernel. The same kernel/system is used with
this system when booted with Dom0 and native cases. I could rebuild the
kernel without it and see what happens? 

Native:

00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
Family USB xHCI (rev 05) (prog-if 30 [XHCI])
Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB
xHCI
Flags: bus master, medium devsel, latency 0, IRQ 27
Memory at f7e2 (64-bit, non-prefetchable) [size=64K]
Capabilities: [70] Power Management version 2
Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
Kernel driver in use: xhci_hcd



Dom0:

00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
Family USB xHCI (rev 05) (prog-if 30 [XHCI])
Subsystem: Intel Corporation 8 Series/C220 Series Chipset Family USB
xHCI
Flags: bus master, medium devsel, latency 0, IRQ 76
Memory at f7e2 (64-bit, non-prefetchable) [size=64K]
Capabilities: [70] Power Management version 2
Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci


cat /boot/config-3.18.1-1.fc20.x86_64 | grep XHCI
CONFIG_USB_XHCI_HCD=y
CONFIG_USB_XHCI_PCI=y

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-11 Thread Jan Beulich
>>> On 10.09.15 at 18:20,  wrote:
> On Wed, 2015-09-09 at 00:48 -0600, Jan Beulich wrote:
>> >>> On 08.09.15 at 18:02,  wrote:
>> > I believe the driver does support use of multiple interrupts based on
>> > the previous explanation of the lspci output where it was established
>> > that the device could use up to 8 interrupts which is what I see on bare
>> > metal.
>> 
>> Where is the proof of that? All I've seen is output like this
>> 
>> Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
>> 
>> which says that one out of eight interrupts is being used. And
>> if in the native case this would indeed be the case, I don't think
>> you've provided complete hypervisor and kernel logs for the
>> Xen case so far, which would allow us to look for respective error
>> indications. And this (ignoring the line wrapping, which makes
>> things hard to read - it would be appreciated if you could fix
>> your mail client)...
>> 
>> > Bare metal:
>> > 
>> > cat /proc/interrupts 
>> >CPU0   CPU1   CPU2   CPU3   CPU4   CPU5
>> > CPU6   CPU7   
>> >   0: 36  0  0  0  0  0
>> > 0  0  IR-IO-APIC-edge  timer
>> >[...]
>> >  27: 337125  47893 708965   4049   53940667 263303
>> > 87847   4958  IR-PCI-MSI-edge  xhci_hcd
>> 
>> ... also shows just a single interrupt being in use.
> 
> Kernel logs for native and Dom0 with 'debug' appended to grub. xl-dmesg
> with log_lvl=all guest_loglvl=all set. Please let me know if there are
> other logs or log levels that I should provide. 

The native kernel log supports there only being a single interrupt
in use. I'm still not seeing any proof of your claim for this to be
different. Did you double check lspci output in the native case?

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-08 Thread Jan Beulich
>>> On 08.09.15 at 18:02,  wrote:
> I believe the driver does support use of multiple interrupts based on
> the previous explanation of the lspci output where it was established
> that the device could use up to 8 interrupts which is what I see on bare
> metal.

Where is the proof of that? All I've seen is output like this

Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+

which says that one out of eight interrupts is being used. And
if in the native case this would indeed be the case, I don't think
you've provided complete hypervisor and kernel logs for the
Xen case so far, which would allow us to look for respective error
indications. And this (ignoring the line wrapping, which makes
things hard to read - it would be appreciated if you could fix
your mail client)...

> Bare metal:
> 
> cat /proc/interrupts 
>CPU0   CPU1   CPU2   CPU3   CPU4   CPU5
> CPU6   CPU7   
>   0: 36  0  0  0  0  0
> 0  0  IR-IO-APIC-edge  timer
>[...]
>  27: 337125  47893 708965   4049   53940667 263303
> 87847   4958  IR-PCI-MSI-edge  xhci_hcd

... also shows just a single interrupt being in use.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-04 Thread Jan Beulich
>>> On 03.09.15 at 18:52,  wrote:
> On Thu, 2015-09-03 at 09:04 -0600, Jan Beulich wrote:
>> >>> On 03.09.15 at 14:04,  wrote:
>> >  I am still confused as to whether any device, or in this case 
>> > xhci_hcd, 
>> > can use more than one cpu at any given time. My understanding based on 
>> > David's response is that it cannot due to the event channel mapping. The 
>> > device interrupt can be pinned to a specific cpu by specifying the 
>> > affinity. I was hoping there was a way to allow the driver's interrupt to 
>> > be 
>> > scheduled to use more than 1 CPU at any given time. 
>> 
>> The problem is that you're mixing up two things: devices and
>> interrupts. Any individual interrupt can only be serviced by a single
>> CPU at a time, due to the way event channels get bound. Any
>> individual device can have more than one interrupt (MSI or MSI-X),
>> and then each of these interrupts can be serviced on different
>> CPUs.
> 
> Thanks for clarifying. To the original question, with respect to my
> limited understanding of the event channels and interrupts, each
> interrupt can be serviced on a different CPU using irqbalance or setting
> the affinity manually, but the same interrupt cannot be serviced by more
> than 1 CPU at a time? If so, is there a way around the 1:1 binding when
> loading the Dom0 kernel - a flag or option to use the native interrupt
> scheduling for some set of or all 8 CPUs that the device can schedule
> interrupts on when not loading the Dom0? The xhci_hcd, as one example,
> seems to perform better when it is able to have interrupts serviced by
> multiple CPUs. 

I don't follow - we tell you this doesn't work (multiple times and
different people), and you ask yet another time whether this can
be made work? Just to make this very clear once again: Under Xen
(and leaving aside pure HVM guests), interrupt load from a single
device can be spread across CPUs only when the device uses
multiple interrupts.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-03 Thread Jan Beulich
>>> On 03.09.15 at 14:04,  wrote:
 On 02.09.15 at 19:17,  wrote:
>>  From: Jan Beulich 
>>  Sent: Wednesday, September 2, 2015 4:58 AM
> Justin Acker  09/02/15 1:14 AM >>>
>>> 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset 
> Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI])
>>>Subsystem: Dell Device 053e
>>>Flags: bus master, medium devsel, latency 0, IRQ 78
>>>Memory at f7f2 (64-bit, non-prefetchable) [size=64K]
>>>Capabilities: [70] Power Management version 2
>>>Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
>> 
>> This shows that the driver could use up to 8 MSI IRQs, but chose to use just 
> 
>> one. If
>> this is the same under Xen and the native kernel, the driver likely doesn't 
>> know any
>> better. If under native more interrupts are being used, there might be an 
>> issue with
>> Xen specific code in the kernel or hypervisor code. We'd need to see details 
> 
>> to be
>> able to tell.
>> 
>> Please let me know what details I should provide. 

I'd like to emphasize what I said in my previous reply:

> Please, first of all, get your reply style fixed. Just look at the above
> and tell me how a reader should figure which parts of the text were
> written by whom.
>
>[...]
>
>  I am still confused as to whether any device, or in this case xhci_hcd, 
> can use more than one cpu at any given time. My understanding based on 
> David's response is that it cannot due to the event channel mapping. The 
> device interrupt can be pinned to a specific cpu by specifying the 
> affinity. I was hoping there was a way to allow the driver's interrupt to be 
> scheduled to use more than 1 CPU at any given time. 

The problem is that you're mixing up two things: devices and
interrupts. Any individual interrupt can only be serviced by a single
CPU at a time, due to the way event channels get bound. Any
individual device can have more than one interrupt (MSI or MSI-X),
and then each of these interrupts can be serviced on different
CPUs.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-03 Thread Jan Beulich
(re-adding xen-devel)

>>> On 02.09.15 at 19:17,  wrote:
>   From: Jan Beulich 
>  Sent: Wednesday, September 2, 2015 4:58 AM
 Justin Acker  09/02/15 1:14 AM >>>
>> 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset 
>> Family USB xHCI Host Controller (rev 04) (prog-if 30 [XHCI])
>>Subsystem: Dell Device 053e
>>Flags: bus master, medium devsel, latency 0, IRQ 78
>>Memory at f7f2 (64-bit, non-prefetchable) [size=64K]
>>Capabilities: [70] Power Management version 2
>>Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> 
> This shows that the driver could use up to 8 MSI IRQs, but chose to use just 
> one. If
> this is the same under Xen and the native kernel, the driver likely doesn't 
> know any
> better. If under native more interrupts are being used, there might be an 
> issue with
> Xen specific code in the kernel or hypervisor code. We'd need to see details 
> to be
> able to tell.
> 
> Please let me know what details I should provide. 
> 
> Jan

Please, first of all, get you reply style fixed. Just look at the above
and tell me how a reader should figure which parts of the text were
written by whom.

Together with other replies you sent, I first of all wonder whether
you've understood what you've been told: Any interrupt delivered
via the event channel mechanism can't be delivered to more than
one CPU unless it gets moved around them by a tool or manually.
When you set the affinity to more than on (v)CPU, the kernel will
pick one (usually the first) out of the provided set and bind the
event channel to that vCPU.

As to, in the XHCI case, using multi-vector MSI: Please tell use
whether the lspci output still left in context above was with a
kernel running natively or under Xen. In the former case, the
driver may need improving. In the latter case we'd need to see,
for comparison, the same output with a natively running kernel. If
it matches the Xen one, same thing (driver may need improving).
If it doesn't match, maximum verbosity hypervisor and kernel logs
would be what we'd need to start with.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-02 Thread David Vrabel
On 02/09/15 18:25, Justin Acker wrote:
> 
> 
> *From:* David Vrabel 
> *To:* Justin Acker ; "xen-devel@lists.xen.org"
> 
> *Sent:* Wednesday, September 2, 2015 9:47 AM
> *Subject:* Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU
> limited to single interrupt
> 
> On 01/09/15 18:39, Justin Acker wrote:
> 
> 
> 
>> Taking this to the dev list from users.
>>
>> Is there a way to force or enable pirq delivery to a set of cpus as
>> opposed to single device from being a assigned a single pirq so that its
>> interrupt can be distributed across multiple cpus?
> 
> 
> No.
> 
> PIRQs are delivered via event channels and these can only be bound to
> one VCPU at a time.
> 
> Thanks David. This applies to Dom0 or Dom0/DomU?

Both.

David


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-02 Thread Justin Acker

  From: Ian Campbell 
 To: Konrad Rzeszutek Wilk ; Justin Acker 
 
Cc: "boris.ostrov...@oracle.com" ; 
"xen-devel@lists.xen.org"  
 Sent: Wednesday, September 2, 2015 9:49 AM
 Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to 
single interrupt
   
On Wed, 2015-09-02 at 08:53 -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote:
> > 
> >      From: Konrad Rzeszutek Wilk 
> >  To: Justin Acker  
> > Cc: "xen-devel@lists.xen.org" ; 
> > boris.ostrov...@oracle.com 
> >  Sent: Tuesday, September 1, 2015 4:56 PM
> >  Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU 
> > limited to single interrupt
> >    
> > On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote:
> > > Taking this to the dev list from users. 
> > > 
> > > Is there a way to force or enable pirq delivery to a set of cpus as 
> > > opposed to single device from being a assigned a single pirq so that 
> > > its interrupt can be distributed across multiple cpus? I believe the 
> > > device drivers do support multiple queues when run natively without 
> > > the Dom0 loaded. The device in question is the xhci_hcd driver for 
> > > which I/O transfers seem to be slowed when the Dom0 is loaded. The 
> > > behavior seems to pass through to the DomU if pass through is 
> > > enabled. I found some similar threads, but most relate to Ethernet 
> > > controllers. I tried some of the x2apic and x2apic_phys dom0 kernel 
> > > arguments, but none distributed the pirqs. Based on the reading 
> > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done 
> > > to avoid an interrupt storm. I tried IRQ balance and when 
> > > configured/adjusted it will balance individual pirqs, but not 
> > > multiple interrupts.
> > 
> > Yes. You can do it with smp affinity:
> > 
> > https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt
> > Yes, this does allow for assigning a specific interrupt to a single 
> > cpu, but it will not spread the interrupt load across a defined group 
> > or all cpus. Is it possible to define a range of CPUs or spread the 
> > interrupt load for a device across all cpus as it does with a native 
> > kernel without the Dom0 loaded?
> 
> It should be. Did you try giving it an mask that puts the interrupts on 
> all the CPUs?
> (0xf) ?
> > 
> > I don't follow the "behavior seems to pass through to the DomU if pass 
> > through is enabled" ?
> > The device interrupts are limited to a single pirq if the device is 
> > used directly in the Dom0. If the device is passed through to a DomU - 
> > i.e. the xhci_hcd controller - then the DomU cannot spread the 
> > interrupt load across the cpus in the VM. 
> 
> Why? How are you seeing this? The method by which you use smp affinity 
> should
> be exactly the same.
> 
> And it looks to me that the device has a single pirq as well when booting 
> as baremetal right?
> 
> So the issue here is that you want to spread the interrupt delivery to happen 
> across
> all of the CPUs. The smp_affinity should do it. Did you try modifying it by 
> hand (you may
> want to kill irqbalance when you do this just to make sure it does not write 
> its own values in)?

It sounds then like the real issue is that under native irqbalance is
writing smp_affinity values with potentially multiple bits set while under
Xen it is only setting a single bit?

Justin, is the contents of /proc/irq//smp_affinity for the IRQ in
question under Native and Xen consistent with that supposition?


Ian, I think the mask is the same in both cases.  With irqbalance enabled, the 
interrupts are mapped - seemed randomly - to various cpus, but only one cpu per 
interrupt in all cases. 

With irqbalance disabled at boot and the same kernel version used with Dom0 and 
baremetal.
With Dom0 loaded:
cat /proc/irq/78/smp_affinity
ff

Baremetal kernel:
cat /proc/irq/27/smp_affinity
ff


Ian.



  ___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-02 Thread Justin Acker

  From: Konrad Rzeszutek Wilk 
 To: Justin Acker  
Cc: "xen-devel@lists.xen.org" ; 
"boris.ostrov...@oracle.com"  
 Sent: Wednesday, September 2, 2015 8:53 AM
 Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to 
single interrupt
   
On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote:
> 
>      From: Konrad Rzeszutek Wilk 
>  To: Justin Acker  
> Cc: "xen-devel@lists.xen.org" ; 
> boris.ostrov...@oracle.com 
>  Sent: Tuesday, September 1, 2015 4:56 PM
>  Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited 
>to single interrupt
>    
> On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote:
> > Taking this to the dev list from users. 
> > 
> > Is there a way to force or enable pirq delivery to a set of cpus as opposed 
> > to single device from being a assigned a single pirq so that its interrupt 
> > can be distributed across multiple cpus? I believe the device drivers do 
> > support multiple queues when run natively without the Dom0 loaded. The 
> > device in question is the xhci_hcd driver for which I/O transfers seem to 
> > be slowed when the Dom0 is loaded. The behavior seems to pass through to 
> > the DomU if pass through is enabled. I found some similar threads, but most 
> > relate to Ethernet controllers. I tried some of the x2apic and x2apic_phys 
> > dom0 kernel arguments, but none distributed the pirqs. Based on the reading 
> > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to 
> > avoid an interrupt storm. I tried IRQ balance and when configured/adjusted 
> > it will balance individual pirqs, but not multiple interrupts.
> 
> Yes. You can do it with smp affinity:
> 
> https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt
> Yes, this does allow for assigning a specific interrupt to a single cpu, but 
> it will not spread the interrupt load across a defined group or all cpus. Is 
> it possible to define a range of CPUs or spread the interrupt load for a 
> device across all cpus as it does with a native kernel without the Dom0 
> loaded?

It should be. Did you try giving it an mask that puts the interrupts on all the 
CPUs?
(0xf) ?
> 
> I don't follow the "behavior seems to pass through to the DomU if pass 
> through is enabled" ?
> The device interrupts are limited to a single pirq if the device is used 
> directly in the Dom0. If the device is passed through to a DomU - i.e. the 
> xhci_hcd controller - then the DomU cannot spread the interrupt load across 
> the cpus in the VM. 

Why? How are you seeing this? The method by which you use smp affinity should
be exactly the same.

And it looks to me that the device has a single pirq as well when booting as 
baremetal right?
On baremetal, it uses all 8 cpus for affinity as noted below (IRQ27) compared 
to (IRQ78) in the Dom0.
     
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   CPU6   
CPU7  baremetal:        27:   17977230 
628258   44247270 120391 1597809883   14440991  152189328  73322  
IR-PCI-MSI-edge  xhci_hcdDom0 or DomU passed through: 78:  82521
  0  0  0  0  0  0  0  
xen-pirq-msi   xhci_hcd
So the issue here is that you want to spread the interrupt delivery to happen 
across
all of the CPUs. The smp_affinity should do it. Did you try modifying it by 
hand (you may
want to kill irqbalance when you do this just to make sure it does not write 
its own values in)?


Yes, this would be great if there is a way to spread the affinity across all 
cpus or a specified set of CPUs similar to the native kernel behavior. I guess 
it would be spread across a set or all pirqs? With irqbalance disabled, I did 
adjust try the interrupt affinity manually (i.e. echo ff 
/prox/irq/78/smp_affinity). The interrupt will move to the specified CPU (0 
through 7). Without specifying the affinity manually, it does look like it's 
mapped to all cpus by default. With the Dom0 loaded, cat 
/proc/irq/78/smp_affinity returns ff, but the interrupt never appears to be 
scheduled on more than cpu. 






> 
> > 
> > 
> > 
> > With irqbalance enabled in Dom0:
> 
> What version? There was a bug in it where it would never distribute the IRQs 
> properly
> across the CPUs.
> irqbalance version 1.0.7.
> 
> Boris (CC-ed) might remember the upstream patch that made this work properly?
> 
> 
> > 
> >    CPU0   CPU1   CPU2   CPU3   CPU4   CPU5  
> >  CPU6   CPU7  
> >  76:  11304  0 149579  0  0  0  

Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-02 Thread Ian Campbell
On Wed, 2015-09-02 at 08:53 -0400, Konrad Rzeszutek Wilk wrote:
> On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote:
> > 
> >   From: Konrad Rzeszutek Wilk 
> >  To: Justin Acker  
> > Cc: "xen-devel@lists.xen.org" ; 
> > boris.ostrov...@oracle.com 
> >  Sent: Tuesday, September 1, 2015 4:56 PM
> >  Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU 
> > limited to single interrupt
> >
> > On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote:
> > > Taking this to the dev list from users. 
> > > 
> > > Is there a way to force or enable pirq delivery to a set of cpus as 
> > > opposed to single device from being a assigned a single pirq so that 
> > > its interrupt can be distributed across multiple cpus? I believe the 
> > > device drivers do support multiple queues when run natively without 
> > > the Dom0 loaded. The device in question is the xhci_hcd driver for 
> > > which I/O transfers seem to be slowed when the Dom0 is loaded. The 
> > > behavior seems to pass through to the DomU if pass through is 
> > > enabled. I found some similar threads, but most relate to Ethernet 
> > > controllers. I tried some of the x2apic and x2apic_phys dom0 kernel 
> > > arguments, but none distributed the pirqs. Based on the reading 
> > > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done 
> > > to avoid an interrupt storm. I tried IRQ balance and when 
> > > configured/adjusted it will balance individual pirqs, but not 
> > > multiple interrupts.
> > 
> > Yes. You can do it with smp affinity:
> > 
> > https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt
> > Yes, this does allow for assigning a specific interrupt to a single 
> > cpu, but it will not spread the interrupt load across a defined group 
> > or all cpus. Is it possible to define a range of CPUs or spread the 
> > interrupt load for a device across all cpus as it does with a native 
> > kernel without the Dom0 loaded?
> 
> It should be. Did you try giving it an mask that puts the interrupts on 
> all the CPUs?
> (0xf) ?
> > 
> > I don't follow the "behavior seems to pass through to the DomU if pass 
> > through is enabled" ?
> > The device interrupts are limited to a single pirq if the device is 
> > used directly in the Dom0. If the device is passed through to a DomU - 
> > i.e. the xhci_hcd controller - then the DomU cannot spread the 
> > interrupt load across the cpus in the VM. 
> 
> Why? How are you seeing this? The method by which you use smp affinity 
> should
> be exactly the same.
> 
> And it looks to me that the device has a single pirq as well when booting 
> as baremetal right?
> 
> So the issue here is that you want to spread the interrupt delivery to happen 
> across
> all of the CPUs. The smp_affinity should do it. Did you try modifying it by 
> hand (you may
> want to kill irqbalance when you do this just to make sure it does not write 
> its own values in)?

It sounds then like the real issue is that under native irqbalance is
writing smp_affinity values with potentially multiple bits set while under
Xen it is only setting a single bit?

Justin, is the contents of /proc/irq//smp_affinity for the IRQ in
question under Native and Xen consistent with that supposition?

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-02 Thread David Vrabel
On 01/09/15 18:39, Justin Acker wrote:
> Taking this to the dev list from users.
> 
> Is there a way to force or enable pirq delivery to a set of cpus as
> opposed to single device from being a assigned a single pirq so that its
> interrupt can be distributed across multiple cpus?

No.

PIRQs are delivered via event channels and these can only be bound to
one VCPU at a time.

David

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-02 Thread Konrad Rzeszutek Wilk
On Tue, Sep 01, 2015 at 11:09:38PM +, Justin Acker wrote:
> 
>   From: Konrad Rzeszutek Wilk 
>  To: Justin Acker  
> Cc: "xen-devel@lists.xen.org" ; 
> boris.ostrov...@oracle.com 
>  Sent: Tuesday, September 1, 2015 4:56 PM
>  Subject: Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited 
> to single interrupt
>
> On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote:
> > Taking this to the dev list from users. 
> > 
> > Is there a way to force or enable pirq delivery to a set of cpus as opposed 
> > to single device from being a assigned a single pirq so that its interrupt 
> > can be distributed across multiple cpus? I believe the device drivers do 
> > support multiple queues when run natively without the Dom0 loaded. The 
> > device in question is the xhci_hcd driver for which I/O transfers seem to 
> > be slowed when the Dom0 is loaded. The behavior seems to pass through to 
> > the DomU if pass through is enabled. I found some similar threads, but most 
> > relate to Ethernet controllers. I tried some of the x2apic and x2apic_phys 
> > dom0 kernel arguments, but none distributed the pirqs. Based on the reading 
> > relating to IRQs for Xen, I think pinning the pirqs to cpu0 is done to 
> > avoid an interrupt storm. I tried IRQ balance and when configured/adjusted 
> > it will balance individual pirqs, but not multiple interrupts.
> 
> Yes. You can do it with smp affinity:
> 
> https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt
> Yes, this does allow for assigning a specific interrupt to a single cpu, but 
> it will not spread the interrupt load across a defined group or all cpus. Is 
> it possible to define a range of CPUs or spread the interrupt load for a 
> device across all cpus as it does with a native kernel without the Dom0 
> loaded?

It should be. Did you try giving it an mask that puts the interrupts on all the 
CPUs?
(0xf) ?
> 
> I don't follow the "behavior seems to pass through to the DomU if pass 
> through is enabled" ?
> The device interrupts are limited to a single pirq if the device is used 
> directly in the Dom0. If the device is passed through to a DomU - i.e. the 
> xhci_hcd controller - then the DomU cannot spread the interrupt load across 
> the cpus in the VM. 

Why? How are you seeing this? The method by which you use smp affinity should
be exactly the same.

And it looks to me that the device has a single pirq as well when booting as 
baremetal right?

So the issue here is that you want to spread the interrupt delivery to happen 
across
all of the CPUs. The smp_affinity should do it. Did you try modifying it by 
hand (you may
want to kill irqbalance when you do this just to make sure it does not write 
its own values in)?

> 
> > 
> > 
> > 
> > With irqbalance enabled in Dom0:
> 
> What version? There was a bug in it where it would never distribute the IRQs 
> properly
> across the CPUs.
> irqbalance version 1.0.7.
> 
> Boris (CC-ed) might remember the upstream patch that made this work properly?
> 
> 
> > 
> >    CPU0   CPU1   CPU2   CPU3   CPU4   CPU5  
> >  CPU6   CPU7  
> >  76:  11304  0 149579  0  0  0  
> >     0  0  xen-pirq-msi   :00:1f.2
> >  77:   1243  0  0  35447  0  0  
> >     0  0  xen-pirq-msi   radeon
> >  78:  82521  0  0  0  0  0  
> >     0  0  xen-pirq-msi   xhci_hcd
> >  79: 23  0  0  0  0  0  
> >     0  0  xen-pirq-msi   mei_me
> >  80: 11  0  0  0  0    741  
> >     0  0  xen-pirq-msi   em1
> >  81:    350  0  0  0   1671  0  
> >     0  0  xen-pirq-msi   iwlwifi
> >  82:    275  0  0  0  0  0  
> >     0  0  xen-pirq-msi   snd_hda_intel
> > 
> > With native 3.19 kernel:
> > 
> > Without Dom0 for the same system from the first message:
> > 
> > # cat /proc/interrupts
> >    CPU0   CPU1   CPU2   CPU3   CPU4   CPU5  
> >  CPU6   CPU7  
> >   0: 33  0  0  0  0  0  
> >     0  0  IR-IO-APIC-edge  timer
> >   8:  0  0  0  0  0  0  
> >     1  0  IR-IO-APIC-edge  rtc0
> >   9:  

Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-01 Thread Boris Ostrovsky

On 09/01/2015 04:56 PM, Konrad Rzeszutek Wilk wrote:

On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote:

Taking this to the dev list from users.

Is there a way to force or enable pirq delivery to a set of cpus as opposed to 
single device from being a assigned a single pirq so that its interrupt can be 
distributed across multiple cpus? I believe the device drivers do support 
multiple queues when run natively without the Dom0 loaded. The device in 
question is the xhci_hcd driver for which I/O transfers seem to be slowed when 
the Dom0 is loaded. The behavior seems to pass through to the DomU if pass 
through is enabled. I found some similar threads, but most relate to Ethernet 
controllers. I tried some of the x2apic and x2apic_phys dom0 kernel arguments, 
but none distributed the pirqs. Based on the reading relating to IRQs for Xen, 
I think pinning the pirqs to cpu0 is done to avoid an interrupt storm. I tried 
IRQ balance and when configured/adjusted it will balance individual pirqs, but 
not multiple interrupts.

Yes. You can do it with smp affinity:

https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt

I don't follow the "behavior seems to pass through to the DomU if pass through is 
enabled" ?




With irqbalance enabled in Dom0:

What version? There was a bug in it where it would never distribute the IRQs 
properly
across the CPUs.

Boris (CC-ed) might remember the upstream patch that made this work properly?


I think we ended up taking the latest version of irqbalance as the one 
that added support for Xen guests was still not quite working. Besides, 
that patch was for xen-dyn-events, not for pirqs.


-boris



CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   
CPU6   CPU7
  76:  11304  0 149579  0  0  0 
 0  0  xen-pirq-msi   :00:1f.2
  77:   1243  0  0  35447  0  0 
 0  0  xen-pirq-msi   radeon
  78:  82521  0  0  0  0  0 
 0  0  xen-pirq-msi   xhci_hcd
  79: 23  0  0  0  0  0 
 0  0  xen-pirq-msi   mei_me
  80: 11  0  0  0  0741 
 0  0  xen-pirq-msi   em1
  81:350  0  0  0   1671  0 
 0  0  xen-pirq-msi   iwlwifi
  82:275  0  0  0  0  0 
 0  0  xen-pirq-msi   snd_hda_intel

With native 3.19 kernel:

Without Dom0 for the same system from the first message:

# cat /proc/interrupts
CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   
CPU6   CPU7
   0: 33  0  0  0  0  0 
 0  0  IR-IO-APIC-edge  timer
   8:  0  0  0  0  0  0 
 1  0  IR-IO-APIC-edge  rtc0
   9: 20  0  0  0  0  1 
 1  1  IR-IO-APIC-fasteoi   acpi
  16: 15  0  8  1  4  1 
 1  1  IR-IO-APIC  16-fasteoi   ehci_hcd:usb3
  18: 703940   56781426226   13033938243 111477 
757871510  IR-IO-APIC  18-fasteoi   ath9k
  23: 11  2  3  0  0 17 
 2  0  IR-IO-APIC  23-fasteoi   ehci_hcd:usb4
  24:  0  0  0  0  0  0 
 0  0  DMAR_MSI-edge  dmar0
  25:  0  0  0  0  0  0 
 0  0  DMAR_MSI-edge  dmar1
  26:  20419   1609  26822567  62281   5426  
14928395  IR-PCI-MSI-edge  :00:1f.2
  27:   17977230 628258   44247270 120391 1597809883   14440991  
152189328  73322  IR-PCI-MSI-edge  xhci_hcd
  28:563  0  0  0  1  0 
 6  0  IR-PCI-MSI-edge  i915
  29: 14  0  0  4  2  4 
 0  0  IR-PCI-MSI-edge  mei_me
  30:  39514   1744  60339157 129956  19702  
72140 83  IR-PCI-MSI-edge  eth0
  31:  3  0  0  1 54  0 
 0  2  IR-PCI-MSI-edge  snd_hda_intel
  32:  28145284  53316 63 139165   4410  
25760 27  IR-PCI-MSI-edge  eth1-rx-0
  33:   1032 43   2392  5   1797265   
1507 20  IR-PCI-MSI-edge  eth1-tx-0
  34:  0  1  0  0  0  1 
 2  0  IR-PCI-MSI-edge  eth1
  35:  5  

Re: [Xen-devel] xhci_hcd intterrupt affinity in Dom0/DomU limited to single interrupt

2015-09-01 Thread Konrad Rzeszutek Wilk
On Tue, Sep 01, 2015 at 05:39:46PM +, Justin Acker wrote:
> Taking this to the dev list from users. 
> 
> Is there a way to force or enable pirq delivery to a set of cpus as opposed 
> to single device from being a assigned a single pirq so that its interrupt 
> can be distributed across multiple cpus? I believe the device drivers do 
> support multiple queues when run natively without the Dom0 loaded. The device 
> in question is the xhci_hcd driver for which I/O transfers seem to be slowed 
> when the Dom0 is loaded. The behavior seems to pass through to the DomU if 
> pass through is enabled. I found some similar threads, but most relate to 
> Ethernet controllers. I tried some of the x2apic and x2apic_phys dom0 kernel 
> arguments, but none distributed the pirqs. Based on the reading relating to 
> IRQs for Xen, I think pinning the pirqs to cpu0 is done to avoid an interrupt 
> storm. I tried IRQ balance and when configured/adjusted it will balance 
> individual pirqs, but not multiple interrupts.

Yes. You can do it with smp affinity:

https://cs.uwaterloo.ca/~brecht/servers/apic/SMP-affinity.txt

I don't follow the "behavior seems to pass through to the DomU if pass through 
is enabled" ?

> 
> 
> 
> With irqbalance enabled in Dom0:

What version? There was a bug in it where it would never distribute the IRQs 
properly
across the CPUs.

Boris (CC-ed) might remember the upstream patch that made this work properly?
> 
>    CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   
> CPU6   CPU7  
>  76:  11304  0 149579  0  0  0
>   0  0  xen-pirq-msi   :00:1f.2
>  77:   1243  0  0  35447  0  0
>   0  0  xen-pirq-msi   radeon
>  78:  82521  0  0  0  0  0
>   0  0  xen-pirq-msi   xhci_hcd
>  79: 23  0  0  0  0  0
>   0  0  xen-pirq-msi   mei_me
>  80: 11  0  0  0  0    741
>   0  0  xen-pirq-msi   em1
>  81:    350  0  0  0   1671  0
>   0  0  xen-pirq-msi   iwlwifi
>  82:    275  0  0  0  0  0
>   0  0  xen-pirq-msi   snd_hda_intel
> 
> With native 3.19 kernel:
> 
> Without Dom0 for the same system from the first message:
> 
> # cat /proc/interrupts
>    CPU0   CPU1   CPU2   CPU3   CPU4   CPU5   
> CPU6   CPU7  
>   0: 33  0  0  0  0  0
>   0  0  IR-IO-APIC-edge  timer
>   8:  0  0  0  0  0  0
>   1  0  IR-IO-APIC-edge  rtc0
>   9: 20  0  0  0  0  1
>   1  1  IR-IO-APIC-fasteoi   acpi
>  16: 15  0  8  1  4  1
>   1  1  IR-IO-APIC  16-fasteoi   ehci_hcd:usb3
>  18: 703940   5678    1426226   1303    3938243 111477 
> 757871    510  IR-IO-APIC  18-fasteoi   ath9k
>  23: 11  2  3  0  0 17
>   2  0  IR-IO-APIC  23-fasteoi   ehci_hcd:usb4
>  24:  0  0  0  0  0  0
>   0  0  DMAR_MSI-edge  dmar0
>  25:  0  0  0  0  0  0
>   0  0  DMAR_MSI-edge  dmar1
>  26:  20419   1609  26822    567  62281   5426  
> 14928    395  IR-PCI-MSI-edge  :00:1f.2
>  27:   17977230 628258   44247270 120391 1597809883   14440991  
> 152189328  73322  IR-PCI-MSI-edge  xhci_hcd
>  28:    563  0  0  0  1  0
>   6  0  IR-PCI-MSI-edge  i915
>  29: 14  0  0  4  2  4
>   0  0  IR-PCI-MSI-edge  mei_me
>  30:  39514   1744  60339    157 129956  19702  
> 72140 83  IR-PCI-MSI-edge  eth0
>  31:  3  0  0  1 54  0
>   0  2  IR-PCI-MSI-edge  snd_hda_intel
>  32:  28145    284  53316 63 139165   4410  
> 25760 27  IR-PCI-MSI-edge  eth1-rx-0
>  33:   1032 43   2392  5   1797    265   
> 1507 20  IR-PCI-MSI-edge  eth1-tx-0
>  34:  0  1  0  0  0  1
>   2  0  IR-PCI-MSI-edge  eth1
>  35:  5  0  0 12    148  6
>   2  1  IR-PCI-MSI-edge  snd_hda_intel
> 
> 
> The