Re: [PATCH 1/2] KVM: Implement default IRQ routing

2013-09-29 Thread Gleb Natapov
On Thu, Sep 26, 2013 at 10:00:59AM +1000, Paul Mackerras wrote:
 On Mon, Sep 23, 2013 at 09:34:01PM +0300, Gleb Natapov wrote:
  On Mon, Sep 23, 2013 at 09:24:14PM +1000, Paul Mackerras wrote:
   On Sun, Sep 22, 2013 at 03:32:53PM +0300, Gleb Natapov wrote:
On Tue, Sep 17, 2013 at 07:18:40PM +1000, Paul Mackerras wrote:
 This implements a simple way to express the case of IRQ routing where
 there is one in-kernel PIC and the system interrupts (GSIs) are routed
 1-1 to the corresponding pins of the PIC.  This is expressed by having
 kvm-irq_routing == NULL with a skeleton irq routing entry in the new
 kvm-default_irq_route field.
 
 This provides a convenient way to provide backwards compatibility when
 adding IRQ routing capability to an existing in-kernel PIC, such as 
 the
 XICS emulation on powerpc.
 
Why not create simple 1-1 irq routing table? It will take a little bit
more memory, but there will be no need for kvm-irq_routing == NULL
special handling.
   
   The short answer is that userspace wants to use interrupt source
   numbers (i.e. pin numbers for the inputs to the emulated XICS) that
   are scattered throughout a large space, since that mirrors what real
   hardware does.  More specifically, hardware divides up the interrupt
   source number into two fields, each of typically 12 bits, where the
   more significant field identifies an interrupt source unit (ISU) and
   the less significant field identifies an interrupt within the ISU.
   Each PCI host bridge would have an ISU, for example, and there can be
   ISUs associated with other things that attach directly to the
   interconnect fabric (coprocessors, cluster interconnects, etc.).
   
   Today, QEMU creates a virtual ISU numbered 1 for the emulated PCI host
   bridge, which means for example that virtio devices get interrupt pin
   numbers starting at 4096.
   
   So, I could have increased KVM_IRQCHIP_NUM_PINS to some multiple of
   4096, say 16384, which would allow for 3 ISUs.  But that would bloat
   out struct kvm_irq_routing_table to over 64kB, and if I wanted 1-1
   mappings between GSI and pins for all of them, the routing table would
   be over 960kB.
   
  Yes, this is not an option. GSI is just a cookie for anything but x86
  non MSI interrupts. So the way to use irq routing table to deliver XICS irqs
  is to register GSI-XICS irq mapping and by triggering GSI, which is
  just an arbitrary number, userspace tells kernel that XICS irq, that was
  registered for that GSI, should be injected.
 
 Yes, that's fine as far as it goes, but the trouble is that the
 existing data structures (specifically the chip[][] array in struct
 kvm_irq_routing_table) don't handle well the case where the pin
 numbers are large and/or sparse.
 
 In other words, using a small compact set of GSI numbers wouldn't
 help, because it's not the GSI - pin mapping that is the problem, it
 is the reverse pin - GSI mapping.
 
That's internal implementation detail that can be redesigned if needed,
we can let architecture define how irq is mapped back to a GSI (cookie).
Buts since you mentioned chip[][] here I have a question about patch 2: in
kvm_set_routing_entry() you check irq against KVM_IRQCHIP_NUM_PINS which
is defined to be 256 for powerpc, but according to what you described
above about 12 bit ISU/irq devision this is not enough to inject any
interrupt with ISU 1. You would have had at least 256 irqs in each ISU
if you reused irqchip to specify ISU, but you didn't do it. Why not extend
a union in kvm_irq_routing_entry with xics specific structure that will
specify ISU/irq withing ISU?

Who irq numbers inside ISU are assigned?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Implement default IRQ routing

2013-09-25 Thread Paul Mackerras
On Mon, Sep 23, 2013 at 09:34:01PM +0300, Gleb Natapov wrote:
 On Mon, Sep 23, 2013 at 09:24:14PM +1000, Paul Mackerras wrote:
  On Sun, Sep 22, 2013 at 03:32:53PM +0300, Gleb Natapov wrote:
   On Tue, Sep 17, 2013 at 07:18:40PM +1000, Paul Mackerras wrote:
This implements a simple way to express the case of IRQ routing where
there is one in-kernel PIC and the system interrupts (GSIs) are routed
1-1 to the corresponding pins of the PIC.  This is expressed by having
kvm-irq_routing == NULL with a skeleton irq routing entry in the new
kvm-default_irq_route field.

This provides a convenient way to provide backwards compatibility when
adding IRQ routing capability to an existing in-kernel PIC, such as the
XICS emulation on powerpc.

   Why not create simple 1-1 irq routing table? It will take a little bit
   more memory, but there will be no need for kvm-irq_routing == NULL
   special handling.
  
  The short answer is that userspace wants to use interrupt source
  numbers (i.e. pin numbers for the inputs to the emulated XICS) that
  are scattered throughout a large space, since that mirrors what real
  hardware does.  More specifically, hardware divides up the interrupt
  source number into two fields, each of typically 12 bits, where the
  more significant field identifies an interrupt source unit (ISU) and
  the less significant field identifies an interrupt within the ISU.
  Each PCI host bridge would have an ISU, for example, and there can be
  ISUs associated with other things that attach directly to the
  interconnect fabric (coprocessors, cluster interconnects, etc.).
  
  Today, QEMU creates a virtual ISU numbered 1 for the emulated PCI host
  bridge, which means for example that virtio devices get interrupt pin
  numbers starting at 4096.
  
  So, I could have increased KVM_IRQCHIP_NUM_PINS to some multiple of
  4096, say 16384, which would allow for 3 ISUs.  But that would bloat
  out struct kvm_irq_routing_table to over 64kB, and if I wanted 1-1
  mappings between GSI and pins for all of them, the routing table would
  be over 960kB.
  
 Yes, this is not an option. GSI is just a cookie for anything but x86
 non MSI interrupts. So the way to use irq routing table to deliver XICS irqs
 is to register GSI-XICS irq mapping and by triggering GSI, which is
 just an arbitrary number, userspace tells kernel that XICS irq, that was
 registered for that GSI, should be injected.

Yes, that's fine as far as it goes, but the trouble is that the
existing data structures (specifically the chip[][] array in struct
kvm_irq_routing_table) don't handle well the case where the pin
numbers are large and/or sparse.

In other words, using a small compact set of GSI numbers wouldn't
help, because it's not the GSI - pin mapping that is the problem, it
is the reverse pin - GSI mapping.

  There is a compatibility concern too -- if I want existing userspace
  to run, I would have to create 1-1 default mappings for at least the
  first (say) 4200 pins or so, which would use up about 294kB.  That
  really doesn't seem worth it compared to just using the null routing
  table pointer to indicate an unlimited 1-1 mapping.
  
 Let me check that I understand you correctly. Exiting userspace already
 uses XICS irq directly and now you want to switch this interface to use
 irq routing. New userspace will register GSI-XICS irq mapping like
 described above, this is what [2/2] does. Is this correct? 

I don't particularly want irq routing, but it appears to be
unavoidable if one wants IRQFD, which I do want.  There is no
particular advantage to userspace in using a GSI - XICS irq mapping.
If userspace did set one up it would be the identity mapping.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Implement default IRQ routing

2013-09-23 Thread Paul Mackerras
On Sun, Sep 22, 2013 at 03:32:53PM +0300, Gleb Natapov wrote:
 On Tue, Sep 17, 2013 at 07:18:40PM +1000, Paul Mackerras wrote:
  This implements a simple way to express the case of IRQ routing where
  there is one in-kernel PIC and the system interrupts (GSIs) are routed
  1-1 to the corresponding pins of the PIC.  This is expressed by having
  kvm-irq_routing == NULL with a skeleton irq routing entry in the new
  kvm-default_irq_route field.
  
  This provides a convenient way to provide backwards compatibility when
  adding IRQ routing capability to an existing in-kernel PIC, such as the
  XICS emulation on powerpc.
  
 Why not create simple 1-1 irq routing table? It will take a little bit
 more memory, but there will be no need for kvm-irq_routing == NULL
 special handling.

The short answer is that userspace wants to use interrupt source
numbers (i.e. pin numbers for the inputs to the emulated XICS) that
are scattered throughout a large space, since that mirrors what real
hardware does.  More specifically, hardware divides up the interrupt
source number into two fields, each of typically 12 bits, where the
more significant field identifies an interrupt source unit (ISU) and
the less significant field identifies an interrupt within the ISU.
Each PCI host bridge would have an ISU, for example, and there can be
ISUs associated with other things that attach directly to the
interconnect fabric (coprocessors, cluster interconnects, etc.).

Today, QEMU creates a virtual ISU numbered 1 for the emulated PCI host
bridge, which means for example that virtio devices get interrupt pin
numbers starting at 4096.

So, I could have increased KVM_IRQCHIP_NUM_PINS to some multiple of
4096, say 16384, which would allow for 3 ISUs.  But that would bloat
out struct kvm_irq_routing_table to over 64kB, and if I wanted 1-1
mappings between GSI and pins for all of them, the routing table would
be over 960kB.

There is a compatibility concern too -- if I want existing userspace
to run, I would have to create 1-1 default mappings for at least the
first (say) 4200 pins or so, which would use up about 294kB.  That
really doesn't seem worth it compared to just using the null routing
table pointer to indicate an unlimited 1-1 mapping.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Implement default IRQ routing

2013-09-23 Thread Gleb Natapov
On Mon, Sep 23, 2013 at 09:24:14PM +1000, Paul Mackerras wrote:
 On Sun, Sep 22, 2013 at 03:32:53PM +0300, Gleb Natapov wrote:
  On Tue, Sep 17, 2013 at 07:18:40PM +1000, Paul Mackerras wrote:
   This implements a simple way to express the case of IRQ routing where
   there is one in-kernel PIC and the system interrupts (GSIs) are routed
   1-1 to the corresponding pins of the PIC.  This is expressed by having
   kvm-irq_routing == NULL with a skeleton irq routing entry in the new
   kvm-default_irq_route field.
   
   This provides a convenient way to provide backwards compatibility when
   adding IRQ routing capability to an existing in-kernel PIC, such as the
   XICS emulation on powerpc.
   
  Why not create simple 1-1 irq routing table? It will take a little bit
  more memory, but there will be no need for kvm-irq_routing == NULL
  special handling.
 
 The short answer is that userspace wants to use interrupt source
 numbers (i.e. pin numbers for the inputs to the emulated XICS) that
 are scattered throughout a large space, since that mirrors what real
 hardware does.  More specifically, hardware divides up the interrupt
 source number into two fields, each of typically 12 bits, where the
 more significant field identifies an interrupt source unit (ISU) and
 the less significant field identifies an interrupt within the ISU.
 Each PCI host bridge would have an ISU, for example, and there can be
 ISUs associated with other things that attach directly to the
 interconnect fabric (coprocessors, cluster interconnects, etc.).
 
 Today, QEMU creates a virtual ISU numbered 1 for the emulated PCI host
 bridge, which means for example that virtio devices get interrupt pin
 numbers starting at 4096.
 
 So, I could have increased KVM_IRQCHIP_NUM_PINS to some multiple of
 4096, say 16384, which would allow for 3 ISUs.  But that would bloat
 out struct kvm_irq_routing_table to over 64kB, and if I wanted 1-1
 mappings between GSI and pins for all of them, the routing table would
 be over 960kB.
 
Yes, this is not an option. GSI is just a cookie for anything but x86
non MSI interrupts. So the way to use irq routing table to deliver XICS irqs
is to register GSI-XICS irq mapping and by triggering GSI, which is
just an arbitrary number, userspace tells kernel that XICS irq, that was
registered for that GSI, should be injected.

 There is a compatibility concern too -- if I want existing userspace
 to run, I would have to create 1-1 default mappings for at least the
 first (say) 4200 pins or so, which would use up about 294kB.  That
 really doesn't seem worth it compared to just using the null routing
 table pointer to indicate an unlimited 1-1 mapping.
 
Let me check that I understand you correctly. Exiting userspace already
uses XICS irq directly and now you want to switch this interface to use
irq routing. New userspace will register GSI-XICS irq mapping like
described above, this is what [2/2] does. Is this correct? 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html