Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-15 Thread Wu, Feng


 -Original Message-
 From: Tian, Kevin
 Sent: Tuesday, March 10, 2015 10:22 AM
 To: Wu, Feng; xen-devel@lists.xen.org
 Cc: Jan Beulich; Zhang, Yang Z
 Subject: RE: VT-d Posted-interrupt (PI) design for XEN
 
  From: Wu, Feng
  Sent: Wednesday, March 04, 2015 9:30 PM
 
  VT-d Posted-interrupt (PI) design for XEN
 
  Background
  ==
  With the development of virtualization, there are more and more device
  assignment requirements. However, today when a VM is running with
  assigned devices (such as, NIC), external interrupt handling for the 
  assigned
  devices always needs VMM intervention.
 
  VT-d Posted-interrupt is a more enhanced method to handle interrupts
  in the virtualization environment. Interrupt posting is the process by
  which an interrupt request is recorded in a memory-resident
  posted-interrupt-descriptor structure by the root-complex, followed by
  an optional notification event issued to the CPU complex.
 
  With VT-d Posted-interrupt we can get the following advantages:
  - Directly delivery of external interrupts to running vCPUs without VMM
  intervention
 
 Directly - Direct
 
  - Decease the interrupt migration complexity. On vCPU migration, software
  can atomically co-migrate all interrupts targeting the migrating vCPU.
 
 could you elaborate this benefit? I didn't see discussion around migration
 throughout the proposal.
 
 
 
  Posted-interrupt Introduction
  
  There are two components to the Posted-interrupt architecture:
  Processor Support and Root-Complex Support
 
  - Processor Support
  Posted-interrupt processing is a feature by which a processor processes
  the virtual interrupts by recording them as pending on the virtual-APIC
  page.
 
  Posted-interrupt processing is enabled by setting the process posted
  interrupts VM-execution control. The processing is performed in response
  to the arrival of an interrupt with the posted-interrupt notification 
  vector.
  In response to such an interrupt, the processor processes virtual interrupts
  recorded in a data structure called a posted-interrupt descriptor.
 
  More information about APICv and CPU-side Posted-interrupt, please refer
  to Chapter 29, and Section 29.6 in the Intel SDM:
 
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/6
  4-ia-32-architectures-software-developer-manual-325462.pdf
 
  - Root-Complex Support
  Interrupt posting is the process by which an interrupt request (from IOAPIC
  or MSI/MSIx capable sources) is recorded in a memory-resident
  posted-interrupt-descriptor structure by the root-complex, followed by
  an optional notification event issued to the CPU complex. The interrupt
  request arriving at the root-complex carry the identity of the interrupt
  request source and a 'remapping-index'. The remapping-index is used to
  look-up an entry from the memory-resident interrupt-remap-table. Unlike
  with interrupt-remapping, the interrupt-remap-table-entry for a posted-
  interrupt, specifies a virtual-vector and a pointer to the posted-interrupt
  descriptor. The virtual-vector specifies the vector of the interrupt to be
  recorded in the posted-interrupt descriptor. The posted-interrupt descriptor
  hosts storage for the virtual-vectors and contains the attributes of the
  notification event (interrupt) to be issued to the CPU complex to inform
  CPU/software about pending interrupts recorded in the posted-interrupt
  descriptor.
 
  More information about VT-d PI, please refer to
 
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog
  y/vt-directed-io-spec.html
 
 
  Design Overview
  ==
  In this design, we will cover the following items:
  1. Add a variant to control whether enable VT-d posted-interrupt or not.
  2. VT-d PI feature detection.
  3. Extend posted-interrupt descriptor structure to cover VT-d PI specific 
  stuff.
  4. Extend IRTE structure to support VT-d PI.
  5. Introduce a new global vector which is used for waking up the HLT'ed 
  vCPU.
 
 HLT'ed - blocked
 
  6. Update IRTE when guest modifies the interrupt configuration (MSI/MSIx
  configuration).
  7. Update posted-interrupt descriptor during vCPU scheduling (when the state
  of the vCPU is transmitted among RUNSTATE_running / RUNSTATE_blocked/
  RUNSTATE_runnable / RUNSTATE_offline).
  8. New boot command line for Xen, which controls VT-d PI feature by user.
  9. Multicast/broadcast and lowest priority interrupts consideration.
 
 
 add a step on notification handler, as what you described in another mail.
 
 
  Implementation details
  ===
  - New variant to control VT-d PI
  Like variant 'iommu_intremap' for interrupt remapping, it is very
  straightforward
  to add a new one 'iommu_intpost' for posted-interrupt. 'iommu_intpost' is 
  set
  only when interrupt remapping and VT-d posted-interrupt are both enabled.
 
  - VT-d PI feature detection.
  Bit 59 in VT-d Capability Register is used to report 

Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-15 Thread Wu, Feng


 -Original Message-
 From: Tian, Kevin
 Sent: Tuesday, March 10, 2015 10:01 AM
 To: Andrew Cooper; Tim Deegan; Wu, Feng
 Cc: Zhang, Yang Z; Jan Beulich; xen-devel@lists.xen.org
 Subject: RE: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
  From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
  Sent: Monday, March 09, 2015 7:46 PM
 
  On 09/03/15 10:33, Tim Deegan wrote:
   At 02:03 + on 09 Mar (1425863009), Wu, Feng wrote:
  
   -Original Message-
   From: Tim Deegan [mailto:t...@xen.org]
   Sent: Friday, March 06, 2015 5:44 PM
   To: Wu, Feng
   Cc: Jan Beulich; Zhang, Yang Z; Tian, Kevin; xen-devel@lists.xen.org
   Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
  
   At 02:07 + on 06 Mar (1425604054), Wu, Feng wrote:
   From: Tim Deegan [mailto:t...@xen.org]
   But I don't understand why we would need a new global vector for
   RUNSTATE_blocked rather than suppressing the posted interrupts as
 you
   suggest for RUNSTATE_runnable.  (Or conversely why not use the new
   global vector for RUNSTATE_runnable too?)
   If we suppress the posted-interrupts when vCPU is blocked, it cannot
   be unblocked by the external interrupts, this is not correct.
   OK, I don't understand at all now. :)  When the posted interrupt is
   suppressed, what happens to the interrupt?
   When the posted interrupt is suppressed, VT-d engine will not issue
   notification events.
  
   If it's just dropped, then we can't use that for _any_ cases.
   We can suppress the posted-interrupt when vCPU is waiting in the
 runqueue
   (vCPU is in RUNSTATE_runnable state), it is not needed to send 
   notification
   event when vCPU is in this state, since when interrupt happens, the
  interrupt
   information are not _dropped_, instead, they are stored in PIR, and this
 will
   be synced to vIRR before VM-Entry.
   So you think you can use the same system for RUNSTATE_runnable as
   RUNSTATE_blocked?  That seems like a good idea.
  
   I'll leave the details (e.g. single global vector + queue vs any other
   way to wake the vcpu) to people who know the x86 irq code better than
   I do. :)
 
  From my reading the relevant section in the VT-d spec, to the best of my
  understanding:
 
  We only need the second vector if Xen wishes to be informed that an
  interrupt has been queued for a vcpu.  The spec suggests that, for one
  usecase, this information should affect scheduling decisions.
 
  If we do not wish to make scheduling alterations based on interrupt
  delivery, the extra vector can be ignored.
 
  If we do wish to make scheduling alterations, we will need to be able to
  uniquely identify a vcpu from a vector, which will involve allocating
  one vector per vcpu.
 
 
  If my understanding is correct, I would suggest that Xen opt for not
  getting notifications.  Interrupting one guest to indicate that another
  vcpu has been interrupted scales progressively worse with the number of
  running VMs, and there are existing usecases which have already
  exhausted the x86 vector space completely.
 
  It might be sensible to have the option available as a per-domain opt-in
  option.  A usecase such as device driver domain could easily want to
  deal with its interrupts ahead of running the domains it is servicing.
 
 
 IMO we don't need such opt. An blocked VCPU may not be woken up
 when losing a virtual interrupt notification, and if you look at earlier
 reply to Jan it's not necessarily to have one-vector-per-vcpu. It's just
 a global vector, which when sent to a specific pcpu, the handler will
 walk through blocked vcpus on that pcpu to decide which one should
 be woken up. So only one new vector is required.
 
 from Feng's design, the notification may be disabled in one scenario,
 i.e. when vcpu is in runnable state. That works if real-time is not
 considered since we know runnable vcpu is already unblocked. Later
 when considering real-time, this notification will be required too.

Thanks for your clarification, Kevin!

Thanks,
Feng

 
 Thanks
 Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-15 Thread Wu, Feng


 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: Monday, March 09, 2015 7:46 PM
 To: Tim Deegan; Wu, Feng
 Cc: Zhang, Yang Z; Tian, Kevin; Jan Beulich; xen-devel@lists.xen.org
 Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
 On 09/03/15 10:33, Tim Deegan wrote:
  At 02:03 + on 09 Mar (1425863009), Wu, Feng wrote:
 
  -Original Message-
  From: Tim Deegan [mailto:t...@xen.org]
  Sent: Friday, March 06, 2015 5:44 PM
  To: Wu, Feng
  Cc: Jan Beulich; Zhang, Yang Z; Tian, Kevin; xen-devel@lists.xen.org
  Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
  At 02:07 + on 06 Mar (1425604054), Wu, Feng wrote:
  From: Tim Deegan [mailto:t...@xen.org]
  But I don't understand why we would need a new global vector for
  RUNSTATE_blocked rather than suppressing the posted interrupts as you
  suggest for RUNSTATE_runnable.  (Or conversely why not use the new
  global vector for RUNSTATE_runnable too?)
  If we suppress the posted-interrupts when vCPU is blocked, it cannot
  be unblocked by the external interrupts, this is not correct.
  OK, I don't understand at all now. :)  When the posted interrupt is
  suppressed, what happens to the interrupt?
  When the posted interrupt is suppressed, VT-d engine will not issue
  notification events.
 
  If it's just dropped, then we can't use that for _any_ cases.
  We can suppress the posted-interrupt when vCPU is waiting in the runqueue
  (vCPU is in RUNSTATE_runnable state), it is not needed to send notification
  event when vCPU is in this state, since when interrupt happens, the
 interrupt
  information are not _dropped_, instead, they are stored in PIR, and this 
  will
  be synced to vIRR before VM-Entry.
  So you think you can use the same system for RUNSTATE_runnable as
  RUNSTATE_blocked?  That seems like a good idea.
 
  I'll leave the details (e.g. single global vector + queue vs any other
  way to wake the vcpu) to people who know the x86 irq code better than
  I do. :)
 
 From my reading the relevant section in the VT-d spec, to the best of my
 understanding:
 
 We only need the second vector if Xen wishes to be informed that an
 interrupt has been queued for a vcpu.  The spec suggests that, for one
 usecase, this information should affect scheduling decisions.
 
 If we do not wish to make scheduling alterations based on interrupt
 delivery, the extra vector can be ignored.

As I mentioned in the previous mail in this thread, the second vector is used to
wake up the blocked vCPU when external interrupts is coming for the vCPU.

Thanks,
Feng

 
 If we do wish to make scheduling alterations, we will need to be able to
 uniquely identify a vcpu from a vector, which will involve allocating
 one vector per vcpu.
 
 
 If my understanding is correct, I would suggest that Xen opt for not
 getting notifications.  Interrupting one guest to indicate that another
 vcpu has been interrupted scales progressively worse with the number of
 running VMs, and there are existing usecases which have already
 exhausted the x86 vector space completely.
 
 It might be sensible to have the option available as a per-domain opt-in
 option.  A usecase such as device driver domain could easily want to
 deal with its interrupts ahead of running the domains it is servicing.
 
 ~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-09 Thread Andrew Cooper
On 09/03/15 10:33, Tim Deegan wrote:
 At 02:03 + on 09 Mar (1425863009), Wu, Feng wrote:

 -Original Message-
 From: Tim Deegan [mailto:t...@xen.org]
 Sent: Friday, March 06, 2015 5:44 PM
 To: Wu, Feng
 Cc: Jan Beulich; Zhang, Yang Z; Tian, Kevin; xen-devel@lists.xen.org
 Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

 At 02:07 + on 06 Mar (1425604054), Wu, Feng wrote:
 From: Tim Deegan [mailto:t...@xen.org]
 But I don't understand why we would need a new global vector for
 RUNSTATE_blocked rather than suppressing the posted interrupts as you
 suggest for RUNSTATE_runnable.  (Or conversely why not use the new
 global vector for RUNSTATE_runnable too?)
 If we suppress the posted-interrupts when vCPU is blocked, it cannot
 be unblocked by the external interrupts, this is not correct.
 OK, I don't understand at all now. :)  When the posted interrupt is
 suppressed, what happens to the interrupt? 
 When the posted interrupt is suppressed, VT-d engine will not issue
 notification events.

 If it's just dropped, then we can't use that for _any_ cases. 
 We can suppress the posted-interrupt when vCPU is waiting in the runqueue
 (vCPU is in RUNSTATE_runnable state), it is not needed to send notification
 event when vCPU is in this state, since when interrupt happens, the interrupt
 information are not _dropped_, instead, they are stored in PIR, and this will
 be synced to vIRR before VM-Entry.
 So you think you can use the same system for RUNSTATE_runnable as
 RUNSTATE_blocked?  That seems like a good idea. 

 I'll leave the details (e.g. single global vector + queue vs any other
 way to wake the vcpu) to people who know the x86 irq code better than
 I do. :)

From my reading the relevant section in the VT-d spec, to the best of my
understanding:

We only need the second vector if Xen wishes to be informed that an
interrupt has been queued for a vcpu.  The spec suggests that, for one
usecase, this information should affect scheduling decisions.

If we do not wish to make scheduling alterations based on interrupt
delivery, the extra vector can be ignored.

If we do wish to make scheduling alterations, we will need to be able to
uniquely identify a vcpu from a vector, which will involve allocating
one vector per vcpu.


If my understanding is correct, I would suggest that Xen opt for not
getting notifications.  Interrupting one guest to indicate that another
vcpu has been interrupted scales progressively worse with the number of
running VMs, and there are existing usecases which have already
exhausted the x86 vector space completely.

It might be sensible to have the option available as a per-domain opt-in
option.  A usecase such as device driver domain could easily want to
deal with its interrupts ahead of running the domains it is servicing.

~Andrew


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-09 Thread Tian, Kevin
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: Monday, March 09, 2015 7:46 PM
 
 On 09/03/15 10:33, Tim Deegan wrote:
  At 02:03 + on 09 Mar (1425863009), Wu, Feng wrote:
 
  -Original Message-
  From: Tim Deegan [mailto:t...@xen.org]
  Sent: Friday, March 06, 2015 5:44 PM
  To: Wu, Feng
  Cc: Jan Beulich; Zhang, Yang Z; Tian, Kevin; xen-devel@lists.xen.org
  Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
  At 02:07 + on 06 Mar (1425604054), Wu, Feng wrote:
  From: Tim Deegan [mailto:t...@xen.org]
  But I don't understand why we would need a new global vector for
  RUNSTATE_blocked rather than suppressing the posted interrupts as you
  suggest for RUNSTATE_runnable.  (Or conversely why not use the new
  global vector for RUNSTATE_runnable too?)
  If we suppress the posted-interrupts when vCPU is blocked, it cannot
  be unblocked by the external interrupts, this is not correct.
  OK, I don't understand at all now. :)  When the posted interrupt is
  suppressed, what happens to the interrupt?
  When the posted interrupt is suppressed, VT-d engine will not issue
  notification events.
 
  If it's just dropped, then we can't use that for _any_ cases.
  We can suppress the posted-interrupt when vCPU is waiting in the runqueue
  (vCPU is in RUNSTATE_runnable state), it is not needed to send notification
  event when vCPU is in this state, since when interrupt happens, the
 interrupt
  information are not _dropped_, instead, they are stored in PIR, and this 
  will
  be synced to vIRR before VM-Entry.
  So you think you can use the same system for RUNSTATE_runnable as
  RUNSTATE_blocked?  That seems like a good idea.
 
  I'll leave the details (e.g. single global vector + queue vs any other
  way to wake the vcpu) to people who know the x86 irq code better than
  I do. :)
 
 From my reading the relevant section in the VT-d spec, to the best of my
 understanding:
 
 We only need the second vector if Xen wishes to be informed that an
 interrupt has been queued for a vcpu.  The spec suggests that, for one
 usecase, this information should affect scheduling decisions.
 
 If we do not wish to make scheduling alterations based on interrupt
 delivery, the extra vector can be ignored.
 
 If we do wish to make scheduling alterations, we will need to be able to
 uniquely identify a vcpu from a vector, which will involve allocating
 one vector per vcpu.
 
 
 If my understanding is correct, I would suggest that Xen opt for not
 getting notifications.  Interrupting one guest to indicate that another
 vcpu has been interrupted scales progressively worse with the number of
 running VMs, and there are existing usecases which have already
 exhausted the x86 vector space completely.
 
 It might be sensible to have the option available as a per-domain opt-in
 option.  A usecase such as device driver domain could easily want to
 deal with its interrupts ahead of running the domains it is servicing.
 

IMO we don't need such opt. An blocked VCPU may not be woken up
when losing a virtual interrupt notification, and if you look at earlier
reply to Jan it's not necessarily to have one-vector-per-vcpu. It's just
a global vector, which when sent to a specific pcpu, the handler will
walk through blocked vcpus on that pcpu to decide which one should
be woken up. So only one new vector is required.

from Feng's design, the notification may be disabled in one scenario,
i.e. when vcpu is in runnable state. That works if real-time is not
considered since we know runnable vcpu is already unblocked. Later
when considering real-time, this notification will be required too.

Thanks
Kevin

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-08 Thread Wu, Feng


 -Original Message-
 From: Tim Deegan [mailto:t...@xen.org]
 Sent: Friday, March 06, 2015 5:44 PM
 To: Wu, Feng
 Cc: Jan Beulich; Zhang, Yang Z; Tian, Kevin; xen-devel@lists.xen.org
 Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
 At 02:07 + on 06 Mar (1425604054), Wu, Feng wrote:
   From: Tim Deegan [mailto:t...@xen.org]
   But I don't understand why we would need a new global vector for
   RUNSTATE_blocked rather than suppressing the posted interrupts as you
   suggest for RUNSTATE_runnable.  (Or conversely why not use the new
   global vector for RUNSTATE_runnable too?)
 
  If we suppress the posted-interrupts when vCPU is blocked, it cannot
  be unblocked by the external interrupts, this is not correct.
 
 OK, I don't understand at all now. :)  When the posted interrupt is
 suppressed, what happens to the interrupt? 

When the posted interrupt is suppressed, VT-d engine will not issue
notification events.

 If it's just dropped, then we can't use that for _any_ cases. 

We can suppress the posted-interrupt when vCPU is waiting in the runqueue
(vCPU is in RUNSTATE_runnable state), it is not needed to send notification
event when vCPU is in this state, since when interrupt happens, the interrupt
information are not _dropped_, instead, they are stored in PIR, and this will
be synced to vIRR before VM-Entry.

 If it goes through the old path,
 via the vlapic, that should be enough to wake any HLT'ed vcpu.  It
 sounds like it might be a little slower, but not necessarily once
 you've had to add a new list of potentially-HLT'd-and-wakeable vcpus,
 especially with many idle vcpus.


When Posted-interrupt is used, how to go to the old path?

Thanks,
Feng

Thanks,
Feng

 
 Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-06 Thread Tim Deegan
At 02:07 + on 06 Mar (1425604054), Wu, Feng wrote:
  From: Tim Deegan [mailto:t...@xen.org]
  But I don't understand why we would need a new global vector for
  RUNSTATE_blocked rather than suppressing the posted interrupts as you
  suggest for RUNSTATE_runnable.  (Or conversely why not use the new
  global vector for RUNSTATE_runnable too?)
 
 If we suppress the posted-interrupts when vCPU is blocked, it cannot
 be unblocked by the external interrupts, this is not correct.

OK, I don't understand at all now. :)  When the posted interrupt is
suppressed, what happens to the interrupt?  If it's just dropped, then
we can't use that for _any_ cases.  If it goes through the old path,
via the vlapic, that should be enough to wake any HLT'ed vcpu.  It
sounds like it might be a little slower, but not necessarily once
you've had to add a new list of potentially-HLT'd-and-wakeable vcpus,
especially with many idle vcpus.

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-05 Thread Wu, Feng


 -Original Message-
 From: Tim Deegan [mailto:t...@xen.org]
 Sent: Thursday, March 05, 2015 8:03 PM
 To: Jan Beulich
 Cc: Wu, Feng; Zhang, Yang Z; Tian, Kevin; xen-devel@lists.xen.org
 Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
 Hi,
 
 At 08:52 + on 05 Mar (1425541947), Jan Beulich wrote:
   On 05.03.15 at 09:29, feng...@intel.com wrote:
   From: Jan Beulich [mailto:jbeul...@suse.com]
   Sent: Thursday, March 05, 2015 3:13 PM
   And if it can know, why couldn't the handler for
   posted_intr_vector not know either (i.e. after introducing a specific
   handler for it in place of the currently used event_check_interrupt)?
  
   Come back to the above scenario, vCPU1 is running on pCPU0 while vCPU0
   is blocked, if we still use posted_intr_vector for the blocked vCPU0. If
   vCPU1
   is running in non-root mode and external interrupts happen for it, the
   notification
   event will be handled by CPU hardware (in non-root mode) automatically,
   then we cannot get any control in the handler for posted_intr_vector.
 
  And how would this be different with your separate new vector? I
  feel I'm missing something, but I'm afraid I have to rely on you to
  point out what it is. Just again - please explain what it is you need
  two global vectors for that can't be done with one.
 
 I think the relevant detail is that the posted_intr_vector is consumed
 by the CPU's posted-interrupt logic and doesn't cause an exit to Xen.
 

Exactly!

 But I don't understand why we would need a new global vector for
 RUNSTATE_blocked rather than suppressing the posted interrupts as you
 suggest for RUNSTATE_runnable.  (Or conversely why not use the new
 global vector for RUNSTATE_runnable too?)

If we suppress the posted-interrupts when vCPU is blocked, it cannot
be unblocked by the external interrupts, this is not correct.

Thanks,
Feng

 
 Cheers,
 
 Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-05 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Thursday, March 05, 2015 6:15 PM
 To: Wu, Feng
 Cc: Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org
 Subject: RE: VT-d Posted-interrupt (PI) design for XEN
 
  On 05.03.15 at 10:07, feng...@intel.com wrote:
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Thursday, March 05, 2015 4:52 PM
  And how would this be different with your separate new vector? I
  feel I'm missing something, but I'm afraid I have to rely on you to
  point out what it is. Just again - please explain what it is you need
  two global vectors for that can't be done with one.
 
  Stilling using the above scenario, if vCPU1 is running in non-root mode
  and external interrupts happen for vCPU0 (who is HLT'ed).
 
  If using 'posted_intr_vector' for vCPU0 and 'posted_intr_vector' is also
  used for other vCPUs, including vCPU1. VT-d engine will issue notification
  event using this global vector, and this SPECIAL vector will be handled
  this way: (from Section 29.6 in the Intel SDM:
 
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/6
 4-ia-32-ar
  chitectures-software-developer-manual-325462.pdf)
 
  1. The local APIC is acknowledged; this provides the processor core with an
  interrupt vector, called here the
  physical vector.
  2. If the physical vector equals the posted-interrupt notification vector,
  the logical processor continues to the next
  step. Otherwise, a VM exit occurs as it would normally due to an external
  interrupt; the vector is saved in the
  VM-exit interruption-information field.
  3. The processor clears the outstanding-notification bit in the
  posted-interrupt descriptor. This is done atomically
  so as to leave the remainder of the descriptor unmodified (e.g., with a
  locked AND operation).
  4. The processor writes zero to the EOI register in the local APIC; this
  dismisses the interrupt with the postedinterrupt
  notification vector from the local APIC.
  5. The logical processor performs a logical-OR of PIR into VIRR and clears
  PIR. No other agent can read or write a
  PIR bit (or group of bits) between the time it is read (to determine what to
  OR into VIRR) and when it is cleared.
  6. The logical processor sets RVI to be the maximum of the old value of RVI
  and the highest index of all bits that
  were set in PIR; if no bit was set in PIR, RVI is left unmodified.
  7. The logical processor evaluates pending virtual interrupts as described
  in Section 29.2.1.
 
  This is totally handled by CPU hardware, so we cannot get control in the
  handler for posted_intr_vector.
 
  OTOH, if using 'pi_wakeup_vector' for vCPU0, VT-d engine will issue
  notification event using this new vector,
  Since this new vector is not a SPECIAL one to CPU, it is just a normal
  vector. To cpu, it just receives an normal
  external interrupt, then we can get control in the handler of this new
  vector. In this case, hypervisor can
  do something in it, such as wakeup the HLT'ed vCPU.
 
  Hope this can clarify your confusion.
 
 Thanks, yes - it is this vector-is-special-to-CPU that makes a second
 vector necessary. Please make sure this is being properly explained in
 the description and/or code comments of the patches to come (of
 course without need to quote the SDM, but a reference to the
 respective section may be useful).

Sure, I will add the description later!

So things are a little clear now, could you please take some time to
review this design again and give more comments? Thanks a lot!!

Thanks,
Feng

 
 Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-05 Thread Wu, Feng


 -Original Message-
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Thursday, March 05, 2015 3:13 PM
 To: Wu, Feng
 Cc: Tian, Kevin; Zhang, Yang Z; xen-devel@lists.xen.org
 Subject: RE: VT-d Posted-interrupt (PI) design for XEN
 
  On 05.03.15 at 06:04, feng...@intel.com wrote:
  From: Jan Beulich [mailto:jbeul...@suse.com]
  Sent: Wednesday, March 04, 2015 11:19 PM
   On 04.03.15 at 14:30, feng...@intel.com wrote:
   - Introduce a new global vector which is used to wake up the HLT'ed vCPU.
   Currently, there is a global vector 'posted_intr_vector', which is used 
   as
   the
   global notification vector for all vCPUs in the system. This vector is
   stored in
   VMCS and CPU considers it as a special vector, uses it to notify the 
   related
   pCPU when an interrupt is recorded in the posted-interrupt descriptor.
  
   After having VT-d PI, VT-d engine can issue notification event when the
   assigned devices issue interrupts. We need add a new global vector to
   wakeup the HLT'ed vCPU, please refer to the following scenario for the
   usage of this new global vector:
  
   1. vCPU0 is running on pCPU0
   2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0
   3. An external interrupt from an assigned device occurs for vCPU0, if we
   still use 'posted_intr_vector' as the notification vector for vCPU0, the
   notification event for vCPU0 (the event will go to pCPU1) will be 
   consumed
   by vCPU1 incorrectly. The worst case is that vCPU0 will never be woken up
   again since the wakeup event for it is always consumed by other vCPUs
   incorrectly. So we need introduce another global vector, naming
   'pi_wakeup_vector'
   to wake up the HTL'ed vCPU.
 
  I'm afraid you describe a particular scenario here, but I don't see
  how this is related to the introduction of another global vector:
  Either the current (global) vector is sufficient, or another global
  vector also can't solve your problem. I'm sure I'm missing something
  here, so please be explicit.
 
 
  In fact, the new global vector is used for the above scenario. Let me
  explain this a bit more:
 
  After having VT-d PI, when an external interrupt from an assigned device
  happens,
  here is the hardware processing flow:
 
  1. Interrupts happen.
  2. Find the associated IRTE.
  3. Find the destination vCPU from IRTE (from Posted-interrupt descriptor
  address)
  4. Sync the interrupt (stored in IRTE as 'virtual vector') to PIRR fields in
  Posted-interrupt descriptor.
  5. If needed (Please refer to the VT-d Spec about the condition of issuing
  Notification Event),
  issue notification event to the destination CPU which is store in
  posted-interrupt descriptor as 'NDST'
 
  Back to the above scenario:
  1. vCPU0 is running in pCPU0, and the 'NDST' filed of vCPU0's
  posted-interrupt descriptor is pCPU0
  2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0.
  3. An external interrupt from an assigned device happens, the destination of
  this interrupt will be
  determined as above flow (IRTE -- posted-interrupt descriptor address/vCPU
 --
  notification event to 'NDST'),
  If this external interrupt is for vCPU0, the notification event will be
  delivered to pCPU0 since the 'NDST' field
  of vCPU0's posted-interrupt descriptor is pCPU0. if we use the current
  (global) vector for the notification event
  for vCPU0 in the above case, since the current global vector (notification
  vector) is a particular vector to CPU,
  vCPU1 will consume it while vCPU1 is currently running on pCPU0, so we
  failed to wake up the HLT'ed vCPU0.
 
  please refer to Section 29.6 in the Intel SDM about how CPU handles this
  particular vector:
 
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/6
 4-ia-32-ar
  chitectures-software-developer-manual-325462.pdf
 
  After introducing a new global vector naming 'pi_wakeup_vector', before
 vCPU
  is being HLT'ed, we set
  The 'NV' filed (Notification Vector) in the vCPU's posted-interrupt
  descriptor to 'pi_wakeup_vector', and
  this is a normal vector to CPU and CPU will not do special things for it
  (different from the current global vector).
  In the handler of this vector, we can wake up the HLT'ed vCPU.
 
 So suppose you have more than on vCPU which most recently ran on
 pCPU0 - how will the handler for the new vector know which of the
 vCPU-s it should kick? 

Oh, sorry, I thought I had added how the wakeup the HLT'ed vCPU in this design,
Seems I missed it. Here is it.

1. Define a per-cpu list 'blocked_vcpu_on_cpu_lock', which stored the blocked
vCPU on the pCPU.
2. When the vCPU's state is changed to RUNSTATE_blocked, insert the vCPU
to the per-cpu list belonging to the pCPU it was running
3. When the vCPU is unblocked, remove the vCPU from the related pCPU list.

In the handler of 'pi_wakeup_vector', we do:
1. Get the physical CPU.
2. Iterate the list 'blocked_vcpu_on_cpu_lock' of the current pCPU, if 'ON' is 
set,
we unblock the associated vCPU

Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-05 Thread Jan Beulich
 On 05.03.15 at 10:07, feng...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Thursday, March 05, 2015 4:52 PM
 And how would this be different with your separate new vector? I
 feel I'm missing something, but I'm afraid I have to rely on you to
 point out what it is. Just again - please explain what it is you need
 two global vectors for that can't be done with one.
 
 Stilling using the above scenario, if vCPU1 is running in non-root mode
 and external interrupts happen for vCPU0 (who is HLT'ed).
 
 If using 'posted_intr_vector' for vCPU0 and 'posted_intr_vector' is also
 used for other vCPUs, including vCPU1. VT-d engine will issue notification
 event using this global vector, and this SPECIAL vector will be handled
 this way: (from Section 29.6 in the Intel SDM:
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-ar
  
 chitectures-software-developer-manual-325462.pdf)
 
 1. The local APIC is acknowledged; this provides the processor core with an 
 interrupt vector, called here the
 physical vector.
 2. If the physical vector equals the posted-interrupt notification vector, 
 the logical processor continues to the next
 step. Otherwise, a VM exit occurs as it would normally due to an external 
 interrupt; the vector is saved in the
 VM-exit interruption-information field.
 3. The processor clears the outstanding-notification bit in the 
 posted-interrupt descriptor. This is done atomically
 so as to leave the remainder of the descriptor unmodified (e.g., with a 
 locked AND operation).
 4. The processor writes zero to the EOI register in the local APIC; this 
 dismisses the interrupt with the postedinterrupt
 notification vector from the local APIC.
 5. The logical processor performs a logical-OR of PIR into VIRR and clears 
 PIR. No other agent can read or write a
 PIR bit (or group of bits) between the time it is read (to determine what to 
 OR into VIRR) and when it is cleared.
 6. The logical processor sets RVI to be the maximum of the old value of RVI 
 and the highest index of all bits that
 were set in PIR; if no bit was set in PIR, RVI is left unmodified.
 7. The logical processor evaluates pending virtual interrupts as described 
 in Section 29.2.1.
 
 This is totally handled by CPU hardware, so we cannot get control in the 
 handler for posted_intr_vector.
 
 OTOH, if using 'pi_wakeup_vector' for vCPU0, VT-d engine will issue 
 notification event using this new vector,
 Since this new vector is not a SPECIAL one to CPU, it is just a normal 
 vector. To cpu, it just receives an normal
 external interrupt, then we can get control in the handler of this new 
 vector. In this case, hypervisor can
 do something in it, such as wakeup the HLT'ed vCPU.
 
 Hope this can clarify your confusion.

Thanks, yes - it is this vector-is-special-to-CPU that makes a second
vector necessary. Please make sure this is being properly explained in
the description and/or code comments of the patches to come (of
course without need to quote the SDM, but a reference to the
respective section may be useful).

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-04 Thread Jan Beulich
 On 04.03.15 at 14:30, feng...@intel.com wrote:
 - Introduce a new global vector which is used to wake up the HLT'ed vCPU.
 Currently, there is a global vector 'posted_intr_vector', which is used as 
 the
 global notification vector for all vCPUs in the system. This vector is 
 stored in
 VMCS and CPU considers it as a special vector, uses it to notify the related
 pCPU when an interrupt is recorded in the posted-interrupt descriptor.
 
 After having VT-d PI, VT-d engine can issue notification event when the
 assigned devices issue interrupts. We need add a new global vector to
 wakeup the HLT'ed vCPU, please refer to the following scenario for the
 usage of this new global vector:
 
 1. vCPU0 is running on pCPU0
 2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0
 3. An external interrupt from an assigned device occurs for vCPU0, if we
 still use 'posted_intr_vector' as the notification vector for vCPU0, the
 notification event for vCPU0 (the event will go to pCPU1) will be consumed
 by vCPU1 incorrectly. The worst case is that vCPU0 will never be woken up
 again since the wakeup event for it is always consumed by other vCPUs
 incorrectly. So we need introduce another global vector, naming 
 'pi_wakeup_vector'
 to wake up the HTL'ed vCPU.

I'm afraid you describe a particular scenario here, but I don't see
how this is related to the introduction of another global vector:
Either the current (global) vector is sufficient, or another global
vector also can't solve your problem. I'm sure I'm missing something
here, so please be explicit.

 - Update posted-interrupt descriptor during vCPU scheduling
 The basic idea here is:
 1. When vCPU's state is RUNSTATE_running,
 - Set 'NV' to 'posted_intr_vector'.
 - Clear 'SN' to accept posted-interrupts.
 - Set 'NDST' to the pCPU on which the vCPU will be running.
[...]

This is pretty hard to read without knowing what the abbreviations
actually stand for, and suggesting to hunt for them in the spec isn't
very reader friendly either. Please explain these fields, at the very
least by way of comments on the structure fields presented earlier.

 On Xen side, what is your opinion about support lowest-priority interrupts
 for VT-d PI?

I certainly think (as with every other virtualized piece of hardware)
that hardware behavior should be emulated as closely as possible.
I.e. yes, we should have it eventually. As to the two stage approach
mentioned for KVM - I've grown reservations against Intel people
making promises towards future implementation of something, i.e.
I'm kind of hesitant to agree to such an implementation model. Yet
you're to contribute the patches, and I'm surely not planning to veto
a stage-1-only implementation as it would be an improvement anyway.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-04 Thread Andrew Cooper
On 04/03/15 13:30, Wu, Feng wrote:
 VT-d Posted-interrupt (PI) design for XEN

Thankyou very much for this!


 Background
 ==
 With the development of virtualization, there are more and more device
 assignment requirements. However, today when a VM is running with
 assigned devices (such as, NIC), external interrupt handling for the assigned
 devices always needs VMM intervention.

 VT-d Posted-interrupt is a more enhanced method to handle interrupts
 in the virtualization environment. Interrupt posting is the process by
 which an interrupt request is recorded in a memory-resident
 posted-interrupt-descriptor structure by the root-complex, followed by
 an optional notification event issued to the CPU complex.

 With VT-d Posted-interrupt we can get the following advantages:
 - Directly delivery of external interrupts to running vCPUs without VMM
 intervention
 - Decease the interrupt migration complexity. On vCPU migration, software
 can atomically co-migrate all interrupts targeting the migrating vCPU.

I presume you mean Decrease ?

Decease means something quite different.



 Posted-interrupt Introduction
 
 There are two components to the Posted-interrupt architecture:
 Processor Support and Root-Complex Support

 - Processor Support
 Posted-interrupt processing is a feature by which a processor processes
 the virtual interrupts by recording them as pending on the virtual-APIC
 page.

 Posted-interrupt processing is enabled by setting the process posted
 interrupts VM-execution control. The processing is performed in response
 to the arrival of an interrupt with the posted-interrupt notification vector.
 In response to such an interrupt, the processor processes virtual interrupts
 recorded in a data structure called a posted-interrupt descriptor.

 More information about APICv and CPU-side Posted-interrupt, please refer
 to Chapter 29, and Section 29.6 in the Intel SDM:
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

 - Root-Complex Support
 Interrupt posting is the process by which an interrupt request (from IOAPIC
 or MSI/MSIx capable sources) is recorded in a memory-resident
 posted-interrupt-descriptor structure by the root-complex, followed by
 an optional notification event issued to the CPU complex. The interrupt
 request arriving at the root-complex carry the identity of the interrupt
 request source and a 'remapping-index'. The remapping-index is used to
 look-up an entry from the memory-resident interrupt-remap-table. Unlike
 with interrupt-remapping, the interrupt-remap-table-entry for a posted-
 interrupt, specifies a virtual-vector and a pointer to the posted-interrupt
 descriptor. The virtual-vector specifies the vector of the interrupt to be
 recorded in the posted-interrupt descriptor. The posted-interrupt descriptor
 hosts storage for the virtual-vectors and contains the attributes of the
 notification event (interrupt) to be issued to the CPU complex to inform
 CPU/software about pending interrupts recorded in the posted-interrupt
 descriptor.

 More information about VT-d PI, please refer to
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html


 Design Overview
 ==
 In this design, we will cover the following items:
 1. Add a variant to control whether enable VT-d posted-interrupt or not.
 2. VT-d PI feature detection.
 3. Extend posted-interrupt descriptor structure to cover VT-d PI specific 
 stuff.
 4. Extend IRTE structure to support VT-d PI.
 5. Introduce a new global vector which is used for waking up the HLT'ed vCPU.
 6. Update IRTE when guest modifies the interrupt configuration (MSI/MSIx 
 configuration).
 7. Update posted-interrupt descriptor during vCPU scheduling (when the state
 of the vCPU is transmitted among RUNSTATE_running / RUNSTATE_blocked/
 RUNSTATE_runnable / RUNSTATE_offline).
 8. New boot command line for Xen, which controls VT-d PI feature by user.
 9. Multicast/broadcast and lowest priority interrupts consideration.


 Implementation details
 ===
 - New variant to control VT-d PI

I know what you are trying to say, but New variant does not express
what you mean.

A new control relating to VT-d PI perhaps?

 Like variant 'iommu_intremap' for interrupt remapping, it is very 
 straightforward
 to add a new one 'iommu_intpost' for posted-interrupt. 'iommu_intpost' is set
 only when interrupt remapping and VT-d posted-interrupt are both enabled.

I would avoid mixing names such as PI and intpost.  If anything, it
should be iommu_postint to keep the naming consistent.  (Here and
elsewhere).


 - VT-d PI feature detection.
 Bit 59 in VT-d Capability Register is used to report VT-d Posted-interrupt 
 support.

 - Extend posted-interrupt descriptor structure to cover VT-d PI specific 
 stuff.
 Here is the new structure for posted-interrupt descriptor:

 struct pi_desc {
  

Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-04 Thread Wu, Feng


 -Original Message-
 From: Andrew Cooper [mailto:andrew.coop...@citrix.com]
 Sent: Thursday, March 05, 2015 2:48 AM
 To: Wu, Feng; xen-devel@lists.xen.org
 Cc: Zhang, Yang Z; Tian, Kevin; Jan Beulich
 Subject: Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN
 
 On 04/03/15 13:30, Wu, Feng wrote:
  VT-d Posted-interrupt (PI) design for XEN
 
 Thankyou very much for this!
 
 
  Background
  ==
  With the development of virtualization, there are more and more device
  assignment requirements. However, today when a VM is running with
  assigned devices (such as, NIC), external interrupt handling for the 
  assigned
  devices always needs VMM intervention.
 
  VT-d Posted-interrupt is a more enhanced method to handle interrupts
  in the virtualization environment. Interrupt posting is the process by
  which an interrupt request is recorded in a memory-resident
  posted-interrupt-descriptor structure by the root-complex, followed by
  an optional notification event issued to the CPU complex.
 
  With VT-d Posted-interrupt we can get the following advantages:
  - Directly delivery of external interrupts to running vCPUs without VMM
  intervention
  - Decease the interrupt migration complexity. On vCPU migration, software
  can atomically co-migrate all interrupts targeting the migrating vCPU.
 
 I presume you mean Decrease ?

Yes!

 
 Decease means something quite different.

Sorry for the typo. 

 
 
 
  Posted-interrupt Introduction
  
  There are two components to the Posted-interrupt architecture:
  Processor Support and Root-Complex Support
 
  - Processor Support
  Posted-interrupt processing is a feature by which a processor processes
  the virtual interrupts by recording them as pending on the virtual-APIC
  page.
 
  Posted-interrupt processing is enabled by setting the process posted
  interrupts VM-execution control. The processing is performed in response
  to the arrival of an interrupt with the posted-interrupt notification 
  vector.
  In response to such an interrupt, the processor processes virtual interrupts
  recorded in a data structure called a posted-interrupt descriptor.
 
  More information about APICv and CPU-side Posted-interrupt, please refer
  to Chapter 29, and Section 29.6 in the Intel SDM:
 
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/6
 4-ia-32-architectures-software-developer-manual-325462.pdf
 
  - Root-Complex Support
  Interrupt posting is the process by which an interrupt request (from IOAPIC
  or MSI/MSIx capable sources) is recorded in a memory-resident
  posted-interrupt-descriptor structure by the root-complex, followed by
  an optional notification event issued to the CPU complex. The interrupt
  request arriving at the root-complex carry the identity of the interrupt
  request source and a 'remapping-index'. The remapping-index is used to
  look-up an entry from the memory-resident interrupt-remap-table. Unlike
  with interrupt-remapping, the interrupt-remap-table-entry for a posted-
  interrupt, specifies a virtual-vector and a pointer to the posted-interrupt
  descriptor. The virtual-vector specifies the vector of the interrupt to be
  recorded in the posted-interrupt descriptor. The posted-interrupt descriptor
  hosts storage for the virtual-vectors and contains the attributes of the
  notification event (interrupt) to be issued to the CPU complex to inform
  CPU/software about pending interrupts recorded in the posted-interrupt
  descriptor.
 
  More information about VT-d PI, please refer to
 
 http://www.intel.com/content/www/us/en/intelligent-systems/intel-technolog
 y/vt-directed-io-spec.html
 
 
  Design Overview
  ==
  In this design, we will cover the following items:
  1. Add a variant to control whether enable VT-d posted-interrupt or not.
  2. VT-d PI feature detection.
  3. Extend posted-interrupt descriptor structure to cover VT-d PI specific 
  stuff.
  4. Extend IRTE structure to support VT-d PI.
  5. Introduce a new global vector which is used for waking up the HLT'ed 
  vCPU.
  6. Update IRTE when guest modifies the interrupt configuration (MSI/MSIx
 configuration).
  7. Update posted-interrupt descriptor during vCPU scheduling (when the
 state
  of the vCPU is transmitted among RUNSTATE_running / RUNSTATE_blocked/
  RUNSTATE_runnable / RUNSTATE_offline).
  8. New boot command line for Xen, which controls VT-d PI feature by user.
  9. Multicast/broadcast and lowest priority interrupts consideration.
 
 
  Implementation details
  ===
  - New variant to control VT-d PI
 
 I know what you are trying to say, but New variant does not express
 what you mean.
 
 A new control relating to VT-d PI perhaps?
 
  Like variant 'iommu_intremap' for interrupt remapping, it is very
 straightforward
  to add a new one 'iommu_intpost' for posted-interrupt. 'iommu_intpost' is
 set
  only when interrupt remapping and VT-d posted-interrupt are both enabled.
 
 I would

Re: [Xen-devel] VT-d Posted-interrupt (PI) design for XEN

2015-03-04 Thread Jan Beulich
 On 05.03.15 at 06:04, feng...@intel.com wrote:
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, March 04, 2015 11:19 PM
  On 04.03.15 at 14:30, feng...@intel.com wrote:
  - Introduce a new global vector which is used to wake up the HLT'ed vCPU.
  Currently, there is a global vector 'posted_intr_vector', which is used as
  the
  global notification vector for all vCPUs in the system. This vector is
  stored in
  VMCS and CPU considers it as a special vector, uses it to notify the 
  related
  pCPU when an interrupt is recorded in the posted-interrupt descriptor.
 
  After having VT-d PI, VT-d engine can issue notification event when the
  assigned devices issue interrupts. We need add a new global vector to
  wakeup the HLT'ed vCPU, please refer to the following scenario for the
  usage of this new global vector:
 
  1. vCPU0 is running on pCPU0
  2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0
  3. An external interrupt from an assigned device occurs for vCPU0, if we
  still use 'posted_intr_vector' as the notification vector for vCPU0, the
  notification event for vCPU0 (the event will go to pCPU1) will be consumed
  by vCPU1 incorrectly. The worst case is that vCPU0 will never be woken up
  again since the wakeup event for it is always consumed by other vCPUs
  incorrectly. So we need introduce another global vector, naming
  'pi_wakeup_vector'
  to wake up the HTL'ed vCPU.
 
 I'm afraid you describe a particular scenario here, but I don't see
 how this is related to the introduction of another global vector:
 Either the current (global) vector is sufficient, or another global
 vector also can't solve your problem. I'm sure I'm missing something
 here, so please be explicit.
 
 
 In fact, the new global vector is used for the above scenario. Let me
 explain this a bit more:
 
 After having VT-d PI, when an external interrupt from an assigned device 
 happens,
 here is the hardware processing flow:
 
 1. Interrupts happen.
 2. Find the associated IRTE.
 3. Find the destination vCPU from IRTE (from Posted-interrupt descriptor 
 address)
 4. Sync the interrupt (stored in IRTE as 'virtual vector') to PIRR fields in 
 Posted-interrupt descriptor.
 5. If needed (Please refer to the VT-d Spec about the condition of issuing 
 Notification Event),
 issue notification event to the destination CPU which is store in 
 posted-interrupt descriptor as 'NDST'
 
 Back to the above scenario:
 1. vCPU0 is running in pCPU0, and the 'NDST' filed of vCPU0's 
 posted-interrupt descriptor is pCPU0
 2. vCPU0 is HLT'ed and vCPU1 is currently running on pCPU0.
 3. An external interrupt from an assigned device happens, the destination of 
 this interrupt will be
 determined as above flow (IRTE -- posted-interrupt descriptor address/vCPU 
 -- 
 notification event to 'NDST'),
 If this external interrupt is for vCPU0, the notification event will be 
 delivered to pCPU0 since the 'NDST' field
 of vCPU0's posted-interrupt descriptor is pCPU0. if we use the current 
 (global) vector for the notification event
 for vCPU0 in the above case, since the current global vector (notification 
 vector) is a particular vector to CPU,
 vCPU1 will consume it while vCPU1 is currently running on pCPU0, so we 
 failed to wake up the HLT'ed vCPU0.
 
 please refer to Section 29.6 in the Intel SDM about how CPU handles this 
 particular vector:
 http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-ar
  
 chitectures-software-developer-manual-325462.pdf
 
 After introducing a new global vector naming 'pi_wakeup_vector', before vCPU 
 is being HLT'ed, we set
 The 'NV' filed (Notification Vector) in the vCPU's posted-interrupt 
 descriptor to 'pi_wakeup_vector', and
 this is a normal vector to CPU and CPU will not do special things for it 
 (different from the current global vector).
 In the handler of this vector, we can wake up the HLT'ed vCPU.

So suppose you have more than on vCPU which most recently ran on
pCPU0 - how will the handler for the new vector know which of the
vCPU-s it should kick? And if it can know, why couldn't the handler for
posted_intr_vector not know either (i.e. after introducing a specific
handler for it in place of the currently used event_check_interrupt)?
(One of the reasons I'm asking, i.e. apart from wanting to
understand the model, is the limited amount of vectors we have.)

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel