On 11/30/18 2:11 AM, David Gibson wrote:
> On Thu, Nov 29, 2018 at 04:27:31PM +0100, Cédric Le Goater wrote:
>> [ ... ] 
>>
>>>>>> +/*
>>>>>> + * The allocation of VP blocks is a complex operation in OPAL and the
>>>>>> + * VP identifiers have a relation with the number of HW chips, the
>>>>>> + * size of the VP blocks, VP grouping, etc. The QEMU sPAPR XIVE
>>>>>> + * controller model does not have the same constraints and can use a
>>>>>> + * simple mapping scheme of the CPU vcpu_id
>>>>>> + *
>>>>>> + * These identifiers are never returned to the OS.
>>>>>> + */
>>>>>> +
>>>>>> +#define SPAPR_XIVE_VP_BASE 0x400
>>>>>
>>>>> 0x400 == 1024.  Could we ever have the possibility of needing to
>>>>> consider both physical NVTs and PAPR NVTs at the same time?  
>>>>
>>>> They would not be in the same CAM line: OS ring vs. PHYS ring. 
>>>
>>> Hm.  They still inhabit the same NVT number space though, don't they?
>>
>> No. skiboot reserves the range of VPs for the HW at init.
>>
>> https://github.com/open-power/skiboot/blob/master/hw/xive.c#L1093
> 
> Uh.. I don't see how they're reserved is relevant.
> 
> What I mean is that the ENDs address the NVTs for HW endpoints by the
> same (block, index) tuples as the NVTs for virtualized endpoints, yes?

Ah. Yes. The (block, index) tuples, fields END_W6_NVT_BLOCK and 
END_W6_NVT_INDEX in the END structure, are all in the same number space.

skiboot defines some ranges though.


>>> I'm thinking about the END->NVT stage of the process here, rather than
>>> the NVT->TCTX stage.
>>>
>>> Oh, also, you're using "VP" here which IIUC == "NVT".  Can we
>>> standardize on one, please.
>>
>> VP is used in Linux/KVM Linux/Native and skiboot. Yes. it's a mess. 
>> Let's have consistent naming in QEMU and use NVT. 
> 
> Right.  And to cover any inevitable missed ones is why I'd like to see
> a cheatsheet giving both terms in the header comments somewhere.

yes. I have added a list of names in xive.h. 

I was wondering if I should put the diagram below somewhere in a .h file 
or under doc/specs/.

Thanks,

C.  


= XIVE =================================================================

The POWER9 processor comes with a new interrupt controller, called
XIVE as "eXternal Interrupt Virtualization Engine".


* Overall architecture


             XIVE Interrupt Controller
             +------------------------------------+      IPIs
             | +---------+ +---------+ +--------+ |    +-------+
             | |VC       | |CQ       | |PC      |----> | CORES |
             | |     esb | |         | |        |----> |       |
             | |     eas | |  Bridge | |   tctx |----> |       |
             | |SC   end | |         | |    nvt | |    |       |
 +------+    | +---------+ +----+----+ +--------+ |    +-+-+-+-+
 | RAM  |    +------------------|-----------------+      | | |
 |      |                       |                        | | |
 |      |                       |                        | | |
 |      |  +--------------------v------------------------v-v-v--+    other
 |      <--+                     Power Bus                      +--> chips
 |  esb |  +---------+-----------------------+------------------+
 |  eas |            |                       |
 |  end |        +---|-----+                 |
 |  nvt |       +----+----+|            +----+----+
 +------+       |SC       ||            |SC       |
                |         ||            |         |
                | PQ-bits ||            | PQ-bits |
                | local   |+            |  in VC  |
                +---------+             +---------+
                   PCIe                 NX,NPU,CAPI

                  SC: Source Controller (aka. IVSE)
                  VC: Virtualization Controller (aka. IVRE)
                  PC: Presentation Controller (aka. IVPE)
                  CQ: Common Queue (Bridge)

             PQ-bits: 2 bits source state machine (P:pending Q:queued)
                 esb: Event State Buffer (Array of PQ bits in an IVSE)
                 eas: Event Assignment Structure
                 end: Event Notification Descriptor
                 nvt: Notification Virtual Target
                tctx: Thread interrupt Context


The XIVE IC is composed of three sub-engines :

  - Interrupt Virtualization Source Engine (IVSE), or Source
    Controller (SC). These are found in PCI PHBs, in the PSI host
    bridge controller, but also inside the main controller for the
    core IPIs and other sub-chips (NX, CAP, NPU) of the
    chip/processor. They are configured to feed the IVRE with events.

  - Interrupt Virtualization Routing Engine (IVRE) or Virtualization
    Controller (VC). Its job is to match an event source with an Event
    Notification Descriptor (END).

  - Interrupt Virtualization Presentation Engine (IVPE) or Presentation
    Controller (PC). It maintains the interrupt context state of each
    thread and handles the delivery of the external exception to the
    thread.


* XIVE internal tables

Each of the sub-engines uses a set of tables to redirect exceptions
from event sources to CPU threads.

                                          +-------+
  User or OS                              |  EQ   |
      or                          +------>|entries|
  Hypervisor                      |       |  ..   |
    Memory                        |       +-------+
                                  |           ^
                                  |           |
             +-------------------------------------------------+
                                  |           |
  Hypervisor      +------+    +---+--+    +---+--+   +------+
    Memory        | ESB  |    | EAT  |    | ENDT |   | NVTT |
   (skiboot)      +----+-+    +----+-+    +----+-+   +------+
                    ^  |        ^  |        ^  |       ^
                    |  |        |  |        |  |       |
             +-------------------------------------------------+
                    |  |        |  |        |  |       |
                    |  |        |  |        |  |       |
               +----|--|--------|--|--------|--|-+   +-|-----+    +------+
               |    |  |        |  |        |  | |   | | tctx|    |Thread|
   IPI or   ---+    +  v        +  v        +  v |---| +  .. |----->     |
  HW events    |                                 |   |       |    |      |
               |             IVRE                |   | IVPE  |    +------+
               +---------------------------------+   +-------+
            


The IVSE have a 2-bits, P for pending and Q for queued, state machine
for each source that allows events to be triggered. They are stored in
an array, the Event State Buffer (ESB) and controlled by MMIOs.

If the event is let through, the IVRE looks up in the Event Assignment
Structure (EAS) table for an Event Notification Descriptor (END)
configured for the source. Each Event Notification Descriptor defines
a notification path to a CPU and an in-memory Event Queue, in which
will be pushed an EQ data for the OS to pull.

The IVPE determines if a Notification Virtual Target (NVT) can handle
the event by scanning the thread contexts of the VPs dispatched on the
processor HW threads. It maintains the interrupt context state of each
thread in a NVT table.


Reply via email to