On 11/30/18 2:11 AM, David Gibson wrote: > On Thu, Nov 29, 2018 at 04:27:31PM +0100, Cédric Le Goater wrote: >> [ ... ] >> >>>>>> +/* >>>>>> + * The allocation of VP blocks is a complex operation in OPAL and the >>>>>> + * VP identifiers have a relation with the number of HW chips, the >>>>>> + * size of the VP blocks, VP grouping, etc. The QEMU sPAPR XIVE >>>>>> + * controller model does not have the same constraints and can use a >>>>>> + * simple mapping scheme of the CPU vcpu_id >>>>>> + * >>>>>> + * These identifiers are never returned to the OS. >>>>>> + */ >>>>>> + >>>>>> +#define SPAPR_XIVE_VP_BASE 0x400 >>>>> >>>>> 0x400 == 1024. Could we ever have the possibility of needing to >>>>> consider both physical NVTs and PAPR NVTs at the same time? >>>> >>>> They would not be in the same CAM line: OS ring vs. PHYS ring. >>> >>> Hm. They still inhabit the same NVT number space though, don't they? >> >> No. skiboot reserves the range of VPs for the HW at init. >> >> https://github.com/open-power/skiboot/blob/master/hw/xive.c#L1093 > > Uh.. I don't see how they're reserved is relevant. > > What I mean is that the ENDs address the NVTs for HW endpoints by the > same (block, index) tuples as the NVTs for virtualized endpoints, yes?
Ah. Yes. The (block, index) tuples, fields END_W6_NVT_BLOCK and END_W6_NVT_INDEX in the END structure, are all in the same number space. skiboot defines some ranges though. >>> I'm thinking about the END->NVT stage of the process here, rather than >>> the NVT->TCTX stage. >>> >>> Oh, also, you're using "VP" here which IIUC == "NVT". Can we >>> standardize on one, please. >> >> VP is used in Linux/KVM Linux/Native and skiboot. Yes. it's a mess. >> Let's have consistent naming in QEMU and use NVT. > > Right. And to cover any inevitable missed ones is why I'd like to see > a cheatsheet giving both terms in the header comments somewhere. yes. I have added a list of names in xive.h. I was wondering if I should put the diagram below somewhere in a .h file or under doc/specs/. Thanks, C. = XIVE ================================================================= The POWER9 processor comes with a new interrupt controller, called XIVE as "eXternal Interrupt Virtualization Engine". * Overall architecture XIVE Interrupt Controller +------------------------------------+ IPIs | +---------+ +---------+ +--------+ | +-------+ | |VC | |CQ | |PC |----> | CORES | | | esb | | | | |----> | | | | eas | | Bridge | | tctx |----> | | | |SC end | | | | nvt | | | | +------+ | +---------+ +----+----+ +--------+ | +-+-+-+-+ | RAM | +------------------|-----------------+ | | | | | | | | | | | | | | | | | +--------------------v------------------------v-v-v--+ other | <--+ Power Bus +--> chips | esb | +---------+-----------------------+------------------+ | eas | | | | end | +---|-----+ | | nvt | +----+----+| +----+----+ +------+ |SC || |SC | | || | | | PQ-bits || | PQ-bits | | local |+ | in VC | +---------+ +---------+ PCIe NX,NPU,CAPI SC: Source Controller (aka. IVSE) VC: Virtualization Controller (aka. IVRE) PC: Presentation Controller (aka. IVPE) CQ: Common Queue (Bridge) PQ-bits: 2 bits source state machine (P:pending Q:queued) esb: Event State Buffer (Array of PQ bits in an IVSE) eas: Event Assignment Structure end: Event Notification Descriptor nvt: Notification Virtual Target tctx: Thread interrupt Context The XIVE IC is composed of three sub-engines : - Interrupt Virtualization Source Engine (IVSE), or Source Controller (SC). These are found in PCI PHBs, in the PSI host bridge controller, but also inside the main controller for the core IPIs and other sub-chips (NX, CAP, NPU) of the chip/processor. They are configured to feed the IVRE with events. - Interrupt Virtualization Routing Engine (IVRE) or Virtualization Controller (VC). Its job is to match an event source with an Event Notification Descriptor (END). - Interrupt Virtualization Presentation Engine (IVPE) or Presentation Controller (PC). It maintains the interrupt context state of each thread and handles the delivery of the external exception to the thread. * XIVE internal tables Each of the sub-engines uses a set of tables to redirect exceptions from event sources to CPU threads. +-------+ User or OS | EQ | or +------>|entries| Hypervisor | | .. | Memory | +-------+ | ^ | | +-------------------------------------------------+ | | Hypervisor +------+ +---+--+ +---+--+ +------+ Memory | ESB | | EAT | | ENDT | | NVTT | (skiboot) +----+-+ +----+-+ +----+-+ +------+ ^ | ^ | ^ | ^ | | | | | | | +-------------------------------------------------+ | | | | | | | | | | | | | | +----|--|--------|--|--------|--|-+ +-|-----+ +------+ | | | | | | | | | | tctx| |Thread| IPI or ---+ + v + v + v |---| + .. |-----> | HW events | | | | | | | IVRE | | IVPE | +------+ +---------------------------------+ +-------+ The IVSE have a 2-bits, P for pending and Q for queued, state machine for each source that allows events to be triggered. They are stored in an array, the Event State Buffer (ESB) and controlled by MMIOs. If the event is let through, the IVRE looks up in the Event Assignment Structure (EAS) table for an Event Notification Descriptor (END) configured for the source. Each Event Notification Descriptor defines a notification path to a CPU and an in-memory Event Queue, in which will be pushed an EQ data for the OS to pull. The IVPE determines if a Notification Virtual Target (NVT) can handle the event by scanning the thread contexts of the VPs dispatched on the processor HW threads. It maintains the interrupt context state of each thread in a NVT table.