Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Wed, 2015-06-03 at 12:55 +0530, Vijay Kilari wrote: > On Mon, Jun 1, 2015 at 5:54 PM, Julien Grall wrote: > > On 01/06/15 13:11, Ian Campbell wrote: > ### Device ID (`ID`) > > This parameter is used by commands which manage a specific device and > the interrupts associated with that device. Checking if a device is > present and retrieving the data structure must be fast. > > The device identifiers may not be assigned contiguously and the maximum > number is very high (2^32). > > XXX In the context of virtualised device ids this may not be the case, > e.g. we can arrange for (mostly) contiguous device ids and we know the > bound is significantly lower than 2^32 > > Possible efficient data structures would be: > > 1. List: The lookup/deletion is in O(n) and the insertion will depend > if the device should be sorted following their identifier. The > memory overhead is 18 bytes per element. > 2. Red-black tree: All the operations are O(log(n)). The memory > overhead is 24 bytes per element. > > How about using radix-tree instead of RB-tree? > > > A Red-black tree seems the more suitable for having fast deviceID > validation even though the memory overhead is a bit higher compare to > the list. > >>> > >>> When PHYSDEVOP_pci_device_add is called, memory for its_device structure > >>> and other needed structure for this device is allocated added to RB-tree > >>> with all necessary information > >> > >> Sounds like a reasonable time to do it. I added something based on your > >> words. > > > > Hmmm... The RB-tree suggested is per domain not the host and indexed > > with the vDevID. > > > > This is the only way to know quickly if the domain is able to use the > > device and retrieving a device. Indeed, the vDevID won't be equal to the > > pDevID as the vBDF will be different to the pBDF. > > Yes, vBDF is converted to pBDF to match DevID > > > > > PHYSDEVOP_pci_device_add is to ask Xen managing the PCI device. At that > > time we don't know to which domain the device will be passthrough. > > PHYSDEVOP_pci_device_add will only add its_device to global radix tree list. > > When MAPD is received, its_device is removed from global list and added > to per domain list. When domain releases the device, its_device is added back > to global list. is it ok? I suspect we might need two list (or tree) entries for each its_device, one for the pDevice mapping and one for the vDevice mapping. We may even want a third for vCollection membership, I'm not sure. Either way I don't think it'll be a big deal, the need or not for each of those will fall out in the wash from the rest of the design, I think. Based on the amount of discussion on draftC and the fact that we are still finding new areas of complexity I'm going to take a step back and try something simpler and see if I can come up with something which we can get done for 4.6. I'll try and get a new draft reflecting that out ASAP. (I have my edits from the feedback on draftC so far in git, so if it doesn't work we can always take up this one again...) ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Mon, Jun 1, 2015 at 5:54 PM, Julien Grall wrote: > On 01/06/15 13:11, Ian Campbell wrote: ### Device ID (`ID`) This parameter is used by commands which manage a specific device and the interrupts associated with that device. Checking if a device is present and retrieving the data structure must be fast. The device identifiers may not be assigned contiguously and the maximum number is very high (2^32). XXX In the context of virtualised device ids this may not be the case, e.g. we can arrange for (mostly) contiguous device ids and we know the bound is significantly lower than 2^32 Possible efficient data structures would be: 1. List: The lookup/deletion is in O(n) and the insertion will depend if the device should be sorted following their identifier. The memory overhead is 18 bytes per element. 2. Red-black tree: All the operations are O(log(n)). The memory overhead is 24 bytes per element. How about using radix-tree instead of RB-tree? A Red-black tree seems the more suitable for having fast deviceID validation even though the memory overhead is a bit higher compare to the list. >>> >>> When PHYSDEVOP_pci_device_add is called, memory for its_device structure >>> and other needed structure for this device is allocated added to RB-tree >>> with all necessary information >> >> Sounds like a reasonable time to do it. I added something based on your >> words. > > Hmmm... The RB-tree suggested is per domain not the host and indexed > with the vDevID. > > This is the only way to know quickly if the domain is able to use the > device and retrieving a device. Indeed, the vDevID won't be equal to the > pDevID as the vBDF will be different to the pBDF. Yes, vBDF is converted to pBDF to match DevID > > PHYSDEVOP_pci_device_add is to ask Xen managing the PCI device. At that > time we don't know to which domain the device will be passthrough. PHYSDEVOP_pci_device_add will only add its_device to global radix tree list. When MAPD is received, its_device is removed from global list and added to per domain list. When domain releases the device, its_device is added back to global list. is it ok? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Tue, 2015-06-02 at 11:46 +0100, Julien Grall wrote: > Hi Ian, > > On 01/06/15 14:36, Ian Campbell wrote: > > On Fri, 2015-05-29 at 15:06 +0100, Julien Grall wrote: > >> Hi Vijay, > >> > >> On 27/05/15 17:44, Vijay Kilari wrote: > ## Command Translation > > Of the existing GICv3 ITS commands, `MAPC`, `MAPD`, `MAPVI`/`MAPI` are > potentially time consuming commands as these commands creates entry in > the Xen ITS structures, which are used to validate other ITS commands. > > `INVALL` and `SYNC` are global and potentially disruptive to other > guests and so need consideration. > > All other ITS command like `MOVI`, `DISCARD`, `INV`, `INT`, `CLEAR` > just validate and generate physical command. > > ### `MAPC` command translation > > Format: `MAPC vCID, vTA` > > >>>- The GITS_TYPER.PAtype is emulated as 0. Hence vTA is always > >>> represents > >>> vcpu number. Hence vTA is validated against physical Collection > >>> IDs by querying > >>> ITS driver and corresponding Physical Collection ID is retrieved. > >>>- Each vITS will have cid_map (struct cid_mapping) which holds > >>> mapping of > >> > >> Why do you speak about each vITS? The emulation is only related to one > >> vITS and not shared... > > > > And each vITS will have a cid_map, which is used. This seems like a > > reasonable way to express this concept in the context. > > This is rather strange when everything in the command emulation is per-vits. I'm afraid you are going to have to say more explicitly what you find strange here. > > Perhaps there is a need to include discussion of some of the secondary > > data structures alongside the defintion `cits_cq`. In which case we > > could talk about "its associated `cid_map`" and things. > > > >>> Virtual Collection ID(vCID), Virtual Target address(vTA) and > >>> Physical Collection ID (pCID). > >>> If vCID entry already exists in cid_map, then that particular > >>> mapping is updated with > >>> the new pCID and vTA else new entry is made in cid_map > >> > >> When you move a collection, you also have to make sure that all the > >> interrupts associated to it will be delivered to the new target. > >> > >> I'm not sure what you are suggesting for that... > > > > This is going to be rather painful I fear. > > > >>>- MAPC pCID, pTA physical ITS command is generated > >> > >> We should not send any MAPC command to the physical ITS. The collection > >> is already mapped during Xen boot and the guest should not be able to > >> move the physical collection (they are shared between all the guests and > >> Xen). > > > > This needs discussion in the background section, to describe the > > physical setup which the virtual stuff can make assumption of. > > I don't think this is a background section. The physical number of > collection is limited (the mandatory number of collections is nr_cpus + > 1). Those collection will likely be shared between Xen and the different > guests. Right, and this needs to be explained in the document as an assumption upon which other things can draw, so that the document is (so far as possible) a coherent whole... > If we let the guest moving the physical collection we will also move all > the interrupts which is wrong. ... and therefore things like this would become apparent. > - `MAPC pCID, pTA` physical ITS command is generated > > ### `MAPD` Command translation > > Format: `MAPD device, Valid, ITT IPA, ITT Size` > > `MAPD` is sent with `Valid` bit set if device needs to be added and reset > when device is removed. > > If `Valid` bit is set: > > - Allocate memory for `its_device` struct > - Validate ITT IPA & ITT size and update its_device struct > - Find number of vectors(nrvecs) for this device by querying PCI > helper function > - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`? > - Allocate memory for `struct vlpi_map` for this device. This > `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID. > - Find physical ITS node with which this device is associated > - Call `p2m_lookup` on ITT IPA addr and get physical ITT address > - Validate ITT Size > - Generate/format physical ITS command: `MAPD, ITT PA, ITT Size` > > Here the overhead is with memory allocation for `its_device` and > `vlpi_map` > > XXX Suggestion was to preallocate some of those at device passthrough > setup time? > >>> > >>> If Validation bit is set: > >>>- Query its_device tree and get its_device structure for this device. > >>>- (XXX: If pci device is hidden from dom0, does this device is added > >>>with PHYSDEVOP_pci_device_add hypercall?) > >>>- If device does not exists return > >>>- If device exists in RB-tree then > >>> - Validate ITT IPA & ITT size and update its
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
Hi Ian, On 01/06/15 14:36, Ian Campbell wrote: On Fri, 2015-05-29 at 15:06 +0100, Julien Grall wrote: Hi Vijay, On 27/05/15 17:44, Vijay Kilari wrote: ## Command Translation Of the existing GICv3 ITS commands, `MAPC`, `MAPD`, `MAPVI`/`MAPI` are potentially time consuming commands as these commands creates entry in the Xen ITS structures, which are used to validate other ITS commands. `INVALL` and `SYNC` are global and potentially disruptive to other guests and so need consideration. All other ITS command like `MOVI`, `DISCARD`, `INV`, `INT`, `CLEAR` just validate and generate physical command. ### `MAPC` command translation Format: `MAPC vCID, vTA` - The GITS_TYPER.PAtype is emulated as 0. Hence vTA is always represents vcpu number. Hence vTA is validated against physical Collection IDs by querying ITS driver and corresponding Physical Collection ID is retrieved. - Each vITS will have cid_map (struct cid_mapping) which holds mapping of Why do you speak about each vITS? The emulation is only related to one vITS and not shared... And each vITS will have a cid_map, which is used. This seems like a reasonable way to express this concept in the context. This is rather strange when everything in the command emulation is per-vits. Perhaps there is a need to include discussion of some of the secondary data structures alongside the defintion `cits_cq`. In which case we could talk about "its associated `cid_map`" and things. Virtual Collection ID(vCID), Virtual Target address(vTA) and Physical Collection ID (pCID). If vCID entry already exists in cid_map, then that particular mapping is updated with the new pCID and vTA else new entry is made in cid_map When you move a collection, you also have to make sure that all the interrupts associated to it will be delivered to the new target. I'm not sure what you are suggesting for that... This is going to be rather painful I fear. - MAPC pCID, pTA physical ITS command is generated We should not send any MAPC command to the physical ITS. The collection is already mapped during Xen boot and the guest should not be able to move the physical collection (they are shared between all the guests and Xen). This needs discussion in the background section, to describe the physical setup which the virtual stuff can make assumption of. I don't think this is a background section. The physical number of collection is limited (the mandatory number of collections is nr_cpus + 1). Those collection will likely be shared between Xen and the different guests. If we let the guest moving the physical collection we will also move all the interrupts which is wrong. Here there is no overhead, the cid_map entries are preallocated with size of nr_cpus in the platform. As said the number of collection should be at least nr_cpus + 1. FWIW I read this as "with size appropriate for nr_cpus", which leaves the +1 as implicit. I added the +1 nevertheless. I wanted to make clear. His implementation was only considering nr_cpus collections. - `MAPC pCID, pTA` physical ITS command is generated ### `MAPD` Command translation Format: `MAPD device, Valid, ITT IPA, ITT Size` `MAPD` is sent with `Valid` bit set if device needs to be added and reset when device is removed. If `Valid` bit is set: - Allocate memory for `its_device` struct - Validate ITT IPA & ITT size and update its_device struct - Find number of vectors(nrvecs) for this device by querying PCI helper function - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`? - Allocate memory for `struct vlpi_map` for this device. This `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID. - Find physical ITS node with which this device is associated - Call `p2m_lookup` on ITT IPA addr and get physical ITT address - Validate ITT Size - Generate/format physical ITS command: `MAPD, ITT PA, ITT Size` Here the overhead is with memory allocation for `its_device` and `vlpi_map` XXX Suggestion was to preallocate some of those at device passthrough setup time? If Validation bit is set: - Query its_device tree and get its_device structure for this device. - (XXX: If pci device is hidden from dom0, does this device is added with PHYSDEVOP_pci_device_add hypercall?) - If device does not exists return - If device exists in RB-tree then - Validate ITT IPA & ITT size and update its_device struct To validate the ITT size you need to know the number of interrupt ID. Please could you get into the habit of making concrete suggestions for changes to the text. I've no idea what change I should make based on this observation. If not concrete suggestions please try and make the implications of what you are saying clear. The size of the ITT is based on the number of Interrupt supported by the device. The only way to validate the size getting the number of Interrupt before. i.e - Find the number of MS
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Mon, 2015-06-01 at 16:29 +0100, Julien Grall wrote: > On 01/06/15 14:12, Ian Campbell wrote: > > On Fri, 2015-05-29 at 14:40 +0100, Julien Grall wrote: > >> Hi Ian, > > Hi Ian, > > >> NIT: You used my Linaro email which I think is de-activated now :). > > > > I keep finding new address books with that address in them! > > > >>> ## ITS Translation Table > >>> > >>> Message signalled interrupts are translated into an LPI via an ITS > >>> translation table which must be configured for each device which can > >>> generate an MSI. > >> > >> I'm not sure what is the ITS Table Table. Did you mean Interrupt > >> Translation Table? > > > > I don't think I wrote Table Table anywhere. > > Sorry I meant "ITS translation table" > > > I'm referring to the tables which are established by e.g. the MAPD > > command and friends, e.g. the thing shown in "4.9.12 Notional ITS Table > > Structure". > > On previous paragraph you are referring particularly to "Interrupt > Translation Table". This is the only table that is configured per device. I'm afraid I'm still not getting your point. Please quote the exact text which you think is wrong and if possible suggest an alternative. > [..] > > >>> XXX there are other aspects to virtualising the ITS (LPI collection > >>> management, assignment of LPI ranges to guests, device > >>> management). However these are not currently considered here. XXX > >>> Should they be/do they need to be? > >> > >> I think we began to cover these aspect with the section "command > >> emulation". > > > > Some aspects, yes. I went with: > > > > There are other aspects to virtualising the ITS (LPI collection > > management, assignment of LPI ranges to guests, device > > management). However these are only considered here to the extent > > needed for describing the vITS emulation. > > > >>> XXX In the context of virtualised device ids this may not be the case, > >>> e.g. we can arrange for (mostly) contiguous device ids and we know the > >>> bound is significantly lower than 2^32 > >> > >> Well, the deviceID is computed from the BDF and some DMA alias. As the > >> algorithm can't be tweaked, it's very likely that we will have > >> non-contiguous Device ID. See pci_for_each_dma_alias in Linux > >> (drivers/pci/search.c). > > > > The implication here is that deviceID is fixed in hardware and is used > > by driver domain software in contexts where we do not get the > > opportunity to translate is that right? What contexts are those? > > No, the driver domain software will always use a virtual DeviceID (based > on the vBDF and other things). The problem I wanted to raise is how to > translate back the vDeviceID to a physical deviceID/BDF. Right, so this goes back to my original point, which is that if we completely control the translation from vDeviceID to pDeviceID/BDF then the vDeviceId space need not be sparse and need not utilise the entire 2^32 space, at least for domU uses. > > Note that the BDF is also something which we could in principal > > virtualise (we already do for domU). Perhaps that is infeasible for dom0 > > though? > > For DOM0 the virtual BDF is equal to the physical BDF. So the both > deviceID (physical and virtual) will be the same. > > We may decide to do vBDF == pBDF for guest too in order to simplify the > code. It seems to me that choosing vBDF such that the vDeviceId space is to our liking would be a good idea. > > That gives me two thoughts. > > > > The first is that although device identifiers are not necessarily > > contiguous, they are generally at least grouped and not allocated at > > random through the 2^32 options. For example a PCI Host bridge typically > > has a range of device ids associated with it and each device has a > > device id derived from that. > > Usually it's one per (device, function). Yes, but my point is that they are generally grouped by bus. The bus is assigned a (contiguous) range and individual (device,function)=> device id mappings are based on a formula applied to the base address. i.e. for a given PCI bus the device ids are in the range 1000..1000+N, not N random number selected from the 2^32 space. > > > > > I'm not sure if we can leverage that into a more useful data structure > > than an R-B tree, or for example to arrange for the R-B to allow for the > > translation of a device within a span into the parent span and from > > there do the lookup. Specifically when looking up a device ID > > corresponding to a PCI device we could arrange to find the PCI host > > bridge and find the actual device from there. This would keep the RB > > tree much smaller and therefore perhaps quicker? Of course that depends > > on what the lookup from PCI host bridge to a device looked like. > > I'm not sure why you are speaking about PCI host bridge. AFAIK, the > guest doesn't have a physical host bridge. It has a virtual one provided by the pciif/pcifront+back thing. Any PCI bus is behind some sort of
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On 01/06/15 14:12, Ian Campbell wrote: > On Fri, 2015-05-29 at 14:40 +0100, Julien Grall wrote: >> Hi Ian, Hi Ian, >> NIT: You used my Linaro email which I think is de-activated now :). > > I keep finding new address books with that address in them! > >>> ## ITS Translation Table >>> >>> Message signalled interrupts are translated into an LPI via an ITS >>> translation table which must be configured for each device which can >>> generate an MSI. >> >> I'm not sure what is the ITS Table Table. Did you mean Interrupt >> Translation Table? > > I don't think I wrote Table Table anywhere. Sorry I meant "ITS translation table" > I'm referring to the tables which are established by e.g. the MAPD > command and friends, e.g. the thing shown in "4.9.12 Notional ITS Table > Structure". On previous paragraph you are referring particularly to "Interrupt Translation Table". This is the only table that is configured per device. [..] >>> XXX there are other aspects to virtualising the ITS (LPI collection >>> management, assignment of LPI ranges to guests, device >>> management). However these are not currently considered here. XXX >>> Should they be/do they need to be? >> >> I think we began to cover these aspect with the section "command emulation". > > Some aspects, yes. I went with: > > There are other aspects to virtualising the ITS (LPI collection > management, assignment of LPI ranges to guests, device > management). However these are only considered here to the extent > needed for describing the vITS emulation. > >>> XXX In the context of virtualised device ids this may not be the case, >>> e.g. we can arrange for (mostly) contiguous device ids and we know the >>> bound is significantly lower than 2^32 >> >> Well, the deviceID is computed from the BDF and some DMA alias. As the >> algorithm can't be tweaked, it's very likely that we will have >> non-contiguous Device ID. See pci_for_each_dma_alias in Linux >> (drivers/pci/search.c). > > The implication here is that deviceID is fixed in hardware and is used > by driver domain software in contexts where we do not get the > opportunity to translate is that right? What contexts are those? No, the driver domain software will always use a virtual DeviceID (based on the vBDF and other things). The problem I wanted to raise is how to translate back the vDeviceID to a physical deviceID/BDF. > Note that the BDF is also something which we could in principal > virtualise (we already do for domU). Perhaps that is infeasible for dom0 > though? For DOM0 the virtual BDF is equal to the physical BDF. So the both deviceID (physical and virtual) will be the same. We may decide to do vBDF == pBDF for guest too in order to simplify the code. > That gives me two thoughts. > > The first is that although device identifiers are not necessarily > contiguous, they are generally at least grouped and not allocated at > random through the 2^32 options. For example a PCI Host bridge typically > has a range of device ids associated with it and each device has a > device id derived from that. Usually it's one per (device, function). > > I'm not sure if we can leverage that into a more useful data structure > than an R-B tree, or for example to arrange for the R-B to allow for the > translation of a device within a span into the parent span and from > there do the lookup. Specifically when looking up a device ID > corresponding to a PCI device we could arrange to find the PCI host > bridge and find the actual device from there. This would keep the RB > tree much smaller and therefore perhaps quicker? Of course that depends > on what the lookup from PCI host bridge to a device looked like. I'm not sure why you are speaking about PCI host bridge. AFAIK, the guest doesn't have a physical host bridge. Although, this is an optimization that we can think about it later. The R-B will already be fast enough for a first implementation. My main point was about the translation vDeviceID => pDeviceID. > The second is that perhaps we can do something simpler for the domU > case, if we were willing to tolerate it being different from dom0. > >>> Possible efficient data structures would be: >>> >>> 1. List: The lookup/deletion is in O(n) and the insertion will depend >>> if the device should be sorted following their identifier. The >>> memory overhead is 18 bytes per element. >>> 2. Red-black tree: All the operations are O(log(n)). The memory >>> overhead is 24 bytes per element. >>> >>> A Red-black tree seems the more suitable for having fast deviceID >>> validation even though the memory overhead is a bit higher compare to >>> the list. >>> >>> ### Event ID (`vID`) >>> >>> This is the per-device Interrupt identifier (i.e. the MSI index). It >>> is configured by the device driver software. >>> >>> It is not necessary to translate a `vID`, however they may need to be >>> represented in various data structures given to the pITS. >>> >>> XXX i
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Mon, 2015-06-01 at 13:24 +0100, Julien Grall wrote: > On 01/06/15 13:11, Ian Campbell wrote: > >>> ### Device ID (`ID`) > >>> > >>> This parameter is used by commands which manage a specific device and > >>> the interrupts associated with that device. Checking if a device is > >>> present and retrieving the data structure must be fast. > >>> > >>> The device identifiers may not be assigned contiguously and the maximum > >>> number is very high (2^32). > >>> > >>> XXX In the context of virtualised device ids this may not be the case, > >>> e.g. we can arrange for (mostly) contiguous device ids and we know the > >>> bound is significantly lower than 2^32 > >>> > >>> Possible efficient data structures would be: > >>> > >>> 1. List: The lookup/deletion is in O(n) and the insertion will depend > >>>if the device should be sorted following their identifier. The > >>>memory overhead is 18 bytes per element. > >>> 2. Red-black tree: All the operations are O(log(n)). The memory > >>>overhead is 24 bytes per element. > >>> > >>> A Red-black tree seems the more suitable for having fast deviceID > >>> validation even though the memory overhead is a bit higher compare to > >>> the list. > >> > >> When PHYSDEVOP_pci_device_add is called, memory for its_device structure > >> and other needed structure for this device is allocated added to RB-tree > >> with all necessary information > > > > Sounds like a reasonable time to do it. I added something based on your > > words. > > Hmmm... The RB-tree suggested is per domain not the host and indexed > with the vDevID. I added "The `ID` is per domain and therefore the datastructure should be too." before "Possible efficient..." > This is the only way to know quickly if the domain is able to use the > device and retrieving a device. Indeed, the vDevID won't be equal to the > pDevID as the vBDF will be different to the pBDF. > > PHYSDEVOP_pci_device_add is to ask Xen managing the PCI device. At that > time we don't know to which domain the device will be passthrough. Yes, I suppose we can allocate at PHYSDEVOP_pci_device_add time, but linking it into the R-B tree will have to happen at assignment time. This section now ends: When `PHYSDEVOP_pci_device_add` is called, memory for its_device structure and other needed structure for this device is allocated. When `XEN_DOMCTL_assign_device` is called the device will be added to the per domain RB-tree with all necessary information. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Fri, 2015-05-29 at 15:06 +0100, Julien Grall wrote: > Hi Vijay, > > On 27/05/15 17:44, Vijay Kilari wrote: > >> ## Command Translation > >> > >> Of the existing GICv3 ITS commands, `MAPC`, `MAPD`, `MAPVI`/`MAPI` are > >> potentially time consuming commands as these commands creates entry in > >> the Xen ITS structures, which are used to validate other ITS commands. > >> > >> `INVALL` and `SYNC` are global and potentially disruptive to other > >> guests and so need consideration. > >> > >> All other ITS command like `MOVI`, `DISCARD`, `INV`, `INT`, `CLEAR` > >> just validate and generate physical command. > >> > >> ### `MAPC` command translation > >> > >> Format: `MAPC vCID, vTA` > >> > >- The GITS_TYPER.PAtype is emulated as 0. Hence vTA is always represents > > vcpu number. Hence vTA is validated against physical Collection > > IDs by querying > > ITS driver and corresponding Physical Collection ID is retrieved. > >- Each vITS will have cid_map (struct cid_mapping) which holds mapping > > of > > Why do you speak about each vITS? The emulation is only related to one > vITS and not shared... And each vITS will have a cid_map, which is used. This seems like a reasonable way to express this concept in the context. Perhaps there is a need to include discussion of some of the secondary data structures alongside the defintion `cits_cq`. In which case we could talk about "its associated `cid_map`" and things. > > Virtual Collection ID(vCID), Virtual Target address(vTA) and > > Physical Collection ID (pCID). > > If vCID entry already exists in cid_map, then that particular > > mapping is updated with > > the new pCID and vTA else new entry is made in cid_map > > When you move a collection, you also have to make sure that all the > interrupts associated to it will be delivered to the new target. > > I'm not sure what you are suggesting for that... This is going to be rather painful I fear. > >- MAPC pCID, pTA physical ITS command is generated > > We should not send any MAPC command to the physical ITS. The collection > is already mapped during Xen boot and the guest should not be able to > move the physical collection (they are shared between all the guests and > Xen). This needs discussion in the background section, to describe the physical setup which the virtual stuff can make assumption of. > >Here there is no overhead, the cid_map entries are preallocated > > with size of nr_cpus > >in the platform. > > As said the number of collection should be at least nr_cpus + 1. FWIW I read this as "with size appropriate for nr_cpus", which leaves the +1 as implicit. I added the +1 nevertheless. > >> - `MAPC pCID, pTA` physical ITS command is generated > >> > >> ### `MAPD` Command translation > >> > >> Format: `MAPD device, Valid, ITT IPA, ITT Size` > >> > >> `MAPD` is sent with `Valid` bit set if device needs to be added and reset > >> when device is removed. > >> > >> If `Valid` bit is set: > >> > >> - Allocate memory for `its_device` struct > >> - Validate ITT IPA & ITT size and update its_device struct > >> - Find number of vectors(nrvecs) for this device by querying PCI > >> helper function > >> - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`? > >> - Allocate memory for `struct vlpi_map` for this device. This > >> `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID. > >> - Find physical ITS node with which this device is associated > >> - Call `p2m_lookup` on ITT IPA addr and get physical ITT address > >> - Validate ITT Size > >> - Generate/format physical ITS command: `MAPD, ITT PA, ITT Size` > >> > >> Here the overhead is with memory allocation for `its_device` and `vlpi_map` > >> > >> XXX Suggestion was to preallocate some of those at device passthrough > >> setup time? > > > > If Validation bit is set: > >- Query its_device tree and get its_device structure for this device. > >- (XXX: If pci device is hidden from dom0, does this device is added > >with PHYSDEVOP_pci_device_add hypercall?) > >- If device does not exists return > >- If device exists in RB-tree then > > - Validate ITT IPA & ITT size and update its_device struct > > To validate the ITT size you need to know the number of interrupt ID. Please could you get into the habit of making concrete suggestions for changes to the text. I've no idea what change I should make based on this observation. If not concrete suggestions please try and make the implications of what you are saying clear. > > > - Check if device is already assigned to the domain, > > if not then > >- Find number of vectors(nrvecs) for this device. > >- Allocate nrvecs number of LPI > >- Fetch vlpi_map for this device (preallocated at the > > time of adding > > this device to Xen). This vlpi_map holds mapping of > > Virtual LPI to > >
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Fri, 2015-05-29 at 14:40 +0100, Julien Grall wrote: > Hi Ian, > > NIT: You used my Linaro email which I think is de-activated now :). I keep finding new address books with that address in them! > > ## ITS Translation Table > > > > Message signalled interrupts are translated into an LPI via an ITS > > translation table which must be configured for each device which can > > generate an MSI. > > I'm not sure what is the ITS Table Table. Did you mean Interrupt > Translation Table? I don't think I wrote Table Table anywhere. I'm referring to the tables which are established by e.g. the MAPD command and friends, e.g. the thing shown in "4.9.12 Notional ITS Table Structure". > > is _not_ guarenteed that a change to the LPI Configuration Table won't > > s/guarenteed/guaranteed/? Or may the first use of this word was wrong? guaranteed is correct, I can never remember it though. > > XXX there are other aspects to virtualising the ITS (LPI collection > > management, assignment of LPI ranges to guests, device > > management). However these are not currently considered here. XXX > > Should they be/do they need to be? > > I think we began to cover these aspect with the section "command emulation". Some aspects, yes. I went with: There are other aspects to virtualising the ITS (LPI collection management, assignment of LPI ranges to guests, device management). However these are only considered here to the extent needed for describing the vITS emulation. > > XXX In the context of virtualised device ids this may not be the case, > > e.g. we can arrange for (mostly) contiguous device ids and we know the > > bound is significantly lower than 2^32 > > Well, the deviceID is computed from the BDF and some DMA alias. As the > algorithm can't be tweaked, it's very likely that we will have > non-contiguous Device ID. See pci_for_each_dma_alias in Linux > (drivers/pci/search.c). The implication here is that deviceID is fixed in hardware and is used by driver domain software in contexts where we do not get the opportunity to translate is that right? What contexts are those? Note that the BDF is also something which we could in principal virtualise (we already do for domU). Perhaps that is infeasible for dom0 though? That gives me two thoughts. The first is that although device identifiers are not necessarily contiguous, they are generally at least grouped and not allocated at random through the 2^32 options. For example a PCI Host bridge typically has a range of device ids associated with it and each device has a device id derived from that. I'm not sure if we can leverage that into a more useful data structure than an R-B tree, or for example to arrange for the R-B to allow for the translation of a device within a span into the parent span and from there do the lookup. Specifically when looking up a device ID corresponding to a PCI device we could arrange to find the PCI host bridge and find the actual device from there. This would keep the RB tree much smaller and therefore perhaps quicker? Of course that depends on what the lookup from PCI host bridge to a device looked like. The second is that perhaps we can do something simpler for the domU case, if we were willing to tolerate it being different from dom0. > > Possible efficient data structures would be: > > > > 1. List: The lookup/deletion is in O(n) and the insertion will depend > > if the device should be sorted following their identifier. The > > memory overhead is 18 bytes per element. > > 2. Red-black tree: All the operations are O(log(n)). The memory > > overhead is 24 bytes per element. > > > > A Red-black tree seems the more suitable for having fast deviceID > > validation even though the memory overhead is a bit higher compare to > > the list. > > > > ### Event ID (`vID`) > > > > This is the per-device Interrupt identifier (i.e. the MSI index). It > > is configured by the device driver software. > > > > It is not necessary to translate a `vID`, however they may need to be > > represented in various data structures given to the pITS. > > > > XXX is any of this true? > > > Right, the vID will always be equal to the pID. Although you will need > to associate a physical LPI for every pair (vID, DevID). I think in the terms defined by this document that is (`ID`, `vID`) => an LPI. Right? Have we considered how this mapping will be tracked? > > ### Interrupt Collection (`vCID`) > > > > This parameter is used in commands which manage collections and > > interrupt in order to move them for one CPU to another. The ITS is > > only mandated to implement N + 1 collections where N is the number of > > processor on the platform (i.e max number of VCPUs for a given > > guest). Furthermore, the identifiers are always contiguous. > > > > If we decide to implement the strict minimum (i.e N + 1), an array is > > enough and will allow operations in O(1). > > > > XXX Could forgo array and go straight to vcpu_info/d
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On 01/06/15 13:11, Ian Campbell wrote: >>> ### Device ID (`ID`) >>> >>> This parameter is used by commands which manage a specific device and >>> the interrupts associated with that device. Checking if a device is >>> present and retrieving the data structure must be fast. >>> >>> The device identifiers may not be assigned contiguously and the maximum >>> number is very high (2^32). >>> >>> XXX In the context of virtualised device ids this may not be the case, >>> e.g. we can arrange for (mostly) contiguous device ids and we know the >>> bound is significantly lower than 2^32 >>> >>> Possible efficient data structures would be: >>> >>> 1. List: The lookup/deletion is in O(n) and the insertion will depend >>>if the device should be sorted following their identifier. The >>>memory overhead is 18 bytes per element. >>> 2. Red-black tree: All the operations are O(log(n)). The memory >>>overhead is 24 bytes per element. >>> >>> A Red-black tree seems the more suitable for having fast deviceID >>> validation even though the memory overhead is a bit higher compare to >>> the list. >> >> When PHYSDEVOP_pci_device_add is called, memory for its_device structure >> and other needed structure for this device is allocated added to RB-tree >> with all necessary information > > Sounds like a reasonable time to do it. I added something based on your > words. Hmmm... The RB-tree suggested is per domain not the host and indexed with the vDevID. This is the only way to know quickly if the domain is able to use the device and retrieving a device. Indeed, the vDevID won't be equal to the pDevID as the vBDF will be different to the pBDF. PHYSDEVOP_pci_device_add is to ask Xen managing the PCI device. At that time we don't know to which domain the device will be passthrough. Regards, -- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Wed, 2015-05-27 at 22:14 +0530, Vijay Kilari wrote: > > ## pITS Scheduling > > > > A pITS scheduling pass is attempted: > > > > * On write to any virtual `CWRITER` iff that write results in there > > being new outstanding requests for that vits; > >You mean, scheduling pass (softirq trigger) is triggered iff there is no > ongoing requests from that vits? Yes, this has changed with the switch to only a single outstanding batch. I went with: * On write to any virtual `CWRITER` iff that write results in there being new outstanding requests for that vits which could be consumed by the pits (i.e. subject to only a single batch only being permitted by the scheduler); Although implementationwise it may be OK to defer that decision to the scheduler, rather than try to figure it out in the mmio trap. > > > * On read from a virtual `CREADR` iff there are commands outstanding > > on that vits; > > * On receipt of an interrupt notification arising from Xen's own use > > of `INT`; (see discussion under Completion) > > * On any interrupt injection arising from a guests use of the `INT` > > command; (XXX perhaps, see discussion under Completion) > > > > This may result in lots of contention on the scheduler > > locking. Therefore we consider that in each case all which happens is > > triggering of a softirq which will be processed on return to guest, > > and just once even for multiple events. > > Is it required to have all the cases to trigger scheduling pass? > Just on CWRITER if no ongoing request and on Xen's own completion INT > is not sufficient? I think CREADR is needed too, so the guest sees up to date info. And on injection arising from the guest use of INT is marked as optional here and considered later on. Whether it is needed depends on the decision there. > [...] > > The second option is likely to be preferable if the issue of selecting > > a device ID can be addressed. > > > > A secondary question is when these `INT` commands should be inserted > > into the command stream: (Nb, this is a list of options, not a list of places where it must be done) > > > > * After each batch taken from a single `vits_cq`; > >Is this not enough? because Scheduling pass just sends a one batch of > command with Xen's INT command It is almost certainly _sufficient_, the question is more whether it is _necessary_ or whether we can reduce the number of interrupts which are required for correct emulation of a vits, iow can we get away with one of the other two options. The following text argues that only one Xen INT is needed in the stream at any given moment. > > ### Device ID (`ID`) > > > > This parameter is used by commands which manage a specific device and > > the interrupts associated with that device. Checking if a device is > > present and retrieving the data structure must be fast. > > > > The device identifiers may not be assigned contiguously and the maximum > > number is very high (2^32). > > > > XXX In the context of virtualised device ids this may not be the case, > > e.g. we can arrange for (mostly) contiguous device ids and we know the > > bound is significantly lower than 2^32 > > > > Possible efficient data structures would be: > > > > 1. List: The lookup/deletion is in O(n) and the insertion will depend > >if the device should be sorted following their identifier. The > >memory overhead is 18 bytes per element. > > 2. Red-black tree: All the operations are O(log(n)). The memory > >overhead is 24 bytes per element. > > > > A Red-black tree seems the more suitable for having fast deviceID > > validation even though the memory overhead is a bit higher compare to > > the list. > > When PHYSDEVOP_pci_device_add is called, memory for its_device structure > and other needed structure for this device is allocated added to RB-tree > with all necessary information Sounds like a reasonable time to do it. I added something based on your words. [...] > > Format: `MAPC vCID, vTA` > > >- The GITS_TYPER.PAtype is emulated as 0. ITYM `GITS_TYPER.PTA`? I've updated various introductory section to reflect the decision to emulate as 0. > > > - `MAPC pCID, pTA` physical ITS command is generated > > > > ### `MAPD` Command translation > > > > Format: `MAPD device, Valid, ITT IPA, ITT Size` > > > > `MAPD` is sent with `Valid` bit set if device needs to be added and reset > > when device is removed. > > > > If `Valid` bit is set: > > > > - Allocate memory for `its_device` struct > > - Validate ITT IPA & ITT size and update its_device struct > > - Find number of vectors(nrvecs) for this device by querying PCI > > helper function > > - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`? > > - Allocate memory for `struct vlpi_map` for this device. This > > `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID. > > - Find physical ITS node with which this device is associated > > - Call `p2m_lookup
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
Hi Vijay, On 27/05/15 17:44, Vijay Kilari wrote: >> ## Command Translation >> >> Of the existing GICv3 ITS commands, `MAPC`, `MAPD`, `MAPVI`/`MAPI` are >> potentially time consuming commands as these commands creates entry in >> the Xen ITS structures, which are used to validate other ITS commands. >> >> `INVALL` and `SYNC` are global and potentially disruptive to other >> guests and so need consideration. >> >> All other ITS command like `MOVI`, `DISCARD`, `INV`, `INT`, `CLEAR` >> just validate and generate physical command. >> >> ### `MAPC` command translation >> >> Format: `MAPC vCID, vTA` >> >- The GITS_TYPER.PAtype is emulated as 0. Hence vTA is always represents > vcpu number. Hence vTA is validated against physical Collection > IDs by querying > ITS driver and corresponding Physical Collection ID is retrieved. >- Each vITS will have cid_map (struct cid_mapping) which holds mapping of Why do you speak about each vITS? The emulation is only related to one vITS and not shared... > Virtual Collection ID(vCID), Virtual Target address(vTA) and > Physical Collection ID (pCID). > If vCID entry already exists in cid_map, then that particular > mapping is updated with > the new pCID and vTA else new entry is made in cid_map When you move a collection, you also have to make sure that all the interrupts associated to it will be delivered to the new target. I'm not sure what you are suggesting for that... >- MAPC pCID, pTA physical ITS command is generated We should not send any MAPC command to the physical ITS. The collection is already mapped during Xen boot and the guest should not be able to move the physical collection (they are shared between all the guests and Xen). > >Here there is no overhead, the cid_map entries are preallocated > with size of nr_cpus >in the platform. As said the number of collection should be at least nr_cpus + 1. > >> - `MAPC pCID, pTA` physical ITS command is generated >> >> ### `MAPD` Command translation >> >> Format: `MAPD device, Valid, ITT IPA, ITT Size` >> >> `MAPD` is sent with `Valid` bit set if device needs to be added and reset >> when device is removed. >> >> If `Valid` bit is set: >> >> - Allocate memory for `its_device` struct >> - Validate ITT IPA & ITT size and update its_device struct >> - Find number of vectors(nrvecs) for this device by querying PCI >> helper function >> - Allocate nrvecs number of LPI XXX nrvecs is a function of `ITT Size`? >> - Allocate memory for `struct vlpi_map` for this device. This >> `vlpi_map` holds mapping of Virtual LPI to Physical LPI and ID. >> - Find physical ITS node with which this device is associated >> - Call `p2m_lookup` on ITT IPA addr and get physical ITT address >> - Validate ITT Size >> - Generate/format physical ITS command: `MAPD, ITT PA, ITT Size` >> >> Here the overhead is with memory allocation for `its_device` and `vlpi_map` >> >> XXX Suggestion was to preallocate some of those at device passthrough >> setup time? > > If Validation bit is set: >- Query its_device tree and get its_device structure for this device. >- (XXX: If pci device is hidden from dom0, does this device is added >with PHYSDEVOP_pci_device_add hypercall?) >- If device does not exists return >- If device exists in RB-tree then > - Validate ITT IPA & ITT size and update its_device struct To validate the ITT size you need to know the number of interrupt ID. > - Check if device is already assigned to the domain, > if not then >- Find number of vectors(nrvecs) for this device. >- Allocate nrvecs number of LPI >- Fetch vlpi_map for this device (preallocated at the > time of adding > this device to Xen). This vlpi_map holds mapping of > Virtual LPI to > Physical LPI and ID. >- Call p2m_lookup on ITT IPA addr and get physical ITT address >- Assign this device to this domain and mark as enabled > - If this device already exists with the domain (Domain is > remapping the device) >- Validate ITT IPA & ITT size and update its_device struct >- Call p2m_lookup on ITT IPA addr and get physical ITT address >- Disable all the LPIs of this device by searching > through vlpi_map and LPI > configuration table Disabling all the LPIs associated to a device can be time consuming because you have to unroute them and make sure that the physical ITS effectively disabled it before sending the MAPD command. Given that the software would be buggy if it send a MAPD command without releasing all the associated interrupt we could ignore the command if any interrupt is still enabled. > > - Generate/format physical ITS command: MAPD, ITT PA, ITT Size > >> >> If Validation bit is not set: >> >> - Validate if the device exits by checking vITS device li
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
Hi Ian, NIT: You used my Linaro email which I think is de-activated now :). On 27/05/2015 13:48, Ian Campbell wrote: > Here follows draft C based on previous feedback. > > Also at: > > http://xenbits.xen.org/people/ianc/vits/draftC.{pdf,html} > > I think I've captured most of the previous discussion, except where > explicitly noted by XXX or in other replies, but please do point out > places where I've missed something. > > One area where I am pretty sure I've dropped the ball is on the > completion and update of `CREADR`. That conversation ended up > bifurcating along the 1:N vs N:N mapping scheme lines, and I didn't > manage to get the various proposals straight. Since we've now agreed on > N:N hopefully we can reach a conclusion (no pun intended) on the > completion aspect too (sorry that this probably means rehasing at least > a subset of the previous thread). > > Ian. > > % Xen on ARM vITS Handling > % Ian Campbell > % Draft C > > # Changelog > > ## Since Draft B > > * Details of command translation (thanks to Julien and Vijay) > * Added background on LPI Translation and Pending tablesd > * Added background on Collections > * Settled on `N:N` scheme for vITS:pITS mapping. > * Rejigged section nesting a bit. > * Since we now thing translation should be cheap, settle on >translation at scheduling time. > * Lazy `INVALL` and `SYNC` > > ## Since Draft A > > * Added discussion of when/where command translation occurs. > * Contention on scheduler lock, suggestion to use SOFTIRQ. > * Handling of domain shutdown. > * More detailed discussion of multiple vs single vits pros/cons. > > # Introduction > > ARM systems containing a GIC version 3 or later may contain one or > more ITS logical blocks. An ITS is used to route Message Signalled > interrupts from devices into an LPI injection on the processor. > > The following summarises the ITS hardware design and serves as a set > of assumptions for the vITS software design. (XXX it is entirely > possible I've horribly misunderstood how this stuff fits > together). For full details of the ITS see the "GIC Architecture > Specification". > > ## Device Identifiers > > Each device using the ITS is associated with a unique identifier. > > The device IDs are typically described via system firmware, e.g. the > ACPI IORT table or via device tree. > > The number of device ids is variable and can be discovered via > `GITS_TYPER.Devbits`. This field allows an ITS to have up to 2^32 > device. > > ## Interrupt Collections > > Each interrupt is a member of an Interrupt Collection. This allows > software to manage large numbers of physical interrupts with a small > number of commands rather than issuing one command per interrupt. > > On a system with N processors, the ITS must provide at least N+1 > collections. > > ## Target Addresses > > The Target Address correspond to a specific GIC re-distributor. The format > of this field depends on the value of the `GITS_TYPER.PTA` bit: > > * 1: the base address of the re-distributor target is used > * 0: a unique processor number is used. The mapping between the >processor affinity value (`MPIDR`) and the processor number is >discoverable via `GICR_TYPER.ProcessorNumber`. > > ## ITS Translation Table > > Message signalled interrupts are translated into an LPI via an ITS > translation table which must be configured for each device which can > generate an MSI. I'm not sure what is the ITS Table Table. Did you mean Interrupt Translation Table? > > The ITS translation table maps the device id of the originating devic s/devic/device/? > into an Interrupt Collection and then into a target address. > > ## ITS Configuration > > The ITS is configured and managed, including establishing and > configuring Translation Table for each device, via an in memory ring > shared between the CPU and the ITS controller. The ring is managed via > the `GITS_CBASER` register and indexed by `GITS_CWRITER` and > `GITS_CREADR` registers. > > A processor adds commands to the shared ring and then updates > `GITS_CWRITER` to make them visible to the ITS controller. > > The ITS controller processes commands from the ring and then updates > `GITS_CREADR` to indicate the the processor that the command has been > processed. > > Commands are processed sequentially. > > Commands sent on the ring include operational commands: > > * Routing interrupts to processors; > * Generating interrupts; > * Clearing the pending state of interrupts; > * Synchronising the command queue > > and maintenance commands: > > * Map device/collection/processor; > * Map virtual interrupt; > * Clean interrupts; > * Discard interrupts; > > The field `GITS_CBASER.Size` encodes the number of 4KB pages minus 0 > consisting of the command queue. This field is 8 bits which means the > maximum size is 2^8 * 4KB = 1MB. Given that each command is 32 bytes, > there is a maximum of 32768 commands in the queue. > > The ITS provides no specific completion notification > mechanism. Completion is
Re: [Xen-devel] [Draft C] Xen on ARM vITS Handling
On Wed, May 27, 2015 at 5:18 PM, Ian Campbell wrote: > Here follows draft C based on previous feedback. > > Also at: > > http://xenbits.xen.org/people/ianc/vits/draftC.{pdf,html} > > I think I've captured most of the previous discussion, except where > explicitly noted by XXX or in other replies, but please do point out > places where I've missed something. > > One area where I am pretty sure I've dropped the ball is on the > completion and update of `CREADR`. That conversation ended up > bifurcating along the 1:N vs N:N mapping scheme lines, and I didn't > manage to get the various proposals straight. Since we've now agreed on > N:N hopefully we can reach a conclusion (no pun intended) on the > completion aspect too (sorry that this probably means rehasing at least > a subset of the previous thread). > > Ian. > > % Xen on ARM vITS Handling > % Ian Campbell > % Draft C > > # Changelog > > ## Since Draft B > > * Details of command translation (thanks to Julien and Vijay) > * Added background on LPI Translation and Pending tablesd > * Added background on Collections > * Settled on `N:N` scheme for vITS:pITS mapping. > * Rejigged section nesting a bit. > * Since we now thing translation should be cheap, settle on > translation at scheduling time. > * Lazy `INVALL` and `SYNC` > > ## Since Draft A > > * Added discussion of when/where command translation occurs. > * Contention on scheduler lock, suggestion to use SOFTIRQ. > * Handling of domain shutdown. > * More detailed discussion of multiple vs single vits pros/cons. > > # Introduction > > ARM systems containing a GIC version 3 or later may contain one or > more ITS logical blocks. An ITS is used to route Message Signalled > interrupts from devices into an LPI injection on the processor. > > The following summarises the ITS hardware design and serves as a set > of assumptions for the vITS software design. (XXX it is entirely > possible I've horribly misunderstood how this stuff fits > together). For full details of the ITS see the "GIC Architecture > Specification". > > ## Device Identifiers > > Each device using the ITS is associated with a unique identifier. > > The device IDs are typically described via system firmware, e.g. the > ACPI IORT table or via device tree. > > The number of device ids is variable and can be discovered via > `GITS_TYPER.Devbits`. This field allows an ITS to have up to 2^32 > device. > > ## Interrupt Collections > > Each interrupt is a member of an Interrupt Collection. This allows > software to manage large numbers of physical interrupts with a small > number of commands rather than issuing one command per interrupt. > > On a system with N processors, the ITS must provide at least N+1 > collections. > > ## Target Addresses > > The Target Address correspond to a specific GIC re-distributor. The format > of this field depends on the value of the `GITS_TYPER.PTA` bit: > > * 1: the base address of the re-distributor target is used > * 0: a unique processor number is used. The mapping between the > processor affinity value (`MPIDR`) and the processor number is > discoverable via `GICR_TYPER.ProcessorNumber`. > > ## ITS Translation Table > > Message signalled interrupts are translated into an LPI via an ITS > translation table which must be configured for each device which can > generate an MSI. > > The ITS translation table maps the device id of the originating devic > into an Interrupt Collection and then into a target address. > > ## ITS Configuration > > The ITS is configured and managed, including establishing and > configuring Translation Table for each device, via an in memory ring > shared between the CPU and the ITS controller. The ring is managed via > the `GITS_CBASER` register and indexed by `GITS_CWRITER` and > `GITS_CREADR` registers. > > A processor adds commands to the shared ring and then updates > `GITS_CWRITER` to make them visible to the ITS controller. > > The ITS controller processes commands from the ring and then updates > `GITS_CREADR` to indicate the the processor that the command has been > processed. > > Commands are processed sequentially. > > Commands sent on the ring include operational commands: > > * Routing interrupts to processors; > * Generating interrupts; > * Clearing the pending state of interrupts; > * Synchronising the command queue > > and maintenance commands: > > * Map device/collection/processor; > * Map virtual interrupt; > * Clean interrupts; > * Discard interrupts; > > The field `GITS_CBASER.Size` encodes the number of 4KB pages minus 0 > consisting of the command queue. This field is 8 bits which means the > maximum size is 2^8 * 4KB = 1MB. Given that each command is 32 bytes, > there is a maximum of 32768 commands in the queue. > > The ITS provides no specific completion notification > mechanism. Completion is monitored by a combination of a `SYNC` > command and either polling `GITS_CREADR` or notification via an > interrupt generated via the `INT` command. > > Note that the inte
[Xen-devel] [Draft C] Xen on ARM vITS Handling
Here follows draft C based on previous feedback. Also at: http://xenbits.xen.org/people/ianc/vits/draftC.{pdf,html} I think I've captured most of the previous discussion, except where explicitly noted by XXX or in other replies, but please do point out places where I've missed something. One area where I am pretty sure I've dropped the ball is on the completion and update of `CREADR`. That conversation ended up bifurcating along the 1:N vs N:N mapping scheme lines, and I didn't manage to get the various proposals straight. Since we've now agreed on N:N hopefully we can reach a conclusion (no pun intended) on the completion aspect too (sorry that this probably means rehasing at least a subset of the previous thread). Ian. % Xen on ARM vITS Handling % Ian Campbell % Draft C # Changelog ## Since Draft B * Details of command translation (thanks to Julien and Vijay) * Added background on LPI Translation and Pending tablesd * Added background on Collections * Settled on `N:N` scheme for vITS:pITS mapping. * Rejigged section nesting a bit. * Since we now thing translation should be cheap, settle on translation at scheduling time. * Lazy `INVALL` and `SYNC` ## Since Draft A * Added discussion of when/where command translation occurs. * Contention on scheduler lock, suggestion to use SOFTIRQ. * Handling of domain shutdown. * More detailed discussion of multiple vs single vits pros/cons. # Introduction ARM systems containing a GIC version 3 or later may contain one or more ITS logical blocks. An ITS is used to route Message Signalled interrupts from devices into an LPI injection on the processor. The following summarises the ITS hardware design and serves as a set of assumptions for the vITS software design. (XXX it is entirely possible I've horribly misunderstood how this stuff fits together). For full details of the ITS see the "GIC Architecture Specification". ## Device Identifiers Each device using the ITS is associated with a unique identifier. The device IDs are typically described via system firmware, e.g. the ACPI IORT table or via device tree. The number of device ids is variable and can be discovered via `GITS_TYPER.Devbits`. This field allows an ITS to have up to 2^32 device. ## Interrupt Collections Each interrupt is a member of an Interrupt Collection. This allows software to manage large numbers of physical interrupts with a small number of commands rather than issuing one command per interrupt. On a system with N processors, the ITS must provide at least N+1 collections. ## Target Addresses The Target Address correspond to a specific GIC re-distributor. The format of this field depends on the value of the `GITS_TYPER.PTA` bit: * 1: the base address of the re-distributor target is used * 0: a unique processor number is used. The mapping between the processor affinity value (`MPIDR`) and the processor number is discoverable via `GICR_TYPER.ProcessorNumber`. ## ITS Translation Table Message signalled interrupts are translated into an LPI via an ITS translation table which must be configured for each device which can generate an MSI. The ITS translation table maps the device id of the originating devic into an Interrupt Collection and then into a target address. ## ITS Configuration The ITS is configured and managed, including establishing and configuring Translation Table for each device, via an in memory ring shared between the CPU and the ITS controller. The ring is managed via the `GITS_CBASER` register and indexed by `GITS_CWRITER` and `GITS_CREADR` registers. A processor adds commands to the shared ring and then updates `GITS_CWRITER` to make them visible to the ITS controller. The ITS controller processes commands from the ring and then updates `GITS_CREADR` to indicate the the processor that the command has been processed. Commands are processed sequentially. Commands sent on the ring include operational commands: * Routing interrupts to processors; * Generating interrupts; * Clearing the pending state of interrupts; * Synchronising the command queue and maintenance commands: * Map device/collection/processor; * Map virtual interrupt; * Clean interrupts; * Discard interrupts; The field `GITS_CBASER.Size` encodes the number of 4KB pages minus 0 consisting of the command queue. This field is 8 bits which means the maximum size is 2^8 * 4KB = 1MB. Given that each command is 32 bytes, there is a maximum of 32768 commands in the queue. The ITS provides no specific completion notification mechanism. Completion is monitored by a combination of a `SYNC` command and either polling `GITS_CREADR` or notification via an interrupt generated via the `INT` command. Note that the interrupt generation via `INT` requires an originating device ID to be supplied (which is then translated via the ITS into an LPI). No specific device ID is defined for this purpose and so the OS software is expected to fabricate one. Possible ways of inventing such a device ID are: * Enume