[Xen-devel] [Draft Design v3] ACPI/IORT Support in Xen.
ACPI/IORT Support in Xen. -- Draft 3 Revision History: Changes since v2: - Modified as per comments from Julien /Sameer/Andre Changes since v1: - Modified IORT Parsing data structures. - Added RID-StreamID and RID-DeviceID map as per Andre's suggestion. - Added reference code which can be read along with this document. - Removed domctl for DomU, it would be covered in PCI-PT design. Introduction: - I had sent out patch series [0] to hide smmu from Dom0 IORT. This document is a rework of the series as it: (a) extends scope by adding parsing of IORT table once and storing it in in-memory data structures, which can then be used for querying. This would eliminate the need to parse complete iort table multiple times. (b) Generation of IORT for domains be independent using a set of helper routines. Index 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Implementation Phases 8. References 1. IORT Structure ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT [1]: It has nodes for PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> Deviceid can be obtained. Which device is behind which SMMU and which interrupt controller, topology is described in IORT Table. Some PCI RC may be not behind an SMMU, and directly map RID-DeviceID. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. Each iort_node contains an ID map array to translate one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is associated with PCI RC node, SMMU node, Named component node. and can reference to a SMMU or ITS node. 2. Support of IORT --- It is proposed in this document to parse iort once and use the information to translate RID without traversing IORT again and again. Also Xen prepares an IORT table for dom0 based on host IORT. For DomU IORT table is required only in case of device passthrough. 3. IORT for Dom0 - IORT for Dom0 is based on host iort. Few nodes could be removed or modified. For instance - Host SMMU nodes should not be present as Xen should only touch it. - platform devices (named components) would be passed as is. The visibility criterion for DOM0 is TDB. 4. IORT for DomU - IORT for DomU should be generated by toolstack. IORT table is only present in case of device passthrough. At a minimum domU IORT should include a single PCIRC and ITS Group. Similar PCIRC can be added in DSDT. The exact structure of DomU IORT would be covered along with PCI PT design. 5. Parsing of IORT in Xen -- IORT nodes can be saved in structures so that IORT table parsing can be done once and is reused by all xen subsystems like ITS / SMMU etc, domain creation. Proposed are the structures to hold IORT information. [4] struct rid_map_struct { void *pcirc_node; u16 inpute_base; u32 output_base; u16 id_ccount; struct list_head entry; }; Two global variables would hold the maps. struct list_head rid_streamid_map; struct list_head rid_deviceid_map; 5.1 Functions to query StreamID and DeviceID from RID. void query_streamid(void *pcirc_node, u16 rid, u32 *streamid); void query_deviceid(void *pcirc_node, u16 rid, u32 *deviceid); Adding a mapping is done via helper functions int add_rid_streamid_map(void *pcirc_node, u32 ib, u32 ob, u32 idc) int add_rid_deviceid_map(void *pcirc_node, u32 ib, u32 ob, u32 idc) - rid-streamid map is straight forward and is created using pci_rc's idmap - rid-deviceid map is created by translating streamids to deviceids. fixup_rid_deviceid_map function does that. (See [6]) To keep the api similar to linux iort_node_map_rid be mapped to query_streamid 6. IORT Generation --- It is proposed to have a common helper library to generate IORT for dom0/U. Note: it is desired to have IORT generation code sharing between toolstack and Xen. a. For Dom0 rid_deviceId_map can be used directly to generate dom0 IORT table. Exclusions of nodes is still open for suggestions. b. For DomU Minimal structure is discussed in section 4. It will be further discussed in the context of PCI PT design. 7. Implementation Phases - a. IORT Parsing and RID Query b. IORT Generation for Dom0 c. IORT Generation for DomU. 8. References: - [0] https://www.mail-archive.com/xen-devel@lists.xen.org/msg121667.html [1] ARM DEN0049C:
Re: [Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.
On 11/16/2017 5:23 PM, Julien Grall wrote: Hi Manish, On 16/11/17 11:46, Manish Jaggi wrote: On 11/16/2017 5:07 PM, Julien Grall wrote: On 16/11/17 07:39, Manish Jaggi wrote: On 11/14/2017 6:53 PM, Julien Grall wrote: 3. IORT for Dom0 - IORT for Dom0 is based on host iort. Few nodes could be removed or modified. For instance - Host SMMU nodes should not be present as Xen should only touch it. - platform nodes (named components) may be controlled by xen command line. I am not sure where does this example come from? As I said, there are no plan to support Platform Device passthrough with ACPI. A better example here would removing PMCG. It came from review comments on my previous IORT SMMU hiding patch. Andre suggested that Platform Nodes are needed. After some brainstorming with Julien we found two problems: 1) This only covers RC nodes, but not "named components" (platform devices), which we will need. ... From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg123434.html I think you misunderstood my comment here... What I call "device passthrough" is giving access to a device to a domain other than the Hardware Domain There are no plan for supporting platform device-passthrough on ACPI and I don't understand why you would like to control that using the command line. What Andre was saying is your series was not covering the "named components" for the Hardware Domain. The section 3 is IORT for Dom0, where I mentioned that some platform devices can be hidden from dom0. So your comment on Platform device Passthrough might not be valid then as it is for domU's only. Regarding the visibility of a platform device for dom0, I took cue from your comment below Where did I ever mention the command line solution? Please stop trying to put words in my mouth. There are other reason than passthrough to hide device from the Hardware Domain. Lets put some clarity on the below items specifically for dom0 a. can platform devices can be part of dom0 IORT ? b. If (a) yes, then how to decide on a finer grain the visibility of platform devices for Dom0 Update ACPI tables to remove the device? c. Is fine grain visibility of platform device for dom0 to be covered in my current patchset This has two benefits: ... 3) We could decide in a finer grain which devices (e.g platform device)Dom0 can see. From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg124534.html Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.
On 11/16/2017 5:07 PM, Julien Grall wrote: On 16/11/17 07:39, Manish Jaggi wrote: On 11/14/2017 6:53 PM, Julien Grall wrote: 3. IORT for Dom0 - IORT for Dom0 is based on host iort. Few nodes could be removed or modified. For instance - Host SMMU nodes should not be present as Xen should only touch it. - platform nodes (named components) may be controlled by xen command line. I am not sure where does this example come from? As I said, there are no plan to support Platform Device passthrough with ACPI. A better example here would removing PMCG. It came from review comments on my previous IORT SMMU hiding patch. Andre suggested that Platform Nodes are needed. After some brainstorming with Julien we found two problems: 1) This only covers RC nodes, but not "named components" (platform devices), which we will need. ... From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg123434.html I think you misunderstood my comment here... What I call "device passthrough" is giving access to a device to a domain other than the Hardware Domain There are no plan for supporting platform device-passthrough on ACPI and I don't understand why you would like to control that using the command line. What Andre was saying is your series was not covering the "named components" for the Hardware Domain. The section 3 is IORT for Dom0, where I mentioned that some platform devices can be hidden from dom0. So your comment on Platform device Passthrough might not be valid then as it is for domU's only. Regarding the visibility of a platform device for dom0, I took cue from your comment below This has two benefits: ... 3) We could decide in a finer grain which devices (e.g platform device)Dom0 can see. From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg124534.html Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.
On 11/14/2017 6:53 PM, Julien Grall wrote: Hi Manish, Hey Julien, On 08/11/17 14:38, Manish Jaggi wrote: ACPI/IORT Support in Xen. -- Draft 2 Revision History: Changes since v1- - Modified IORT Parsing data structures. - Added RID->StreamID and RID->DeviceID map as per Andre's suggestion. - Added reference code which can be read along with this document. - Removed domctl for DomU, it would be covered in PCI-PT design. Introduction: - I had sent out patch series [0] to hide smmu from Dom0 IORT. This document is a rework of the series as it: (a) extends scope by adding parsing of IORT table once and storing it in in-memory data structures, which can then be used for querying. This would eliminate the need to parse complete iort table multiple times. (b) Generation of IORT for domains be independent using a set of helper routines. Index 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Implementation Phases 8. References 1. IORT Structure ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT [1]: It has nodes for PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> DeviceId can be obtained. Which device is behind which SMMU and which interrupt controller, topology is described in IORT Table. Some PCI RC may be not behind an SMMU, and directly map RID->DeviceID. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. Each iort_node contains an ID map array to translate one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is associated with PCI RC node, SMMU node, Named component node. and can reference to a SMMU or ITS node. 2. Current Support of IORT --- IORT is proposed to be used by Xen to setup SMMU's and platform devices and for translating RID->StreamID and RID->DeviceID. I am not sure to understand "to setup SMMU's and platform devices...". With IORT, a software can discover list of SMMUs and the IDs to configure the ITS and SMMUs for each device (e.g PCI, integrated...) on the platform. You will not be able to discover the list of platform devices through it. Also, it is not really "proposed". It is the only way to get those information from ACPI. ok, I will rephrase it. It is proposed in this document to parse iort once and use the information to translate RID without traversing IORT again and again. Also Xen prepares an IORT table for dom0 based on host IORT. For DomU IORT table proposed only in case of device passthrough. 3. IORT for Dom0 - IORT for Dom0 is based on host iort. Few nodes could be removed or modified. For instance - Host SMMU nodes should not be present as Xen should only touch it. - platform nodes (named components) may be controlled by xen command line. I am not sure where does this example come from? As I said, there are no plan to support Platform Device passthrough with ACPI. A better example here would removing PMCG. It came from review comments on my previous IORT SMMU hiding patch. Andre suggested that Platform Nodes are needed. After some brainstorming with Julien we found two problems: 1) This only covers RC nodes, but not "named components" (platform devices), which we will need. ... From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg123434.html 4. IORT for DomU - IORT for DomU should be generated by toolstack. IORT table is only present in case of device passthrough. At a minimum domU IORT should include a single PCIRC and ITS Group. Similar PCIRC can be added in DSDT. The exact structure of DomU IORT would be covered along with PCI PT design. 5. Parsing of IORT in Xen -- IORT nodes can be saved in structures so that IORT table parsing can be done once and is reused by all xen subsystems like ITS / SMMU etc, domain creation. Proposed are the structures to hold IORT information. [4] struct rid_map_struct { void *pcirc_node; u16 ib; /* Input base */ u32 ob; /* Output base */ u16 idc; /* Id Count */ struct list_head entry; }; struct iort_ref { struct list_head rid_streamId_map; struct list_head rid_deviceId_map; }iortref; 5.1 Functions to query StreamID and DeviceID from RID. void query_streamId(void *pcirc_node, u16 rid, u32 *streamId); void query_deviceId(void *pcirc_node, u16 rid, u32 *deviceId); Adding a mapping is done via helper functions intadd_rid_streamId_map(void*pcirc_node, u32 ib, u32 ob, u32 idc) intadd_rid_de
Re: [Xen-devel] [RFC v2 5/7] acpi:arm64: Add support for parsing IORT table
Hi Sameer On 9/21/2017 6:07 AM, Sameer Goel wrote: Add support for parsing IORT table to initialize SMMU devices. * The code for creating an SMMU device has been modified, so that the SMMU device can be initialized. * The NAMED NODE code has been commented out as this will need DOM0 kernel support. * ITS code has been included but it has not been tested. Signed-off-by: Sameer GoelFollowup of the discussions we had on iort parsing and querying streamID and deviceId based on RID. I have extended your patchset with a patch that provides an alternative way of parsing iort into maps : {rid-streamid}, {rid-deviceID) which can directly be looked up for searching streamId for a rid. This will remove the need to traverse iort table again. The test patch just describes the proposed flow and how the parsing and query code might fit in. I have not tested it. The code only compiles. https://github.com/mjaggi-cavium/xen-wip/commit/df006d64bdbb5c8344de5a710da8bf64c9e8edd5 (This repo has all 7 of your patches + test code patch merged. Note: The commit text of the patch describes the basic flow /assumptions / usage of functions. Please see the code along with the v2 design draft. [RFC] [Draft Design v2] ACPI/IORT Support in Xen. https://lists.xen.org/archives/html/xen-devel/2017-11/msg00512.html I seek your advice on this. Please provide your feedback. Thanks Manish --- xen/arch/arm/setup.c | 3 + xen/drivers/acpi/Makefile | 1 + xen/drivers/acpi/arm/Makefile | 1 + xen/drivers/acpi/arm/iort.c| 173 + xen/drivers/passthrough/arm/smmu.c | 1 + xen/include/acpi/acpi_iort.h | 17 ++-- xen/include/asm-arm/device.h | 2 + xen/include/xen/acpi.h | 21 + xen/include/xen/pci.h | 8 ++ 9 files changed, 146 insertions(+), 81 deletions(-) create mode 100644 xen/drivers/acpi/arm/Makefile diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 92f173b..4ba09b2 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -49,6 +49,7 @@ #include #include #include +#include struct bootinfo __initdata bootinfo; @@ -796,6 +797,8 @@ void __init start_xen(unsigned long boot_phys_offset, tasklet_subsys_init(); +/* Parse the ACPI iort data */ +acpi_iort_init(); xsm_dt_init(); diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile index 444b11d..e7ffd82 100644 --- a/xen/drivers/acpi/Makefile +++ b/xen/drivers/acpi/Makefile @@ -1,5 +1,6 @@ subdir-y += tables subdir-y += utilities +subdir-$(CONFIG_ARM) += arm subdir-$(CONFIG_X86) += apei obj-bin-y += tables.init.o diff --git a/xen/drivers/acpi/arm/Makefile b/xen/drivers/acpi/arm/Makefile new file mode 100644 index 000..7c039bb --- /dev/null +++ b/xen/drivers/acpi/arm/Makefile @@ -0,0 +1 @@ +obj-y += iort.o diff --git a/xen/drivers/acpi/arm/iort.c b/xen/drivers/acpi/arm/iort.c index 2e368a6..7f54062 100644 --- a/xen/drivers/acpi/arm/iort.c +++ b/xen/drivers/acpi/arm/iort.c @@ -14,17 +14,47 @@ * This file implements early detection/parsing of I/O mapping * reported to OS through firmware via I/O Remapping Table (IORT) * IORT document number: ARM DEN 0049A + * + * Based on Linux drivers/acpi/arm64/iort.c + * => commit ca78d3173cff3503bcd15723b049757f75762d15 + * + * Xen modification: + * Sameer Goel + * Copyright (C) 2017, The Linux Foundation, All rights reserved. + * */ -#define pr_fmt(fmt) "ACPI: IORT: " fmt - -#include -#include -#include -#include -#include -#include -#include +#include +#include +#include +#include +#include +#include +#include + +#include + +/* Xen: Define compatibility functions */ +#define FW_BUG "[Firmware Bug]: " +#define pr_err(fmt, ...) printk(XENLOG_ERR fmt, ## __VA_ARGS__) +#define pr_warn(fmt, ...) printk(XENLOG_WARNING fmt, ## __VA_ARGS__) + +/* Alias to Xen allocation helpers */ +#define kfree xfree +#define kmalloc(size, flags)_xmalloc(size, sizeof(void *)) +#define kzalloc(size, flags)_xzalloc(size, sizeof(void *)) + +/* Redefine WARN macros */ +#undef WARN +#undef WARN_ON +#define WARN(condition, format...) ({ \ + int __ret_warn_on = !!(condition); \ + if (unlikely(__ret_warn_on))\ + printk(format); \ + unlikely(__ret_warn_on);\ +}) +#define WARN_TAINT(cond, taint, format...) WARN(cond, format) +#define WARN_ON(cond) (!!cond) #define IORT_TYPE_MASK(type) (1 << (type)) #define IORT_MSI_TYPE (1 << ACPI_IORT_NODE_ITS_GROUP) @@ -256,6 +286,13 @@ static acpi_status iort_match_node_callback(struct acpi_iort_node *node, acpi_status status; if (node->type ==
[Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.
ACPI/IORT Support in Xen. -- Draft 2 Revision History: Changes since v1- - Modified IORT Parsing data structures. - Added RID->StreamID and RID->DeviceID map as per Andre's suggestion. - Added reference code which can be read along with this document. - Removed domctl for DomU, it would be covered in PCI-PT design. Introduction: - I had sent out patch series [0] to hide smmu from Dom0 IORT. This document is a rework of the series as it: (a) extends scope by adding parsing of IORT table once and storing it in in-memory data structures, which can then be used for querying. This would eliminate the need to parse complete iort table multiple times. (b) Generation of IORT for domains be independent using a set of helper routines. Index 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Implementation Phases 8. References 1. IORT Structure ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT [1]: It has nodes for PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> DeviceId can be obtained. Which device is behind which SMMU and which interrupt controller, topology is described in IORT Table. Some PCI RC may be not behind an SMMU, and directly map RID->DeviceID. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. Each iort_node contains an ID map array to translate one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is associated with PCI RC node, SMMU node, Named component node. and can reference to a SMMU or ITS node. 2. Current Support of IORT --- IORT is proposed to be used by Xen to setup SMMU's and platform devices and for translating RID->StreamID and RID->DeviceID. It is proposed in this document to parse iort once and use the information to translate RID without traversing IORT again and again. Also Xen prepares an IORT table for dom0 based on host IORT. For DomU IORT table proposed only in case of device passthrough. 3. IORT for Dom0 - IORT for Dom0 is based on host iort. Few nodes could be removed or modified. For instance - Host SMMU nodes should not be present as Xen should only touch it. - platform nodes (named components) may be controlled by xen command line. 4. IORT for DomU - IORT for DomU should be generated by toolstack. IORT table is only present in case of device passthrough. At a minimum domU IORT should include a single PCIRC and ITS Group. Similar PCIRC can be added in DSDT. The exact structure of DomU IORT would be covered along with PCI PT design. 5. Parsing of IORT in Xen -- IORT nodes can be saved in structures so that IORT table parsing can be done once and is reused by all xen subsystems like ITS / SMMU etc, domain creation. Proposed are the structures to hold IORT information. [4] struct rid_map_struct { void *pcirc_node; u16 ib; /* Input base */ u32 ob; /* Output base */ u16 idc; /* Id Count */ struct list_head entry; }; struct iort_ref { struct list_head rid_streamId_map; struct list_head rid_deviceId_map; }iortref; 5.1 Functions to query StreamID and DeviceID from RID. void query_streamId(void *pcirc_node, u16 rid, u32 *streamId); void query_deviceId(void *pcirc_node, u16 rid, u32 *deviceId); Adding a mapping is done via helper functions intadd_rid_streamId_map(void*pcirc_node, u32 ib, u32 ob, u32 idc) intadd_rid_deviceId_map(void*pcirc_node, u32 ib, u32 ob, u32 idc) - rid-streamId map is straight forward and is created using pci_rc's idmap - rid-deviceId map is created by translating streamIds to deviceIds. fixup_rid_deviceId_map function does that. (See [6]) It is proposed that query functions should replace functions like iort_node_map_rid which is currently used in linux and is imported in Xen in the patchset [2][5] 5.2 Proposed Flow of parsing The flow is based on the patchset in [5]. I have added a reference code on top of it which does IORT parsing as described in this section. The code is available at [6]. The commit also describes the code flow and assumptions. 6. IORT Generation --- It is proposed to have a common helper library to generate IORT for dom0/U. Note: it is desired to have IORT generation code sharing between toolstack and Xen. a. For Dom0 rid_deviceId_map can be used directly to generate dom0 IORT table. Exclusions of nodes is still open for suggestions. b. For DomU Minimal structure is discussed in section 4. It will be further discussed in the context of PCI
Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.
On 10/31/2017 5:03 AM, Goel, Sameer wrote: On 10/12/2017 3:03 PM, Manish Jaggi wrote: ACPI/IORT Support in Xen. -- I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending the scope and including all that is required to support ACPI/IORT in Xen. Presenting for review first _draft_ of design of ACPI/IORT support in Xen. Not complete though. Discussed is the parsing and generation of IORT table for Dom0 and DomUs. It is proposed that IORT be parsed and the information in saved into xen data-structure say host_iort_struct and is reused by all xen subsystems like ITS / SMMU etc. Since this is first draft is open to technical comments, modifications and suggestions. Please be open and feel free to add any missing points / additions. 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Future Work and TODOs 1. What is IORT. What are its components ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT is has nodes which have information about PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> DeviceId can be obtained. More specifically which device is behind which SMMU and which interrupt controller, this topology is described in IORT Table. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. For a non-pci device RID could be simply an ID. Each iort_node contains an ID map array to translate from one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is present in PCI RC node,SMMU node, Named component node etc and can reference to a SMMU or ITS node. 2. Current Support of IORT --- Currently Xen passes host IORT table to dom0 without any modifications. For DomU no IORT table is passed. 3. IORT for Dom0 - IORT for Dom0 is prepared by xen and it is fairly similar to the host iort. However few nodes could be removed removed or modified. For instance - host SMMU nodes should not be present - ITS group nodes are same as host iort but, no stage2 mapping is done for them. - platform nodes (named components) may be selectively present depending on the case where xen is using some. This could be controlled by xen command line. - More items : TODO 4. IORT for DomU --- IORT for DomU is generated by the toolstack. IORT topology is different when DomU supports device passthrough. At a minimum domU IORT should include a single PCIRC and ITS Group. Similar PCIRC can be added in DSDT. Additional node can be added if platform device is assigned to domU. No extra node should be required for PCI device pass-through. It is proposed that the idrange of PCIRC and ITS group be constant for domUs. In case if PCI PT,using a domctl toolstack can communicate physical RID: virtual RID, deviceID: virtual deviceID to xen. It is assumed that domU PCI Config access would be trapped in Xen. The RID at which assigned device is enumerated would be the one provided by the domctl, domctl_set_deviceid_mapping TODO: device assign domctl i/f. Note: This should suffice the virtual deviceID support pointed by Andre. [4] We might not need this domctl if assign_device hypercall is extended to provide this information. 5. Parsing of IORT in Xen -- I think a Linux like approach will solve the following use cases: 1. Identify the SMMU devices and initialize the devices as needed. 2. API function to setup SMMUs in response to a discovery notification from DOM0 - We will still need a path for non pcie devices. - I agree with Andre that the use cases for the named nodes in IORT should be treated the same as PCIe RC devices. 3. The concept of fwnode is still valid as per 4.14 and we can try reuse most of the parsing code. The idea is parse one use at multiple places. - IORT creation for Dom0 - smmu init - finding smmu for a deviceID when pci_assign_device is called by dom0 Manish, I looked at your old patch and had a couple of questions before I comment more on this design. From an initial glance, it seems that you should be able to hide SMMUs by calling the already defined API functions in the iort.c implementation (for most part :)). Yes some of the parsing functions can be replaced with APIs. I am wondering if we really need to keep a list of parsed nodes. Or which use case apart from hw dom IORT mandates this? For all cases I believe where a mapping lookup of Devid-smmu-pcirc is required. IORT nodes can be saved in structures so that IORT table parsing can be done once and is reused by all xen subsystems like ITS / SMMU etc,
Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.
On 10/27/2017 7:35 PM, Andre Przywara wrote: Hi, Hey Andre, On 25/10/17 09:22, Manish Jaggi wrote: On 10/23/2017 7:27 PM, Andre Przywara wrote: Hi Manish, On 12/10/17 22:03, Manish Jaggi wrote: ACPI/IORT Support in Xen. -- I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending the scope and including all that is required to support ACPI/IORT in Xen. Presenting for review first _draft_ of design of ACPI/IORT support in Xen. Not complete though. Discussed is the parsing and generation of IORT table for Dom0 and DomUs. It is proposed that IORT be parsed and the information in saved into xen data-structure say host_iort_struct and is reused by all xen subsystems like ITS / SMMU etc. Since this is first draft is open to technical comments, modifications and suggestions. Please be open and feel free to add any missing points / additions. 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Future Work and TODOs 1. What is IORT. What are its components ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT is has nodes which have information about PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> DeviceId can be obtained. More specifically which device is behind which SMMU and which interrupt controller, this topology is described in IORT Table. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. For a non-pci device RID could be simply an ID. Each iort_node contains an ID map array to translate from one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is present in PCI RC node,SMMU node, Named component node etc and can reference to a SMMU or ITS node. 2. Current Support of IORT --- Currently Xen passes host IORT table to dom0 without any modifications. For DomU no IORT table is passed. 3. IORT for Dom0 - IORT for Dom0 is prepared by xen and it is fairly similar to the host iort. However few nodes could be removed removed or modified. For instance - host SMMU nodes should not be present - ITS group nodes are same as host iort but, no stage2 mapping is done for them. What do you mean with stage2 mapping? Please ignore this line. Copy paste error. Read it as follows - ITS group nodes are same as host iort. (though I would modify the same as in next draft) - platform nodes (named components) may be selectively present depending on the case where xen is using some. This could be controlled by xen command line. Mmh, I am not so sure platform devices described in the IORT (those which use MSIs!) are so much different from PCI devices here. My understanding is those platform devices are network adapters, for instance, for which Xen has no use. ok. So I would translate "Named Components" or "platform devices" as devices just not using the PCIe bus (so no config space and no (S)BDF), but being otherwise the same from an ITS or SMMU point of view. Correct. - More items : TODO I think we agreed upon rewriting the IORT table instead of patching it? yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was _rewriting_ most of Dom0 IORT and not patching it. I was just after the wording above: "IORT for Dom0 is prepared by xen and it is fairly similar to the host iort. However few nodes could be removed removed or modified." ... which sounds a bit like you alter the h/w IORT. It would be good to clarify this by explicitly mentioning the parsing/generation cycle, as this is a fundamental design decision. Sure will do that. Thanks for pointing that. We can have a IRC discussion on this. I think apart from rewriting, the other tasks which were required that are handled in this epic task - parse IORT and save in xen internal data structures - common code to generate IORT for dom0/domU - All xen code that parses IORT multiple times use now the xen internal data structures. Yes, that sounds about right. :) (I have explained this in this mail below) So to some degree your statements are true, but when we rewrite the IORT table without SMMUs (and possibly without other components like the PMUs), it would be kind of a stretch to call it "fairly similar to the host IORT". I think "based on the host IORT" would be more precise. Yes. Based on host IORT is better,thanks. 4. IORT for DomU - IORT for DomU is generated by the toolstack. IORT topology is different when DomU supports device passthrough. Can you elaborate on that? Different compared to what? My understanding i
Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.
On 10/23/2017 8:26 PM, Julien Grall wrote: Hi, On 23/10/17 14:57, Andre Przywara wrote: On 12/10/17 22:03, Manish Jaggi wrote: It is proposed that the idrange of PCIRC and ITS group be constant for domUs. "constant" is a bit confusing here. Maybe "arbitrary", "from scratch" or "independent from the actual h/w"? I don't think we should tie to anything here. IORT for DomU will get some input, it could be same as the host or something generated (not necessarily constant). That's implementation details and might be up to the user. In case if PCI PT,using a domctl toolstack can communicate physical RID: virtual RID, deviceID: virtual deviceID to xen. It is assumed that domU PCI Config access would be trapped in Xen. The RID at which assigned device is enumerated would be the one provided by the domctl, domctl_set_deviceid_mapping TODO: device assign domctl i/f. Note: This should suffice the virtual deviceID support pointed by Andre. [4] Well, there's more to it. First thing: while I tried to include virtual ITS deviceIDs to be different from physical ones, in the moment there are fixed to being mapped 1:1 in the code. So the first step would be to go over the ITS code and identify where "devid" refers to a virtual deviceID and where to a physical one (probably renaming them accordingly). Then we would need a function to translate between the two. At the moment this would be a dummy function (just return the input value). Later we would loop in the actual table. We might not need this domctl if assign_device hypercall is extended to provide this information. Do we actually need a new interface or even extend the existing one? If I got Julien correctly, the existing interface is just fine? In the first place, I am not sure to understand why Domctl is mentioned in this document. I have answered this in reply to Andres mail. Please refer to that. (Just avoiding duplication) I can understand why you want to describe the information used for DomU IORT. But it does not matter at how this is tying to the rest of the passthrough work. Passthrough could be PCI Device PT or platform device passthrough. [...] 6. IORT Generation --- There would be a common code to generate IORT table from iort_table_struct. That sounds useful, but we would need to be careful with sharing code between Xen and the tool stack. Has this actually been done before? Yes, see libelf for instance. But I think there is a terminology problem here. Skimming the rest of the e-mail I see: "populate a basic IORT in a buffer passed by toolstack (using a domctl : domctl_prepare_dom_iort)". By sharing code, I meant creating a library that would be compiled in both the hypervisor and the toolstack. It might need more work. I have answered this in reply to Andres mail. Please refer to that. But as I said before, this is not the purpose now. The purpose is finally getting support of IORT in the hypervisor with the generation of the IORT for Dom0 fully separated from the parsing. Thats not the only purpose, I have described the tasks in reply to Andres mail. Please refer to that. a. For Dom0 the structure (iort_table_struct) be modified to remove smmu nodes and update id_mappings. PCIRC idmap -> output refrence to ITS group. (RID -> DeviceID). TODO: Describe algo in update_id_mapping function to map RID -> DeviceID used in my earlier patch [3] If the above approach works, this would become a simple list iteration, creating PCI rc nodes with the appropriate pointer to the ITS nodes. b. For DomU - iort_table_struct would have minimal 2 nodes (1 PCIRC and 1 ITS group) - populate a basic IORT in a buffer passed by toolstack( using a domctl : domctl_prepare_dom_iort) I think we should reduce this to iterating the same data structure as for Dom0. Each pass-through-ed PCI device would possibly create one struct instance, and later on we do the same iteration as we do for Dom0. If that proves to be simple enough, we might even live with the code duplication between Xen and the toolstack. I think you summarize quite what I have been saying in the previous thread. Thank you :). Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.
On 10/23/2017 7:27 PM, Andre Przywara wrote: Hi Manish, On 12/10/17 22:03, Manish Jaggi wrote: ACPI/IORT Support in Xen. -- I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending the scope and including all that is required to support ACPI/IORT in Xen. Presenting for review first _draft_ of design of ACPI/IORT support in Xen. Not complete though. Discussed is the parsing and generation of IORT table for Dom0 and DomUs. It is proposed that IORT be parsed and the information in saved into xen data-structure say host_iort_struct and is reused by all xen subsystems like ITS / SMMU etc. Since this is first draft is open to technical comments, modifications and suggestions. Please be open and feel free to add any missing points / additions. 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Future Work and TODOs 1. What is IORT. What are its components ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT is has nodes which have information about PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> DeviceId can be obtained. More specifically which device is behind which SMMU and which interrupt controller, this topology is described in IORT Table. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. For a non-pci device RID could be simply an ID. Each iort_node contains an ID map array to translate from one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is present in PCI RC node,SMMU node, Named component node etc and can reference to a SMMU or ITS node. 2. Current Support of IORT --- Currently Xen passes host IORT table to dom0 without any modifications. For DomU no IORT table is passed. 3. IORT for Dom0 - IORT for Dom0 is prepared by xen and it is fairly similar to the host iort. However few nodes could be removed removed or modified. For instance - host SMMU nodes should not be present - ITS group nodes are same as host iort but, no stage2 mapping is done for them. What do you mean with stage2 mapping? Please ignore this line. Copy paste error. Read it as follows - ITS group nodes are same as host iort. (though I would modify the same as in next draft) - platform nodes (named components) may be selectively present depending on the case where xen is using some. This could be controlled by xen command line. Mmh, I am not so sure platform devices described in the IORT (those which use MSIs!) are so much different from PCI devices here. My understanding is those platform devices are network adapters, for instance, for which Xen has no use. ok. So I would translate "Named Components" or "platform devices" as devices just not using the PCIe bus (so no config space and no (S)BDF), but being otherwise the same from an ITS or SMMU point of view. Correct. - More items : TODO I think we agreed upon rewriting the IORT table instead of patching it? yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was _rewriting_ most of Dom0 IORT and not patching it. We can have a IRC discussion on this. I think apart from rewriting, the other tasks which were required that are handled in this epic task - parse IORT and save in xen internal data structures - common code to generate IORT for dom0/domU - All xen code that parses IORT multiple times use now the xen internal data structures. (I have explained this in this mail below) So to some degree your statements are true, but when we rewrite the IORT table without SMMUs (and possibly without other components like the PMUs), it would be kind of a stretch to call it "fairly similar to the host IORT". I think "based on the host IORT" would be more precise. Yes. Based on host IORT is better,thanks. 4. IORT for DomU - IORT for DomU is generated by the toolstack. IORT topology is different when DomU supports device passthrough. Can you elaborate on that? Different compared to what? My understanding is that without device passthrough there would be no IORT in the first place? I was exploring the possibility of having virtual devices for DomU. So if a virtual is assigned to guest there needs to be some mapping in IORT as well. This virtual device can be on a PCI bus / or as a platform device. Device Pass-through can be split into two parts a. platform device passthrough (not on PCI bus) b. PCI device PT => If we discount the possibility of a virtual device for domU and platform device passthrough then you are correct
Re: [Xen-devel] [RFC v2 5/7] acpi:arm64: Add support for parsing IORT table
On 10/19/2017 8:30 PM, Goel, Sameer wrote: On 10/10/2017 6:36 AM, Manish Jaggi wrote: Hi Sameer, On 9/21/2017 6:07 AM, Sameer Goel wrote: Add support for parsing IORT table to initialize SMMU devices. * The code for creating an SMMU device has been modified, so that the SMMU device can be initialized. * The NAMED NODE code has been commented out as this will need DOM0 kernel support. * ITS code has been included but it has not been tested. Could you please refactor this patch into another set of two patches. I am planning to rebase my IORT for Dom0 Hiding patch rework on this patch. I will try to break this up. Lets discuss this a bit more next week. Please have a look at the draft design. [1] [1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg125951.html Thanks, Manish Signed-off-by: Sameer Goel <sg...@codeaurora.org> --- xen/arch/arm/setup.c | 3 + xen/drivers/acpi/Makefile | 1 + xen/drivers/acpi/arm/Makefile | 1 + xen/drivers/acpi/arm/iort.c| 173 + xen/drivers/passthrough/arm/smmu.c | 1 + xen/include/acpi/acpi_iort.h | 17 ++-- xen/include/asm-arm/device.h | 2 + xen/include/xen/acpi.h | 21 + xen/include/xen/pci.h | 8 ++ 9 files changed, 146 insertions(+), 81 deletions(-) create mode 100644 xen/drivers/acpi/arm/Makefile diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 92f173b..4ba09b2 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -49,6 +49,7 @@ #include #include #include +#include struct bootinfo __initdata bootinfo; @@ -796,6 +797,8 @@ void __init start_xen(unsigned long boot_phys_offset, tasklet_subsys_init(); +/* Parse the ACPI iort data */ +acpi_iort_init(); xsm_dt_init(); diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile index 444b11d..e7ffd82 100644 --- a/xen/drivers/acpi/Makefile +++ b/xen/drivers/acpi/Makefile @@ -1,5 +1,6 @@ subdir-y += tables subdir-y += utilities +subdir-$(CONFIG_ARM) += arm subdir-$(CONFIG_X86) += apei obj-bin-y += tables.init.o diff --git a/xen/drivers/acpi/arm/Makefile b/xen/drivers/acpi/arm/Makefile new file mode 100644 index 000..7c039bb --- /dev/null +++ b/xen/drivers/acpi/arm/Makefile @@ -0,0 +1 @@ +obj-y += iort.o diff --git a/xen/drivers/acpi/arm/iort.c b/xen/drivers/acpi/arm/iort.c index 2e368a6..7f54062 100644 --- a/xen/drivers/acpi/arm/iort.c +++ b/xen/drivers/acpi/arm/iort.c @@ -14,17 +14,47 @@ * This file implements early detection/parsing of I/O mapping * reported to OS through firmware via I/O Remapping Table (IORT) * IORT document number: ARM DEN 0049A + * + * Based on Linux drivers/acpi/arm64/iort.c + * => commit ca78d3173cff3503bcd15723b049757f75762d15 + * + * Xen modification: + * Sameer Goel <sg...@codeaurora.org> + * Copyright (C) 2017, The Linux Foundation, All rights reserved. + * */ -#define pr_fmt(fmt)"ACPI: IORT: " fmt - -#include -#include -#include -#include -#include -#include -#include +#include +#include +#include +#include +#include +#include +#include + +#include + +/* Xen: Define compatibility functions */ +#define FW_BUG"[Firmware Bug]: " +#define pr_err(fmt, ...) printk(XENLOG_ERR fmt, ## __VA_ARGS__) +#define pr_warn(fmt, ...) printk(XENLOG_WARNING fmt, ## __VA_ARGS__) + +/* Alias to Xen allocation helpers */ +#define kfree xfree +#define kmalloc(size, flags)_xmalloc(size, sizeof(void *)) +#define kzalloc(size, flags)_xzalloc(size, sizeof(void *)) + +/* Redefine WARN macros */ +#undef WARN +#undef WARN_ON +#define WARN(condition, format...) ({\ +int __ret_warn_on = !!(condition);\ +if (unlikely(__ret_warn_on))\ +printk(format);\ +unlikely(__ret_warn_on);\ +}) +#define WARN_TAINT(cond, taint, format...) WARN(cond, format) +#define WARN_ON(cond) (!!cond) #define IORT_TYPE_MASK(type)(1 << (type)) #define IORT_MSI_TYPE(1 << ACPI_IORT_NODE_ITS_GROUP) @@ -256,6 +286,13 @@ static acpi_status iort_match_node_callback(struct acpi_iort_node *node, acpi_status status; if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT) { +status = AE_NOT_IMPLEMENTED; +/* + * We need the namespace object name from dsdt to match the iort node, this + * will need additions to the kernel xen bus notifiers. + * So, disabling the named node code till a proposal is approved. + */ +#if 0 struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL }; struct acpi_device *adev = to_acpi_device_node(dev->fwnode); struct acpi_iort_named_component *ncomp; @@ -275,11 +312,12 @@ static acpi_status iort_match_node_callback(struct acpi_iort_node *node,
Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table
On 10/12/2017 5:14 PM, Julien Grall wrote: On 12/10/17 12:22, Manish Jaggi wrote: Hi Julien, Why do you omit parts of mail where I have asked a question , please avoid skiping that removes the context. I believe I answered it just after because you asked twice the same thing. So may I dropped the context but the answer was there... For your convenience here the replicated answer. "Why? The generation of IORT is fairly standalone. And again, this was suggestion to share in the future and an expectation for this series. What I care the most is the generation to be fully separated from the rest." I raised a valid point and it was totally ignored and you asked me to explain the rationale on a later point. So if you choose to ignore my first point, how can I put any point. Well, maybe you should read the e-mail more carefully because your point have been addressed. If they are not, then please say it rather than accusing the reviewers on spending not enough time on your series... [...] Now if you see both the codes are quite similar and there is redundancy in libxl and in xen code for preparing ACPI tables for dom0 and domU. The point I am raising is quite clear, if all other tables like MADT, XSDT, RSDP, GTDT etc does not share a common generation code with xen what is so special about IORT. Either we move all generation into a common code or keep redundancy for IORT. I hope I have shown the code and made the point quite clear. Please provide a technical answer rather than a simple "Why". Why do you still continue arguing on how this is going to interact with libxl when your only work now (as I stated in every single e-mail) is for Dom0. If the generation is generic enough, it will require little code to interface. After all, you only need: - informations (e.g DeviceID, MasterID...) - buffer for writing the generated IORT So now it is maybe time for you to suggest an interface we can discuss on. Sure. A quick draft is shared on mailing list. [1] [1] https://marc.info/?l=xen-devel=150784236208192=2 Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.
ACPI/IORT Support in Xen. -- I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending the scope and including all that is required to support ACPI/IORT in Xen. Presenting for review first _draft_ of design of ACPI/IORT support in Xen. Not complete though. Discussed is the parsing and generation of IORT table for Dom0 and DomUs. It is proposed that IORT be parsed and the information in saved into xen data-structure say host_iort_struct and is reused by all xen subsystems like ITS / SMMU etc. Since this is first draft is open to technical comments, modifications and suggestions. Please be open and feel free to add any missing points / additions. 1. What is IORT. What are its components ? 2. Current Support in Xen 3. IORT for Dom0 4. IORT for DomU 5. Parsing of IORT in Xen 6. Generation of IORT 7. Future Work and TODOs 1. What is IORT. What are its components ? IORT refers to Input Output remapping table. It is essentially used to find information about the IO topology (PCIRC-SMMU-ITS) and relationships between devices. A general structure of IORT is has nodes which have information about PCI RC, SMMU, ITS and Platform devices. Using an IORT table relationship between RID -> StreamID -> DeviceId can be obtained. More specifically which device is behind which SMMU and which interrupt controller, this topology is described in IORT Table. RID is a requester ID in PCI context, StreamID is the ID of the device in SMMU context, DeviceID is the ID programmed in ITS. For a non-pci device RID could be simply an ID. Each iort_node contains an ID map array to translate from one ID into another. IDmap Entry {input_range, output_range, output_node_ref, id_count} This array is present in PCI RC node,SMMU node, Named component node etc and can reference to a SMMU or ITS node. 2. Current Support of IORT --- Currently Xen passes host IORT table to dom0 without any modifications. For DomU no IORT table is passed. 3. IORT for Dom0 - IORT for Dom0 is prepared by xen and it is fairly similar to the host iort. However few nodes could be removed removed or modified. For instance - host SMMU nodes should not be present - ITS group nodes are same as host iort but, no stage2 mapping is done for them. - platform nodes (named components) may be selectively present depending on the case where xen is using some. This could be controlled by xen command line. - More items : TODO 4. IORT for DomU - IORT for DomU is generated by the toolstack. IORT topology is different when DomU supports device passthrough. At a minimum domU IORT should include a single PCIRC and ITS Group. Similar PCIRC can be added in DSDT. Additional node can be added if platform device is assigned to domU. No extra node should be required for PCI device pass-through. It is proposed that the idrange of PCIRC and ITS group be constant for domUs. In case if PCI PT,using a domctl toolstack can communicate physical RID: virtual RID, deviceID: virtual deviceID to xen. It is assumed that domU PCI Config access would be trapped in Xen. The RID at which assigned device is enumerated would be the one provided by the domctl, domctl_set_deviceid_mapping TODO: device assign domctl i/f. Note: This should suffice the virtual deviceID support pointed by Andre. [4] We might not need this domctl if assign_device hypercall is extended to provide this information. 5. Parsing of IORT in Xen -- IORT nodes can be saved in structures so that IORT table parsing can be done once and is reused by all xen subsystems like ITS / SMMU etc, domain creation. Proposed are the structures to hold IORT information, very similar to ACPI structures. iort_id_map { range_t input_range; range_t output_range; void *output_reference; ... } =>output_reference points to object of iort_node. struct iort_node { struct list_head id_map; void *context; struct list_head list; } => context could be a reference to acpi_iort_node. struct iort_table_struct { struct list_head pci_rc_nodes; struct list_head smmu_nodes; struct list_head plat_devices; struct list_head its_group; } This structure is created at the point IORT table is parsed say from acpi_iort_init. It is proposed to use this structure information in iort_init_platform_devices. [2] [RFC v2 4/7] ACPI: arm: Support for IORT 6. IORT Generation --- There would be a common code to generate IORT table from iort_table_struct. a. For Dom0 the structure (iort_table_struct) be modified to remove smmu nodes and update id_mappings. PCIRC idmap -> output refrence to ITS group. (RID -> DeviceID). TODO: Describe algo in update_id_mapping function to map RID -> DeviceID used in my earlier patch [3] b. For DomU - iort_table_struct would have minimal 2 nodes (1 PCIRC
Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table
Hi Julien, Why do you omit parts of mail where I have asked a question , please avoid skiping that removes the context. I raised a valid point and it was totally ignored and you asked me to explain the rationale on a later point. So if you choose to ignore my first point, how can I put any point. This is what I have asked >>The ACPI tables for DomU are generated by the toolstack today. So I don't see why we would change that to support IORT. >> >>However, you can have a file shared between the toolstack and Xen and contain the generation of IORT. >> >>For instance, this is what we already does with libelf (see common/libelf). This will amount to adding a function make_iort in libxl__prepare_acpi, which would use the common code that is also use generate dom0 IORT (domain_build.c). Correct ? If we go by this logic, then libxl_prepare_acpi and domain_build.c should use a common code for all acpi tables. - Are you suggesting we change that as well and make it part of common code. The code in domain_build.c and in libxl__prepare_acpi is very similar, see the code below. static int prepare_acpi(struct domain *d, struct kernel_info *kinfo) { d->arch.efi_acpi_table = alloc_xenheap_pages(order, 0); ... rc = acpi_create_fadt(d, tbl_add); if ( rc != 0 ) return rc; rc = acpi_create_madt(d, tbl_add); if ( rc != 0 ) return rc; rc = acpi_create_stao(d, tbl_add); if ( rc != 0 ) return rc; rc = acpi_create_xsdt(d, tbl_add); if ( rc != 0 ) return rc; rc = acpi_create_rsdp(d, tbl_add); if ( rc != 0 ) return rc; ... } int libxl__prepare_acpi(libxl__gc *gc, libxl_domain_build_info *info, struct xc_dom_image *dom) { ... rc = libxl__allocate_acpi_tables(gc, info, dom, acpitables); if (rc) goto out; make_acpi_rsdp(gc, dom, acpitables); make_acpi_xsdt(gc, dom, acpitables); make_acpi_gtdt(gc, dom, acpitables); rc = make_acpi_madt(gc, dom, info, acpitables); if (rc) goto out; make_acpi_fadt(gc, dom, acpitables); make_acpi_dsdt(gc, dom, acpitables); out: return rc; } Now if you see both the codes are quite similar and there is redundancy in libxl and in xen code for preparing ACPI tables for dom0 and domU. The point I am raising is quite clear, if all other tables like MADT, XSDT, RSDP, GTDT etc does not share a common generation code with xen what is so special about IORT. Either we move all generation into a common code or keep redundancy for IORT. I hope I have shown the code and made the point quite clear. Please provide a technical answer rather than a simple "Why". Cheers! Manish On 10/12/2017 4:34 PM, Julien Grall wrote: Hello, On 12/10/17 07:11, Manish Jaggi wrote: On 10/6/2017 7:54 PM, Julien Grall wrote: I am not asking to write the DomU support, but at least have a full separation between the Parsing and Generation. So we could easily adapt and re-use the code when we get the DomU support. I got your point, but as of today there is no code reuse for most of apci_tables. So code result _only_ for IORT but not for acpi is not correct approach. Why? The generation of IORT is fairly standalone. And again, this was suggestion to share in the future and an expectation for this series. What I care the most is the generation to be fully separated from the rest. Also this is the part of PCI passthrough flow so that also might change few things. But from pov of dom0 smmu hiding, it is a different flow and is coupled with PCI PT. I think 1) can be solved using this series as a base. I have quite some comments ready for the patches, shall we follow this route. 2) obviously would change the game completely. We need to sit down and design this properly. Probably this means that Xen parses the IORT and builds internal representations of the mappings, Can you please add more detail on the internal representations of the mappings. What exactly do you want? This is likely going to be decided once you looked what is the expected interaction between IORT and Xen. More details on this line "Probably this means that Xen parses the IORT and builds internal representations of the mappings," I think you have enough meat in this thread to come up with a proposition based on the feedback. Once you send it, we can have a discussion and find agreement. [...] The IORT for the hardware domain is just a specific case as it is based pre-existing information. But because of removing nodes (e.g SMMU nodes and probably the PMU nodes), it is basically a full re-write. So I would consider of full separate the logic of generating the IORT table from the host IORT table. By that I mean not browsing the host IORT when generating the host. by "the host" you mean dom0 IORT ? yes. Something on the lines
Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table
On 10/6/2017 7:54 PM, Julien Grall wrote: Hello, On 04/10/17 06:22, Manish Jaggi wrote: On 10/4/2017 12:12 AM, Julien Grall wrote: On 25/09/17 05:22, Manish Jaggi wrote: On 9/22/2017 7:42 PM, Andre Przywara wrote: Hi Manish, On 11/09/17 22:33, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> The set is divided into two patches. First one calculates the size of IORT while second one writes the IORT table itself. It would be good if you could give a quick introduction *why* this set is needed here (and introduce IORT to the casual reader). In general some more high-level documentation on your functions would be good, as it took me quite some time to understand what each function does. ok, will add more documentation. So my understanding is: phase 1: - go over each entry in each RC node Rather than each entry (which could be a large number) I am taking the complete range and checking it with the same logic. If the ID range is a subset or a super-set of id range in smmu, new id range is created. So if pci_rc node has an id map {p_input-base,p_output-base,p_out_ref, p_count} and it an output reference to smmu node with id-map {s_input-base, s_output-base,s_out_ref, s_count}, based on the the the s_count and s_input/p_output the new id-map is created with {p_input, s_output, s_out_ref, adjusted_count} update_id_mapping function does that. So I am following the same logic. We can chat over IRC / I can give a code walk-through ... - if that points to an SMMU node, go over each outgoing ITS entry and find overlaps with this RC entry - for each overlap create a new entry in a list with this RC pointing to the ITS directly phase 2, creating the new IORT - go over each RC node - if that points to an ITS, copy through IORT entries - if that points to an SMMU, replace with the remapped entries - go over each ITS node - copy through IORT entries Thats exactly what this patch does. What are you comments on the current patch approach to hide smmu nodes. I have answered to your comments, see below. I am not sure to understand the 2 sentences above. What are they related? IMHO we can reuse most of the fixup code here. That's your choice as long as it is properly documented and fits the end goal. So I believe this would do the trick and you end up with an efficient representation of the IORT without SMMUs - at least for RC nodes. After some brainstorming with Julien we found two problems: 1) This only covers RC nodes, but not "named components" (platform devices), which we will need. That should be fixable by removing the hardcoded IORT node types in the code and treating NC nodes like RC nodes. Yes, so first we can take this as a base, once this is ok, I can add support for named components. 2) Eventually we will need *virtual* deviceID support, for DomUs. Now we I am a bit surprised that you answered to the e-mail but didn't provide any opinion on 2). Apologies for that. could start introducing that already, also doing some virtual mapping for Dom0. The ITS code would then translate each virtual device ID that Dom0 requests into a hardware device ID. I agree that this means a lot more work, but we will need it anyway. I am a bit surprised that you answered to the e-mail but didn't provide any opinion on 2). Apologies for that. Sorry to surprise you twice :) Damm, I moved the sentence but forgot to drop the original one. IMHO It was a bit obvious for DomU and I was waiting to hear what other would say on this. as (2) below. Moreover we need to discuss IORT generation for DomU - could be done by xl tools or xen should do it. The ACPI tables for DomU are generated by the toolstack today. So I don't see why we would change that to support IORT. However, you can have a file shared between the toolstack and Xen and contain the generation of IORT. For instance, this is what we already does with libelf (see common/libelf). This will amount to adding a function make_iort in libxl__prepare_acpi, which would use the common code that is also use generate dom0 IORT (domain_build.c). Correct ? If we go by this logic, then libxl_prepare_acpi and domain_build.c should use a common code for all acpi tables. - Are you suggesting we change that as well and make it part of common code. I am not asking to write the DomU support, but at least have a full separation between the Parsing and Generation. So we could easily adapt and re-use the code when we get the DomU support. I got your point, but as of today there is no code reuse for most of apci_tables. So code result _only_ for IORT but not for acpi is not correct approach. Also this is the part of PCI passthrough flow so that also might change few things. But from pov of dom0 smmu hiding, it is a different flow and is coupled with PCI PT. I think 1) can be solved using this series as a base. I have quite some comments ready
Re: [Xen-devel] [PATCH v6 3/5] ARM: ITS: Deny hardware domain access to ITS
Hi Julien, On 10/10/2017 7:09 PM, Julien Grall wrote: Hi Manish, On 10/10/17 13:52, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> This patch extends the gicv3_iomem_deny_access functionality by adding support for ITS region as well. Add function gicv3_its_deny_access. Reviewed-by: Andre Przywara <andre.przyw...@arm.com> Acked-by: Julien Grall <julien.gr...@arm.com> Please state after "---" when you modified a patch and keep the tags to at least check if the reviewer is happy with it. It is one of the reason I like the changelog in each patch. It helps to know what changed in a specific one. It helps me to decide whether I am happy with you keeping my tag and avoid to fully review yet another time the patch. In that case, it is fine to keep it. For this patch please ack it. Changelog: I have added - a check on return value for gicv3_its_deny_access(d); - used its_data->size in place of GICV3_ITS_SIZE - remove extra space in printk Thanks manish Signed-off-by: Manish Jaggi <mja...@cavium.com> > --- xen/arch/arm/gic-v3-its.c| 22 ++ xen/arch/arm/gic-v3.c| 4 xen/include/asm-arm/gic_v3_its.h | 9 + 3 files changed, 35 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 3023ee5..bd94308 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include @@ -905,6 +906,27 @@ struct pending_irq *gicv3_assign_guest_event(struct domain *d, return pirq; } +int gicv3_its_deny_access(const struct domain *d) +{ +int rc = 0; +unsigned long mfn, nr; +const struct host_its *its_data; + +list_for_each_entry( its_data, _its_list, entry ) +{ +mfn = paddr_to_pfn(its_data->addr); +nr = PFN_UP(its_data->size); +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +{ +printk("iomem_deny_access failed for %lx:%lx \r\n", mfn, nr); +break; +} +} + +return rc; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index 6f562f4..475e0d3 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1308,6 +1308,10 @@ static int gicv3_iomem_deny_access(const struct domain *d) if ( rc ) return rc; +rc = gicv3_its_deny_access(d); +if ( rc ) +return rc; + for ( i = 0; i < gicv3.rdist_count; i++ ) { mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT; diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 73d1fd1..73ee0ba 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -139,6 +139,10 @@ void gicv3_its_dt_init(const struct dt_device_node *node); #ifdef CONFIG_ACPI void gicv3_its_acpi_init(void); #endif + +/* Deny iomem access for its */ +int gicv3_its_deny_access(const struct domain *d); + bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -206,6 +210,11 @@ static inline void gicv3_its_acpi_init(void) } #endif +static inline int gicv3_its_deny_access(const struct domain *d) +{ +return 0; +} + static inline bool gicv3_its_host_has_its(void) { return false; Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 5/7] acpi:arm64: Add support for parsing IORT table
Hi Sameer, On 9/21/2017 6:07 AM, Sameer Goel wrote: Add support for parsing IORT table to initialize SMMU devices. * The code for creating an SMMU device has been modified, so that the SMMU device can be initialized. * The NAMED NODE code has been commented out as this will need DOM0 kernel support. * ITS code has been included but it has not been tested. Could you please refactor this patch into another set of two patches. I am planning to rebase my IORT for Dom0 Hiding patch rework on this patch. Thanks, Manish Signed-off-by: Sameer Goel--- xen/arch/arm/setup.c | 3 + xen/drivers/acpi/Makefile | 1 + xen/drivers/acpi/arm/Makefile | 1 + xen/drivers/acpi/arm/iort.c| 173 + xen/drivers/passthrough/arm/smmu.c | 1 + xen/include/acpi/acpi_iort.h | 17 ++-- xen/include/asm-arm/device.h | 2 + xen/include/xen/acpi.h | 21 + xen/include/xen/pci.h | 8 ++ 9 files changed, 146 insertions(+), 81 deletions(-) create mode 100644 xen/drivers/acpi/arm/Makefile diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c index 92f173b..4ba09b2 100644 --- a/xen/arch/arm/setup.c +++ b/xen/arch/arm/setup.c @@ -49,6 +49,7 @@ #include #include #include +#include struct bootinfo __initdata bootinfo; @@ -796,6 +797,8 @@ void __init start_xen(unsigned long boot_phys_offset, tasklet_subsys_init(); +/* Parse the ACPI iort data */ +acpi_iort_init(); xsm_dt_init(); diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile index 444b11d..e7ffd82 100644 --- a/xen/drivers/acpi/Makefile +++ b/xen/drivers/acpi/Makefile @@ -1,5 +1,6 @@ subdir-y += tables subdir-y += utilities +subdir-$(CONFIG_ARM) += arm subdir-$(CONFIG_X86) += apei obj-bin-y += tables.init.o diff --git a/xen/drivers/acpi/arm/Makefile b/xen/drivers/acpi/arm/Makefile new file mode 100644 index 000..7c039bb --- /dev/null +++ b/xen/drivers/acpi/arm/Makefile @@ -0,0 +1 @@ +obj-y += iort.o diff --git a/xen/drivers/acpi/arm/iort.c b/xen/drivers/acpi/arm/iort.c index 2e368a6..7f54062 100644 --- a/xen/drivers/acpi/arm/iort.c +++ b/xen/drivers/acpi/arm/iort.c @@ -14,17 +14,47 @@ * This file implements early detection/parsing of I/O mapping * reported to OS through firmware via I/O Remapping Table (IORT) * IORT document number: ARM DEN 0049A + * + * Based on Linux drivers/acpi/arm64/iort.c + * => commit ca78d3173cff3503bcd15723b049757f75762d15 + * + * Xen modification: + * Sameer Goel + * Copyright (C) 2017, The Linux Foundation, All rights reserved. + * */ -#define pr_fmt(fmt) "ACPI: IORT: " fmt - -#include -#include -#include -#include -#include -#include -#include +#include +#include +#include +#include +#include +#include +#include + +#include + +/* Xen: Define compatibility functions */ +#define FW_BUG "[Firmware Bug]: " +#define pr_err(fmt, ...) printk(XENLOG_ERR fmt, ## __VA_ARGS__) +#define pr_warn(fmt, ...) printk(XENLOG_WARNING fmt, ## __VA_ARGS__) + +/* Alias to Xen allocation helpers */ +#define kfree xfree +#define kmalloc(size, flags)_xmalloc(size, sizeof(void *)) +#define kzalloc(size, flags)_xzalloc(size, sizeof(void *)) + +/* Redefine WARN macros */ +#undef WARN +#undef WARN_ON +#define WARN(condition, format...) ({ \ + int __ret_warn_on = !!(condition); \ + if (unlikely(__ret_warn_on))\ + printk(format); \ + unlikely(__ret_warn_on);\ +}) +#define WARN_TAINT(cond, taint, format...) WARN(cond, format) +#define WARN_ON(cond) (!!cond) #define IORT_TYPE_MASK(type) (1 << (type)) #define IORT_MSI_TYPE (1 << ACPI_IORT_NODE_ITS_GROUP) @@ -256,6 +286,13 @@ static acpi_status iort_match_node_callback(struct acpi_iort_node *node, acpi_status status; if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT) { + status = AE_NOT_IMPLEMENTED; +/* + * We need the namespace object name from dsdt to match the iort node, this + * will need additions to the kernel xen bus notifiers. + * So, disabling the named node code till a proposal is approved. + */ +#if 0 struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL }; struct acpi_device *adev = to_acpi_device_node(dev->fwnode); struct acpi_iort_named_component *ncomp; @@ -275,11 +312,12 @@ static acpi_status iort_match_node_callback(struct acpi_iort_node *node, status = !strcmp(ncomp->device_name, buf.pointer) ? AE_OK : AE_NOT_FOUND; acpi_os_free(buf.pointer); +#endif } else if (node->type ==
Re: [Xen-devel] [PATCH v5 4/5] ARM: Update Formula to compute MADT size using new callbacks in gic_hw_operations
On 10/10/2017 3:44 PM, Julien Grall wrote: Hi Manish, On 10/10/17 07:16, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> estimate_acpi_efi_size needs to be updated to provide correct size of hardware domains MADT, which now adds ITS information as well. This patch updates the formula to compute extra MADT size, as per GICv2/3 by calling gic_get_hwdom_extra_madt_size Missing full stop. oh i missed it. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/domain_build.c | 7 +-- xen/arch/arm/gic-v2.c | 6 ++ xen/arch/arm/gic-v3.c | 19 +++ xen/arch/arm/gic.c | 12 xen/include/asm-arm/gic.h | 3 +++ 5 files changed, 41 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index d6f9585..f17fcf1 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1808,12 +1808,7 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8); acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8); -madt_size = sizeof(struct acpi_table_madt) -+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus -+ sizeof(struct acpi_madt_generic_distributor); -if ( d->arch.vgic.version == GIC_V3 ) -madt_size += sizeof(struct acpi_madt_generic_redistributor) - * d->arch.vgic.nr_regions; +madt_size = gic_get_hwdom_madt_size(d); acpi_size += ROUNDUP(madt_size, 8); addr = acpi_os_get_root_pointer(); diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c index cbe71a9..0123ea4 100644 --- a/xen/arch/arm/gic-v2.c +++ b/xen/arch/arm/gic-v2.c @@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } +static unsigned long gicv2_get_hwdom_extra_madt_size(const struct domain *d) +{ +return 0; +} + #ifdef CONFIG_ACPI static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset) { @@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = { .read_apr= gicv2_read_apr, .make_hwdom_dt_node = gicv2_make_hwdom_dt_node, .make_hwdom_madt = gicv2_make_hwdom_madt, +.get_hwdom_extra_madt_size = gicv2_get_hwdom_extra_madt_size, .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings, .iomem_deny_access = gicv2_iomem_deny_access, .do_LPI = gicv2_do_LPI, diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index b3d605d..447998d 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1406,6 +1406,19 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) return table_len; } +static unsigned long gicv3_get_hwdom_extra_madt_size(const struct domain *d) +{ +unsigned long size; + +size = sizeof(struct acpi_madt_generic_redistributor) +* d->arch.vgic.nr_regions; Here you align the * with struct. But below, you align with sizeof. Please stay consistent and always align with sizeof. + +size += vgic_v3_its_count(d) +* sizeof(struct acpi_madt_generic_translator); Same here. Could you please help with the specific section on coding style guidelines on xen for indentation when line over 80 chars which I am not following for this case. + +return size; +} + static int __init gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header, const unsigned long end) @@ -1597,6 +1610,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) { return 0; } + +static unsigned long gicv3_get_hwdom_extra_madt_size(const struct domain *d) +{ +return 0; +} #endif /* Set up the GIC */ @@ -1698,6 +1716,7 @@ static const struct gic_hw_operations gicv3_ops = { .secondary_init = gicv3_secondary_cpu_init, .make_hwdom_dt_node = gicv3_make_hwdom_dt_node, .make_hwdom_madt = gicv3_make_hwdom_madt, +.get_hwdom_extra_madt_size = gicv3_get_hwdom_extra_madt_size, .iomem_deny_access = gicv3_iomem_deny_access, .do_LPI = gicv3_do_LPI, }; diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 6c803bf..3c7b6df 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -851,6 +851,18 @@ int gic_make_hwdom_madt(const struct domain *d, u32 offset) return gic_hw_ops->make_hwdom_madt(d, offset); } +unsigned long gic_get_hwdom_madt_size(const struct domain *d) +{ +unsigned long madt_size; + +madt_size = sizeof(struct acpi_table_madt) ++ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus ++ sizeof(struct acpi_madt_generic_distributor) ++ gic_hw_ops->get_hwdom_extra_madt_size(d); +
Re: [Xen-devel] [PATCH v4 5/5] ARM: ITS: Expose ITS in the MADT table
Hi Andre, On 10/3/2017 8:03 PM, Julien Grall wrote: Hi Manish, On 21/09/17 14:17, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> Add gicv3_its_make_hwdom_madt to update hwdom MADT ITS information. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 19 +++ xen/arch/arm/gic-v3.c| 1 + xen/include/asm-arm/gic_v3_its.h | 8 3 files changed, 28 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 8697e5b..e3e7e92 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -1062,6 +1062,25 @@ void gicv3_its_acpi_init(void) acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR, gicv3_its_acpi_probe, 0); } + +unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, void *base_ptr) +{ +unsigned long i = 0; +void *fw_its; +struct acpi_madt_generic_translator *hwdom_its; + +hwdom_its = base_ptr; + +for ( i = 0; i < vgic_v3_its_count(d); i++ ) +{ +fw_its = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR, + i); +memcpy(hwdom_its, fw_its, sizeof(struct acpi_madt_generic_translator)); +hwdom_its++; +} + +return sizeof(struct acpi_madt_generic_translator) * vgic_v3_its_count(d); +} #endif /* diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index 6e8d580..d29eea6 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1403,6 +1403,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) table_len += size; } +table_len += gicv3_its_make_hwdom_madt(d, base_ptr + table_len); Newline here please. I will leave Andre to comment on this patch as he suggested the rework. Could you please provide comments on this patch so that I can send an updated v5. Cheers, return table_len; } diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 31fca66..fc37776 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -138,6 +138,8 @@ void gicv3_its_dt_init(const struct dt_device_node *node); #ifdef CONFIG_ACPI void gicv3_its_acpi_init(void); +unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, +void *base_ptr); #endif /* Deny iomem access for its */ @@ -208,6 +210,12 @@ static inline void gicv3_its_dt_init(const struct dt_device_node *node) static inline void gicv3_its_acpi_init(void) { } + +static inline unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, + void *base_ptr) +{ +return 0; +} #endif static inline int gicv3_its_deny_access(const struct domain *d) ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 4/5] ARM: Introduce get_hwdom_madt_size in gic_hw_operations
Hi On 10/3/2017 8:01 PM, Julien Grall wrote: Hi, On 21/09/17 14:17, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> estimate_acpi_efi_size needs to be updated to provide correct size of hardware domains MADT, which now adds ITS information as well. Introducing gic_get_hwdom_madt_size. I think the commit title is misleading, the main purpose of this patch is updating the formula to compute the MADT size for GICv3. Not introducing the callbacks. But likely, you want two patches here: - Patch #1 adding the callbacks - Patch #2 updating the formula for GICv3 For this time, I would be ok to have only one patch providing the commit message is updated. ok, will update.. Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v4 2/5] ARM: ITS: Populate host_its_list from ACPI MADT Table
Hello Julien, On 10/3/2017 7:17 PM, Julien Grall wrote: Hi Manish, On 21/09/17 14:17, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> Added gicv3_its_acpi_init to update host_its_list from MADT table. For ACPI, host_its structure stores dt_node as NULL. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 24 xen/arch/arm/gic-v3.c| 2 ++ xen/include/asm-arm/gic_v3_its.h | 10 ++ 3 files changed, 36 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 0610991..0f662cf 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -18,6 +18,7 @@ * along with this program; If not, see <http://www.gnu.org/licenses/>. */ +#include #include #include #include @@ -1018,6 +1019,29 @@ void gicv3_its_dt_init(const struct dt_device_node *node) } } +#ifdef CONFIG_ACPI +static int gicv3_its_acpi_probe(struct acpi_subtable_header *header, +const unsigned long end) +{ +struct acpi_madt_generic_translator *its; + +its = (struct acpi_madt_generic_translator *)header; +if ( BAD_MADT_ENTRY(its, end) ) +return -EINVAL; + +add_to_host_its_list(its->base_address, GICV3_ITS_SIZE, NULL); After the comment from Andre, I was expecting some rework to avoid store the size of the ITS in host_its. So what's the plan for that? GICV3_ITS_SIZE is now 128K (prev 64k, see below), same as what used in linux code, I think andre mentioned that need to add additional 64K. + +return 0; +} + +void gicv3_its_acpi_init(void) +{ +/* Parse ITS information */ +acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR, +gicv3_its_acpi_probe, 0); The indentation still looks wrong here. ah.. ok. +} +#endif + /* * Local variables: * mode: C diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index f990eae..6f562f4 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1567,6 +1567,8 @@ static void __init gicv3_acpi_init(void) gicv3.rdist_stride = 0; +gicv3_its_acpi_init(); + /* * In ACPI, 0 is considered as the invalid address. However the rest * of the initialization rely on the invalid address to be diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 1fac1c7..e1be33c 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -20,6 +20,7 @@ #ifndef __ASM_ARM_ITS_H__ #define __ASM_ARM_ITS_H__ +#define GICV3_ITS_SIZE SZ_128K A less random place for this is close to the ITS_DOORBELL_OFFSET definition. ok will do :) #define GITS_CTLR 0x000 #define GITS_IIDR 0x004 #define GITS_TYPER 0x008 @@ -135,6 +136,9 @@ extern struct list_head host_its_list; /* Parse the host DT and pick up all host ITSes. */ void gicv3_its_dt_init(const struct dt_device_node *node); +#ifdef CONFIG_ACPI +void gicv3_its_acpi_init(void); +#endif bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -196,6 +200,12 @@ static inline void gicv3_its_dt_init(const struct dt_device_node *node) { } +#ifdef CONFIG_ACPI +static inline void gicv3_its_acpi_init(void) +{ +} +#endif + static inline bool gicv3_its_host_has_its(void) { return false; Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table
Hello Julien, On 10/4/2017 12:12 AM, Julien Grall wrote: Hello, On 25/09/17 05:22, Manish Jaggi wrote: On 9/22/2017 7:42 PM, Andre Przywara wrote: Hi Manish, On 11/09/17 22:33, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> The set is divided into two patches. First one calculates the size of IORT while second one writes the IORT table itself. It would be good if you could give a quick introduction *why* this set is needed here (and introduce IORT to the casual reader). In general some more high-level documentation on your functions would be good, as it took me quite some time to understand what each function does. ok, will add more documentation. So my understanding is: phase 1: - go over each entry in each RC node Rather than each entry (which could be a large number) I am taking the complete range and checking it with the same logic. If the ID range is a subset or a super-set of id range in smmu, new id range is created. So if pci_rc node has an id map {p_input-base,p_output-base,p_out_ref, p_count} and it an output reference to smmu node with id-map {s_input-base, s_output-base,s_out_ref, s_count}, based on the the the s_count and s_input/p_output the new id-map is created with {p_input, s_output, s_out_ref, adjusted_count} update_id_mapping function does that. So I am following the same logic. We can chat over IRC / I can give a code walk-through ... - if that points to an SMMU node, go over each outgoing ITS entry and find overlaps with this RC entry - for each overlap create a new entry in a list with this RC pointing to the ITS directly phase 2, creating the new IORT - go over each RC node - if that points to an ITS, copy through IORT entries - if that points to an SMMU, replace with the remapped entries - go over each ITS node - copy through IORT entries Thats exactly what this patch does. What are you comments on the current patch approach to hide smmu nodes. I have answered to your comments, see below. IMHO we can reuse most of the fixup code here. So I believe this would do the trick and you end up with an efficient representation of the IORT without SMMUs - at least for RC nodes. After some brainstorming with Julien we found two problems: 1) This only covers RC nodes, but not "named components" (platform devices), which we will need. That should be fixable by removing the hardcoded IORT node types in the code and treating NC nodes like RC nodes. Yes, so first we can take this as a base, once this is ok, I can add support for named components. 2) Eventually we will need *virtual* deviceID support, for DomUs. Now we I am a bit surprised that you answered to the e-mail but didn't provide any opinion on 2). Apologies for that. could start introducing that already, also doing some virtual mapping for Dom0. The ITS code would then translate each virtual device ID that Dom0 requests into a hardware device ID. I agree that this means a lot more work, but we will need it anyway. I am a bit surprised that you answered to the e-mail but didn't provide any opinion on 2). Apologies for that. Sorry to surprise you twice :) IMHO It was a bit obvious for DomU and I was waiting to hear what other would say on this. as (2) below. Moreover we need to discuss IORT generation for DomU - could be done by xl tools or xen should do it. Also this is the part of PCI passthrough flow so that also might change few things. But from pov of dom0 smmu hiding, it is a different flow and is coupled with PCI PT. I think 1) can be solved using this series as a base. I have quite some comments ready for the patches, shall we follow this route. 2) obviously would change the game completely. We need to sit down and design this properly. Probably this means that Xen parses the IORT and builds internal representations of the mappings, Can you please add more detail on the internal representations of the mappings. IIUC the information is already there in ACPI tables, would it not add extra overhead of abstractions to maintain. Enumeration of PCI devices would generate a pci list which would be anyways separate. which are consulted as needed when passing through devices. The guest's (that would include Dom0) IORT would then be generated completely from scratch. I have a different opinion here, dom0 IORT would is most cases be very close to host IORT sans smmu nodes and few platform devices. And which platform devices to hide would probably depend on the xen command line, For instance for dom0 we would copy ITS information while for domU it would have to be generated, so scratch would be more for domU. We could have a common code for creating IORT structure but it would be a bit complex code with lot of abstractions and callbacks, so I suggest that keeping code simpler would be better. I would like to hear your opinion on this. I will try to discuss the feasibility of 2) with people at Connect. It would be
Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table
Hi Andre, On 9/22/2017 7:42 PM, Andre Przywara wrote: Hi Manish, On 11/09/17 22:33, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> The set is divided into two patches. First one calculates the size of IORT while second one writes the IORT table itself. It would be good if you could give a quick introduction *why* this set is needed here (and introduce IORT to the casual reader). In general some more high-level documentation on your functions would be good, as it took me quite some time to understand what each function does. ok, will add more documentation. So my understanding is: phase 1: - go over each entry in each RC node Rather than each entry (which could be a large number) I am taking the complete range and checking it with the same logic. If the ID range is a subset or a super-set of id range in smmu, new id range is created. So if pci_rc node has an id map {p_input-base,p_output-base,p_out_ref, p_count} and it an output reference to smmu node with id-map {s_input-base, s_output-base,s_out_ref, s_count}, based on the the the s_count and s_input/p_output the new id-map is created with {p_input, s_output, s_out_ref, adjusted_count} update_id_mapping function does that. So I am following the same logic. We can chat over IRC / I can give a code walk-through ... - if that points to an SMMU node, go over each outgoing ITS entry and find overlaps with this RC entry - for each overlap create a new entry in a list with this RC pointing to the ITS directly phase 2, creating the new IORT - go over each RC node - if that points to an ITS, copy through IORT entries - if that points to an SMMU, replace with the remapped entries - go over each ITS node - copy through IORT entries Thats exactly what this patch does. So I believe this would do the trick and you end up with an efficient representation of the IORT without SMMUs - at least for RC nodes. After some brainstorming with Julien we found two problems: 1) This only covers RC nodes, but not "named components" (platform devices), which we will need. That should be fixable by removing the hardcoded IORT node types in the code and treating NC nodes like RC nodes. Yes, so first we can take this as a base, once this is ok, I can add support for named components. 2) Eventually we will need *virtual* deviceID support, for DomUs. Now we could start introducing that already, also doing some virtual mapping for Dom0. The ITS code would then translate each virtual device ID that Dom0 requests into a hardware device ID. I agree that this means a lot more work, but we will need it anyway. I think 1) can be solved using this series as a base. I have quite some comments ready for the patches, shall we follow this route. 2) obviously would change the game completely. We need to sit down and design this properly. Probably this means that Xen parses the IORT and builds internal representations of the mappings, which are consulted as needed when passing through devices. The guest's (that would include Dom0) IORT would then be generated completely from scratch. I would like to hear your opinion on this. I will try to discuss the feasibility of 2) with people at Connect. It would be good if we could decide whether this is the way to go or we should use a solution based on this series. Cheers, Andre. patch1: estimates size of hardware domain IORT table by parsing all the pcirc nodes and their idmaps, and thereby calculating size by removing smmu nodes. Hardware domain IORT table will have only ITS and PCIRC nodes, and PCIRC nodes' idmap will have output refrences to ITS group nodes. patch 2: The steps are: a. First ITS group nodes are written and their offsets are saved along with the respective offsets from the firmware table. This is required when smmu node is hidden and smmu node still points to the old output_reference. b. PCIRC idmap is parsed and a list of idmaps is created which will have PCIRC idmap -> ITS group nodes. Each idmap is written by resolving ITS offset from the map saved in previous step. Changes wrt v1: No assumption is made wrt format of IORT / hw support Manish Jaggi (2): ARM: ACPI: IORT: Estimate the size of hardware domain IORT table ARM: ACPI: IORT: Write Hardware domain's IORT table xen/arch/arm/acpi/Makefile | 1 + xen/arch/arm/acpi/iort.c| 414 xen/arch/arm/domain_build.c | 49 +- xen/include/asm-arm/acpi.h | 1 + xen/include/asm-arm/iort.h | 17 ++ 5 files changed, 481 insertions(+), 1 deletion(-) create mode 100644 xen/arch/arm/acpi/iort.c create mode 100644 xen/include/asm-arm/iort.h ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2 0/7] SMMUv3 driver and the supporting framework
Hi Sameer, On 9/21/2017 6:07 AM, Sameer Goel wrote: This change incoporates most of the review comments from [1] and adds the proposed SMMUv3 driver. List of changes: - Introduce the iommu_fwspec implementation - No change from the last RFC - IORT port from linux. The differences are as under: * Modified the code for creating the SMMU devices. This code also initializes the discoverd SMMU devices. * MSI code is left as is, but this code is untested. * IORT node data parsing is delegated to the driver. Looking for comments on enabling the code in IORT driver. This will need a standard resource object. (Direct port from Linux or a new define for Xen?) * Assumptions on PCI IORT SMMU interaction. PCI assign device will call iort_iommu_configure to setup the streamids.Then it will call SMMU assign device with the right struct device argument. - SMMUv3 port from Linux. The list of changes are as under: * The Xen iommu_ops list is at parity with SMMUv2. * There is generally no need for an IOMMU group, but have kept a dummy define for now. * Have commented out the S1 translation code. * MSI code is commented out. * Page table ops are commented out as the driver shares the page tables with the cpu. * The list of SMMU devices is maintained from the driver code. Open questions: - IORT regeneration for DOM0. I was hoping to get some update on [2]. Please see v2 patch set https://lists.xen.org/archives/html/xen-devel/2017-09/msg01143.html - We also need a notification framework to get the Named node information from DSDT. - Should we port over code for non-hsared page tables from the kernel or leverage [3]. [1] "[RFC 0/6] IORT support and introduce fwspec" [2] "[Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain" [3] "Non-shared" IOMMU support on ARM" Sameer Goel (7): passthrough/arm: Modify SMMU driver to use generic device definition arm64: Add definitions for fwnode_handle xen/passthrough/arm: Introduce iommu_fwspec ACPI: arm: Support for IORT acpi:arm64: Add support for parsing IORT table Add verbatim copy of arm-smmu-v3.c from Linux xen/iommu: smmu-v3: Add Xen specific code to enable the ported driver xen/arch/arm/setup.c |3 + xen/drivers/acpi/Makefile |1 + xen/drivers/acpi/arm/Makefile |1 + xen/drivers/acpi/arm/iort.c | 986 ++ xen/drivers/passthrough/arm/Makefile |1 + xen/drivers/passthrough/arm/iommu.c | 66 + xen/drivers/passthrough/arm/smmu-v3.c | 3412 + xen/drivers/passthrough/arm/smmu.c| 13 +- xen/include/acpi/acpi_iort.h | 61 + xen/include/asm-arm/device.h |5 + xen/include/xen/acpi.h| 21 + xen/include/xen/fwnode.h | 33 + xen/include/xen/iommu.h | 29 + xen/include/xen/pci.h |8 + 14 files changed, 4634 insertions(+), 6 deletions(-) create mode 100644 xen/drivers/acpi/arm/Makefile create mode 100644 xen/drivers/acpi/arm/iort.c create mode 100644 xen/drivers/passthrough/arm/smmu-v3.c create mode 100644 xen/include/acpi/acpi_iort.h create mode 100644 xen/include/xen/fwnode.h ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v3 3/5] ARM: ITS: Deny hardware domain access to ITS
On 9/7/2017 10:27 PM, Andre Przywara wrote: Hi, On 05/09/17 18:14, mja...@caviumnetworks.com wrote: From: Manish Jaggi <mja...@cavium.com> This patch extends the gicv3_iomem_deny_access functionality by adding support for ITS region as well. Add function gicv3_its_deny_access. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 22 ++ xen/arch/arm/gic-v3.c| 3 +++ xen/include/asm-arm/gic_v3_its.h | 9 + 3 files changed, 34 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 536b48d..0ab1466 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -906,6 +907,27 @@ struct pending_irq *gicv3_assign_guest_event(struct domain *d, return pirq; } +int gicv3_its_deny_access(const struct domain *d) +{ +int rc = 0; +unsigned long mfn, nr; +const struct host_its *its_data; + +list_for_each_entry( its_data, _its_list, entry ) +{ +mfn = paddr_to_pfn(its_data->addr); +nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); Shouldn't this not only cover the ITS register frame, but also the following 64K page containing the doorbell address? Otherwise we leave the doorbell address open, which seems to be asking for trouble ... Cheers, Andre. ok, I will fix in patch 2 the size as 128K, same a linux. If no other change required in this patch can you please ack it. +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +{ +printk( "iomem_deny_access failed for %lx:%lx \r\n", mfn, nr); +break; +} +} + +return rc; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index 6f562f4..b3d605d 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1308,6 +1308,9 @@ static int gicv3_iomem_deny_access(const struct domain *d) if ( rc ) return rc; +if ( gicv3_its_deny_access(d) ) +return rc; + for ( i = 0; i < gicv3.rdist_count; i++ ) { mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT; diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 993819a..9cf18da 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -138,6 +138,10 @@ void gicv3_its_dt_init(const struct dt_device_node *node); #ifdef CONFIG_ACPI void gicv3_its_acpi_init(void); #endif + +/* Deny iomem access for its */ +int gicv3_its_deny_access(const struct domain *d); + bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -205,6 +209,11 @@ static inline void gicv3_its_acpi_init(void) } #endif +static inline int gicv3_its_deny_access(const struct domain *d) +{ +return 0; +} + static inline bool gicv3_its_host_has_its(void) { return false; ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Next Xen ARM community call - Wednesday 13th September 2017
Hi All, On 8/25/2017 4:12 PM, Julien Grall wrote: Hi all, I would suggest to have the next community call on Wednesday 13th September 2017 5pm BST. Does it sound good? Do you have any specific topic you would like to discuss? Will it be possible to have a small discussion on the PCI passthrough support / _implementation timelines_ with all concerned people? -manish Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain
On 8/10/2017 6:44 PM, Julien Grall wrote: On 08/10/2017 02:00 PM, Manish Jaggi wrote: HI Julien, On 8/10/2017 5:43 PM, Julien Grall wrote: On 10/08/17 13:00, Manish Jaggi wrote: Hi Julien, On 8/10/2017 4:58 PM, Julien Grall wrote: On 10/08/17 12:21, Manish Jaggi wrote: Hi Julien, On 6/21/2017 6:53 PM, Julien Grall wrote: Hi Manish, On 21/06/17 02:01, Manish Jaggi wrote: This patch series adds the support of ITS for ACPI hardware domain. It is tested on staging branch with has ITS v12 patchset by Andre. I have tried to incorporate the review comments on the RFC v1/v2 patch. The single patch in RFC is now split into 4 patches. I will comment here rather than on each patches. Patch1: ARM: ITS: Add translation_id to host_its Adds translation_id in host_its data structure, which is populated from translation_id read from firmwar MADT. This value is then programmed into local MADT created for hardware domain in patch 4. I don't see any reason to store value that will only be used for generating the MADT which BTW is just a copy for the ITS. Instead we should copy over the MADT entries. There are two approaches, If I use the standard API acpi_table_parse_madt which would iterate over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the addr and translation_id in some data structure, to be filled later in the hwdomain copy of madt generic translator. If I don't use the standard API I have to add code to manually parse all the translator entries. There are a 3rd approach I suggested and ignored... The ITS entries for Dom0 is exactly the same as the host entries. Yes, and if not passed properly dom0 wont get device interrupts... So you only need to do a verbatim copy of the entry... Can you please check patch 4/2, the translation_id and address are passed verbatim, the other values are reserved in acpi_madt_generic_translator. For ACPI, we took the approach to only rewrite what's necessary and give the rest to Dom0 as it is. If newer version of ACPI re-used those fields, then they will be copied over to Dom0. I don't consider it as an issue because the problem would be the same if those fields have an important meaning for the platform. Few thoughts... If we follow this approach, few points needs to be considered - If ACPI may use the reserved information later it could be equally important for dom0 and Xen, so it might be useful to keep reserved in xen as well. I already covered that in my previous e-mail. Yes, I am just stating it again for xen. - For platforms which use dt, translation_id is not required to be stored in struct host_its, similarly for platforms which use acpi dt_node pointer might be of no use. So we can have struct host_its having a union with dt_device_node * for dt and acpi_madt_generic_translator * for acpi. IMHO this could be an approach we can take. struct host_its { struct list_head entry; -const struct dt_device_node *dt_node; + union { +const struct dt_device_node *dt_node; +const struct acpi_madt_generic_translator *acpi_its_entry; +}; paddr_t addr; What don't you get in my previous e-mail? A no is a no, full stop. This is not helping. Just do what we do in *_make_hwdom_madt. That will work here with no need of a union or anything else. The patchset provides two features (a) populates host_its list from ACPI tables, so ACPI xen can use ITS (b) provides a MADT with ITS information to dom0. What I am focusing with union is for (a) , and (b) code would be simpler if we use the union in (a). You seem to be discounting (a) in comments so far. why union? as I have mentioned before... It will make the host_its structure accommodate dt node and acpi_madt_generic_translator, both has same purpose. If one is valid why not other. please provide a technical reason for not doing it. Even the DT code can be reworked to avoid storing the node. we can have a separate patch for that. Cheers, Cheers! Sending next rev shortly. -manish ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain
HI Julien, On 8/10/2017 5:43 PM, Julien Grall wrote: On 10/08/17 13:00, Manish Jaggi wrote: Hi Julien, On 8/10/2017 4:58 PM, Julien Grall wrote: On 10/08/17 12:21, Manish Jaggi wrote: Hi Julien, On 6/21/2017 6:53 PM, Julien Grall wrote: Hi Manish, On 21/06/17 02:01, Manish Jaggi wrote: This patch series adds the support of ITS for ACPI hardware domain. It is tested on staging branch with has ITS v12 patchset by Andre. I have tried to incorporate the review comments on the RFC v1/v2 patch. The single patch in RFC is now split into 4 patches. I will comment here rather than on each patches. Patch1: ARM: ITS: Add translation_id to host_its Adds translation_id in host_its data structure, which is populated from translation_id read from firmwar MADT. This value is then programmed into local MADT created for hardware domain in patch 4. I don't see any reason to store value that will only be used for generating the MADT which BTW is just a copy for the ITS. Instead we should copy over the MADT entries. There are two approaches, If I use the standard API acpi_table_parse_madt which would iterate over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the addr and translation_id in some data structure, to be filled later in the hwdomain copy of madt generic translator. If I don't use the standard API I have to add code to manually parse all the translator entries. There are a 3rd approach I suggested and ignored... The ITS entries for Dom0 is exactly the same as the host entries. Yes, and if not passed properly dom0 wont get device interrupts... So you only need to do a verbatim copy of the entry... Can you please check patch 4/2, the translation_id and address are passed verbatim, the other values are reserved in acpi_madt_generic_translator. For ACPI, we took the approach to only rewrite what's necessary and give the rest to Dom0 as it is. If newer version of ACPI re-used those fields, then they will be copied over to Dom0. I don't consider it as an issue because the problem would be the same if those fields have an important meaning for the platform. Few thoughts... If we follow this approach, few points needs to be considered - If ACPI may use the reserved information later it could be equally important for dom0 and Xen, so it might be useful to keep reserved in xen as well. - For platforms which use dt, translation_id is not required to be stored in struct host_its, similarly for platforms which use acpi dt_node pointer might be of no use. So we can have struct host_its having a union with dt_device_node * for dt and acpi_madt_generic_translator * for acpi. IMHO this could be an approach we can take. struct host_its { struct list_head entry; -const struct dt_device_node *dt_node; + union { +const struct dt_device_node *dt_node; +const struct acpi_madt_generic_translator *acpi_its_entry; +}; paddr_t addr; Could you please detail 3rd approach and how different it is from approach 2. ACPI_MEMCPY(its, host_its, size); Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain
Hi Julien, On 8/10/2017 4:58 PM, Julien Grall wrote: On 10/08/17 12:21, Manish Jaggi wrote: Hi Julien, On 6/21/2017 6:53 PM, Julien Grall wrote: Hi Manish, On 21/06/17 02:01, Manish Jaggi wrote: This patch series adds the support of ITS for ACPI hardware domain. It is tested on staging branch with has ITS v12 patchset by Andre. I have tried to incorporate the review comments on the RFC v1/v2 patch. The single patch in RFC is now split into 4 patches. I will comment here rather than on each patches. Patch1: ARM: ITS: Add translation_id to host_its Adds translation_id in host_its data structure, which is populated from translation_id read from firmwar MADT. This value is then programmed into local MADT created for hardware domain in patch 4. I don't see any reason to store value that will only be used for generating the MADT which BTW is just a copy for the ITS. Instead we should copy over the MADT entries. There are two approaches, If I use the standard API acpi_table_parse_madt which would iterate over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the addr and translation_id in some data structure, to be filled later in the hwdomain copy of madt generic translator. If I don't use the standard API I have to add code to manually parse all the translator entries. There are a 3rd approach I suggested and ignored... The ITS entries for Dom0 is exactly the same as the host entries. Yes, and if not passed properly dom0 wont get device interrupts... So you only need to do a verbatim copy of the entry... Can you please check patch 4/2, the translation_id and address are passed verbatim, the other values are reserved in acpi_madt_generic_translator. Could you please detail 3rd approach and how different it is from approach 2. Which of the two you find cleaner? This would also avoid to introduce a fake ID for DT as you currently do in patch #2. This can be avoided by storing translator_id only for acpi. +static int add_to_host_its_list(u64 addr, u64 size, + u32 translation_id, const void *node) +{ +struct host_its *its_data; +its_data = xzalloc(struct host_its); + +if ( !its_data ) +return -1; + +if ( node ) +its_data->dt_node = node; +else +its_data->translation_id = translation_id; + +its_data->addr = addr; +its_data->size = size; +printk("GICv3: Found ITS @0x%lx\n", addr); + +list_add_tail(_data->entry, _its_list); + +return 0; What do you think? I don't want to see the translation_id stored for no use at all but creating the DOM0 ACPI tables. Is that clearer? ok, I will remove it. Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain
Hi Julien, On 6/21/2017 6:53 PM, Julien Grall wrote: Hi Manish, On 21/06/17 02:01, Manish Jaggi wrote: This patch series adds the support of ITS for ACPI hardware domain. It is tested on staging branch with has ITS v12 patchset by Andre. I have tried to incorporate the review comments on the RFC v1/v2 patch. The single patch in RFC is now split into 4 patches. I will comment here rather than on each patches. Patch1: ARM: ITS: Add translation_id to host_its Adds translation_id in host_its data structure, which is populated from translation_id read from firmwar MADT. This value is then programmed into local MADT created for hardware domain in patch 4. I don't see any reason to store value that will only be used for generating the MADT which BTW is just a copy for the ITS. Instead we should copy over the MADT entries. There are two approaches, If I use the standard API acpi_table_parse_madt which would iterate over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the addr and translation_id in some data structure, to be filled later in the hwdomain copy of madt generic translator. If I don't use the standard API I have to add code to manually parse all the translator entries. Which of the two you find cleaner? This would also avoid to introduce a fake ID for DT as you currently do in patch #2. This can be avoided by storing translator_id only for acpi. +static int add_to_host_its_list(u64 addr, u64 size, + u32 translation_id, const void *node) +{ +struct host_its *its_data; +its_data = xzalloc(struct host_its); + +if ( !its_data ) +return -1; + +if ( node ) +its_data->dt_node = node; +else +its_data->translation_id = translation_id; + +its_data->addr = addr; +its_data->size = size; +printk("GICv3: Found ITS @0x%lx\n", addr); + +list_add_tail(_data->entry, _its_list); + +return 0; What do you think? Patch2: ARM: ITS: ACPI: Introduce gicv3_its_acpi_init Introduces function for its_acpi_init, which calls add_to_host_its_list which is a common function also called from _dt variant. Just reading at the description, there are a call for splitting this patch... Looking at the code, you mix code movement and code addition. Have a look at [1] to see how to break patches. Yes I will break into multiple patches patch 2 and 4. Patch3: ARM: ITS: Deny hardware domain access to its Extends the gicv3_iomem_deny to include its regions as well Patch4: ARM: ACPI: Add ITS to hardware domain MADT This patch adds ITS information in hardware domain's MADT table. Also this patch interoduces .get_hwdom_madt_size in gic_hw_operations, to return the complete size of MADT table for hardware domain. Same here. Yes. Manish Jaggi (4): ARM: ITS: Add translation_id to host_its ARM: ITS: ACPI: Introduce gicv3_its_acpi_init ARM: ITS: Deny hardware domain access to its ARM: ACPI: Add ITS to hardware domain MADT xen/arch/arm/domain_build.c | 7 +-- xen/arch/arm/gic-v2.c| 6 +++ xen/arch/arm/gic-v3-its.c| 102 +++ xen/arch/arm/gic-v3.c| 31 xen/arch/arm/gic.c | 11 + xen/include/asm-arm/gic.h| 3 ++ xen/include/asm-arm/gic_v3_its.h | 36 ++ 7 files changed, 180 insertions(+), 16 deletions(-) Cheers, [1] https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches#Making_good_patches ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] ARM: SMMUv3 support
On 6/13/2017 10:19 AM, Manish Jaggi wrote: On 3/29/2017 5:30 AM, Goel, Sameer wrote: Sure, I will try to post something soon. Hi Sameer, Are you still working on SMMU v3, can you please post patches. Hi Sameer, Could you please post RFC patches for SMMUv3, can provide feedback by testing on thunderX platform. Thanks manish Thanks Manish Thanks, Sameer On 3/27/2017 11:03 PM, Vijay Kilari wrote: On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameer <sg...@codeaurora.org> wrote: Hi, I am working on adding this support. The work is in initial stages and will target ACPI systems to start with. Do you have a specific requirement? Or even better: want to help with DT testing ? :) Thanks Sameer. I don't have any specific requirement. I am also looking with ACPI support. Please share your RFC patches so that I can test on our platform. Thanks, Sameer On 3/20/2017 11:58 PM, Vijay Kilari wrote: Hi, Is there any effort put by anyone to get SMMUv3 support in Xen for ARM64?. Would be glad to know. Regards Vijay ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Xen 4.10 Development Update
Hi Julien, On Mon, Jul 17, 2017 at 02:26:22PM +0100, Julien Grall wrote: This email only tracks big items for xen.git tree. Please reply for items you woulk like to see in 4.10 so that people have an idea what is going on and prioritise accordingly. You're welcome to provide description and use cases of the feature you're working on. = Timeline = We now adopt a fixed cut-off date scheme. We will release twice a year. The upcoming 4.10 timeline are as followed: * Last posting date: September 15th, 2017 * Hard code freeze: September 29th, 2017 * RC1: TBD * Release: December 2, 2017 Note that we don't have freeze exception scheme anymore. All patches that wish to go into 4.10 must be posted no later than the last posting date. All patches posted after that date will be automatically queued into next release. RCs will be arranged immediately after freeze. We recently introduced a jira instance to track all the tasks (not only big) for the project. See: https://xenproject.atlassian.net/projects/XEN/issues. Most of the tasks tracked by this e-mail also have a corresponding jira task referred by XEN-N. I have started to include the version number of series associated to each feature. Can each owner send an update on the version number if the series was posted upstream? = Projects = == Hypervisor == * Per-cpu tasklet - XEN-28 - Konrad Rzeszutek Wilk * Add support of rcu_idle_{enter,exit} - XEN-27 - Dario Faggioli === x86 === I am working on XEN-70, have already posted rfc. [1] Also can you please add a xen-jira issue for the ITS ACPI support [2] v2 patches, which I have already sent and am working on next rev. [1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg110269.html [2] https://www.mail-archive.com/xen-devel@lists.xen.org/msg111342.html ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit
Hi Roger, On 7/20/2017 3:59 PM, Roger Pau Monné wrote: On Thu, Jul 20, 2017 at 03:02:19PM +0530, Manish Jaggi wrote: Hi Roger, On 7/20/2017 1:54 PM, Roger Pau Monné wrote: On Thu, Jul 20, 2017 at 09:24:36AM +0530, Manish Jaggi wrote: Hi Punit, On 7/19/2017 8:11 PM, Punit Agrawal wrote: I took some notes for the PCI Passthrough design discussion at Xen Summit. Due to the wide range of topics covered, the notes got sparser towards the end of the session. I've tried to attribute names against comments but have very likely got things mixed up. Apologies in advance. Was curious if any discussions happened on the RC Emu (config space emulation) as per slide 18 https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf Part of this is already posted on the list (ATM for x86 only) but the PCI specification (and therefore the config space emulation) is not tied to any arch: https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg03698.html From the summary, I have a questions on " - Roger: Registering config space with Xen before device discovery will allow the hypervisor to set access traps for certain functionality as appropriate" Traps will do emulation or something else ? Have you read the series? What else could the traps do? I'm not sure I understand the question. Is the config space emulation only for DomU or it for Dom0 as well ? Again, have you read the series? This is explained in the cover letter (0/9). On x86 this is initially for Dom0 only, DomU will continue to use QEMU until the implementation inside the hypervisor (vPCI) is complete enough to handle DomU securely. Slide 18 shows only for DomU ? ARM folks believe this is not needed for Dom0 in the ARM case, I don't have an opinion, I know it's certainly mandatory for x86 PVH Dom0. Julien clarified about Slide18. Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit
HI Julien, On 7/20/2017 4:11 PM, Julien Grall wrote: On 20/07/17 10:32, Manish Jaggi wrote: Hi Roger, On 7/20/2017 1:54 PM, Roger Pau Monné wrote: On Thu, Jul 20, 2017 at 09:24:36AM +0530, Manish Jaggi wrote: Hi Punit, On 7/19/2017 8:11 PM, Punit Agrawal wrote: I took some notes for the PCI Passthrough design discussion at Xen Summit. Due to the wide range of topics covered, the notes got sparser towards the end of the session. I've tried to attribute names against comments but have very likely got things mixed up. Apologies in advance. Was curious if any discussions happened on the RC Emu (config space emulation) as per slide 18 https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf Part of this is already posted on the list (ATM for x86 only) but the PCI specification (and therefore the config space emulation) is not tied to any arch: https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg03698.html From the summary, I have a questions on " - Roger: Registering config space with Xen before device discovery will allow the hypervisor to set access traps for certain functionality as appropriate" Traps will do emulation or something else ? Is the config space emulation only for DomU or it for Dom0 as well ? Slide 18 shows only for DomU ? My slides are not meant to be read without the talk. In this particular case, this is only explaining how passthrough will work for DomU. Thanks for clarification. Ah ok, The single slide created confusion, It would be nice if you have added one more describing dom0 config access. I will wait for the video to get posted. Roger series is at the moment focusing on emulating a fully ECAM compliant hostbridge for the hardware domain. This is because Xen and the hardware domain should not access the configuration space at the same time. Yes as discussed on this topic on list few weeks back. We may also perform some tasks (i.e MSI mapping, memory mapping) or sanitizing when the configuration space is updated by the hardware domain. Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit
Hi Roger, On 7/20/2017 1:54 PM, Roger Pau Monné wrote: On Thu, Jul 20, 2017 at 09:24:36AM +0530, Manish Jaggi wrote: Hi Punit, On 7/19/2017 8:11 PM, Punit Agrawal wrote: I took some notes for the PCI Passthrough design discussion at Xen Summit. Due to the wide range of topics covered, the notes got sparser towards the end of the session. I've tried to attribute names against comments but have very likely got things mixed up. Apologies in advance. Was curious if any discussions happened on the RC Emu (config space emulation) as per slide 18 https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf Part of this is already posted on the list (ATM for x86 only) but the PCI specification (and therefore the config space emulation) is not tied to any arch: https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg03698.html From the summary, I have a questions on " - Roger: Registering config space with Xen before device discovery will allow the hypervisor to set access traps for certain functionality as appropriate" Traps will do emulation or something else ? Is the config space emulation only for DomU or it for Dom0 as well ? Slide 18 shows only for DomU ? -manish Roger. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit
Hi Punit, On 7/19/2017 8:11 PM, Punit Agrawal wrote: I took some notes for the PCI Passthrough design discussion at Xen Summit. Due to the wide range of topics covered, the notes got sparser towards the end of the session. I've tried to attribute names against comments but have very likely got things mixed up. Apologies in advance. Was curious if any discussions happened on the RC Emu (config space emulation) as per slide 18 https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf Although the session was well attended, some of the more active discussions involved - Julien Grall, Stefano Stabillini, Roger Pau Monné, Jan Beulich, Vikram Sethi. I'm sure I am missing some folks here. Please do point out any mistakes I've made for the audience's benefit. * Discovery of PCI hostbridges - Dom0 will be responsible for scanning the ECAM for devices and register them with Xen. This approach is chosen due to variety of non-standard PCI controllers on ARM platforms and the desire to not duplicate driver code between Linux and Xen. - Jan, Roger: Bus scan needs to happer before device discovery otherwise a small window where Xen doesn't know which host bridge the device is registered on (as it'll likely only refer to the segment number). - Roger: Registering config space with Xen before device discovery will allow the hypervisor to set access traps for certain functionality as appropriate. - Jan: Xen and Dom0 have to agree on the PCI segment number mapping to host bridges. This is so that for future calls, Dom0 and hypervisor can communicate using sBDF without ambiguity. - Julien: Dom0 will register config space address and segment number. mcfg_add will be used to pass the segment to Xen. - PCI segment - it's purely a software construct so identify different host bridges. - Some discussion on whether boot devices need to be on Segment 0. Technically, MCFG is only required to describe Segment 0 - other host bridges can be described in AML. * Configuration accesses for non-ecam compliant host bridge - Julien proposed these to be forwarded to Dom0 for handling. - Audience: What kind of non-compliance are we talking about? If they are simple, can they be implemented in Xen in a few lines of code? - A few different types - restrictions on access size, e.g., only certain sizes supported - register multiplexing via a window; similar to legacy x86 PCI access mechanism - ECAM compliant but with special casing for different devices * Support on 32bit platforms - Is there enough address space to map ECAM into Dom0. Maximum ECAM size is 256MB. * PCI ACS support - Vikram: Xen needs to be aware of the PCI device topology to correctly setup device groups for passthrough - Jan: Roger: IIRC, Xen is already aware of the device topology thought it doesn't use ACS to work out which devices need to be passed to guest as a group. - Stefano: There was support in xend (previous Xen toolstack) but the functionality has not yet been ported to libxl. * Implementation milestones - Julien provided a summary of breakdown - M0 - design document, currently under discussion on xen-devel - M1 - PCI support in Xen - Xen aware of PCI devices (via Dom0 registration) - M2 - Guest PCIe passthrough - Julien: Some complexity in dealing with Legacy interrupts as they can be shared. - Roger: MSIs mandatory for PCIe. So legacy interrupts can be tackled at a later stage. - M3 - testing - fuzzing. Jan: If implemented it'll be better than what x86 currently have. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH] ARM: SMMUv2: Add compatible match entry for cavium smmuv2
This patch adds cavium,smmu-v2 compatible match entry in smmu driver Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/drivers/passthrough/arm/smmu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index 1082fcf..887f874 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -2272,6 +2272,7 @@ static const struct of_device_id arm_smmu_of_match[] = { { .compatible = "arm,mmu-400", .data = (void *)ARM_SMMU_V1 }, { .compatible = "arm,mmu-401", .data = (void *)ARM_SMMU_V1 }, { .compatible = "arm,mmu-500", .data = (void *)ARM_SMMU_V2 }, + { .compatible = "cavium,smmu-v2", .data = (void *)ARM_SMMU_V2 }, { }, }; MODULE_DEVICE_TABLE(of, arm_smmu_of_match); -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 3/4] ARM: ITS: Deny hardware domain access to its
This patch extends the gicv3_iomem_deny_access functionality by adding support for its region as well. Added function gicv3_its_deny_access. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 19 +++ xen/arch/arm/gic-v3.c| 7 +++ xen/include/asm-arm/gic_v3_its.h | 8 3 files changed, 34 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index e11f29a..98c8f46 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -905,6 +906,24 @@ struct pending_irq *gicv3_assign_guest_event(struct domain *d, return pirq; } +int gicv3_its_deny_access(const struct domain *d) +{ +int rc = 0; +unsigned long mfn, nr; +const struct host_its *its_data; + +list_for_each_entry(its_data, _its_list, entry) +{ +mfn = paddr_to_pfn(its_data->addr); +nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +break; +} + +return rc; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index 558b32c..f6fbf2f 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1308,6 +1308,13 @@ static int gicv3_iomem_deny_access(const struct domain *d) if ( rc ) return rc; +if ( gicv3_its_host_has_its() ) +{ +rc = gicv3_its_deny_access(d); +if ( rc ) +return rc; +} + for ( i = 0; i < gicv3.rdist_count; i++ ) { mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT; diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index bcfa181..84dbb9c 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -143,6 +143,9 @@ int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end); #endif +/* Deny iomem access for its */ +int gicv3_its_deny_access(const struct domain *d); + bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -212,6 +215,11 @@ static inline int gicv3_its_acpi_init(struct acpi_subtable_header *header, } #endif +static inline int gicv3_its_deny_access(const struct domain *d) +{ +return 0; +} + static inline bool gicv3_its_host_has_its(void) { return false; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/4] ARM: ACPI: Add ITS to hardware domain MADT
This patch adds ITS information in hardware domain's MADT table. Also this patch interoduces .get_hwdom_madt_size in gic_hw_operations, to return the complete size of MADT table for hardware domain. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/domain_build.c | 7 +-- xen/arch/arm/gic-v2.c| 6 ++ xen/arch/arm/gic-v3-its.c| 34 ++ xen/arch/arm/gic-v3.c| 18 ++ xen/arch/arm/gic.c | 11 +++ xen/include/asm-arm/gic.h| 3 +++ xen/include/asm-arm/gic_v3_its.h | 12 7 files changed, 85 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3abacc0..15c7f9b 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1802,12 +1802,7 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8); acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8); -madt_size = sizeof(struct acpi_table_madt) -+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus -+ sizeof(struct acpi_madt_generic_distributor); -if ( d->arch.vgic.version == GIC_V3 ) -madt_size += sizeof(struct acpi_madt_generic_redistributor) - * d->arch.vgic.nr_regions; +madt_size = gic_get_hwdom_madt_size(d); acpi_size += ROUNDUP(madt_size, 8); addr = acpi_os_get_root_pointer(); diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c index ffbe47c..e92dc3d 100644 --- a/xen/arch/arm/gic-v2.c +++ b/xen/arch/arm/gic-v2.c @@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } +static u32 gicv2_get_hwdom_madt_size(const struct domain *d) +{ +return 0; +} + #ifdef CONFIG_ACPI static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset) { @@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = { .read_apr= gicv2_read_apr, .make_hwdom_dt_node = gicv2_make_hwdom_dt_node, .make_hwdom_madt = gicv2_make_hwdom_madt, +.get_hwdom_madt_size = gicv2_get_hwdom_madt_size, .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings, .iomem_deny_access = gicv2_iomem_deny_access, .do_LPI = gicv2_do_LPI, diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 98c8f46..7f8ff34 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -924,6 +924,40 @@ int gicv3_its_deny_access(const struct domain *d) return rc; } +#ifdef CONFIG_ACPI +u32 gicv3_its_madt_generic_translator_size(void) +{ +const struct host_its *its_data; +u32 size = 0; + +list_for_each_entry(its_data, _its_list, entry) +size += sizeof(struct acpi_madt_generic_translator); + +return size; +} + +u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ +struct acpi_madt_generic_translator *gic_its; +const struct host_its *its_data; +u32 table_len = offset, size; + +/* Update GIC ITS information in hardware domain's MADT */ +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + + table_len); +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; +gic_its->translation_id = its_data->translation_id; +table_len += size; +} + +return table_len; +} +#endif /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index f6fbf2f..c7a8c1c 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1407,9 +1407,21 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) table_len += size; } +table_len = gicv3_its_make_hwdom_madt(base_ptr, table_len); return table_len; } +static u32 gicv3_get_hwdom_madt_size(const struct domain *d) +{ +u32 size; +size = sizeof(struct acpi_madt_generic_redistributor) + * d->arch.vgic.nr_regions; +if ( gicv3_its_host_has_its() ) +size += gicv3_its_madt_generic_translator_size(); + +return size; +} + static int __init gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header, const unsigned long end) @@ -1605,6 +1617,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) { return 0; } + +static u32 gicv3_get_hwdom_madt_size(const struct domain *d) +{ +return 0; +} #endif /* Set up the GIC */ @@ -1706,6 +1723,7 @@ s
[Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain
This patch series adds the support of ITS for ACPI hardware domain. It is tested on staging branch with has ITS v12 patchset by Andre. I have tried to incorporate the review comments on the RFC v1/v2 patch. The single patch in RFC is now split into 4 patches. Patch1: ARM: ITS: Add translation_id to host_its Adds translation_id in host_its data structure, which is populated from translation_id read from firmwar MADT. This value is then programmed into local MADT created for hardware domain in patch 4. Patch2: ARM: ITS: ACPI: Introduce gicv3_its_acpi_init Introduces function for its_acpi_init, which calls add_to_host_its_list which is a common function also called from _dt variant. Patch3: ARM: ITS: Deny hardware domain access to its Extends the gicv3_iomem_deny to include its regions as well Patch4: ARM: ACPI: Add ITS to hardware domain MADT This patch adds ITS information in hardware domain's MADT table. Also this patch interoduces .get_hwdom_madt_size in gic_hw_operations, to return the complete size of MADT table for hardware domain. Manish Jaggi (4): ARM: ITS: Add translation_id to host_its ARM: ITS: ACPI: Introduce gicv3_its_acpi_init ARM: ITS: Deny hardware domain access to its ARM: ACPI: Add ITS to hardware domain MADT xen/arch/arm/domain_build.c | 7 +-- xen/arch/arm/gic-v2.c| 6 +++ xen/arch/arm/gic-v3-its.c| 102 +++ xen/arch/arm/gic-v3.c| 31 xen/arch/arm/gic.c | 11 + xen/include/asm-arm/gic.h| 3 ++ xen/include/asm-arm/gic_v3_its.h | 36 ++ 7 files changed, 180 insertions(+), 16 deletions(-) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/4] ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
This patch adds gicv3_its_acpi_init. To avoid duplicate code for initializing and adding to host_its_list a common function add_to_host_its_list is added which is called by both _dt_init and _acpi_init. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 49 xen/arch/arm/gic-v3.c| 6 + xen/include/asm-arm/gic_v3_its.h | 14 3 files changed, 59 insertions(+), 10 deletions(-) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 2d36030..e11f29a 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -33,6 +33,7 @@ #define ITS_CMD_QUEUE_SZSZ_1M +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K) /* * No lock here, as this list gets only populated upon boot while scanning * firmware tables for all host ITSes, and only gets iterated afterwards. @@ -976,11 +977,35 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, return res; } +/* Common function for addind to host_its_list +*/ +static int add_to_host_its_list(u64 addr, u64 size, + u32 translation_id, const void *node) +{ +struct host_its *its_data; +its_data = xzalloc(struct host_its); + +if ( !its_data ) +return -1; + +if ( node ) +its_data->dt_node = node; + +its_data->addr = addr; +its_data->size = size; +its_data->translation_id = translation_id; +printk("GICv3: Found ITS @0x%lx\n", addr); + +list_add_tail(_data->entry, _its_list); + +return 0; +} + /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */ void gicv3_its_dt_init(const struct dt_device_node *node) { const struct dt_device_node *its = NULL; -struct host_its *its_data; +static int its_id = 1; /* * Check for ITS MSI subnodes. If any, add the ITS register @@ -996,19 +1021,23 @@ void gicv3_its_dt_init(const struct dt_device_node *node) if ( dt_device_get_address(its, 0, , ) ) panic("GICv3: Cannot find a valid ITS frame address"); -its_data = xzalloc(struct host_its); -if ( !its_data ) -panic("GICv3: Cannot allocate memory for ITS frame"); +if ( add_to_host_its_list(addr, size, its_id++, its) ) +panic("GICV3: Adding Host ITS failed "); +} +} -its_data->addr = addr; -its_data->size = size; -its_data->dt_node = its; +#ifdef CONFIG_ACPI +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) +{ +struct acpi_madt_generic_translator *its_entry; -printk("GICv3: Found ITS @0x%lx\n", addr); +its_entry = (struct acpi_madt_generic_translator *)header; -list_add_tail(_data->entry, _its_list); -} +return add_to_host_its_list(its_entry->base_address, +ACPI_GICV3_ITS_MEM_SIZE, +its_entry->translation_id, NULL); } +#endif /* * Local variables: diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..558b32c 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1567,6 +1567,12 @@ static void __init gicv3_acpi_init(void) gicv3.rdist_stride = 0; +count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR, + gicv3_its_acpi_init, 0); + +if ( count <= 0 ) +panic("GICv3: Can't get ITS entry"); + /* * In ACPI, 0 is considered as the invalid address. However the rest * of the initialization rely on the invalid address to be diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 96b910b..bcfa181 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -105,6 +105,7 @@ #include #include +#include #define HOST_ITS_FLUSH_CMD_QUEUE(1U << 0) #define HOST_ITS_USES_PTA (1U << 1) @@ -137,6 +138,11 @@ extern struct list_head host_its_list; /* Parse the host DT and pick up all host ITSes. */ void gicv3_its_dt_init(const struct dt_device_node *node); +#ifdef CONFIG_ACPI +int gicv3_its_acpi_init(struct acpi_subtable_header *header, +const unsigned long end); +#endif + bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -198,6 +204,14 @@ static inline void gicv3_its_dt_init(const struct dt_device_node *node) { } +#ifdef CONFIG_ACPI +static inline int gicv3_its_acpi_init(struct acpi_subtable_header *header, +const unsigned long end) +{ +return false; +} +#endif + static inline bool gicv3_its_host_has_its(void) { return false; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/4] ARM: ITS: Add translation_id to host_its
This patch adds a translation_id to host_its data structure. Value stored in this id should be copied over to hardware domains MADT table. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/include/asm-arm/gic_v3_its.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 1fac1c7..96b910b 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -118,6 +118,8 @@ struct host_its { const struct dt_device_node *dt_node; paddr_t addr; paddr_t size; +/* A unique value to identify each ITS */ +u32 translation_id; void __iomem *its_base; unsigned int devid_bits; unsigned int evid_bits; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] Hugepage support for Dom0
Hi, Does Xen arm64 support hugepages for Dom0 ? If yes how to enable it. Found wiki page on it : https://wiki.xenproject.org/wiki/Huge_Page_Support but is not updated. Thanks -Manish ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 4/4] ARM: ACPI: Add ITS to hardware domain MADT
This patch adds ITS information in hardware domain's MADT table. Also this patch introduces .get_hwdom_madt_size in gic_hw_operations, to return the complete size of MADT table for hardware domain. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/domain_build.c | 7 +-- xen/arch/arm/gic-v2.c| 6 ++ xen/arch/arm/gic-v3-its.c| 34 ++ xen/arch/arm/gic-v3.c| 18 ++ xen/arch/arm/gic.c | 11 +++ xen/include/asm-arm/gic.h| 3 +++ xen/include/asm-arm/gic_v3_its.h | 12 7 files changed, 85 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3abacc0..15c7f9b 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -1802,12 +1802,7 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8); acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8); -madt_size = sizeof(struct acpi_table_madt) -+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus -+ sizeof(struct acpi_madt_generic_distributor); -if ( d->arch.vgic.version == GIC_V3 ) -madt_size += sizeof(struct acpi_madt_generic_redistributor) - * d->arch.vgic.nr_regions; +madt_size = gic_get_hwdom_madt_size(d); acpi_size += ROUNDUP(madt_size, 8); addr = acpi_os_get_root_pointer(); diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c index ffbe47c..e92dc3d 100644 --- a/xen/arch/arm/gic-v2.c +++ b/xen/arch/arm/gic-v2.c @@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } +static u32 gicv2_get_hwdom_madt_size(const struct domain *d) +{ +return 0; +} + #ifdef CONFIG_ACPI static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset) { @@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = { .read_apr= gicv2_read_apr, .make_hwdom_dt_node = gicv2_make_hwdom_dt_node, .make_hwdom_madt = gicv2_make_hwdom_madt, +.get_hwdom_madt_size = gicv2_get_hwdom_madt_size, .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings, .iomem_deny_access = gicv2_iomem_deny_access, .do_LPI = gicv2_do_LPI, diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 98c8f46..7f8ff34 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -924,6 +924,40 @@ int gicv3_its_deny_access(const struct domain *d) return rc; } +#ifdef CONFIG_ACPI +u32 gicv3_its_madt_generic_translator_size(void) +{ +const struct host_its *its_data; +u32 size = 0; + +list_for_each_entry(its_data, _its_list, entry) +size += sizeof(struct acpi_madt_generic_translator); + +return size; +} + +u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ +struct acpi_madt_generic_translator *gic_its; +const struct host_its *its_data; +u32 table_len = offset, size; + +/* Update GIC ITS information in hardware domain's MADT */ +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + + table_len); +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; +gic_its->translation_id = its_data->translation_id; +table_len += size; +} + +return table_len; +} +#endif /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index f6fbf2f..c7a8c1c 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1407,9 +1407,21 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) table_len += size; } +table_len = gicv3_its_make_hwdom_madt(base_ptr, table_len); return table_len; } +static u32 gicv3_get_hwdom_madt_size(const struct domain *d) +{ +u32 size; +size = sizeof(struct acpi_madt_generic_redistributor) + * d->arch.vgic.nr_regions; +if ( gicv3_its_host_has_its() ) +size += gicv3_its_madt_generic_translator_size(); + +return size; +} + static int __init gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header, const unsigned long end) @@ -1605,6 +1617,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) { return 0; } + +static u32 gicv3_get_hwdom_madt_size(const struct domain *d) +{ +return 0; +} #endif /* Set up the GIC */ @@ -1706,6 +1723,7 @@ static cons
[Xen-devel] [PATCH 3/4] ARM: ITS: Deny hardware domain access to its region
This patch extends the gicv3_iomem_deny_access functionality by adding support for its region as well. Added function gicv3_its_deny_access. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 19 +++ xen/arch/arm/gic-v3.c| 7 +++ xen/include/asm-arm/gic_v3_its.h | 8 3 files changed, 34 insertions(+) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index e11f29a..98c8f46 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -20,6 +20,7 @@ #include #include +#include #include #include #include @@ -905,6 +906,24 @@ struct pending_irq *gicv3_assign_guest_event(struct domain *d, return pirq; } +int gicv3_its_deny_access(const struct domain *d) +{ +int rc = 0; +unsigned long mfn, nr; +const struct host_its *its_data; + +list_for_each_entry(its_data, _its_list, entry) +{ +mfn = paddr_to_pfn(its_data->addr); +nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +break; +} + +return rc; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index 558b32c..f6fbf2f 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1308,6 +1308,13 @@ static int gicv3_iomem_deny_access(const struct domain *d) if ( rc ) return rc; +if ( gicv3_its_host_has_its() ) +{ +rc = gicv3_its_deny_access(d); +if ( rc ) +return rc; +} + for ( i = 0; i < gicv3.rdist_count; i++ ) { mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT; diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index bcfa181..84dbb9c 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -143,6 +143,9 @@ int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end); #endif +/* Deny iomem access for its */ +int gicv3_its_deny_access(const struct domain *d); + bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -212,6 +215,11 @@ static inline int gicv3_its_acpi_init(struct acpi_subtable_header *header, } #endif +static inline int gicv3_its_deny_access(const struct domain *d) +{ +return 0; +} + static inline bool gicv3_its_host_has_its(void) { return false; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 2/4] ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
This patch adds gicv3_its_acpi_init. To avoid duplicate code for initializing and adding to host_its_list a common function add_to_host_its_list is added which is called by both _dt_init and _acpi_init. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3-its.c| 49 xen/arch/arm/gic-v3.c| 6 + xen/include/asm-arm/gic_v3_its.h | 14 3 files changed, 59 insertions(+), 10 deletions(-) diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 2d36030..e11f29a 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -33,6 +33,7 @@ #define ITS_CMD_QUEUE_SZSZ_1M +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K) /* * No lock here, as this list gets only populated upon boot while scanning * firmware tables for all host ITSes, and only gets iterated afterwards. @@ -976,11 +977,35 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, return res; } +/* Common function for addind to host_its_list +*/ +static int add_to_host_its_list(u64 addr, u64 size, + u32 translation_id, const void *node) +{ +struct host_its *its_data; +its_data = xzalloc(struct host_its); + +if ( !its_data ) +return -1; + +if ( node ) +its_data->dt_node = node; + +its_data->addr = addr; +its_data->size = size; +its_data->translation_id = translation_id; +printk("GICv3: Found ITS @0x%lx\n", addr); + +list_add_tail(_data->entry, _its_list); + +return 0; +} + /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */ void gicv3_its_dt_init(const struct dt_device_node *node) { const struct dt_device_node *its = NULL; -struct host_its *its_data; +static int its_id = 1; /* * Check for ITS MSI subnodes. If any, add the ITS register @@ -996,19 +1021,23 @@ void gicv3_its_dt_init(const struct dt_device_node *node) if ( dt_device_get_address(its, 0, , ) ) panic("GICv3: Cannot find a valid ITS frame address"); -its_data = xzalloc(struct host_its); -if ( !its_data ) -panic("GICv3: Cannot allocate memory for ITS frame"); +if ( add_to_host_its_list(addr, size, its_id++, its) ) +panic("GICV3: Adding Host ITS failed "); +} +} -its_data->addr = addr; -its_data->size = size; -its_data->dt_node = its; +#ifdef CONFIG_ACPI +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) +{ +struct acpi_madt_generic_translator *its_entry; -printk("GICv3: Found ITS @0x%lx\n", addr); +its_entry = (struct acpi_madt_generic_translator *)header; -list_add_tail(_data->entry, _its_list); -} +return add_to_host_its_list(its_entry->base_address, +ACPI_GICV3_ITS_MEM_SIZE, +its_entry->translation_id, NULL); } +#endif /* * Local variables: diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..558b32c 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1567,6 +1567,12 @@ static void __init gicv3_acpi_init(void) gicv3.rdist_stride = 0; +count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR, + gicv3_its_acpi_init, 0); + +if ( count <= 0 ) +panic("GICv3: Can't get ITS entry"); + /* * In ACPI, 0 is considered as the invalid address. However the rest * of the initialization rely on the invalid address to be diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 96b910b..bcfa181 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -105,6 +105,7 @@ #include #include +#include #define HOST_ITS_FLUSH_CMD_QUEUE(1U << 0) #define HOST_ITS_USES_PTA (1U << 1) @@ -137,6 +138,11 @@ extern struct list_head host_its_list; /* Parse the host DT and pick up all host ITSes. */ void gicv3_its_dt_init(const struct dt_device_node *node); +#ifdef CONFIG_ACPI +int gicv3_its_acpi_init(struct acpi_subtable_header *header, +const unsigned long end); +#endif + bool gicv3_its_host_has_its(void); unsigned int vgic_v3_its_count(const struct domain *d); @@ -198,6 +204,14 @@ static inline void gicv3_its_dt_init(const struct dt_device_node *node) { } +#ifdef CONFIG_ACPI +static inline int gicv3_its_acpi_init(struct acpi_subtable_header *header, +const unsigned long end) +{ +return false; +} +#endif + static inline bool gicv3_its_host_has_its(void) { return false; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 1/4] ARM: ITS: Add translation_id to host_its
This patch adds a translation_id to host_its data structure. Value stored in this id should be copied over to hardware domains MADT table. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/include/asm-arm/gic_v3_its.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index 1fac1c7..96b910b 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -118,6 +118,8 @@ struct host_its { const struct dt_device_node *dt_node; paddr_t addr; paddr_t size; +/* A unique value to identify each ITS */ +u32 translation_id; void __iomem *its_base; unsigned int devid_bits; unsigned int evid_bits; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain
Hi, This patch series adds the support of ITS for ACPI hardware domain. It is tested on staging branch with has ITS v12 patchset by Andre. I have tried to incorporate the review comments on the RFC v1/v2 patch. The single patch in RFC is now split into 4 patches. Patch1: ARM: ITS: Add translation_id to host_its Adds translation_id in host_its data structure, which is populated from translation_id read from firmware MADT. This value is then programmed into local MADT created for hardware domain in patch 4. Patch2: ARM: ITS: ACPI: Introduce gicv3_its_acpi_init Introduces function for its_acpi_init, which calls add_to_host_its_list which is a common function also called from _dt variant. Patch3: ARM: ITS: Deny hardware domain access to its Extends the gicv3_iomem_deny to include its regions as well Patch4: ARM: ACPI: Add ITS to hardware domain MADT This patch adds ITS information in hardware domain's MADT table. Also this patch introduces .get_hwdom_madt_size in gic_hw_operations, to return the complete size of MADT table for hardware domain. Manish Jaggi (4): ARM: ITS: Add translation_id to host_its ARM: ITS: ACPI: Introduce gicv3_its_acpi_init ARM: ITS: Deny hardware domain access to its ARM: ACPI: Add ITS to hardware domain MADT xen/arch/arm/domain_build.c | 7 +-- xen/arch/arm/gic-v2.c| 6 +++ xen/arch/arm/gic-v3-its.c| 102 +++ xen/arch/arm/gic-v3.c| 31 xen/arch/arm/gic.c | 11 + xen/include/asm-arm/gic.h| 3 ++ xen/include/asm-arm/gic_v3_its.h | 36 ++ 7 files changed, 180 insertions(+), 16 deletions(-) -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0
On 6/13/2017 4:58 PM, Julien Grall wrote: On 13/06/17 12:02, Manish Jaggi wrote: Will the below code be ok? If you noticed, I didn't say this code is wrong. Instead I asked why you use the same ID. Meaning, is there anything in the DSDT requiring this value? + int tras_id = 0; unsigned. + list_for_each_entry(its_data, _its_list, entry) + { +gic_its->translation_id = ++trans_id; You start the translation ID at 1. Why? as per the ACPI spec the value should be unique to each GIC ITS unit. Does starting with 1 break anything? Or should I start with a magic number? Rather than arguing on the start value here, you should have first answer to the question regarding the usage of translation_id. in v1 I assumed that it would be the same as read from host its tables, so it would have a unique value as programmed by host firmware. I understand that nobody is using it today. However, when I asked around me nobody ruled out to any future usage of GIC ITS ID and request this to be kept as it is. This means that you can simply copy over the ACPI tables. Rather regenerating them. I dont follow your comment, a bit confused In v1 you mentioned that "Please explain why you need to have the same ID as the host." now when you say we copy over the translation_id would be same as that of host? Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0
Hi julien, On 6/9/2017 2:09 PM, Julien Grall wrote: On 09/06/2017 07:48, Manish Jaggi wrote: On 6/8/2017 7:28 PM, Julien Grall wrote: Hi, Hello Julien, Hello, +list_for_each_entry(its_data, _its_list, entry) +{ Pointless { +size += sizeof(struct acpi_madt_generic_translator); +} Just for readability of code. You have indentation for that. So I don't think it helps. ok i will fix it. Same here + add a newline. Sure. +return size; +} + +u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ +struct acpi_madt_generic_translator *gic_its; +const struct host_its *its_data; +u32 table_len = offset, size; + +/* Update GIC ITS information in hardware domain's MADT */ +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + table_len); This line is likely too long. I will check it. +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; On the previous patch you had: gic_its->translation_id = its_data->translation_id; I asked to explain why you need to have the same ID as the host. And now you dropped it. This does not match the spec (Table 5-67 in ACPI 6.1): "GIC ITS ID. In a system with multiple GIC ITS units, this value must be unique to each one." But here, the ITS ID will not be unique. So why did you dropped it? The reason I dropped it from its_data as I was not setting it. So it doesn't belong there. Where would it belong then? This function is used to generate ACPI tables for the hardware domain. Will the below code be ok? If you noticed, I didn't say this code is wrong. Instead I asked why you use the same ID. Meaning, is there anything in the DSDT requiring this value? + int tras_id = 0; unsigned. + list_for_each_entry(its_data, _its_list, entry) + { +gic_its->translation_id = ++trans_id; You start the translation ID at 1. Why? as per the ACPI spec the value should be unique to each GIC ITS unit. Does starting with 1 break anything? Or should I start with a magic number? +table_len += size; +} +return table_len; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address @@ -992,6 +1045,26 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, return res; } +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) ACPI is an option and is not able by default. Please make sure that this code build without ACPI. Likely this means surrounding with #ifdef CONFIG_ACPI. I will get compiled but not called. Do you still want to put ifdef, i can add that. All ACPIs functions are protected by ifdef. So this one should be as well. ok will do. +{ +struct acpi_madt_generic_translator *its_entry; +struct host_its *its_data; + +its_data = xzalloc(struct host_its); +if (!its_data) Coding style. Sure. +return -1; + +its_entry = (struct acpi_madt_generic_translator *)header; +its_data->addr = its_entry->base_address; +its_data->size = ACPI_GICV3_ITS_MEM_SIZE; + +spin_lock_init(_data->cmd_lock); + +printk("GICv3: Found ITS @0x%lx\n", its_data->addr); + +list_add_tail(_data->entry, _its_list); As said on v1, likely you could re-use factorize a part of gicv3_its_dt_init to avoid implementing twice the initialization. For this I have a different opinion. Why didn't you state it on the previous version? I usually interpret a non-answer as an acknowledgment. gicv3_its_dt_init has a loop dt_for_each_child_node(node, its) while gicv3_its_acpi_init is a callback. Moreover, apart from xzalloc and list_add_tail most of the code is different. so IMHO keeping them separate is better. You still set addr and size as in the DT counterpart. Also, this is a call to forget to initialize a field if we decided to extend the structure host_its. So I still don't see any reason to open-code it and take the risk to introduce bug in the future... ok Added. Also newline. +return 0; +} Newline here. Sure. /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */ void gicv3_its_dt_init(const struct dt_device_node *node) { diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..f0f6d12 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1333,9 +1333,8 @@ static int gicv3_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } -return 0; +return gicv3_its_deny_access(d); Copying my answer from v1 for convenience: if ( vbase != INVALID_PADDR ) { mfn = vbase >> PAGE_S
Re: [Xen-devel] ARM: SMMUv3 support
On 3/29/2017 5:30 AM, Goel, Sameer wrote: Sure, I will try to post something soon. Hi Sameer, Are you still working on SMMU v3, can you please post patches. Thanks Manish Thanks, Sameer On 3/27/2017 11:03 PM, Vijay Kilari wrote: On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameerwrote: Hi, I am working on adding this support. The work is in initial stages and will target ACPI systems to start with. Do you have a specific requirement? Or even better: want to help with DT testing ? :) Thanks Sameer. I don't have any specific requirement. I am also looking with ACPI support. Please share your RFC patches so that I can test on our platform. Thanks, Sameer On 3/20/2017 11:58 PM, Vijay Kilari wrote: Hi, Is there any effort put by anyone to get SMMUv3 support in Xen for ARM64?. Would be glad to know. Regards Vijay ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel -- Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v11 00/34] arm64: Dom0 ITS emulation
On 6/9/2017 11:11 PM, Andre Przywara wrote: Hi, Hi Andre, Tested this patchset + my acpi ITS patch (https://lists.xen.org/archives/html/xen-devel/2017-06/msg00716.html) on our platform and it works. With v10 was not able to get interrupts. v9 was booting ok. WBR -Manish fixes to v10, with their number getting eventually smaller ;-) The same restriction as for the previous versions still apply: the locking is considered somewhat insufficient and will be fixed by an upcoming rework. Patch 01/34 was reworked to properly synchronize access to the priority in a lock-less fashion. This should be back-ported to 4.9. The former patch 12/32 ("enable ITS and LPIs on the host") was moved up-front and split to allow back-porting the new 02/34 to Xen 4.9, which is broken if the preliminary ITS support is configured in and the machine advertises an ITS in the device tree. No big changes this time: some bugs fixed (many thanks to Julien for proper testing!), some extended comments and some improvements to better protect parallel accesses. For a detailed changelog see below. I added Acked-by: and Reviewed-by: tags, but refrained from doing so for Julien's tags for patch 18/34 and 20/34, since I changed them slightly. Cheers, Andre -- This series adds support for emulation of an ARM GICv3 ITS interrupt controller. For hardware which relies on the ITS to provide interrupts for its peripherals this code is needed to get a machine booted into Dom0 at all. ITS emulation for DomUs is only really useful with PCI passthrough, which is not yet available for ARM. It is expected that this feature will be co-developed with the ITS DomU code. However this code drop here considered DomU emulation already, to keep later architectural changes to a minimum. This is technical preview version to allow early testing of the feature. Things not (properly) addressed in this release: - There is only support for Dom0 at the moment. DomU support is only really useful with PCI passthrough, which is not there yet for ARM. - The MOVALL command is not emulated. In our case there is really nothing to do here. We might need to revisit this in the future for DomU support. - The INVALL command might need some rework to be more efficient. Currently we iterate over all mapped LPIs, which might take a bit longer. - Indirect tables are not supported. This affects both the host and the virtual side. - The ITS tables inside (Dom0) guest memory cannot easily be protected at the moment (without restricting access to Xen as well). So for now we trust Dom0 not to touch this memory (which the spec forbids as well). - With malicious guests (DomUs) there is a possibility of an interrupt storm triggered by a device. We would need to investigate what that means for Xen and if there is a nice way to prevent this. Disabling the LPI on the host side would require command queuing, which has its downsides to be issued during runtime. - Dom0 should make sure that the ITS resources (number of LPIs, devices, events) later handed to a DomU are really limited, as a large number of them could mean much time spend in Xen to initialize, free or handle those. It is expected that the toolstack sets up a tailored ITS with just enough resources to accommodate the needs of the actual passthrough-ed device(s). - The command queue locking is currently suboptimal and should be made more fine-grained in the future, if possible. - Provide support for running with an IOMMU, to map the doorbell page to all devices. Some generic design principles: * The current GIC code statically allocates structures for each supported IRQ (both for the host and the guest), which due to the potentially millions of LPI interrupts is not feasible to copy for the ITS. So we refrain from introducing the ITS as a first class Xen interrupt controller, also we don't hold struct irq_desc's or struct pending_irq's for each possible LPI. Fortunately LPIs are only interesting to guests, so we get away with storing only the virtual IRQ number and the guest VCPU for each allocated host LPI, which can be stashed into one uint64_t. This data is stored in a two-level table, which is both memory efficient and quick to access. We hook into the existing IRQ handling and VGIC code to avoid accessing the normal structures, providing alternative methods for getting the needed information (priority, is enabled?) for LPIs. Whenever a guest maps a device, we allocate the maximum required number of struct pending_irq's, so that any triggering LPI can find its data structure. Upon the guest actually mapping the LPI, this pointer to the corresponding pending_irq gets entered into a radix tree, so that it can be quickly looked up. * On the guest side we (later will) have to deal with malicious guests trying to hog Xen with mapping requests for a lot of LPIs, for instance. As the ITS actually uses system memory for storing status information, we use this memory (which the guest has to
Re: [Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain
HI Julien, On 6/9/2017 2:53 PM, Julien Grall wrote: On 09/06/2017 08:13, Manish Jaggi wrote: On 6/8/2017 6:39 PM, Julien Grall wrote: Hi Manish, Hi Julien, Hello, On 08/06/17 13:38, Manish Jaggi wrote: Spurious line. This patch disables the smmu node in IORT table for hardware domain. Also patches the output_base of pci_rc id_array with output_base of smmu node id_array. I would have appreciated a bit more description in the commit message to explain your logic. I will add it. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/domain_build.c | 142 +++- domain_build.c is starting to be really big. I think it is time to move some acpi bits outside domain_build.c. You are right, I also thought that How about 3 files domain_build.c acpi_domain_build.c dt_domain_build.c If you want to split the current code, then fine. But it is not strictly mandatory for this code. What I want is adding new code in separate files. But in this case they should be named: domain_build.c acpi/domain_build.c dt/domain_build.c This would keep the ACPI and DT firmware code separated and not polluting the arch/arm. I will follow this structure. xen/include/acpi/actbl2.h | 3 +- xen/include/asm-arm/acpi.h | 1 + 3 files changed, 144 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index d6d6c94..9f41d0e 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -32,6 +32,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus); int dom0_11_mapping = 1; static u64 __initdata dom0_mem; +static u8 *iort_base_ptr; Looking at the code, I don't see any reason to have this global. If you look a bit closer this is used at multiple places see fixup_pcirc_node, hide_smmu_iort. My point stands... you could have passed iort_base_ptr as an extra parameter of the functions. Or even use kinfo. Anyway, at the moment I don't see any reason to have this global variable. ok, I will pass it as a parameter. static void __init parse_dom0_mem(const char *s) { @@ -1336,6 +1337,96 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo) #ifdef CONFIG_ACPI #define ACPI_DOM0_FDT_MIN_SIZE 4096 +static void patch_output_ref(struct acpi_iort_id_mapping *pci_idmap, + struct acpi_iort_node *smmu_node) +{ +struct acpi_iort_id_mapping *idmap = NULL; +int i; Newline. Sure. +for (i=0; i < smmu_node->mapping_count; i++) { Please respect Xen coding style... I expect you to fix *all* the place in the next version. Also, there is a latent lack of comments within the patch to explain the logic. I will add detail comments. +if(!idmap) +idmap = (struct acpi_iort_id_mapping*)((u8*)smmu_node + + smmu_node->mapping_offset); +else +idmap++; + +if (pci_idmap->output_base == idmap->input_base) { +pci_idmap->output_base = idmap->output_base; +pci_idmap->output_reference = idmap->output_reference; As I pointed out on the previous thread, you assume that one PCI ID mapping will end up to be translated to one Device ID mapping and not split across multiple one. For instance: The assumption is based on the ACPI tables on two platforms ThunderX and ThunderX2. While the spec does not deny it but would there be a use case as such where a PCI node id array would split the range into the same smmu. May I remind you that the goal of Xen is to run on *all* the current and future platforms. If the spec says it is allowed, then we should do it unless there is a strong reason not to do it. RC A // doesn't use SMMU 0 so just outputs DeviceIDs to ITS GROUP 0 // Input ID --> Output reference: Output ID 0x-0x --> ITS GROUP 0 : 0x->0x This is not relevant as this code wont touch RC A. Can you avoid to dismiss any example that don't fit your solution? This is not helpful. Sure. I will add more description in that case. Describing the RC is relevant in my example to show a case that your solution will not handle. I will add my rationale here. Hiding smmu from IORT table would require setting device ID in the pci_rc id_array for RID and output reference as ITS group. For the RC idarray elements which don't have an output reference as smmu but a ITS group, there is no need to touch them. Based on this rationale I said this is not relevant. SMMU 0 // Note that range of StreamIDs that map to DeviceIDs excludes // the NIC 0 DeviceID as it does not generate MSIs // Input ID --> Output reference: Output ID 0x-0x01ff --> ITS GROUP 0 : 0x1->0x101ff 0x0200-0x --> ITS GROUP 0 : 0x2->0x207ff It can be from 2 different RC's and not from same RC. It is not my point in this example. My point is same RC with spli
Re: [Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain
On 6/8/2017 6:39 PM, Julien Grall wrote: Hi Manish, Hi Julien, On 08/06/17 13:38, Manish Jaggi wrote: Spurious line. This patch disables the smmu node in IORT table for hardware domain. Also patches the output_base of pci_rc id_array with output_base of smmu node id_array. I would have appreciated a bit more description in the commit message to explain your logic. I will add it. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/domain_build.c | 142 +++- domain_build.c is starting to be really big. I think it is time to move some acpi bits outside domain_build.c. You are right, I also thought that How about 3 files domain_build.c acpi_domain_build.c dt_domain_build.c xen/include/acpi/actbl2.h | 3 +- xen/include/asm-arm/acpi.h | 1 + 3 files changed, 144 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index d6d6c94..9f41d0e 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -32,6 +32,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus); int dom0_11_mapping = 1; static u64 __initdata dom0_mem; +static u8 *iort_base_ptr; Looking at the code, I don't see any reason to have this global. If you look a bit closer this is used at multiple places see fixup_pcirc_node, hide_smmu_iort. static void __init parse_dom0_mem(const char *s) { @@ -1336,6 +1337,96 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo) #ifdef CONFIG_ACPI #define ACPI_DOM0_FDT_MIN_SIZE 4096 +static void patch_output_ref(struct acpi_iort_id_mapping *pci_idmap, + struct acpi_iort_node *smmu_node) +{ +struct acpi_iort_id_mapping *idmap = NULL; +int i; Newline. Sure. +for (i=0; i < smmu_node->mapping_count; i++) { Please respect Xen coding style... I expect you to fix *all* the place in the next version. Also, there is a latent lack of comments within the patch to explain the logic. I will add detail comments. +if(!idmap) +idmap = (struct acpi_iort_id_mapping*)((u8*)smmu_node + + smmu_node->mapping_offset); +else +idmap++; + +if (pci_idmap->output_base == idmap->input_base) { +pci_idmap->output_base = idmap->output_base; +pci_idmap->output_reference = idmap->output_reference; As I pointed out on the previous thread, you assume that one PCI ID mapping will end up to be translated to one Device ID mapping and not split across multiple one. For instance: The assumption is based on the ACPI tables on two platforms ThunderX and ThunderX2. While the spec does not deny it but would there be a use case as such where a PCI node id array would split the range into the same smmu. RC A // doesn't use SMMU 0 so just outputs DeviceIDs to ITS GROUP 0 // Input ID --> Output reference: Output ID 0x-0x --> ITS GROUP 0 : 0x->0x This is not relevant as this code wont touch RC A. SMMU 0 // Note that range of StreamIDs that map to DeviceIDs excludes // the NIC 0 DeviceID as it does not generate MSIs // Input ID --> Output reference: Output ID 0x-0x01ff --> ITS GROUP 0 : 0x1->0x101ff 0x0200-0x --> ITS GROUP 0 : 0x2->0x207ff It can be from 2 different RC's and not from same RC. // SMMU 0 Control interrupt is MSI based // Input ID --> Output reference: Output ID N/A --> ITS GROUP 0 : 0x21 I still don't see anything in the spec preventing that. And I would like clarification from your side before going forward. *hint* The spec should be quoted *hint* Spec does not prevent that, but we need to see IMHO what all cases are practically possible and current platforms support it. Is there any platform which supports that ? I can add code for the combinations but how I will test it. [...] diff --git a/xen/include/acpi/actbl2.h b/xen/include/acpi/actbl2.h index 42beac4..f180ea5 100644 --- a/xen/include/acpi/actbl2.h +++ b/xen/include/acpi/actbl2.h @@ -591,7 +591,8 @@ enum acpi_iort_node_type { ACPI_IORT_NODE_NAMED_COMPONENT = 0x01, ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02, ACPI_IORT_NODE_SMMU = 0x03, -ACPI_IORT_NODE_SMMU_V3 = 0x04 +ACPI_IORT_NODE_SMMU_V3 = 0x04, +ACPI_IORT_NODE_RESERVED = 0xff This is likely a call to a separate patch. ok. }; struct acpi_iort_id_mapping { diff --git a/xen/include/asm-arm/acpi.h b/xen/include/asm-arm/acpi.h index 9f954d3..1cc0167 100644 --- a/xen/include/asm-arm/acpi.h +++ b/xen/include/asm-arm/acpi.h @@ -36,6 +36,7 @@ typedef enum { TBL_FADT, TBL_MADT, TBL_STAO, +TBL_IORT, TBL_XSDT, TBL_RSDP, TBL_EFIT, Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0
On 6/8/2017 7:28 PM, Julien Grall wrote: Hi, Hello Julien, Please CC all relevant maintainers. Sure. Will do in the next patch rev. On 08/06/17 14:03, Manish Jaggi wrote: Spurious newline This patch supports ITS in hardware domain, supports ITS in Xen when booting with ACPI. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- Changes since v1: - Moved its specific code to gic-v3-its.c - fixed macros It sounds like you haven't addressed all my comments. I will repeat them for this time. But next time, I will not bother reviewing your patch. *Thanks* for reviewing the patch, I will try to address _all_ the comments xen/arch/arm/domain_build.c | 6 ++-- xen/arch/arm/gic-v3-its.c| 75 +++- xen/arch/arm/gic-v3.c| 10 -- xen/include/asm-arm/gic_v3_its.h | 6 4 files changed, 91 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3abacc0..d6d6c94 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -20,7 +20,7 @@ #include #include #include - Why did you drop this newline? I will fix it. +#include Nack. I asked on v1 to separate code between GICv3 and ITS, it is not for directly calling gicv3 code directly in the common code. If you need to call GICv3 specific code, then introduce a callback in gic_hw_operations. Good point, I will add it. #include #include #include @@ -1804,7 +1804,9 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) madt_size = sizeof(struct acpi_table_madt) + sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus -+ sizeof(struct acpi_madt_generic_distributor); ++ sizeof(struct acpi_madt_generic_distributor) ++ gicv3_its_madt_generic_translator_size(); See my comment above. Will address it. + if ( d->arch.vgic.version == GIC_V3 ) madt_size += sizeof(struct acpi_madt_generic_redistributor) * d->arch.vgic.nr_regions; diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 1fb06ca..937b970 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -25,14 +25,18 @@ #include #include #include +#include The include are ordered alphabetically, please respect it. Sure. I will fix it. #include #include #include #include #include +#include +#include +#include Ditto. Sure. I will fix it. #define ITS_CMD_QUEUE_SZSZ_1M - Again, we don't drop newline for no reason. I will fix it. +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K) /* * No lock here, as this list gets only populated upon boot while scanning * firmware tables for all host ITSes, and only gets iterated afterwards. @@ -920,6 +924,55 @@ int gicv3_lpi_change_vcpu(struct domain *d, paddr_t vdoorbell, return 0; } +int gicv3_its_deny_access(const struct domain *d) +{ +int rc = 0; +unsigned long mfn, nr; +const struct host_its *its_data; + +list_for_each_entry(its_data, _its_list, entry) +{ +mfn = paddr_to_pfn(its_data->addr); +nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +goto end; Hmmm, why not using a break here rather than a goto? I can use break, np. +} +end: +return rc; +} + +u32 gicv3_its_madt_generic_translator_size(void) +{ +const struct host_its *its_data; +u32 size = 0; + +list_for_each_entry(its_data, _its_list, entry) +{ Pointless { +size += sizeof(struct acpi_madt_generic_translator); +} Just for readability of code. Same here + add a newline. Sure. +return size; +} + +u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ +struct acpi_madt_generic_translator *gic_its; +const struct host_its *its_data; +u32 table_len = offset, size; + +/* Update GIC ITS information in hardware domain's MADT */ +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + table_len); This line is likely too long. I will check it. +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; On the previous patch you had: gic_its->translation_id = its_data->translation_id; I asked to explain why you need to have the same ID as the host. And now you dropped it. This does not match the spec (Table 5-67 in ACPI 6.1): "GIC ITS ID. In a system with multiple GIC ITS units, this value must be unique to each one." But here, the ITS ID will not be unique. So why did you dropped it? The reason I dropped it from its_data as I was not setting it. So it doesn't b
[Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0
This patch supports ITS in hardware domain, supports ITS in Xen when booting with ACPI. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- Changes since v1: - Moved its specific code to gic-v3-its.c - fixed macros xen/arch/arm/domain_build.c | 6 ++-- xen/arch/arm/gic-v3-its.c| 75 +++- xen/arch/arm/gic-v3.c| 10 -- xen/include/asm-arm/gic_v3_its.h | 6 4 files changed, 91 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3abacc0..d6d6c94 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -20,7 +20,7 @@ #include #include #include - +#include #include #include #include @@ -1804,7 +1804,9 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) madt_size = sizeof(struct acpi_table_madt) + sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus -+ sizeof(struct acpi_madt_generic_distributor); ++ sizeof(struct acpi_madt_generic_distributor) ++ gicv3_its_madt_generic_translator_size(); + if ( d->arch.vgic.version == GIC_V3 ) madt_size += sizeof(struct acpi_madt_generic_redistributor) * d->arch.vgic.nr_regions; diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 1fb06ca..937b970 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -25,14 +25,18 @@ #include #include #include +#include #include #include #include #include #include +#include +#include +#include #define ITS_CMD_QUEUE_SZSZ_1M - +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K) /* * No lock here, as this list gets only populated upon boot while scanning * firmware tables for all host ITSes, and only gets iterated afterwards. @@ -920,6 +924,55 @@ int gicv3_lpi_change_vcpu(struct domain *d, paddr_t vdoorbell, return 0; } +int gicv3_its_deny_access(const struct domain *d) +{ +int rc = 0; +unsigned long mfn, nr; +const struct host_its *its_data; + +list_for_each_entry(its_data, _its_list, entry) +{ +mfn = paddr_to_pfn(its_data->addr); +nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +goto end; +} +end: +return rc; +} + +u32 gicv3_its_madt_generic_translator_size(void) +{ +const struct host_its *its_data; +u32 size = 0; + +list_for_each_entry(its_data, _its_list, entry) +{ +size += sizeof(struct acpi_madt_generic_translator); +} +return size; +} + +u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ +struct acpi_madt_generic_translator *gic_its; +const struct host_its *its_data; +u32 table_len = offset, size; + +/* Update GIC ITS information in hardware domain's MADT */ +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + table_len); +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; +table_len += size; +} +return table_len; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address @@ -992,6 +1045,26 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, return res; } +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) +{ +struct acpi_madt_generic_translator *its_entry; +struct host_its *its_data; + +its_data = xzalloc(struct host_its); +if (!its_data) +return -1; + +its_entry = (struct acpi_madt_generic_translator *)header; +its_data->addr = its_entry->base_address; +its_data->size = ACPI_GICV3_ITS_MEM_SIZE; + +spin_lock_init(_data->cmd_lock); + +printk("GICv3: Found ITS @0x%lx\n", its_data->addr); + +list_add_tail(_data->entry, _its_list); +return 0; +} /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */ void gicv3_its_dt_init(const struct dt_device_node *node) { diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..f0f6d12 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1333,9 +1333,8 @@ static int gicv3_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } -return 0; +return gicv3_its_deny_access(d); } - #ifdef CONFIG_ACPI static void __init gic_acpi_add_rdist_region(paddr_t base, paddr_t size, bool single_rdist) @@ -1374,6 +1373,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) for ( i =
[Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain
This patch disables the smmu node in IORT table for hardware domain. Also patches the output_base of pci_rc id_array with output_base of smmu node id_array. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/domain_build.c | 142 +++- xen/include/acpi/actbl2.h | 3 +- xen/include/asm-arm/acpi.h | 1 + 3 files changed, 144 insertions(+), 2 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index d6d6c94..9f41d0e 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -32,6 +32,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus); int dom0_11_mapping = 1; static u64 __initdata dom0_mem; +static u8 *iort_base_ptr; static void __init parse_dom0_mem(const char *s) { @@ -1336,6 +1337,96 @@ static int prepare_dtb(struct domain *d, struct kernel_info *kinfo) #ifdef CONFIG_ACPI #define ACPI_DOM0_FDT_MIN_SIZE 4096 +static void patch_output_ref(struct acpi_iort_id_mapping *pci_idmap, + struct acpi_iort_node *smmu_node) +{ +struct acpi_iort_id_mapping *idmap = NULL; +int i; +for (i=0; i < smmu_node->mapping_count; i++) { +if(!idmap) +idmap = (struct acpi_iort_id_mapping*)((u8*)smmu_node + + smmu_node->mapping_offset); +else +idmap++; + +if (pci_idmap->output_base == idmap->input_base) { +pci_idmap->output_base = idmap->output_base; +pci_idmap->output_reference = idmap->output_reference; +} +} +} + +static void fixup_pcirc_node(struct acpi_iort_node *node) +{ +struct acpi_iort_id_mapping *idmap = NULL; +struct acpi_iort_node *onode; +int i=0; + +for (i=0; i < node->mapping_count; i++) { +if(!idmap) +idmap = (struct acpi_iort_id_mapping*)((u8*)node + + + node->mapping_offset); +else +idmap++; + +onode = (struct acpi_iort_node*)(iort_base_ptr + + idmap->output_reference); +switch (onode->type) +{ +case ACPI_IORT_NODE_ITS_GROUP: +continue; +case ACPI_IORT_NODE_SMMU: +case ACPI_IORT_NODE_SMMU_V3: + patch_output_ref(idmap, onode); +break; +} +} +} + +static int hide_smmu_iort(void) +{ +u32 i; +u32 node_offset = 0; +struct acpi_table_iort *iort_table; +struct acpi_iort_node *node = NULL; + +iort_table = (struct acpi_table_iort *)iort_base_ptr; + +for (i=0; i < iort_table->node_count; i++) { +if (!node){ +node = (struct acpi_iort_node *)(iort_base_ptr + + iort_table->node_offset); +node_offset = iort_table->node_offset; +} else { +node = (struct acpi_iort_node *)(iort_base_ptr + + node_offset); +} + +node_offset += node->length; +if (node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX) +fixup_pcirc_node(node); +} + +node_offset = 0; +node = NULL; +for (i=0; i < iort_table->node_count; i++) { +if (!node){ +node = (struct acpi_iort_node *)(iort_base_ptr + + iort_table->node_offset); +node_offset = iort_table->node_offset; +} else { +node = (struct acpi_iort_node *)(iort_base_ptr + + node_offset); +} +node_offset += node->length; +if ((node->type == ACPI_IORT_NODE_SMMU) || + (node->type == ACPI_IORT_NODE_SMMU_V3)) +node->type = ACPI_IORT_NODE_RESERVED; +} + +return 0; +} + static int acpi_iomem_deny_access(struct domain *d) { acpi_status status; @@ -1348,7 +1439,12 @@ static int acpi_iomem_deny_access(struct domain *d) if ( rc ) return rc; -/* TODO: Deny MMIO access for SMMU, GIC ITS */ +/* Hide SMMU from IORT */ +rc = hide_smmu_iort(); +if (rc) +return rc; + +/* Deny MMIO access for GIC ITS */ status = acpi_get_table(ACPI_SIG_SPCR, 0, (struct acpi_table_header **)); @@ -1646,6 +1742,8 @@ static int acpi_create_xsdt(struct domain *d, struct membank tbl_add[]) ACPI_SIG_FADT, tbl_add[TBL_FADT].start); acpi_xsdt_modify_entry(xsdt->table_offset_entry, entry_count, ACPI_SIG_MADT, tbl_add[TBL_MADT].start); +acpi_xsdt_modify_entry(xsdt->table_offset_entry, entry_count, + ACPI_SIG_IORT, tbl_add[TBL_IORT].start); xsdt->table_offset_entry[entry_count] = tbl_add[TBL_STAO].start; xsdt->header.length = table_size; @@ -1794,11 +1892,23 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) { size
Re: [Xen-devel] xen/arm: Hiding SMMUs from Dom0 when using ACPI on Xen
On 5/19/2017 1:39 AM, Julien Grall wrote: On 18/05/2017 21:02, Manish Jaggi wrote: In the IORT table using the PCI-RC node, SMMU node and ITS node, RID->StreamID->Device-ID mapping can be generated. As per IORT spec toady, same RID can be mapped to different StreamIDs using two ID Array elements with same RID range but different output reference. There exists no use case for such a scenario hence a clarification is required in IORT spec which states that RID range cannot overlap in the ID array. I understand that. with this clarification in place, it is straight-forward to map RID to a device-ID by replacing output of SMMU to output of RCI-RC I am not sure to follow your suggestion here. But I will wait a patch before commenting. Please see [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
[Xen-devel] [RFC v2] [PATCH] arm64-its: Add ITS support for ACPI dom0
This patch supports ITS in hardware domain, supports ITS in Xen when booting with ACPI. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- Changes since v1: - Moved its specific code to gic-v3-its.c - fixed macros xen/arch/arm/domain_build.c | 6 ++-- xen/arch/arm/gic-v3-its.c | 75 +++- xen/arch/arm/gic-v3.c | 10 -- xen/include/asm-arm/gic_v3_its.h | 6 4 files changed, 91 insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c index 3abacc0..d6d6c94 100644 --- a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -20,7 +20,7 @@ #include #include #include - +#include #include #include #include @@ -1804,7 +1804,9 @@ static int estimate_acpi_efi_size(struct domain *d, struct kernel_info *kinfo) madt_size = sizeof(struct acpi_table_madt) + sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus - + sizeof(struct acpi_madt_generic_distributor); + + sizeof(struct acpi_madt_generic_distributor) + + gicv3_its_madt_generic_translator_size(); + if ( d->arch.vgic.version == GIC_V3 ) madt_size += sizeof(struct acpi_madt_generic_redistributor) * d->arch.vgic.nr_regions; diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c index 1fb06ca..937b970 100644 --- a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -25,14 +25,18 @@ #include #include #include +#include #include #include #include #include #include +#include +#include +#include #define ITS_CMD_QUEUE_SZ SZ_1M - +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K) /* * No lock here, as this list gets only populated upon boot while scanning * firmware tables for all host ITSes, and only gets iterated afterwards. @@ -920,6 +924,55 @@ int gicv3_lpi_change_vcpu(struct domain *d, paddr_t vdoorbell, return 0; } +int gicv3_its_deny_access(const struct domain *d) +{ + int rc = 0; + unsigned long mfn, nr; + const struct host_its *its_data; + + list_for_each_entry(its_data, _its_list, entry) + { + mfn = paddr_to_pfn(its_data->addr); + nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); + rc = iomem_deny_access(d, mfn, mfn + nr); + if ( rc ) + goto end; + } +end: + return rc; +} + +u32 gicv3_its_madt_generic_translator_size(void) +{ + const struct host_its *its_data; + u32 size = 0; + + list_for_each_entry(its_data, _its_list, entry) + { + size += sizeof(struct acpi_madt_generic_translator); + } + return size; +} + +u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ + struct acpi_madt_generic_translator *gic_its; + const struct host_its *its_data; + u32 table_len = offset, size; + + /* Update GIC ITS information in hardware domain's MADT */ + list_for_each_entry(its_data, _its_list, entry) + { + size = sizeof(struct acpi_madt_generic_translator); + gic_its = (struct acpi_madt_generic_translator *)(base_ptr + table_len); + gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; + gic_its->header.length = size; + gic_its->base_address = its_data->addr; + table_len += size; + } + return table_len; +} + /* * Create the respective guest DT nodes from a list of host ITSes. * This copies the reg property, so the guest sees the ITS at the same address @@ -992,6 +1045,26 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, return res; } +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) +{ + struct acpi_madt_generic_translator *its_entry; + struct host_its *its_data; + + its_data = xzalloc(struct host_its); + if (!its_data) + return -1; + + its_entry = (struct acpi_madt_generic_translator *)header; + its_data->addr = its_entry->base_address; + its_data->size = ACPI_GICV3_ITS_MEM_SIZE; + + spin_lock_init(_data->cmd_lock); + + printk("GICv3: Found ITS @0x%lx\n", its_data->addr); + + list_add_tail(_data->entry, _its_list); + return 0; +} /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */ void gicv3_its_dt_init(const struct dt_device_node *node) { diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..f0f6d12 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1333,9 +1333,8 @@ static int gicv3_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } - return 0; + return gicv3_its_deny_access(d); } - #ifdef CONFIG_ACPI static void __init gic_acpi_add_rdist_region(paddr_t base, paddr_t size, bool single_rdist) @@ -1374,6 +1373,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) for ( i = 0; i < d->max_vcpus; i++ ) { gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + table_len); + ACPI_MEMCPY(gicc, host_gicc, size); gicc->cpu_interface_number = i; gicc->uid = i; @@ -1399,7 +1399,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) gicr->length = d->arch.vgic.rdist_regions[i].size; table_len += size; } - + table_len = gic
Re: [Xen-devel] [RFC] [PATCH] arm64-its: Add ITS support for ACPI dom0
Hi Julien, On 5/30/2017 4:07 PM, Julien Grall wrote: Hello Manish, On 30/05/17 07:07, Manish Jaggi wrote: This patch is an RFC on top of Andre's v10 series. https://www.mail-archive.com/xen-devel@lists.xen.org/msg109093.html This patch deny's access to ITS region for the guest and also updates s/deny's/denies/ the acpi tables for dom0. This patch is doing more that supporting ITS in the hardware domain. It also allows support of ITS in Xen when booting using ACPI. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3.c| 49 xen/include/asm-arm/gic_v3_its.h | 1 + 2 files changed, 50 insertions(+) diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..f496fc1 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1301,6 +1301,7 @@ static int gicv3_iomem_deny_access(const struct domain *d) { int rc, i; unsigned long mfn, nr; +const struct host_its *its_data; mfn = dbase >> PAGE_SHIFT; nr = DIV_ROUND_UP(SZ_64K, PAGE_SIZE); @@ -1333,6 +1334,16 @@ static int gicv3_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } If GICv2 is supported, the function will bail out as soon as the virtual base region is denied (see just above). Didnt get your point. gicv2 has already a similar function gicv2_iomem_deny_access. Can you please elaborate. I am sending a v2 version on patch incorporating other comments. +/* deny for ITS as well */ +list_for_each_entry(its_data, _its_list, entry) +{ +mfn = its_data->addr >> PAGE_SHIFT; Please don't open-code the shift and using paddr_to_pfn(...). ok. +nr = DIV_ROUND_UP(SZ_128K, PAGE_SIZE); Please use PFN_UP rather than DIV_ROUND_UP(...). ok Also, where does the SZ_128K comes from? +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +return rc; +} No implementation of ITS specific code in the GICv3 driver please. Instead introduce a helper for that. + return 0; } @@ -1357,8 +1368,10 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) struct acpi_subtable_header *header; struct acpi_madt_generic_interrupt *host_gicc, *gicc; struct acpi_madt_generic_redistributor *gicr; +struct acpi_madt_generic_translator *gic_its; u8 *base_ptr = d->arch.efi_acpi_table + offset; u32 i, table_len = 0, size; +const struct host_its *its_data; See my comment above regarding ITS specific code. /* Add Generic Interrupt */ header = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0); @@ -1374,6 +1387,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) for ( i = 0; i < d->max_vcpus; i++ ) { gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + table_len); + Spurious change. ACPI_MEMCPY(gicc, host_gicc, size); gicc->cpu_interface_number = i; gicc->uid = i; @@ -1399,6 +1413,18 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) gicr->length = d->arch.vgic.rdist_regions[i].size; table_len += size; } + +/* Update GIC ITS information in dom0 madt */ s/dom0/hardware domain/ s/madt/MADT/ Also, likely you want to make sure you have space in efi_acpi_table (see estimate_acpi_efi_size). +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + table_len); +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; +gic_its->translation_id = its_data->translation_id; Please explain why you need to have the same ID as the host. +table_len += size; +} return table_len; } @@ -1511,6 +1537,25 @@ gic_acpi_get_madt_redistributor_num(struct acpi_subtable_header *header, */ return 0; } Newnline here. +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_128K) + +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) Why this is not static? +{ Same remark as above regarding ITS specific code. +struct acpi_madt_generic_translator *its_entry; +struct host_its *its_data; + +its_data = xzalloc(struct host_its); What if xzalloc fails? +its_entry = (struct acpi_madt_generic_translator *)header; +its_data->addr = its_entry->base_address; +its_data->size = ACPI_GICV3_ITS_MEM_SIZE; + +spin_lock_init(_data->cmd_lock); + +printk("GICv3: Found ITS @0x%lx\n", its_data->addr); + +list_add_tail(_data->entry, _its_list); Likely you could re-use factorize a part of gicv3_its_dt_init to avoid implementing twice the initi
[Xen-devel] [RFC] [PATCH] arm64-its: Add ITS support for ACPI dom0
This patch is an RFC on top of Andre's v10 series. https://www.mail-archive.com/xen-devel@lists.xen.org/msg109093.html This patch deny's access to ITS region for the guest and also updates the acpi tables for dom0. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- xen/arch/arm/gic-v3.c| 49 xen/include/asm-arm/gic_v3_its.h | 1 + 2 files changed, 50 insertions(+) diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..f496fc1 100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ -1301,6 +1301,7 @@ static int gicv3_iomem_deny_access(const struct domain *d) { int rc, i; unsigned long mfn, nr; +const struct host_its *its_data; mfn = dbase >> PAGE_SHIFT; nr = DIV_ROUND_UP(SZ_64K, PAGE_SIZE); @@ -1333,6 +1334,16 @@ static int gicv3_iomem_deny_access(const struct domain *d) return iomem_deny_access(d, mfn, mfn + nr); } +/* deny for ITS as well */ +list_for_each_entry(its_data, _its_list, entry) +{ +mfn = its_data->addr >> PAGE_SHIFT; +nr = DIV_ROUND_UP(SZ_128K, PAGE_SIZE); +rc = iomem_deny_access(d, mfn, mfn + nr); +if ( rc ) +return rc; +} + return 0; } @@ -1357,8 +1368,10 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) struct acpi_subtable_header *header; struct acpi_madt_generic_interrupt *host_gicc, *gicc; struct acpi_madt_generic_redistributor *gicr; +struct acpi_madt_generic_translator *gic_its; u8 *base_ptr = d->arch.efi_acpi_table + offset; u32 i, table_len = 0, size; +const struct host_its *its_data; /* Add Generic Interrupt */ header = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0); @@ -1374,6 +1387,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) for ( i = 0; i < d->max_vcpus; i++ ) { gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + table_len); + ACPI_MEMCPY(gicc, host_gicc, size); gicc->cpu_interface_number = i; gicc->uid = i; @@ -1399,6 +1413,18 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) gicr->length = d->arch.vgic.rdist_regions[i].size; table_len += size; } + +/* Update GIC ITS information in dom0 madt */ +list_for_each_entry(its_data, _its_list, entry) +{ +size = sizeof(struct acpi_madt_generic_translator); +gic_its = (struct acpi_madt_generic_translator *)(base_ptr + table_len); +gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; +gic_its->header.length = size; +gic_its->base_address = its_data->addr; +gic_its->translation_id = its_data->translation_id; +table_len += size; +} return table_len; } @@ -1511,6 +1537,25 @@ gic_acpi_get_madt_redistributor_num(struct acpi_subtable_header *header, */ return 0; } +#define ACPI_GICV3_ITS_MEM_SIZE (SZ_128K) + +int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned long end) +{ +struct acpi_madt_generic_translator *its_entry; +struct host_its *its_data; + +its_data = xzalloc(struct host_its); +its_entry = (struct acpi_madt_generic_translator *)header; +its_data->addr = its_entry->base_address; +its_data->size = ACPI_GICV3_ITS_MEM_SIZE; + +spin_lock_init(_data->cmd_lock); + +printk("GICv3: Found ITS @0x%lx\n", its_data->addr); + +list_add_tail(_data->entry, _its_list); +return 0; +} static void __init gicv3_acpi_init(void) { @@ -1567,6 +1612,9 @@ static void __init gicv3_acpi_init(void) gicv3.rdist_stride = 0; +acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR, + gicv3_its_acpi_init, 0); + /* * In ACPI, 0 is considered as the invalid address. However the rest * of the initialization rely on the invalid address to be @@ -1585,6 +1633,7 @@ static void __init gicv3_acpi_init(void) else vsize = GUEST_GICC_SIZE; + } #else static void __init gicv3_acpi_init(void) { } diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h index d2a3e53..c92cdb9 100644 --- a/xen/include/asm-arm/gic_v3_its.h +++ b/xen/include/asm-arm/gic_v3_its.h @@ -125,6 +125,7 @@ struct host_its { spinlock_t cmd_lock; void *cmd_buf; unsigned int flags; +u32 translation_id; }; -- 2.7.4 ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
Hi Julien, On 5/29/2017 11:44 PM, Julien Grall wrote: On 05/29/2017 03:30 AM, Manish Jaggi wrote: Hi Julien, Hello Manish, On 5/26/2017 10:44 PM, Julien Grall wrote: PCI pass-through allows the guest to receive full control of physical PCI devices. This means the guest will have full and direct access to the PCI device. ARM is supporting a kind of guest that exploits as much as possible virtualization support in hardware. The guest will rely on PV driver only for IO (e.g block, network) and interrupts will come through the virtualized interrupt controller, therefore there are no big changes required within the kernel. As a consequence, it would be possible to replace PV drivers by assigning real devices to the guest for I/O access. Xen on ARM would therefore be able to run unmodified operating system. To achieve this goal, it looks more sensible to go towards emulating the host bridge (there will be more details later). IIUC this means that domU would have an emulated host bridge and dom0 will see the actual host bridge? You don't want the hardware domain and Xen access the configuration space at the same time. So if Xen is in charge of the host bridge, then an emulated host bridge should be exposed to the hardware. I believe in x86 case dom0 and Xen do access the config space. In the context of pci device add hypercall. Thats when the pci_config_XXX functions in xen are called. Although, this is depending on who is in charge of the the host bridge. As you may have noticed, this design document is proposing two ways to handle configuration space access. At the moment any generic host bridge (see the definition in the design document) will be handled in Xen and the hardware domain will have an emulated host bridge. So in case of generic hb, xen will manage the config space and provide a emulated I/f to dom0, and accesses would be trapped by Xen. Essentially the goal is to scan all pci devices and register them with Xen (which in turn will configure the smmu). For a generic hb, this can be done either in dom0/xen. The only doubt here is what extra benefit the emulated hb give in case of dom0. If your host bridges is not a generic one, then the hardware domain will be in charge of the host bridges, any configuration access from Xen will be forward to the hardware domain. At the moment, as part of the first implementation, we are only looking to implement a generic host bridge in Xen. We will decide on case by case basis for all the other host bridges whether we want to have the driver in Xen. agreed. [...] ## IOMMU The IOMMU will be used to isolate the PCI device when accessing the memory (e.g DMA and MSI Doorbells). Often the IOMMU will be configured using a MasterID (aka StreamID for ARM SMMU) that can be deduced from the SBDF with the help of the firmware tables (see below). Whilst in theory, all the memory transactions issued by a PCI device should go through the IOMMU, on certain platforms some of the memory transaction may not reach the IOMMU because they are interpreted by the host bridge. For instance, this could happen if the MSI doorbell is built into the PCI host bridge or for P2P traffic. See [6] for more details. XXX: I think this could be solved by using direct mapping (e.g GFN == MFN), this would mean the guest memory layout would be similar to the host one when PCI devices will be pass-throughed => Detail it. In the example given in the IORT spec, for pci devices not behind an SMMU, how would the writes from the device be protected. I realize the XXX paragraph is quite confusing. I am not trying to solve the problem where PCI devices are not protected behind an SMMU but platform where some transactions (e.g P2P or MSI doorbell access) are by-passing the SMMU. You may still want to allow PCI passthrough in that case, because you know that P2P cannot be done (or potentially disabled) and MSI doorbell access is protected (for instance a write in the ITS doorbell will be tagged with the device by the hardware). In order to support such platform you need to direct map the doorbel (e.g GFN == MFN) and carve out the P2P region from the guest memory map. Hence the suggestion to re-use the host memory layout for the guest. Note that it does not mean the RAM region will be direct mapped. It is only there to ease carving out memory region by-passed by the SMMU. [...] ## ACPI ### Host bridges The static table MCFG (see 4.2 in [1]) will describe the host bridges available at boot and supporting ECAM. Unfortunately, there are platforms out there (see [2]) that re-use MCFG to describe host bridge that are not fully ECAM compatible. This means that Xen needs to account for possible quirks in the host bridge. The Linux community are working on a patch series for this, see [2] and [3], where quirks will be detected with: * OEM ID * OEM Table ID * OEM Revision * PCI Segment * PCI bus number ra
Re: [Xen-devel] [RFC] ARM PCI Passthrough design document
Hi Julien, On 5/26/2017 10:44 PM, Julien Grall wrote: Hi all, The document below is an RFC version of a design proposal for PCI Passthrough in Xen on ARM. It aims to describe from an high level perspective the interaction with the different subsystems and how guest will be able to discover and access PCI. Currently on ARM, Xen does not have any knowledge about PCI devices. This means that IOMMU and interrupt controller (such as ITS) requiring specific configuration will not work with PCI even with DOM0. The PCI Passthrough work could be divided in 2 phases: * Phase 1: Register all PCI devices in Xen => will allow to use ITS and SMMU with PCI in Xen * Phase 2: Assign devices to guests This document aims to describe the 2 phases, but for now only phase 1 is fully described. I think I was able to gather all of the feedbacks and come up with a solution that will satisfy all the parties. The design document has changed quite a lot compare to the early draft sent few months ago. The major changes are: * Provide more details how PCI works on ARM and the interactions with MSI controller and IOMMU * Provide details on the existing host bridge implementations * Give more explanation and justifications on the approach chosen * Describing the hypercalls used and how they should be called Feedbacks are welcomed. Cheers, % PCI pass-through support on ARM % Julien Grall% Draft B # Preface This document aims to describe the components required to enable the PCI pass-through on ARM. This is an early draft and some questions are still unanswered. When this is the case, the text will contain XXX. # Introduction PCI pass-through allows the guest to receive full control of physical PCI devices. This means the guest will have full and direct access to the PCI device. ARM is supporting a kind of guest that exploits as much as possible virtualization support in hardware. The guest will rely on PV driver only for IO (e.g block, network) and interrupts will come through the virtualized interrupt controller, therefore there are no big changes required within the kernel. As a consequence, it would be possible to replace PV drivers by assigning real devices to the guest for I/O access. Xen on ARM would therefore be able to run unmodified operating system. To achieve this goal, it looks more sensible to go towards emulating the host bridge (there will be more details later). IIUC this means that domU would have an emulated host bridge and dom0 will see the actual host bridge? A guest would be able to take advantage of the firmware tables, obviating the need for a specific driver for Xen. Thus, in this document we follow the emulated host bridge approach. # PCI terminologies Each PCI device under a host bridge is uniquely identified by its Requester ID (AKA RID). A Requester ID is a triplet of Bus number, Device number, and Function. When the platform has multiple host bridges, the software can add a fourth number called Segment (sometimes called Domain) to differentiate host bridges. A PCI device will then uniquely by segment:bus:device:function (AKA SBDF). So given a specific SBDF, it would be possible to find the host bridge and the RID associated to a PCI device. The pair (host bridge, RID) will often be used to find the relevant information for configuring the different subsystems (e.g IOMMU, MSI controller). For convenience, the rest of the document will use SBDF to refer to the pair (host bridge, RID). # PCI host bridge PCI host bridge enables data transfer between a host processor and PCI bus based devices. The bridge is used to access the configuration space of each PCI devices and, on some platform may also act as an MSI controller. ## Initialization of the PCI host bridge Whilst it would be expected that the bootloader takes care of initializing the PCI host bridge, on some platforms it is done in the Operating System. This may include enabling/configuring the clocks that could be shared among multiple devices. ## Accessing PCI configuration space Accessing the PCI configuration space can be divided in 2 category: * Indirect access, where the configuration spaces are multiplexed. An example would be legacy method on x86 (e.g 0xcf8 and 0xcfc). On ARM a similar method is used by PCIe RCar root complex (see [12]). * ECAM access, each configuration space will have its own address space. Whilst ECAM is a standard, some PCI host bridges will require specific fiddling when access the registers (see thunder-ecam [13]). In most of the cases, accessing all the PCI configuration spaces under a given PCI host will be done the same way (i.e either indirect access or ECAM access). However, there are a few cases, dependent on the PCI devices accessed, which will use different methods (see
Re: [Xen-devel] xen/arm: Hiding SMMUs from Dom0 when using ACPI on Xen
Hi Julien, On 5/18/2017 8:27 PM, Julien Grall wrote: Hello, On 18/05/17 12:59, Manish Jaggi wrote: On 2/27/2017 11:42 PM, Julien Grall wrote: On 02/27/2017 04:58 PM, Shanker Donthineni wrote: Hi Julien, Hi Shanker, Please don't drop people in CC. In my case, any e-mail I am not CCed are skipping my inbox and I may not read them for a while. On 02/27/2017 08:12 AM, Julien Grall wrote: On 27/02/17 13:23, Vijay Kilari wrote: Hi Julien, Hello Vijay, On Wed, Feb 22, 2017 at 7:40 PM, Julien Grall <julien.gr...@arm.com> wrote: Hello, There was few discussions recently about hiding SMMUs from DOM0 when using ACPI. I thought it would be good to have a separate thread for this. When using ACPI, the SMMUs will be described in the IO Remapping Table (IORT). The specification can be found on the ARM website [1]. For a brief summary, the IORT can be used to discover the SMMUs present on the platform and find for a given device the ID to configure components such as ITS (DeviceID) and SMMU (StreamID). The appendix A in the specification gives an example how DeviceID and StreamID can be found. For instance, when a PCI device is both protected by an SMMU and MSI-capable the following translation will happen: RID -> StreamID -> DeviceID Currently, SMMUs are hidden from DOM0 because they are been used by Xen and we don't support stage-1 SMMU. If we pass the IORT as it is, DOM0 will try to initialize SMMU and crash. I first thought about using a Xen specific way (STAO) or extending a flag in IORT. But that is not ideal. So we would have to rewrite the IORT for DOM0. Given that a range of RID can mapped to multiple ranges of DeviceID, Do you envisage a scenario where same RID can map to multiple StreamIDs belonging to different SMMUs ? we would have to translate RID one by one to find the associated DeviceID. I think this may end up to complex code and have a big IORT table. Why can't we replace Output base of IORT of PCI node with SMMU output base?. I mean similar to PCI node without SMMU, why can't replace output base of PCI node with SMMU's output base?. Because I don't see anything in the spec preventing one RC ID mapping to produce multiple SMMU ID mapping. So which output base would you use? Basically, remove SMMU nodes, and replaces output of the PCIe and named nodes ID mappings with ITS nodes. RID --> StreamID --> dviceID --> ITS device id = RID --> dviceID --> ITS device id Can you detail it? You seem to assume that one RC ID mapping range will only produce ID mapping range. AFAICT, this is not mandated by the spec. You are correct that it is not mandated by the spec, but AFAIK there seems to be no valid use case for that. Xen has to be compliant with the spec, if the spec says something then we should do it unless there is a strong reason not to. In this case, it is not too difficult to implement the suggestion I wrote a couple of months ago. So why would we try to put us in a corner? See below RID range should not overlap between ID Array entries. I believe you misunderstood my point here. So let me give an example. My understanding of the spec is it is possible to have: RC A // doesn't use SMMU 0 so just outputs DeviceIDs to ITS GROUP 0 // Input ID --> Output reference: Output ID 0x-0x --> ITS GROUP 0 : 0x->0x SMMU 0 // Note that range of StreamIDs that map to DeviceIDs excludes // the NIC 0 DeviceID as it does not generate MSIs // Input ID --> Output reference: Output ID 0x-0x01ff --> ITS GROUP 0 : 0x1->0x101ff 0x0200-0x --> ITS GROUP 0 : 0x2->0x207ff // SMMU 0 Control interrupt is MSI based // Input ID --> Output reference: Output ID N/A --> ITS GROUP 0 : 0x21 I could have misunderstood so I am stating my understanding so far .. please feel free to correct me :) In the IORT table using the PCI-RC node, SMMU node and ITS node, RID->StreamID->Device-ID mapping can be generated. As per IORT spec toady, same RID can be mapped to different StreamIDs using two ID Array elements with same RID range but different output reference. There exists no use case for such a scenario hence a clarification is required in IORT spec which states that RID range cannot overlap in the ID array. with this clarification in place, it is straight-forward to map RID to a device-ID by replacing output of SMMU to output of RCI-RC I believe this would be updated in the next IORT spec revision. Well, Xen should still support current revision of IORT even if the next version add more restriction. Cheers, -Manish ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] xen/arm: Hiding SMMUs from Dom0 when using ACPI on Xen
+Chales. Hi Julien, On 2/27/2017 11:42 PM, Julien Grall wrote: On 02/27/2017 04:58 PM, Shanker Donthineni wrote: Hi Julien, Hi Shanker, Please don't drop people in CC. In my case, any e-mail I am not CCed are skipping my inbox and I may not read them for a while. On 02/27/2017 08:12 AM, Julien Grall wrote: On 27/02/17 13:23, Vijay Kilari wrote: Hi Julien, Hello Vijay, On Wed, Feb 22, 2017 at 7:40 PM, Julien Grallwrote: Hello, There was few discussions recently about hiding SMMUs from DOM0 when using ACPI. I thought it would be good to have a separate thread for this. When using ACPI, the SMMUs will be described in the IO Remapping Table (IORT). The specification can be found on the ARM website [1]. For a brief summary, the IORT can be used to discover the SMMUs present on the platform and find for a given device the ID to configure components such as ITS (DeviceID) and SMMU (StreamID). The appendix A in the specification gives an example how DeviceID and StreamID can be found. For instance, when a PCI device is both protected by an SMMU and MSI-capable the following translation will happen: RID -> StreamID -> DeviceID Currently, SMMUs are hidden from DOM0 because they are been used by Xen and we don't support stage-1 SMMU. If we pass the IORT as it is, DOM0 will try to initialize SMMU and crash. I first thought about using a Xen specific way (STAO) or extending a flag in IORT. But that is not ideal. So we would have to rewrite the IORT for DOM0. Given that a range of RID can mapped to multiple ranges of DeviceID, we would have to translate RID one by one to find the associated DeviceID. I think this may end up to complex code and have a big IORT table. Why can't we replace Output base of IORT of PCI node with SMMU output base?. I mean similar to PCI node without SMMU, why can't replace output base of PCI node with SMMU's output base?. Because I don't see anything in the spec preventing one RC ID mapping to produce multiple SMMU ID mapping. So which output base would you use? Basically, remove SMMU nodes, and replaces output of the PCIe and named nodes ID mappings with ITS nodes. RID --> StreamID --> dviceID --> ITS device id = RID --> dviceID --> ITS device id Can you detail it? You seem to assume that one RC ID mapping range will only produce ID mapping range. AFAICT, this is not mandated by the spec. You are correct that it is not mandated by the spec, but AFAIK there seems to be no valid use case for that. RID range should not overlap between ID Array entries. I believe this would be updated in the next IORT spec revision. I have started working on recreating iort for dom0 with this restriction. The issue I see is RID is [15:0] where is DeviceID is [17:0]. Actuality device id is 32bit field. However, given that DeviceID will be used by DOM0 to only configure the ITS. We have no need to use to have the DOM0 DeviceID equal to the host DeviceID. So I think we could simplify our life by generating DeviceID for each RID range. If DOM0 DeviceID != host Device ID, then we cannot initialize ITS using DOM0 ITS commands (MAPD). So, is it concluded that ITS initializes all the devices with platform specific Device ID's in Xen?. Initializing ITS using DOM0 ITS command is a workaround until we get PCI passthrough done. It would still be possible to implement that with vDeviceID != pDeviceID as Xen would likely have the mapping between the 2 DeviceID. I believe mapping dom0 ITS commands to XEN ITS commands one to one is the better approach. Physical DeviceID is unique per ITS group, not a system wide unique ID. As for guest, you don't care about the virtual DeviceID for DOM0 as long as you are able to map it to the host ITS and host DeviceID. > In case of direct VLPI, LPI number has to be programmed whenever dom0/domU calls the MAPTI command but not at the time of PCIe device creation. I am a bit confused. Why are you speaking about direct vLPI here? This has no relation with the IORT. Cheers, ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document
Hello Julien, On 01/25/2017 08:55 PM, Julien Grall wrote: > Hello Manish, > > On 25/01/17 04:37, Manish Jaggi wrote: >> On 01/24/2017 11:13 PM, Julien Grall wrote: >>> >>> >>> On 19/01/17 05:09, Manish Jaggi wrote: >>>> I think, PCI passthrough and DOM0 w/ACPI enumerating devices on PCI are >>>> separate features. >>>> Without Xen mapping PCI config space region in stage2 of dom0, ACPI dom0 >>>> wont boot. >>>> Currently for dt xen does that. >>>> >>>> So can we have 2 design documents >>>> a) PCI passthrough >>>> b) ACPI dom0/domU support in Xen and Linux >>>> - this may include: >>>> b.1 Passing IORT to Dom0 without smmu >>>> b.2 Hypercall to map PCI config space in dom0 >>>> b.3 >>>> >>>> What do you think? >>> >>> I don't think ACPI should be treated in a separate design document. >> As PCI passthrough support will take time to mature, why should we hold the >> ACPI design ? >> If I can boot dom0/domU with ACPI as it works with dt today, it would be a >> good milestone. > > The way PCI is working on DT today is a hack. Can you please elaborate why it is a hack ? > There is no SMMU support SMMU support can be turned on and off by iommu=0 and also by not having an smmu node in device tree. So not having an smmu support for dom0 is not a hack IMHO. domUs can continue with PV devices And if you term without smmu as a hack, if I may suggest lets use this as a phase 0 for ACPI. > and the first version of GICv3 ITS support will contain hardcoded DeviceID > (or very similar). I have a disagreement on this, why should it contain hardcoded device ID, what prevents it today technically? Can you please elaborate. If you are ok to have a first limited version of GICV3 ITS why not have a Phase0 for ACPI? > > The current hack will introduce problem on platform where a specific host > controller is necessary to access the configuration space. The specific host controller can be accessed by dom0 with Xen mapping stage2, then we dont need a driver? right? Can you please elaborate on the problem? > Indeed, at the beginning Xen may not have a driver available (this > will depend on the contribution), but we still need to be able to use PCI > with Xen. ACPI dom0 boot can and should be done without smmu support. > We chose this way on DT because we didn't know when the PCI passthrough will > be added in Xen. not a technical argument. > > As mentioned in the introduction of the design document, I envision PCI > passthrough implementation in 2 phases: > - Phase 1: Register all PCI devices in Xen => will allow to use ITS and > SMMU with PCI in Xen > - Phase 2: Assign devices to guests > I think 3 phases, Lets add phase 0. - Phase 0: Dom0 ACPI without SMMU, DomU with PV devices, ITS in Xen > This design document will cover both phases because they are tight together. > But the implementation can be decoupled, it would be possible (and also my > plan) to see the 2 phases upstreamed in > different Xen release. > > Phase 1, will cover anything necessary for Xen to discover and register PCI > devices. This include the ACPI support (IORT,...). > > I see little point to have a temporary solution for ACPI that will require > bandwidth review. It would be better to put this bandwidth focusing on > getting a good design document. I disagree, it is not a temporary solution. There are several use cases where PCI pass-through is not required but ACPI is. > > When we brainstormed about PCI passthrough, we identified some tasks that > could be done in parallel of the design document. The list I have in mind is: > * SMMUv3: I am aware of a company working on this > * GICv3-ITS: work done by ARM (see [2]) > * IORT: it is required to discover ITSes and SMMU with ACPI. So it can at > least be parsed (I will speak about hiding some part to DOM0 later) > * PCI support for SMMUv2 > > There are quite a few companies willing to contribute to PCI passthrough. So > we need some coordination to avoid redundancy. Please get in touch with me if > you are interested to work on one of these > items. > Will mail you. >> Later when PCI passthrough design gets mature and implemented the support >> can be extended. >>> The support of ACPI may affect some of the decisions (such as hypercall) >>> and we have to know them now. >>> >> Still it can be an independent with only dependent features implemented or >> placeholders can be addded >>> Regarding the ECAM region not mapped. This is not related to PCI >>> passthrough but how
Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document
On 01/24/2017 11:13 PM, Julien Grall wrote: > > > On 19/01/17 05:09, Manish Jaggi wrote: >> Hi Julien, > > Hello Manish, [snip] >> I think, PCI passthrough and DOM0 w/ACPI enumerating devices on PCI are >> separate features. >> Without Xen mapping PCI config space region in stage2 of dom0, ACPI dom0 >> wont boot. >> Currently for dt xen does that. >> >> So can we have 2 design documents >> a) PCI passthrough >> b) ACPI dom0/domU support in Xen and Linux >> - this may include: >> b.1 Passing IORT to Dom0 without smmu >> b.2 Hypercall to map PCI config space in dom0 >> b.3 >> >> What do you think? > > I don't think ACPI should be treated in a separate design document. As PCI passthrough support will take time to mature, why should we hold the ACPI design ? If I can boot dom0/domU with ACPI as it works with dt today, it would be a good milestone. Later when PCI passthrough design gets mature and implemented the support can be extended. > The support of ACPI may affect some of the decisions (such as hypercall) and > we have to know them now. > Still it can be an independent with only dependent features implemented or placeholders can be addded > Regarding the ECAM region not mapped. This is not related to PCI passthrough > but how MMIO are mapped with ACPI. This is a separate subject already in > discussion (see [1]). > What about IORT generation for Dom0 without smmu ? I believe, It is not dependent on [1] > Cheers, > > [1] https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg01607.html > ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document
Hi Julien/Stefano, On 01/24/2017 07:58 PM, Julien Grall wrote: > Hi Stefano, > > On 04/01/17 00:24, Stefano Stabellini wrote: >> On Thu, 29 Dec 2016, Julien Grall wrote: > > [...] > >>> # Introduction >>> >>> PCI passthrough allows to give control of physical PCI devices to guest. >>> This >>> means that the guest will have full and direct access to the PCI device. >>> >>> ARM is supporting one kind of guest that is exploiting as much as possible >>> virtualization support in hardware. The guest will rely on PV driver only >>> for IO (e.g block, network), interrupts will come through the virtualized >>> interrupt controller. This means that there are no big changes required >>> within the kernel. >>> >>> By consequence, it would be possible to replace PV drivers by assigning real >> ^ As a consequence > > I will fix all the typoes in the next version. > >> >> >>> devices to the guest for I/O access. Xen on ARM would therefore be able to >>> run unmodified operating system. > > [...] > >>> Instantiation of a specific driver for the host controller can be easily >>> done >>> if Xen has the information to detect it. However, those drivers may require >>> resources described in ASL (see [4] for instance). q. would these drivers (like ecam/pem) be added in xen ? If yes how would xen have the information to detect host controller compatible. Should it be passed in the hypercall physdev_pci_host_bridge_add below. >>> >>> XXX: Need more investigation to know whether the missing information should >>> be passed by DOM0 or hardcoded in the driver. >> >> Given that we are talking about quirks here, it would be better to just >> hardcode them in the drivers, if possible. > > Indeed hardcoded would be the preferred way to avoid introduce new hypercall > for quirk. > > For instance, in the case of Thunder-X (see commit 44f22bd "PCI: Add MCFG > quirks for Cavium ThunderX pass2.x host controller) some region are read from > ACPI. What I'd like to understand is whether > this could be hardcoded or can it change between platform? If it can change, > is there a way in ACPI to differentiate 2 platforms? > > Maybe this is a question that Cavium can answer? (in CC). > I think it is ok to hardcode. You might need to see 648d93f "PCI: Add MCFG quirks for Cavium ThunderX pass1.x host controller" as well. > > [...] > >>> ## Discovering and register hostbridge >>> >>> Both ACPI and Device Tree do not provide enough information to fully >>> instantiate an host bridge driver. In the case of ACPI, some data may come >>> from ASL, >> >> The data available from ASL is just to initialize quirks and non-ECAM >> controllers, right? Given that SBSA mandates ECAM, and we assume that >> ACPI is mostly (if not only) for servers, then I think it is safe to say >> that in the case of ACPI we should have all the info to fully >> instantiate an host bridge driver. > > From the spec, the MCFG will only describe host bridge available at boot (see > 4.2 in "PCI firmware specification, rev 3.2"). All the other host bridges > will be described in ASL. > > So we need DOM0 to feed Xen about the latter host bridges. > >> >> >>> whilst for Device Tree the segment number is not available. >>> >>> So Xen needs to rely on DOM0 to discover the host bridges and notify Xen >>> with all the relevant informations. This will be done via a new hypercall >>> PHYSDEVOP_pci_host_bridge_add. The layout of the structure will be: >> >> I understand that the main purpose of this hypercall is to get Xen and Dom0 >> to >> agree on the segment numbers, but why is it necessary? If Dom0 has an >> emulated contoller like any other guest, do we care what segment numbers >> Dom0 will use? > > I was not planning to have a emulated controller for DOM0. The physical one > is not necessarily ECAM compliant so we would have to either emulate the > physical one (meaning multiple different emulation) > or an ECAM compliant. > > The latter is not possible because you don't know if there is enough free > MMIO space for the emulation. > > In the case on ARM, I don't see much the point to emulate the host bridge for > DOM0. The only thing we need in Xen is to access the configuration space, we > don't have about driving the host bridge. So > I would let DOM0 dealing with that. > > Also, I don't see any reason for ARM to trap DOM0 configuration space access. > The MSI will be configured using the interrupt controller and it is a trusted > Domain. > >> >> >>> struct physdev_pci_host_bridge_add >>> { >>> /* IN */ >>> uint16_t seg; >>> /* Range of bus supported by the host bridge */ >>> uint8_t bus_start; >>> uint8_t bus_nr; >>> uint32_t res0; /* Padding */ >>> /* Information about the configuration space region */ >>> uint64_t cfg_base; >>> uint64_t cfg_size; >>> } >>> >>> DOM0 will issue the hypercall PHYSDEVOP_pci_host_bridge_add for each host >>> bridge available on the platform. When Xen is receiving the
Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document
Hi Julien, On 12/29/2016 07:34 PM, Julien Grall wrote: > Hi all, > > The document below is an early version of a design > proposal for PCI Passthrough in Xen. It aims to > describe from an high level perspective the interaction > with the different subsystems and how guest will be able > to discover and access PCI. > > I am aware that a similar design has been posted recently > by Cavium (see [1]), however the approach to expose PCI > to guest is different. We have request to run unmodified > baremetal OS on Xen, a such guest would directly > access the devices and no PV drivers will be used. > > That's why this design is based on emulating a root controller. > This also has the advantage to have the VM interface as close > as baremetal allowing the guest to use firmware tables to discover > the devices. > > Currently on ARM, Xen does not have any knowledge about PCI devices. > This means that IOMMU and interrupt controller (such as ITS) > requiring specific configuration will not work with PCI even with > DOM0. > > The PCI Passthrough work could be divided in 2 phases: > * Phase 1: Register all PCI devices in Xen => will allow > to use ITS and SMMU with PCI in Xen > * Phase 2: Assign devices to guests > > This document aims to describe the 2 phases, but for now only phase > 1 is fully described. > > I have sent the design document to start to gather feedback on > phase 1. > > Cheers, > > [1] https://lists.xen.org/archives/html/xen-devel/2016-12/msg00224.html > > > % PCI pass-through support on ARM > % Julien Grall> % Draft A > > # Preface > > This document aims to describe the components required to enable PCI > passthrough on ARM. > > This is an early draft and some questions are still unanswered, when this is > the case the text will contain XXX. > > # Introduction > > PCI passthrough allows to give control of physical PCI devices to guest. This > means that the guest will have full and direct access to the PCI device. > > ARM is supporting one kind of guest that is exploiting as much as possible > virtualization support in hardware. The guest will rely on PV driver only > for IO (e.g block, network), interrupts will come through the virtualized > interrupt controller. This means that there are no big changes required > within the kernel. > > By consequence, it would be possible to replace PV drivers by assigning real > devices to the guest for I/O access. Xen on ARM would therefore be able to > run unmodified operating system. > > To achieve this goal, it looks more sensible to go towards emulating the > host bridge (we will go into more details later). A guest would be able > to take advantage of the firmware tables and obviating the need for a specific > driver for Xen. > > Thus in this document we follow the emulated host bridge approach. > > # PCI terminologies > > Each PCI device under a host bridge is uniquely identified by its Requester ID > (AKA RID). A Requester ID is a triplet of Bus number, Device number, and > Function. > > When the platform has multiple host bridges, the software can add fourth > number called Segment to differentiate host bridges. A PCI device will > then uniquely by segment:bus:device:function (AKA SBDF). > > So given a specific SBDF, it would be possible to find the host bridge and the > RID associated to a PCI device. > > # Interaction of the PCI subsystem with other subsystems > > In order to have a PCI device fully working, Xen will need to configure > other subsystems subsytems such as the SMMU and the Interrupt Controller. > > The interaction expected between the PCI subsystem and the other is: > * Add a device > * Remove a device > * Assign a device to a guest > * Deassign a device from a guest > > XXX: Detail the interaction when assigning/deassigning device > > The following subsections will briefly describe the interaction from an > higher level perspective. Implementation details (callback, structure...) > is out of scope. > > ## SMMU > > The SMMU will be used to isolate the PCI device when accessing the memory > (for instance DMA and MSI Doorbells). Often the SMMU will be configured using > a StreamID (SID) that can be deduced from the RID with the help of the > firmware > tables (see below). > > Whilst in theory all the memory transaction issued by a PCI device should > go through the SMMU, on certain platforms some of the memory transaction may > not reach the SMMU because they are interpreted by the host bridge. For > instance this could happen if the MSI doorbell is built into the PCI host > bridge. See [6] for more details. > > XXX: I think this could be solved by using the host memory layout when > creating a guest with PCI devices => Detail it. > > ## Interrupt controller > > PCI supports three kind of interrupts: legacy interrupt, MSI and MSI-X. On ARM > legacy interrupts will be mapped to SPIs. MSI and MSI-x will be
[Xen-devel] ARM PCI Pass through Design Draft 5
- | PCI Pass-through in Xen ARM | - manish.ja...@cavium.com - Draft-5 - Introduction - This document describes the design for the PCI passthrough support in Xen ARM. The target system is an ARM 64bit SoC with GICv3 and SMMU and PCIe devices. It is assumed that the PVH guests will have its msi controller support and a Virtual ITS in Xen would redirect device interrupts to Guest. This document is limited to dt based pci, It will evolve to add ACPI - Revision History - Changes from Draft-1: - a) map_mmio hypercall removed from earlier draft b) device bar mapping into guest not 1:1 c) Reserved Area in guest address space for mapping PCI-EP BARs in Stage2. d) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info). Changes from Draft-2: - a) DomU boot information updated with boot-time device assignment and hotplug. b) SMMU description added c) Mapping between streamID - bdf - deviceID. d) assign_device hypercall to include virtual(guest) sbdf. Toolstack to generate guest sbdf rather than pciback. Changes from Draft-3: - a) Fixed typos and added more description b) NUMA and PCI passthrough description removed for now. c) Added example from Ian's Mail Changes from Draft-4: a) Added Hypercall PHYSDEVOP_pci_dev_map_msi_specifier b) The design takes into account Linux PCI msi-map support c) Added Xen internal to get streamID from pci_dev d) Added few examples and dts/code snippets - Index - (1) Background (2) Basic PCI Support in Xen ARM (2.1) pci_hostbridge and pci_hostbridge_ops (2.2) PHYSDEVOP_HOSTBRIDGE_ADD hypercall (2.3) XEN Internal API (3) SMMU programming (3.1) Additions for PCI Passthrough (3.2) Mapping between streamID - deviceID - pci sbdf - requesterID (4) Assignment of PCI device (4.1) Dom0 (4.1.1) Stage 2 Mapping of GITS_ITRANSLATER space (4k) (4.1.1.1) For Dom0 (4.1.1.2) For DomU (4.1.1.2.1) Hypercall Details: XEN_DOMCTL_get_itranslater_space (4.2) DomU (4.2.1) Reserved Areas in guest memory space (4.2.2) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info). (4.2.3) Hypercall Modification for bdf mapping notification to xen (5) DomU FrontEnd Bus Changes (5.1) Change in Linux PCI frontend bus and gicv3-its node binding for domU (6) Glossary (7) References - 1.Background - Passthrough refers to assigning a PCI device to a guest domain (domU) such that the guest has full control over the device. The MMIO space / interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. PCI devices generated message signaled interrupt writes are within guest address spaces which are also translated using SMMU. 1.1 PCI device Id in Dom0 -- As per the bindings document [6], msi-specifier is generated from msi-map property such the msi-specifier [32:16] bits would come from the msi-map namespace and [15:0] would be same as RID. There could be multiple pci nodes in device tree with same msi-map property pci@84a0 { compatible = "pci-host-ecam-generic"; device_type = "pci"; msi-map = <0x0 0x6f 0x2 0x1>; bus-range = <0x0 0x1f>; reg = <0x84a0 0x0 0x0 0x200>; ... }; pci@87e0c200 { compatible = "cavium,pci-host-thunder-pem"; device_type = "pci"; msi-map = <0x0 0x6f 0x1 0x1>; bus-range = <0x8f 0xc7>; reg = <0x8880 0x8f00 0x0 0x3900 0x87e0 0xc200 0x0 0x100>; ... } pci@8490 { compatible = "pci-host-ecam-generic"; device_type = "pci"; msi-map = <0x0
Re: [Xen-devel] PCI Pass-through in Xen ARM: Draft 4
On Wednesday 16 September 2015 06:28 PM, Julien Grall wrote: On 15/09/15 19:58, Jaggi, Manish wrote: I can see 2 different solutions: 1) Let DOM0 pass the first requester ID when registering the bus Pros: * Less per-platform code in Xen Cons: * Assume that the requester ID are contiguous. (Is it really a cons?) * Still require quirk for buggy device (i.e requester ID not correct) 2) Do it in Xen Pros: * We are not relying on DOM0 giving the requester ID => Not assuming contiguous requester ID Cons: * Per PCI bridge code to handle the mapping We can have (3) that when PHYSDEVOP_pci_add_device is called the sbdf and requesterID both are passed in hypercall. The name of the physdev operation is PHYSDEVOP_pci_device_add and not PHYSDEVOP_pci_add_device. Please rename it all the usage in the design doc. Although, we can't modify PHYSDEVOP_pci_device_add because it's part of the ABI which is stable. Based on David's mail, the requester ID of a given device can be found using base + devfn where base is the first requesterID of the bus. IIRC, this is also match the IORT ACPI spec. So for now, I would extend the physdev you've introduced to add an host bridge (PHYSDEV_pci_host_bridge_add) to pass the base requesterID. The requester-ID is derived from the Node# and ECAM# as per David. I guess the ECAM and Node# can be derived from the cfg_addr. Each Ecam has a cfg_addr in Thunder, which is mentioned in the pci node in device tree. For thunder I think we don't need to pass requester-ID in the phydevop. We can think later to introduce a new physdev op to add PCI if we ever require unique requesterID (i.e non-contiguous under the same bridge). Regards, --- Julien Grall ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM: Draft 4
On Tuesday 01 September 2015 01:02 PM, Jan Beulich wrote: On 31.08.15 at 14:36, <mja...@caviumnetworks.com> wrote: On Thursday 13 August 2015 03:12 PM, Manish Jaggi wrote: 4.2.1 Mapping BAR regions in guest address space - When a PCI-EP device is assigned to a domU the toolstack will read the pci configuration space BAR registers. Toolstack allocates a virtual BAR region for each BAR region, from the area reserved in guest address space for mapping BARs referred to as Guest BAR area. This area is defined in public/arch-arm.h /* For 32bit BARs*/ #define GUEST_BAR_BASE_32 <<>> #define GUEST_BAR_SIZE_32 <<>> /* For 64bit BARs*/ #define GUEST_BAR_BASE_64 <<>> #define GUEST_BAR_SIZE_64 <<>> Toolstack then invokes domctl xc_domain_memory_mapping to map in stage2 translation. If a BAR region address is 32b BASE_32 area would be used, otherwise 64b. If a combination of both is required the support is TODO. Toolstack manages these areas and allocate from these area. The allocation and deallocation is done using APIs similar to malloc and free. To implement this feature in xl tools there is required to have a malloc and free from the reserved area. Can we have the XEN_DOMCTL_memory_mapping extended with a flag say ALLOCATE/FREE_FROM_BAR_AREA. When this flag is passed xen would add or remove the stage2 mapping for the domain. This will make use of the code already present in xen. Above it was said that the tool stack manages this area (including allocations from it). Why would this require a new hypercall? As a rule xl tools should manage the guest memory map. Now if it does by itself or initiates it is another thing. Allocating an area for PCI BAR and freeing it reserved area would require adding allocator code in xl tools. Since xen already knows about the area (as it is defined in public header file) and there exists code in xen, i believe it make sense to use that rather than adding the same in xl tools. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM: Draft 4
On Thursday 13 August 2015 03:12 PM, Manish Jaggi wrote: - | PCI Pass-through in Xen ARM | - manish.ja...@caviumnetworks.com --- Draft-4 [snip] 4.2.1 Mapping BAR regions in guest address space - When a PCI-EP device is assigned to a domU the toolstack will read the pci configuration space BAR registers. Toolstack allocates a virtual BAR region for each BAR region, from the area reserved in guest address space for mapping BARs referred to as Guest BAR area. This area is defined in public/arch-arm.h /* For 32bit BARs*/ #define GUEST_BAR_BASE_32 <<>> #define GUEST_BAR_SIZE_32 <<>> /* For 64bit BARs*/ #define GUEST_BAR_BASE_64 <<>> #define GUEST_BAR_SIZE_64 <<>> Toolstack then invokes domctl xc_domain_memory_mapping to map in stage2 translation. If a BAR region address is 32b BASE_32 area would be used, otherwise 64b. If a combination of both is required the support is TODO. Toolstack manages these areas and allocate from these area. The allocation and deallocation is done using APIs similar to malloc and free. To implement this feature in xl tools there is required to have a malloc and free from the reserved area. Can we have the XEN_DOMCTL_memory_mapping extended with a flag say ALLOCATE/FREE_FROM_BAR_AREA. When this flag is passed xen would add or remove the stage2 mapping for the domain. This will make use of the code already present in xen. Any reservations with this approach ? ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] PCI Pass-through in Xen ARM: Draft 4
- | PCI Pass-through in Xen ARM | - manish.ja...@caviumnetworks.com --- Draft-4 - Introduction - This document describes the design for the PCI passthrough support in Xen ARM. The target system is an ARM 64bit SoC with GICv3 and SMMU v2 and PCIe devices. - Revision History - Changes from Draft-1: - a) map_mmio hypercall removed from earlier draft b) device bar mapping into guest not 1:1 c) Reserved Area in guest address space for mapping PCI-EP BARs in Stage2. d) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info). Changes from Draft-2: - a) DomU boot information updated with boot-time device assignment and hotplug. b) SMMU description added c) Mapping between streamID - bdf - deviceID. d) assign_device hypercall to include virtual(guest) sbdf. Toolstack to generate guest sbdf rather than pciback. Changes from Draft-3: - a) Fixed typos and added more description b) NUMA and PCI passthrough description removed for now. c) Added example from Ian's Mail - Index - (1) Background (2) Basic PCI Support in Xen ARM (2.1) pci_hostbridge and pci_hostbridge_ops (2.2) PHYSDEVOP_HOSTBRIDGE_ADD hypercall (2.3) XEN Internal API (3) SMMU programming (3.1) Additions for PCI Passthrough (3.2) Mapping between streamID - deviceID - pci sbdf - requesterID (4) Assignment of PCI device (4.1) Dom0 (4.1.1) Stage 2 Mapping of GITS_ITRANSLATER space (4k) (4.1.1.1) For Dom0 (4.1.1.2) For DomU (4.1.1.2.1) Hypercall Details: XEN_DOMCTL_get_itranslater_space (4.2) DomU (4.2.1) Reserved Areas in guest memory space (4.2.2) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info). (4.2.3) Hypercall Modification for bdf mapping notification to xen (5) DomU FrontEnd Bus Changes (5.1) Change in Linux PCI frontend bus and gicv3-its node binding for domU (6) Glossary (7) References - 1.Background of PCI passthrough - Passthrough refers to assigning a PCI device to a guest domain (domU) such that the guest has full control over the device. The MMIO space / interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. PCI devices generated message signalled interrupt writes are within guest address spaces which are also translated using SMMU. For this reason the GITS (ITS address space) Interrupt Translation Register space is mapped in the guest address space. 2.Basic PCI Support for ARM - The APIs to read write from PCI configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the PCI host controller. ARM PCI support in Xen, introduces PCI host controller similar to what exists in Linux. Host controller drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. Note: as pci devices are enumerated the pci node in device tree refers to the host controller. (TODO: for ACPI unimplemented) 2.1pci_hostbridge and pci_hostbridge_ops - The init function in the PCI host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A PCI conf_read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32
Re: [Xen-devel] PCI Passthrough Design - Draft 3
Below are the comments. I will also send a Draft 4 taking account of the comments. On Wednesday 12 August 2015 02:04 AM, Konrad Rzeszutek Wilk wrote: On Tue, Aug 04, 2015 at 05:57:24PM +0530, Manish Jaggi wrote: - | PCI Pass-through in Xen ARM | - manish.ja...@caviumnetworks.com --- Draft-3 ... [snip] 2.2PHYSDEVOP_pci_host_bridge_add hypercall -- Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus there needs to be a mechanism to bind the segment number assigned by dom0 to the pci host controller. The hypercall is introduced: Why can't we extend the existing hypercall to have the segment value? Oh wait, PHYSDEVOP_manage_pci_add_ext does it already! It doesn’t pass the cfg_base and size to xen And have the hypercall (and Xen) be able to deal with introduction of PCI devices that are out of sync? Maybe I am confused but aren't PCI host controllers also 'uploaded' to Xen? I need to add one more line here to be more descriptive. The binding is between the segment number (domain number in linux) used by dom0 and the pci config space address in the pci node of device tree (reg property). The hypercall was introduced to cater the fact that the dom0 may process pci nodes in the device tree in any order. By this binding it is a clear ABI. #define PHYSDEVOP_pci_host_bridge_add44 struct physdev_pci_host_bridge_add { /* IN */ uint16_t seg; uint64_t cfg_base; uint64_t cfg_size; }; This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add hypercall. The handler code invokes to update segment number in pci_hostbridge: int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size); Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops of the respective pci_hostbridge. This design sounds like it is added to deal with having to pre-allocate the amount host controllers structure before the PCI devices are streaming in? Instead of having the PCI devices and PCI host controllers be updated as they are coming in? Why can't the second option be done? If you are referring to ACPI, we have to add the support. PCI Host controllers are pci nodes in device tree. 2.3Helper Functions a) pci_hostbridge_dt_node(pdev-seg); Returns the device tree node pointer of the pci node from which the pdev got enumerated. 3.SMMU programming --- 3.1.Additions for PCI Passthrough --- 3.1.1 - add_device in iommu_ops is implemented. This is called when PHYSDEVOP_pci_add_device is called from dom0. Or for PHYSDEVOP_manage_pci_add_ext ? Not sure but it seems logical for this also. .add_device = arm_smmu_add_dom0_dev, static int arm_smmu_add_dom0_dev(u8 devfn, struct device *dev) { if (dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(dev); return arm_smmu_assign_dev(pdev-domain, devfn, dev); } return -1; } What about removal? What if the device is removed (hot-unplugged?? .remove_device = arm_smmu_remove_device(). would be called. Will update in Draft4 3.1.2 dev_get_dev_node is modified for pci devices. - The function is modified to return the dt_node of the pci hostbridge from the device tree. This is required as non-dt devices need a way to find on which smmu they are attached. static struct arm_smmu_device *find_smmu_for_device(struct device *dev) { struct device_node *dev_node = dev_get_dev_node(dev); static struct device_node *dev_get_dev_node(struct device *dev) { if (dev_is_pci(dev)) { struct pci_dev *pdev = to_pci_dev(dev); return pci_hostbridge_dt_node(pdev-seg); } ... 3.2.Mapping between streamID - deviceID - pci sbdf - requesterID - For a simpler case all should be equal to BDF. But there are some devices that use the wrong requester ID for DMA transactions. Linux kernel has pci quirks for these. How the same be implemented in Xen or a diffrent approach has to s/pci/PCI/ be taken is TODO here. Till that time, for basic implementation it is assumed that all are equal to BDF. 4.Assignment of PCI device - 4.1Dom0 All PCI devices are assigned to dom0 unless hidden by pci-hide bootargs in dom0. 'pci-hide' in dom0? Greeping in Documentation/kernel-parameters.txt I don't see anything. %s/pci-hide//pciback/./hide// Dom0 enumerates the PCI devices. For each device
[Xen-devel] PCI Passthrough Design - Draft 3
- | PCI Pass-through in Xen ARM | - manish.ja...@caviumnetworks.com --- Draft-3 --- Introduction --- This document describes the design for the PCI passthrough support in Xen ARM. The target system is an ARM 64bit Soc with GICv3 and SMMU v2 and PCIe devices. --- Revision History --- Changes from Draft-1: - a) map_mmio hypercall removed from earlier draft b) device bar mapping into guest not 1:1 c) holes in guest address space 32bit / 64bit for MMIO virtual BARs d) xenstore device's BAR info addition. Changes from Draft-2: - a) DomU boot information updated with boot-time device assignment and hotplug. b) SMMU description added c) Mapping between streamID - bdf - deviceID. d) assign_device hypercall to include virtual(guest) sbdf. Toolstack to generate guest sbdf rather than pciback. --- Index --- (1) Background (2) Basic PCI Support in Xen ARM (2.1)pci_hostbridge and pci_hostbridge_ops (2.2)PHYSDEVOP_HOSTBRIDGE_ADD hypercall (3) SMMU programming (3.1) Additions for PCI Passthrough (3.2)Mapping between streamID - deviceID - pci sbdf (4) Assignment of PCI device (4.1) Dom0 (4.1.1) Stage 2 Mapping of GITS_ITRANSLATER space (4k) (4.1.1.1) For Dom0 (4.1.1.2) For DomU (4.1.1.2.1) Hypercall Details: XEN_DOMCTL_get_itranslater_space (4.2) DomU (4.2.1) Reserved Areas in guest memory space (4.2.2) New entries in xenstore for device BARs (4.2.4) Hypercall Modification for bdf mapping notification to xen (5) DomU FrontEnd Bus Changes (5.1)Change in Linux PCI FrontEnd - backend driver for MSI/X programming (5.2)Frontend bus and interrupt parent vITS (6) NUMA and PCI passthrough --- 1.Background of PCI passthrough -- Passthrough refers to assigning a pci device to a guest domain (domU) such that the guest has full control over the device. The MMIO space and interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. PCI devices generated message signalled interrupt write are within guest address spaces which are also translated using SMMU. For this reason the GITS (ITS address space) Interrupt Translation Register space is mapped in the guest address space. 2.Basic PCI Support for ARM -- The apis to read write from pci configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the pci host controller. ARM PCI support in Xen, introduces pci host controller similar to what exists in Linux. Each drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. 2.1pci_hostbridge and pci_hostbridge_ops -- The init function in the pci host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A pci conf read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes) { pci_hostbridge_t *pcihb; list_for_each_entry(pcihb, pci_hostbridge_list, list) { if(pcihb-segno == seg) return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes); } return -1; } 2.2PHYSDEVOP_pci_host_bridge_add hypercall -- Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On 31/07/15 4:49 pm, Ian Campbell wrote: On Fri, 2015-07-31 at 16:37 +0530, Manish Jaggi wrote: On Friday 31 July 2015 01:35 PM, Ian Campbell wrote: On Fri, 2015-07-31 at 13:16 +0530, Manish Jaggi wrote: Secondly, the vdev-X entry is created async by dom0 watching on event. So how the tools could read back and call assign device again. Perhaps by using a xenstore watch on that node to wait for the assignment from pciback to occur. As per the flow in the do_pci_add function, assign_device is called first and based on the success xenstore entry is created. Are you suggesting to change the sequence. Perhaps that is what it would take, yes, or maybe some other refactoring (e.g. splitting assign_device into two stages) might be the answer. The hypercall from xenpciback (what I implemented) is actually making the assign device in 2 stages. I think the point of contention is the second stage should be from toolstack. I think calling xc_assign_device after xenstore from the watch callback is the only option. Only if you ignore the other option I proposed. One question is how to split the code for ARM and x86 as this is the common code. Would #ifdef CONFIG_ARM64 ok with maintainers. No. arch hooks in libxl_$ARCH.c (with nop implementations where necessary) would be the way to approach this. However I still am not convinced this is the approach we should be taking. My current preference is for the suggestion below which is to let the toolstack pick the vdevfn and have pciback honour it. That would duplicate code for dev-fn generation into toolstack from __xen_pcibk_add_pci_dev. IMHO the toolstack is the correct place for this code, at least for ARM guests. The toolstack is, in general, responsible for all aspects of the guest layout. I don't think delegating the PCI bus parts of that to the dom0 kernel makes sense. Ok, i will implement the same from pciback to toolstack. I am not sure about the complexity but will give it a try. With this xen-pciback will not create the vdev-X entry at all. I'd not be surprised if the same turns out to be true for x86/PVH guests too. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On 31/07/15 8:26 pm, Julien Grall wrote: On 31/07/15 15:33, Manish Jaggi wrote: Hi Julien, On 31/07/15 6:29 pm, Julien Grall wrote: Hi Manish, On 31/07/15 13:50, Manish Jaggi wrote: Ok, i will implement the same from pciback to toolstack. I am not sure about the complexity but will give it a try. With this xen-pciback will not create the vdev-X entry at all. Can you send a new draft before continuing to implement PCI support in Xen? I am working on the Draft 3 and addressing comments in draft 2. I am doing a feasibility of the stuff I put in draft3. Well, I don't think that anything we say within this thread was impossible to do. As long as we are not agree about it, I thought I was trying to discuss the same. If you have any point please raise it. What I meant is, this is a 40-messages thread with lots of discussions on it. A new draft containing a summary on what was said would benefits everyone and help us to get on a design that we think is good. you loose your time trying to implement something that can drastically change in the next revision. I am only putting the stuff in the Draft3 which *can* be implemented later. But nothing prevent someone in the discussion on Draft3 to say this is wrong and it has to be done in a different way. Usually the time between two draft should be pretty short in order to get sane base for discussion. For now, we are talking about small portion of design and speculating/trying to remember what was agreed on other sub-thread. ok will send draft 3 with the points on this topic as under discussion. Is that fine? Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
Hi Julien, On 31/07/15 6:29 pm, Julien Grall wrote: Hi Manish, On 31/07/15 13:50, Manish Jaggi wrote: Ok, i will implement the same from pciback to toolstack. I am not sure about the complexity but will give it a try. With this xen-pciback will not create the vdev-X entry at all. Can you send a new draft before continuing to implement PCI support in Xen? I am working on the Draft 3 and addressing comments in draft 2. I am doing a feasibility of the stuff I put in draft3. As long as we are not agree about it, I thought I was trying to discuss the same. If you have any point please raise it. you loose your time trying to implement something that can drastically change in the next revision. I am only putting the stuff in the Draft3 which *can* be implemented later. Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Friday 31 July 2015 01:35 PM, Ian Campbell wrote: On Fri, 2015-07-31 at 13:16 +0530, Manish Jaggi wrote: Secondly, the vdev-X entry is created async by dom0 watching on event. So how the tools could read back and call assign device again. Perhaps by using a xenstore watch on that node to wait for the assignment from pciback to occur. As per the flow in the do_pci_add function, assign_device is called first and based on the success xenstore entry is created. Are you suggesting to change the sequence. Perhaps that is what it would take, yes, or maybe some other refactoring (e.g. splitting assign_device into two stages) might be the answer. The hypercall from xenpciback (what I implemented) is actually making the assign device in 2 stages. I think the point of contention is the second stage should be from toolstack. I think calling xc_assign_device after xenstore from the watch callback is the only option. One question is how to split the code for ARM and x86 as this is the common code. Would #ifdef CONFIG_ARM64 ok with maintainers. My current preference is for the suggestion below which is to let the toolstack pick the vdevfn and have pciback honour it. That would duplicate code for dev-fn generation into toolstack from __xen_pcibk_add_pci_dev. We can discuss this more on #xenarm irc Sorry I missed your ping yesterday, I had already gone home. Or you could change things such that vdevfn is always chosen by the toolstack for ARM, not optionally like it is on x86. For this one, the struct libxl_device_pci has a field vdevfn, which is supposed to allow the user to specify a specific vdevfn. I'm not sure how that happens or fits together but libxl could undertake to set that on ARM in the case where the user hasn't done so, effectively taking control of the PCI bus assignment. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Thursday 30 July 2015 03:24 PM, Ian Campbell wrote: On Wed, 2015-07-29 at 15:07 +0530, Manish Jaggi wrote: On Monday 06 July 2015 03:50 PM, Ian Campbell wrote: On Mon, 2015-07-06 at 15:36 +0530, Manish Jaggi wrote: On Monday 06 July 2015 02:41 PM, Ian Campbell wrote: On Sun, 2015-07-05 at 11:25 +0530, Manish Jaggi wrote: On Monday 29 June 2015 04:01 PM, Julien Grall wrote: Hi Manish, On 28/06/15 19:38, Manish Jaggi wrote: 4.1 Holes in guest memory space Holes are added in the guest memory space for mapping pci device's BAR regions. These are defined in arch-arm.h /* For 32bit */ GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE /* For 64bit */ GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE The memory layout for 32bit and 64bit are exactly the same. Why do you need to differ here? I think Ian has already replied. I will change the name of macro 4.2 New entries in xenstore for device BARs toolkit also updates the xenstore information for the device (virtualbar:physical bar). This information is read by xenpciback and returned to the pcifront driver configuration space accesses. Can you details what do you plan to put in xenstore and how? It is implementation . But I plan to put under domU / device / heirarchy Actually, xenstore is an API of sorts which needs to be maintained going forward (since front and backend can evolve separately, so it does need some level of design and documentation. What about the expansion ROM? Do you want to put some restriction on not using expansion ROM as a passthrough device. expansion ROM as a passthrough device doesn't make sense to me, passthrough devices may _have_ an expansion ROM. The expansion ROM is just another BAR. I don't know how pcifront/back deal with those today on PV x86, but I see no reason for ARM to deviate. 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. - Who will call this hypercall? - Why not setting the gsbdf when the device is assigned? Can the maintainer of the pciback suggest an alternate. That's not me, but I don't think this belongs here, I think it can be done from the toolstack. If you think not then please explain what information the toolstack doesn't have in its possession which prevents this mapping from being done there. The toolstack does not have the guest sbdf information. I could only find it in xenpciback. Are you sure? The sbdf relates to the physical device, correct? If so then surely the toolstack knows it -- it's written in the config file and is the primary parameter to all of the related libxl passthrough APIs. The toolstack wouldn't be able to do anything about passing through a given device without knowing which device it should be passing through. Perhaps this info needs plumbing through to some new bit of the toolstack, but it is surely available somewhere. If you meant the virtual SBDF then that is in libxl_device_pci.vdevfn. I added prints in libxl__device_pci_add. vdevfn is always 0 so this may not be the right variable to use. Can you please recheck. Also the vdev-X entry in xenstore appears to be created from pciback code and not from xl. Check function xen_pcibk_publish_pci_dev. So I have to send a hypercall from pciback only. I don't think the necessarily follows. You could have the tools read the vdev-X node back on plug. I have been trying to get the flow of caller of libxl__device_pci_add during pci device assignemnt from cfg file(cold boot). It should be called form xl create flow. Is it called from C code or Python code. libxl__device_pci_add calls xc_assign_device Secondly, the vdev-X entry is created async by dom0 watching on event. So how the tools could read back and call assign device again. static void xen_pcibk_be_watch(struct xenbus_watch *watch, const char **vec, unsigned int len) { ... switch (xenbus_read_driver_state(pdev-xdev-nodename)) { case XenbusStateInitWait: xen_pcibk_setup_backend(pdev); break; } Or you could change things such that vdevfn is always chosen by the toolstack for ARM
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Monday 06 July 2015 03:50 PM, Ian Campbell wrote: On Mon, 2015-07-06 at 15:36 +0530, Manish Jaggi wrote: On Monday 06 July 2015 02:41 PM, Ian Campbell wrote: On Sun, 2015-07-05 at 11:25 +0530, Manish Jaggi wrote: On Monday 29 June 2015 04:01 PM, Julien Grall wrote: Hi Manish, On 28/06/15 19:38, Manish Jaggi wrote: 4.1 Holes in guest memory space Holes are added in the guest memory space for mapping pci device's BAR regions. These are defined in arch-arm.h /* For 32bit */ GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE /* For 64bit */ GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE The memory layout for 32bit and 64bit are exactly the same. Why do you need to differ here? I think Ian has already replied. I will change the name of macro 4.2 New entries in xenstore for device BARs toolkit also updates the xenstore information for the device (virtualbar:physical bar). This information is read by xenpciback and returned to the pcifront driver configuration space accesses. Can you details what do you plan to put in xenstore and how? It is implementation . But I plan to put under domU / device / heirarchy Actually, xenstore is an API of sorts which needs to be maintained going forward (since front and backend can evolve separately, so it does need some level of design and documentation. What about the expansion ROM? Do you want to put some restriction on not using expansion ROM as a passthrough device. expansion ROM as a passthrough device doesn't make sense to me, passthrough devices may _have_ an expansion ROM. The expansion ROM is just another BAR. I don't know how pcifront/back deal with those today on PV x86, but I see no reason for ARM to deviate. 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. - Who will call this hypercall? - Why not setting the gsbdf when the device is assigned? Can the maintainer of the pciback suggest an alternate. That's not me, but I don't think this belongs here, I think it can be done from the toolstack. If you think not then please explain what information the toolstack doesn't have in its possession which prevents this mapping from being done there. The toolstack does not have the guest sbdf information. I could only find it in xenpciback. Are you sure? The sbdf relates to the physical device, correct? If so then surely the toolstack knows it -- it's written in the config file and is the primary parameter to all of the related libxl passthrough APIs. The toolstack wouldn't be able to do anything about passing through a given device without knowing which device it should be passing through. Perhaps this info needs plumbing through to some new bit of the toolstack, but it is surely available somewhere. If you meant the virtual SBDF then that is in libxl_device_pci.vdevfn. I added prints in libxl__device_pci_add. vdevfn is always 0 so this may not be the right variable to use. Can you please recheck. Also the vdev-X entry in xenstore appears to be created from pciback code and not from xl. Check function xen_pcibk_publish_pci_dev. So I have to send a hypercall from pciback only. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Tuesday 14 July 2015 11:31 PM, Stefano Stabellini wrote: On Tue, 14 Jul 2015, Julien Grall wrote: Hi Stefano, On 14/07/2015 18:46, Stefano Stabellini wrote: Linux provides a function (pci_for_each_dma_alias) which will return a requester ID for a given PCI device. It appears that the BDF (the 's' of sBDF is only internal to Linux and not part of the hardware) is equal to the requester ID on your platform but we can't assume it for anyone else. The PCI Express Base Specification states that the requester ID is The combination of a Requester's Bus Number, Device Number, and Function Number that uniquely identifies the Requester. I think it is safe to assume BDF = requester ID on all platforms. With the catch that in case of ARI devices (http://pcisig.com/sites/default/files/specification_documents/ECN-alt-rid-interpretation-070604.pdf), BDF is actually BF because the device number is always 0 and the function number is 8 bits. And some other problem such as broken PCI device... Both Xen x86 (domain_context_mapping in drivers/passthrough/vtd/iommu.c) and Linux (pci_dma_for_each_alias) use a code more complex than requesterID = BDF. So I don't think we can use requesterID = BDF in physdev op unless we are *stricly* sure this is valid. The spec is quite clear about it, but I guess there might be hardware quirks. Can we keep this open and for now till there is agreement make requesterid = bdf. If you are ok, I will update and send Draft 3. Although, based on the x86 code, Xen should be able to translate the BDF into the requester ID... Yes, that is a good point. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Tuesday 07 July 2015 04:54 PM, Ian Campbell wrote: On Tue, 2015-07-07 at 14:16 +0530, Manish Jaggi wrote: As asked you in the previous mail, can you please prove it? The function used to get the requester ID (pci_for_each_dma_alias) is more complex than a simple return sbdf. I am not sure what you would like me to prove. As of ThunderX Xen code we have assumed sbdf == deviceID. Please remember that you are not writing ThunderX Xen code here, you are writing generic Xen code which you happen to be testing on Thunder X. The design and implementation does need to consider the more generic case I'm afraid. In particular if this is going to be a PHYSDEVOP then it needs to be designed to be future proof, since PHYSDEVOP is a stable API i.e. it is hard to change in the future. I think I did ask elsewhere _why_ this was a physdev op, since I can't see why it can't be done by the toolstack, and therefore why it can't be a domctl. If it can be done in domctl I prefer that. Will get back on this. If this was a domctl there might be scope for accepting an implementation which made assumptions such as sbdf == deviceid. However I'd still like to see this topic given proper treatment in the design and not just glossed over with this is how ThunderX does things. I got your point. Or maybe the solution is simple and we should just do it now -- i.e. can we add a new field to the PHYSDEVOP_pci_host_bridge_add argument struct which contains the base deviceid for that bridge deviceId would be same as sbdf. As we dont have a way to translate sbdf to deviceID. What about SMMU streamID, can we also have sbdf = deviceID = smmuid_bdf FYI: In thunder each RC is on a separate smmu. Can we take that as step1. (since I believe both DT and ACPI IORT assume a simple linear mapping[citation needed])? I am ok with the approach but then we have to put something similar to IORT in device tree. Currently it is not there. If we take that route of creating IORT for host / guest it would be altogether different effort. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Thursday 09 July 2015 01:38 PM, Julien Grall wrote: Hi Manish, On 09/07/2015 08:13, Manish Jaggi wrote: If this was a domctl there might be scope for accepting an implementation which made assumptions such as sbdf == deviceid. However I'd still like to see this topic given proper treatment in the design and not just glossed over with this is how ThunderX does things. I got your point. Or maybe the solution is simple and we should just do it now -- i.e. can we add a new field to the PHYSDEVOP_pci_host_bridge_add argument struct which contains the base deviceid for that bridge deviceId would be same as sbdf. As we dont have a way to translate sbdf to deviceID. I think we have to be clear in this design document about the different meaning. When the Device Tree is used, it's assumed that the deviceID will be equal to the requester ID and not the sbdf. Does SMMU v2 has a concept of requesterID. I see requesterID term in SMMUv3 Linux provides a function (pci_for_each_dma_alias) which will return a requester ID for a given PCI device. It appears that the BDF (the 's' of sBDF is only internal to Linux and not part of the hardware) is equal to the requester ID on your platform but we can't assume it for anyone else. so you mean requesterID = pci_for_each_dma_alias(sbdf) When we have a PCI in hand, we have to find the requester ID for this device. That is the question. How to map requesterID to sbdf On Once ? we have it we can deduce the streamID and the deviceID. The way to do it will depend on whether we use device tree or ACPI: - For device tree, the streamID, and deviceID will be equal to the requester ID what do you think should be streamID when a device is PCI EP and is enumerated. Also per ARM SMMU 2.0 spec StreamID is implementation specific. As per SMMUv3 specs For PCI, it is intended that StreamID is generated from the PCI RequesterID. The generation function may be 1:1 where one Root Complex is hosted by one SMMU - For ACPI, we would have to look up in the ACPI IORT. For the latter, I think they are static tables and therefore can be parse in Xen. So we wouldn't need to PHYSDEVOP_pci_host_bridge_add to pass an offset. This will also avoid any assumption that deviceID for a given root complex are always contiguous and make extendable for any new hardware require a different *ID. So what we really care is the requester ID. Although, I'm not sure if you can find it in Xen. If not, we may need to customize (i.e adding a new PHYSDEVOP) PCI add device to take a requesterID in parameter. Now, in the case of the guest, as we are only supporting device tree, we could make the assumption that requesterID == deviceID as long as this is exposed in a DOMCTL to allow us flexibility. It would make sense to extend DOMCTL_assign_device to take the vBDF (or requesterID?) in parameter. Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Tuesday 07 July 2015 01:48 PM, Julien Grall wrote: Hi Manish, On 07/07/2015 08:10, Manish Jaggi wrote: On Monday 06 July 2015 05:15 PM, Julien Grall wrote: On 06/07/15 12:09, Manish Jaggi wrote: On Monday 06 July 2015 04:13 PM, Julien Grall wrote: On 05/07/15 06:55, Manish Jaggi wrote: 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. That wasn't my question. I asked, how Xen will find the mapping between the gdbf and vDeviceID? He doesn't have access to the firmware table and therefore not able to find the right one. I believe gsbdf and vDeviceID would be same. Xen and the guest need to translate the gsbdf the same way. If this is clearly defined by a spec, then you should give a link to it. They are same, will change sbdf -DeviceID and gsbdf-vDeviceID. As asked you in the previous mail, can you please prove it? The function used to get the requester ID (pci_for_each_dma_alias) is more complex than a simple return sbdf. I am not sure what you would like me to prove. As of ThunderX Xen code we have assumed sbdf == deviceID. We are not using ACPI as of now. This is our implementation. It cannot be wrong outrightly. Can you please suggest what could be the other approach. Furthermore, AFAICT, the IORT Table (from ACPI) [1] is used to specify the relationships between the requester ID and the DeviceID. So it's not obvious that sbdf == DeviceID. If not, you have to explain in this design doc how you plan to have xen and the guest using the same vdevID for a given gsbdf. Regards, [1] http://infocenter.arm.com/help/topic/com.arm.doc.den0049a/DEN0049A_IO_Remapping_Table.pdf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Monday 06 July 2015 05:15 PM, Julien Grall wrote: On 06/07/15 12:09, Manish Jaggi wrote: On Monday 06 July 2015 04:13 PM, Julien Grall wrote: On 05/07/15 06:55, Manish Jaggi wrote: 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. That wasn't my question. I asked, how Xen will find the mapping between the gdbf and vDeviceID? He doesn't have access to the firmware table and therefore not able to find the right one. I believe gsbdf and vDeviceID would be same. Xen and the guest need to translate the gsbdf the same way. If this is clearly defined by a spec, then you should give a link to it. They are same, will change sbdf -DeviceID and gsbdf-vDeviceID. If not, you have to explain in this design doc how you plan to have xen and the guest using the same vdevID for a given gsbdf. Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Tuesday 07 July 2015 02:16 PM, Manish Jaggi wrote: On Tuesday 07 July 2015 01:48 PM, Julien Grall wrote: Hi Manish, On 07/07/2015 08:10, Manish Jaggi wrote: On Monday 06 July 2015 05:15 PM, Julien Grall wrote: On 06/07/15 12:09, Manish Jaggi wrote: On Monday 06 July 2015 04:13 PM, Julien Grall wrote: On 05/07/15 06:55, Manish Jaggi wrote: 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. That wasn't my question. I asked, how Xen will find the mapping between the gdbf and vDeviceID? He doesn't have access to the firmware table and therefore not able to find the right one. I believe gsbdf and vDeviceID would be same. Xen and the guest need to translate the gsbdf the same way. If this is clearly defined by a spec, then you should give a link to it. They are same, will change sbdf -DeviceID and gsbdf-vDeviceID. As asked you in the previous mail, can you please prove it? The function used to get the requester ID (pci_for_each_dma_alias) is more complex than a simple return sbdf. I am not sure what you would like me to prove. As of ThunderX Xen code we have assumed sbdf == deviceID. We are not using ACPI as of now. This is our implementation. It cannot be wrong outrightly. Can you please suggest what could be the other approach. Furthermore, AFAICT, the IORT Table (from ACPI) [1] is used to specify the relationships between the requester ID and the DeviceID. So it's not obvious that sbdf == DeviceID. If not, you have to explain in this design doc how you plan to have xen and the guest using the same vdevID for a given gsbdf. Regards, [1] http://infocenter.arm.com/help/topic/com.arm.doc.den0049a/DEN0049A_IO_Remapping_Table.pdf If ACPI is not used IORT (sbdf - StreamID - deviceID mapping) has to be done in device tree. Can we add this as a TODO. So that first series of patches can be accepted with StreamID == DeviceID = sbdf ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Monday 06 July 2015 04:13 PM, Julien Grall wrote: On 05/07/15 06:55, Manish Jaggi wrote: 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. That wasn't my question. I asked, how Xen will find the mapping between the gdbf and vDeviceID? He doesn't have access to the firmware table and therefore not able to find the right one. I believe gsbdf and vDeviceID would be same. In the hypercall processing its_assign_device would be called with params its_assign_device(sbdf, gsbdf, domid) Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Monday 06 July 2015 02:41 PM, Ian Campbell wrote: On Sun, 2015-07-05 at 11:25 +0530, Manish Jaggi wrote: On Monday 29 June 2015 04:01 PM, Julien Grall wrote: Hi Manish, On 28/06/15 19:38, Manish Jaggi wrote: 4.1 Holes in guest memory space Holes are added in the guest memory space for mapping pci device's BAR regions. These are defined in arch-arm.h /* For 32bit */ GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE /* For 64bit */ GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE The memory layout for 32bit and 64bit are exactly the same. Why do you need to differ here? I think Ian has already replied. I will change the name of macro 4.2 New entries in xenstore for device BARs toolkit also updates the xenstore information for the device (virtualbar:physical bar). This information is read by xenpciback and returned to the pcifront driver configuration space accesses. Can you details what do you plan to put in xenstore and how? It is implementation . But I plan to put under domU / device / heirarchy Actually, xenstore is an API of sorts which needs to be maintained going forward (since front and backend can evolve separately, so it does need some level of design and documentation. What about the expansion ROM? Do you want to put some restriction on not using expansion ROM as a passthrough device. expansion ROM as a passthrough device doesn't make sense to me, passthrough devices may _have_ an expansion ROM. The expansion ROM is just another BAR. I don't know how pcifront/back deal with those today on PV x86, but I see no reason for ARM to deviate. 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. - Who will call this hypercall? - Why not setting the gsbdf when the device is assigned? Can the maintainer of the pciback suggest an alternate. That's not me, but I don't think this belongs here, I think it can be done from the toolstack. If you think not then please explain what information the toolstack doesn't have in its possession which prevents this mapping from being done there. The toolstack does not have the guest sbdf information. I could only find it in xenpciback. The answer to your question is that I have only found a place to issue the hypercall where all the information can be located is the function __xen_pcibk_add_pci_dev drivers/xen/xen-pciback/vpci.c unlock: ... kfree(dev_entry); + /*Issue Hypercall here */ +#ifdef CONFIG_ARM64 + map_sbdf.domain_id = pdev-xdev-otherend_id; + map_sbdf.sbdf_s = dev-bus-domain_nr; + map_sbdf.sbdf_b = dev-bus-number; + map_sbdf.sbdf_d = dev-devfn3; + map_sbdf.sbdf_f = dev-devfn 0x7; + map_sbdf.gsbdf_s = 0; + map_sbdf.gsbdf_b = 0; + map_sbdf.gsbdf_d = slot; + map_sbdf.gsbdf_f = dev-devfn 0x7; + pr_info(## sbdf = %d:%d:%d.%d g_sbdf %d:%d:%d.%d \ + domain_id=%d ##\r\n, + map_sbdf.sbdf_s, + map_sbdf.sbdf_b, + map_sbdf.sbdf_d, + map_sbdf.sbdf_f, + map_sbdf.gsbdf_s, + map_sbdf.gsbdf_b, + map_sbdf.gsbdf_d, + map_sbdf.gsbdf_f, + map_sbdf.domain_id); + + err = HYPERVISOR_physdev_op(PHYSDEVOP_map_sbdf, map_sbdf); + if (err) + printk(KERN_ERR Xen Error PHYSDEVOP_map_sbdf); +#endif --- Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Sunday 05 July 2015 11:25 AM, Manish Jaggi wrote: On Monday 29 June 2015 04:01 PM, Julien Grall wrote: Hi Manish, On 28/06/15 19:38, Manish Jaggi wrote: 4.1 Holes in guest memory space Holes are added in the guest memory space for mapping pci device's BAR regions. These are defined in arch-arm.h /* For 32bit */ GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE /* For 64bit */ GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE The memory layout for 32bit and 64bit are exactly the same. Why do you need to differ here? I think Ian has already replied. I will change the name of macro 4.2 New entries in xenstore for device BARs toolkit also updates the xenstore information for the device (virtualbar:physical bar). This information is read by xenpciback and returned to the pcifront driver configuration space accesses. Can you details what do you plan to put in xenstore and how? It is implementation . But I plan to put under domU / device / heirarchy What about the expansion ROM? Do you want to put some restriction on not using expansion ROM as a passthrough device. 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. The hypercall handler in xen, would call its_assign_device(sbdf, gsbdf, domid); - Who will call this hypercall? - Why not setting the gsbdf when the device is assigned? Can the maintainer of the pciback suggest an alternate. The answer to your question is that I have only found a place to issue the hypercall where all the information can be located is the function __xen_pcibk_add_pci_dev drivers/xen/xen-pciback/vpci.c unlock: ... kfree(dev_entry); + /*Issue Hypercall here */ +#ifdef CONFIG_ARM64 + map_sbdf.domain_id = pdev-xdev-otherend_id; + map_sbdf.sbdf_s = dev-bus-domain_nr; + map_sbdf.sbdf_b = dev-bus-number; + map_sbdf.sbdf_d = dev-devfn3; + map_sbdf.sbdf_f = dev-devfn 0x7; + map_sbdf.gsbdf_s = 0; + map_sbdf.gsbdf_b = 0; + map_sbdf.gsbdf_d = slot; + map_sbdf.gsbdf_f = dev-devfn 0x7; + pr_info(## sbdf = %d:%d:%d.%d g_sbdf %d:%d:%d.%d \ + domain_id=%d ##\r\n, + map_sbdf.sbdf_s, + map_sbdf.sbdf_b, + map_sbdf.sbdf_d, + map_sbdf.sbdf_f, + map_sbdf.gsbdf_s, + map_sbdf.gsbdf_b, + map_sbdf.gsbdf_d, + map_sbdf.gsbdf_f, + map_sbdf.domain_id); + + err = HYPERVISOR_physdev_op(PHYSDEVOP_map_sbdf, map_sbdf); + if (err) + printk(KERN_ERR Xen Error PHYSDEVOP_map_sbdf); +#endif --- Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2
Ian Campbell Wrote: On Mon, 2015-06-29 at 00:08 +0530, Manish Jaggi wrote: PCI Pass-through in Xen ARM -- Draft 2 Index 1. Background 2. Basic PCI Support in Xen ARM 2.1 pci_hostbridge and pci_hostbridge_ops 2.2 PHYSDEVOP_HOSTBRIDGE_ADD hypercall 3. Dom0 Access PCI devices 4. DomU assignment of PCI device 4.1 Holes in guest memory space 4.2 New entries in xenstore for device BARs 4.3 Hypercall for bdf mapping noification to xen 4.4 Change in Linux PCI FrontEnd - backend driver for MSI/X programming 5. NUMA and PCI passthrough 6. DomU pci device attach flow Revision History Changes from Draft 1 a) map_mmio hypercall removed from earlier draft b) device bar mapping into guest not 1:1 c) holes in guest address space 32bit / 64bit for MMIO virtual BARs d) xenstore device's BAR info addition. 1. Background of PCI passthrough Passthrough refers to assigning a pci device to a guest domain (domU) such that the guest has full control over the device.The MMIO space and interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. In case of MSI/X the device writes to GITS (ITS address space) Interrupt Translation Register. 2. Basic PCI Support for ARM The apis to read write from pci configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the pci host controller. ARM PCI support in Xen, introduces pci host controller similar to what exists in Linux. Each drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. 2.1: The init function in the pci host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A pci conf read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes) { pci_hostbridge_t *pcihb; list_for_each_entry(pcihb, pci_hostbridge_list, list) { if(pcihb-segno == seg) return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes); } return -1; } 2.2 PHYSDEVOP_pci_host_bridge_add hypercall Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus there needs to be a mechanism to bind the segment number assigned by dom0 to the pci host controller. The hypercall is introduced: #define PHYSDEVOP_pci_host_bridge_add44 struct physdev_pci_host_bridge_add { /* IN */ uint16_t seg; uint64_t cfg_base; uint64_t cfg_size; }; This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add hypercall. The handler code invokes to update segment number in pci_hostbridge: int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size); Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops of the respective pci_hostbridge. 3. Dom0 access PCI device - As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each device the MMIO space has to be mapped in the Stage2 translation for dom0. Here device is really host bridge, isn't it? i.e. this is done by mapping the entire MMIO window of each host bridge, not the individual BAR registers of each device one at a time. No the device means the PCIe EP device not RC. IOW this is functionality of the pci host driver's intitial setup, not something which is driven from the dom0 enumeration of the bus. For dom0 xen maps the ranges in pci nodes in stage 2 translation. GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that MSI/X must work. This is done in vits initialization in dom0/domU. This also happens at start of day, but what isn't mentioned is that (AIUI) the SMMU will need to be programmed to map each SBDF to the dom0 p2m as the devices are discovered and reported. Right? Yes, I will add SMMU section in the Draft3. 4. DomU access / assignment PCI device -- In the flow of pci-attach device, the toolkit I assume you mean toolstack throughout? If so
Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
On Monday 29 June 2015 04:01 PM, Julien Grall wrote: Hi Manish, On 28/06/15 19:38, Manish Jaggi wrote: 4.1 Holes in guest memory space Holes are added in the guest memory space for mapping pci device's BAR regions. These are defined in arch-arm.h /* For 32bit */ GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE /* For 64bit */ GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE The memory layout for 32bit and 64bit are exactly the same. Why do you need to differ here? I think Ian has already replied. I will change the name of macro 4.2 New entries in xenstore for device BARs toolkit also updates the xenstore information for the device (virtualbar:physical bar). This information is read by xenpciback and returned to the pcifront driver configuration space accesses. Can you details what do you plan to put in xenstore and how? It is implementation . But I plan to put under domU / device / heirarchy What about the expansion ROM? Do you want to put some restriction on not using expansion ROM as a passthrough device. 4.3 Hypercall for bdf mapping notification to xen --- #define PHYSDEVOP_map_sbdf 43 typedef struct { u32 s; u8 b; u8 df; u16 res; } sbdf_t; struct physdev_map_sbdf { int domain_id; sbdf_tsbdf; sbdf_tgsbdf; }; Each domain has a pdev list, which contains the list of all pci devices. The pdev structure already has a sbdf information. The arch_pci_dev is updated to contain the gsbdf information. (gs- guest segment id) Whenever there is trap from guest or an interrupt has to be injected, the pdev list is iterated to find the gsbdf. Can you give more background for this section? i.e: - Why do you need this? - How xen will translate the gbdf to a vDeviceID? In the context of the hypercall processing. - Who will call this hypercall? - Why not setting the gsbdf when the device is assigned? Can the maintainer of the pciback suggest an alternate. The answer to your question is that I have only found a place to issue the hypercall where all the information can be located is the function __xen_pcibk_add_pci_dev drivers/xen/xen-pciback/vpci.c unlock: ... kfree(dev_entry); + /*Issue Hypercall here */ +#ifdef CONFIG_ARM64 + map_sbdf.domain_id = pdev-xdev-otherend_id; + map_sbdf.sbdf_s = dev-bus-domain_nr; + map_sbdf.sbdf_b = dev-bus-number; + map_sbdf.sbdf_d = dev-devfn3; + map_sbdf.sbdf_f = dev-devfn 0x7; + map_sbdf.gsbdf_s = 0; + map_sbdf.gsbdf_b = 0; + map_sbdf.gsbdf_d = slot; + map_sbdf.gsbdf_f = dev-devfn 0x7; + pr_info(## sbdf = %d:%d:%d.%d g_sbdf %d:%d:%d.%d \ + domain_id=%d ##\r\n, + map_sbdf.sbdf_s, + map_sbdf.sbdf_b, + map_sbdf.sbdf_d, + map_sbdf.sbdf_f, + map_sbdf.gsbdf_s, + map_sbdf.gsbdf_b, + map_sbdf.gsbdf_d, + map_sbdf.gsbdf_f, + map_sbdf.domain_id); + + err = HYPERVISOR_physdev_op(PHYSDEVOP_map_sbdf, map_sbdf); + if (err) + printk(KERN_ERR Xen Error PHYSDEVOP_map_sbdf); +#endif --- Regards, ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] PCI Pass-through in Xen ARM - Draft 2.
PCI Pass-through in Xen ARM -- Draft 2 Index 1. Background 2. Basic PCI Support in Xen ARM 2.1 pci_hostbridge and pci_hostbridge_ops 2.2 PHYSDEVOP_HOSTBRIDGE_ADD hypercall 3. Dom0 Access PCI devices 4. DomU assignment of PCI device 4.1 Holes in guest memory space 4.2 New entries in xenstore for device BARs 4.3 Hypercall for bdf mapping noification to xen 4.4 Change in Linux PCI FrontEnd - backend driver for MSI/X programming 5. NUMA and PCI passthrough 6. DomU pci device attach flow Revision History Changes from Draft 1 a) map_mmio hypercall removed from earlier draft b) device bar mapping into guest not 1:1 c) holes in guest address space 32bit / 64bit for MMIO virtual BARs d) xenstore device's BAR info addition. 1. Background of PCI passthrough Passthrough refers to assigning a pci device to a guest domain (domU) such that the guest has full control over the device.The MMIO space and interrupts are managed by the guest itself, close to how a bare kernel manages a device. Device's access to guest address space needs to be isolated and protected. SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device access guest memory for data transfer and sending MSI/X interrupts. In case of MSI/X the device writes to GITS (ITS address space) Interrupt Translation Register. 2. Basic PCI Support for ARM The apis to read write from pci configuration space are based on segment:bdf. How the sbdf is mapped to a physical address is under the realm of the pci host controller. ARM PCI support in Xen, introduces pci host controller similar to what exists in Linux. Each drivers registers callbacks, which are invoked on matching the compatible property in pci device tree node. 2.1: The init function in the pci host driver calls to register hostbridge callbacks: int pci_hostbridge_register(pci_hostbridge_t *pcihb); struct pci_hostbridge_ops { u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes); void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn, u32 reg, u32 bytes, u32 val); }; struct pci_hostbridge{ u32 segno; paddr_t cfg_base; paddr_t cfg_size; struct dt_device_node *dt_node; struct pci_hostbridge_ops ops; struct list_head list; }; A pci conf read function would internally be as follows: u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes) { pci_hostbridge_t *pcihb; list_for_each_entry(pcihb, pci_hostbridge_list, list) { if(pcihb-segno == seg) return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes); } return -1; } 2.2 PHYSDEVOP_pci_host_bridge_add hypercall Xen code accesses PCI configuration space based on the sbdf received from the guest. The order in which the pci device tree node appear may not be the same order of device enumeration in dom0. Thus there needs to be a mechanism to bind the segment number assigned by dom0 to the pci host controller. The hypercall is introduced: #define PHYSDEVOP_pci_host_bridge_add44 struct physdev_pci_host_bridge_add { /* IN */ uint16_t seg; uint64_t cfg_base; uint64_t cfg_size; }; This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add hypercall. The handler code invokes to update segment number in pci_hostbridge: int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size); Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops of the respective pci_hostbridge. 3. Dom0 access PCI device - As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each device the MMIO space has to be mapped in the Stage2 translation for dom0. For dom0 xen maps the ranges in pci nodes in stage 2 translation. GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that MSI/X must work. This is done in vits initialization in dom0/domU. 4. DomU access / assignment PCI device -- In the flow of pci-attach device, the toolkit will read the pci configuration space BAR registers. The toolkit has the guest memory map and the information of the MMIO holes. When the first pci device is assigned to domU, toolkit allocates a virtual BAR region from the MMIO hole area. toolkit then sends domctl xc_domain_memory_mapping to map in stage2 translation. 4.1 Holes in guest memory space Holes are added in the guest memory space for mapping pci device's BAR regions. These are defined in arch-arm.h /* For 32bit */ GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE /* For 64bit */ GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE 4.2 New entries in xenstore for device BARs toolkit also updates the xenstore information for the
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Friday 26 June 2015 01:02 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and friends do. Using some xen hypercall or a xl-dom0 ioctl ? No, using normal pre-existing Linux functionality. If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. The xenstore read would happen once on device attach, at the same time you are reading the rest of the dev-NNN stuff relating to the just attached device. Doing a xenstore transaction on every BAR read would indeed be silly and doing a hypercall would not be much better. There is no need for either a xenstore read or a hypercall during the cfg space access itself, you just read the value from a pciback datastructure. Add to that the fact that any new hypercall made from dom0 needs to be added as a stable interface I can't see any reason to go with such a model. I think you are overlooking a point which is From what region the virtual BAR be allocated ? One way is for xen to keep a hole for domains where the bar regions be mapped. This is not there as of now. How would the tools know about this hole ? A domctl is required ? For this reason I was suggesting a hypercall to xen to map the physical BARs and return the virtualBARs. Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Friday 26 June 2015 02:39 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 14:20 +0530, Manish Jaggi wrote: On Friday 26 June 2015 01:02 PM, Ian Campbell wrote: On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and friends do. Will implement that. Using some xen hypercall or a xl-dom0 ioctl ? No, using normal pre-existing Linux functionality. If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. The xenstore read would happen once on device attach, at the same time you are reading the rest of the dev-NNN stuff relating to the just attached device. Doing a xenstore transaction on every BAR read would indeed be silly and doing a hypercall would not be much better. There is no need for either a xenstore read or a hypercall during the cfg space access itself, you just read the value from a pciback datastructure. Add to that the fact that any new hypercall made from dom0 needs to be added as a stable interface I can't see any reason to go with such a model. I think you are overlooking a point which is From what region the virtual BAR be allocated ? One way is for xen to keep a hole for domains where the bar regions be mapped. This is not there as of now. How would the tools know about this hole ? I think you've overlooked the point that _only_ the tools know enough about the overall guest address space layout to know about this hole. Xen has no need to know anything
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. Moreover a pci driver would read BARs only once. c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it As Julien has noted pciback already deals with this correctly, because sizing a BAR involves a write, it implementes a scheme which allows either the hardcoded virtual BAR to be written or all 1s (needed for size detection). d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) In fact I'm now (after Julien pointed out the current behaviour of pciback) in favour of (1), although I'm not sure if Stefano is too. (I was never in favour of (2), FWIW, I previously was in favour of (3) which is like (2) except pciback makes the hypervcall to map the virtual bars to the guest, I'd still favour that over (2) but (1) is now my preference) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. Xen would return the IPA for pci-back to return to the request to domU. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? yes, I was working on that only. I was traveling this week 24 hour flights jetlag... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote: On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote: On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote: On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote: On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote: On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote: On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote: On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU While implementing I think rather than the toolstack, pciback driver in dom0 can send the hypercall by to map the physical bar to virtual bar. Thus no xenstore entry is required for BARs. pciback doesn't (and shouldn't) have sufficient knowledge of the guest address space layout to determine what the virtual BAR should be. The toolstack is the right place for that decision to be made. Yes, the point is the pciback driver reads the physical BAR regions on request from domU. So it sends a hypercall to map the physical bars into stage2 translation for the domU through xen. Xen would use the holes left in IPA for MMIO. I still think it is the toolstack which should do this, that's whewre these sorts of layout decisions belong. can the xl tools read pci conf space ? Using some xen hypercall or a xl-dom0 ioctl ? If not then there is no otherway but xenpciback Also I need to introduce a hypercall which would tell toolkit the available holes for virtualBAR mapping. Much simpler is let xen allocate a virtualBAR and return to the caller. At init - sure. But when the guest is running and doing those sort of things. Unless you want guest - pciback - xenstore - libxl - hypercall - send ack on xenstore - pciback - guest. That would entail adding some pcibkack - user-space tickle mechanism and another back. Much simpler to do all of this in xenpciback I think? I agree. If the xenpciback sends a hypercall whenever a BAR read access, the mapping in xen would already have been done, so xen would simply be doing PA-IPA lookup. No xenstore lookup is required. Xen would return the IPA for pci-back to return to the request to domU. Moreover a pci driver would read BARs only once. You can't assume that though, a driver can do whatever it likes, or the module might be unloaded and reloaded in the guest etc etc. Are you going to send out a second draft based on the discussion so far? yes, I was working on that only. I was traveling this week 24 hour flights jetlag... Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] PCI Passthrough ARM Design : Draft1
On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote: On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote: Yes, pciback is already capable of doing that, see drivers/xen/xen-pciback/conf_space.c I am not sure if the pci-back driver can query the guest memory map. Is there an existing hypercall ? No, that is missing. I think it would be OK for the virtual BAR to be initialized to the same value as the physical BAR. But I would let the guest change the virtual BAR address and map the MMIO region wherever it wants in the guest physical address space with XENMEM_add_to_physmap_range. I disagree, given that we've apparently survived for years with x86 PV guests not being able to right to the BARs I think it would be far simpler to extend this to ARM and x86 PVH too than to allow guests to start writing BARs which has various complex questions around it. All that's needed is for the toolstack to set everything up and write some new xenstore nodes in the per-device directory with the BAR address/size. Also most guests apparently don't reassign the PCI bus by default, so using a 1:1 by default and allowing it to be changed would require modifying the guests to reasssign. Easy on Linux, but I don't know about others and I imagine some OSes (especially simpler/embedded ones) are assuming the firmware sets up something sane by default. Does the Flow below captures all points a) When assigning a device to domU, toolstack creates a node in per device directory with virtual BAR address/size Option1: b) toolstack using some hypercall ask xen to create p2m mapping { virtual BAR : physical BAR } for domU c) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it d) when domU queries BAR address from pci-back the virtual BAR address is provided. Option2: b) domU will not anytime update the BARs, if it does then it is a fault, till we decide how to handle it c) when domU queries BAR address from pci-back the virtual BAR address is provided. d) domU sends a hypercall to map virtual BARs, e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } for domU Which option is better I think Ian is for (2) and Stefano may be (1) Ian. ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel