[Xen-devel] [Draft Design v3] ACPI/IORT Support in Xen.

2017-11-20 Thread Manish Jaggi


 ACPI/IORT Support in Xen.
 --
  Draft 3

 Revision History:

 Changes since v2:
 - Modified as per comments from Julien /Sameer/Andre

 Changes since v1:
 - Modified IORT Parsing data structures.
 - Added RID-StreamID and RID-DeviceID map as per Andre's suggestion.
 - Added reference code which can be read along with this document.
 - Removed domctl for DomU, it would be covered in PCI-PT design.

 Introduction:
 -

 I had sent out patch series [0] to hide smmu from Dom0 IORT.
 This document is a rework of the series as it:
 (a) extends scope by adding parsing of IORT table once
 and storing it in in-memory data structures, which can then be used
 for querying. This would eliminate the need to parse complete iort
 table multiple times.

 (b) Generation of IORT for domains be independent using a set of
 helper routines.

 Index
 
 1. What is IORT. What are its components ?
 2. Current Support in Xen
 3. IORT for Dom0
 4. IORT for DomU
 5. Parsing of IORT in Xen
 6. Generation of IORT
7. Implementation Phases
 8. References

 1. IORT Structure ?
 
 IORT refers to Input Output remapping table. It is essentially used to 
find
 information about the IO topology (PCIRC-SMMU-ITS) and relationships 
between

 devices.

 A general structure of IORT [1]:
 It has nodes for PCI RC, SMMU, ITS and Platform devices. Using an IORT 
table

 relationship between RID -> StreamID -> Deviceid can be obtained.
 Which device is behind which SMMU and which interrupt controller, topology
 is described in IORT Table.

 Some PCI RC may be not behind an SMMU, and directly map RID-DeviceID.

 RID is a requester ID in PCI context,
 StreamID is the ID of the device in SMMU context,
 DeviceID is the ID programmed in ITS.

 Each iort_node contains an ID map array to translate one ID into another.
 IDmap Entry {input_range, output_range, output_node_ref, id_count}
 This array is associated with PCI RC node, SMMU node, Named component 
node.

 and can reference to a SMMU or ITS node.

 2. Support of IORT
 ---
 It is proposed in this document to parse iort once and use the information
 to translate RID without traversing IORT again and again.

 Also Xen prepares an IORT table for dom0 based on host IORT.
 For DomU IORT table is required only in case of device passthrough.

 3. IORT for Dom0
 -
 IORT for Dom0 is based on host iort. Few nodes could be removed or 
modified.

   For instance
 - Host SMMU nodes should not be present as Xen should only touch it.
 - platform devices (named components) would be passed as is. The 
visibility

   criterion for DOM0 is TDB.

 4. IORT for DomU
 -
 IORT for DomU should be generated by toolstack. IORT table is only present
 in case of device passthrough.

 At a minimum domU IORT should include a single PCIRC and ITS Group.
 Similar PCIRC can be added in DSDT.
 The exact structure of DomU IORT would be covered along with PCI PT 
design.


 5. Parsing of IORT in Xen
 --
 IORT nodes can be saved in structures so that IORT table parsing can 
be done
 once and is reused by all xen subsystems like ITS / SMMU etc, domain 
creation.

 Proposed are the structures to hold IORT information. [4]

 struct rid_map_struct {
    void *pcirc_node;
    u16 inpute_base;
    u32 output_base;
    u16 id_ccount;
    struct list_head entry;
 };

Two global variables would hold the maps.
  struct list_head rid_streamid_map;
  struct list_head rid_deviceid_map;

 5.1 Functions to query StreamID and DeviceID from RID.

 void query_streamid(void *pcirc_node, u16 rid, u32 *streamid);
 void query_deviceid(void *pcirc_node, u16 rid, u32 *deviceid);

 Adding a mapping is done via helper functions

 int add_rid_streamid_map(void *pcirc_node, u32 ib, u32 ob, u32 idc)
 int add_rid_deviceid_map(void *pcirc_node, u32 ib, u32 ob, u32 idc)
 - rid-streamid map is straight forward and is created using pci_rc's idmap
 - rid-deviceid map is created by translating streamids to deviceids.
  fixup_rid_deviceid_map function does that. (See [6])

To keep the api similar to linux iort_node_map_rid be mapped to
query_streamid

6. IORT Generation
---
It is proposed to have a common helper library to generate IORT for dom0/U.
Note: it is desired to have IORT generation code sharing between toolstack
and Xen.

a. For Dom0
 rid_deviceId_map can be used directly to generate dom0 IORT table.
 Exclusions of nodes is still open for suggestions.

b. For DomU
 Minimal structure is discussed in section 4. It will be further discussed
 in the context of PCI PT design.

7. Implementation Phases
-
a. IORT Parsing and RID Query
b. IORT Generation for Dom0
c. IORT Generation for DomU.

8. References:
-
[0] https://www.mail-archive.com/xen-devel@lists.xen.org/msg121667.html
[1] ARM DEN0049C: 

Re: [Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.

2017-11-16 Thread Manish Jaggi



On 11/16/2017 5:23 PM, Julien Grall wrote:

Hi Manish,

On 16/11/17 11:46, Manish Jaggi wrote:



On 11/16/2017 5:07 PM, Julien Grall wrote:



On 16/11/17 07:39, Manish Jaggi wrote:

On 11/14/2017 6:53 PM, Julien Grall wrote:

3. IORT for Dom0
-
IORT for Dom0 is based on host iort. Few nodes could be removed 
or modified.

  For instance
- Host SMMU nodes should not be present as Xen should only touch it.
- platform nodes (named components) may be controlled by xen 
command line.


I am not sure where does this example come from? As I said, there 
are no plan to support Platform Device passthrough with ACPI. A 
better example here would removing PMCG.


It came from review comments on my previous IORT SMMU hiding patch. 
Andre suggested that Platform Nodes are needed.


After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. ...

From: 
https://www.mail-archive.com/xen-devel@lists.xen.org/msg123434.html


I think you misunderstood my comment here... What I call "device 
passthrough" is giving access to a device to a domain other than the 
Hardware Domain


There are no plan for supporting platform device-passthrough on ACPI 
and I don't understand why you would like to control that using the 
command line.


What Andre was saying is your series was not covering the "named 
components" for the Hardware Domain.


The section 3 is IORT for Dom0, where I mentioned that  some platform 
devices can be hidden from dom0.
So your comment on Platform device Passthrough might not be valid 
then as it is for domU's only.


Regarding the visibility of a platform device for dom0, I took cue 
from your comment below


Where did I ever mention the command line solution? Please stop trying 
to put words in my mouth.


There are other reason than passthrough to hide device from the 
Hardware Domain.



Lets put some clarity on the below items specifically for dom0
a. can platform devices can be part of dom0 IORT ?
b. If (a) yes, then how to decide on a finer grain the visibility of 
platform devices for Dom0

    Update ACPI tables to remove the device?
c. Is fine grain visibility of platform device for dom0 to be covered in 
my current patchset



This has two benefits:

...
3) We could decide in a finer grain which devices (e.g platform 
device)Dom0 can see.
From: 
https://www.mail-archive.com/xen-devel@lists.xen.org/msg124534.html




Cheers,








___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.

2017-11-16 Thread Manish Jaggi



On 11/16/2017 5:07 PM, Julien Grall wrote:



On 16/11/17 07:39, Manish Jaggi wrote:

On 11/14/2017 6:53 PM, Julien Grall wrote:

3. IORT for Dom0
-
IORT for Dom0 is based on host iort. Few nodes could be removed or 
modified.

  For instance
- Host SMMU nodes should not be present as Xen should only touch it.
- platform nodes (named components) may be controlled by xen 
command line.


I am not sure where does this example come from? As I said, there 
are no plan to support Platform Device passthrough with ACPI. A 
better example here would removing PMCG.


It came from review comments on my previous IORT SMMU hiding patch. 
Andre suggested that Platform Nodes are needed.


After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. ...

From: 
https://www.mail-archive.com/xen-devel@lists.xen.org/msg123434.html


I think you misunderstood my comment here... What I call "device 
passthrough" is giving access to a device to a domain other than the 
Hardware Domain


There are no plan for supporting platform device-passthrough on ACPI 
and I don't understand why you would like to control that using the 
command line.


What Andre was saying is your series was not covering the "named 
components" for the Hardware Domain.


The section 3 is IORT for Dom0, where I mentioned that  some platform 
devices can be hidden from dom0.
So your comment on Platform device Passthrough might not be valid then 
as it is for domU's only.


Regarding the visibility of a platform device for dom0, I took cue from 
your comment below


This has two benefits:

...
3) We could decide in a finer grain which devices (e.g platform 
device)Dom0 can see.

From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg124534.html



Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.

2017-11-15 Thread Manish Jaggi



On 11/14/2017 6:53 PM, Julien Grall wrote:

Hi Manish,

Hey Julien,


On 08/11/17 14:38, Manish Jaggi wrote:

ACPI/IORT Support in Xen.
--
 Draft 2

Revision History:

Changes since v1-
- Modified IORT Parsing data structures.
- Added RID->StreamID and RID->DeviceID map as per Andre's suggestion.
- Added reference code which can be read along with this document.
- Removed domctl for DomU, it would be covered in PCI-PT design.

Introduction:
-

I had sent out patch series [0] to hide smmu from Dom0 IORT.
This document is a rework of the series as it:
(a) extends scope by adding parsing of IORT table once
and storing it in in-memory data structures, which can then be used
for querying. This would eliminate the need to parse complete iort
table multiple times.

(b) Generation of IORT for domains be independent using a set of
helper routines.

Index


1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Implementation Phases
8. References

1. IORT Structure ?

IORT refers to Input Output remapping table. It is essentially used 
to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships 
between

devices.

A general structure of IORT [1]:
It has nodes for PCI RC, SMMU, ITS and Platform devices. Using an 
IORT table

relationship between RID -> StreamID -> DeviceId can be obtained.
Which device is behind which SMMU and which interrupt controller, 
topology

is described in IORT Table.

Some PCI RC may be not behind an SMMU, and directly map RID->DeviceID.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

Each iort_node contains an ID map array to translate one ID into 
another.

IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is associated with PCI RC node, SMMU node, Named component 
node.

and can reference to a SMMU or ITS node.

2. Current Support of IORT
---
IORT is proposed to be used by Xen to setup SMMU's and platform devices
and for translating RID->StreamID and RID->DeviceID.


I am not sure to understand "to setup SMMU's and platform devices...". 
With IORT, a software can discover list of SMMUs and the IDs to 
configure the ITS and SMMUs for each device (e.g PCI, integrated...) 
on the platform. You will not be able to discover the list of platform 
devices through it.


Also, it is not really "proposed". It is the only way to get those 
information from ACPI.

ok, I will rephrase it.




It is proposed in this document to parse iort once and use the 
information

to translate RID without traversing IORT again and again.

Also Xen prepares an IORT table for dom0 based on host IORT.
For DomU IORT table proposed only in case of device passthrough.

3. IORT for Dom0
-
IORT for Dom0 is based on host iort. Few nodes could be removed or 
modified.

  For instance
- Host SMMU nodes should not be present as Xen should only touch it.
- platform nodes (named components) may be controlled by xen command 
line.


I am not sure where does this example come from? As I said, there are 
no plan to support Platform Device passthrough with ACPI. A better 
example here would removing PMCG.


It came from review comments on my previous IORT SMMU hiding patch. 
Andre suggested that Platform Nodes are needed.


After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. ...

From: https://www.mail-archive.com/xen-devel@lists.xen.org/msg123434.html



4. IORT for DomU
-
IORT for DomU should be generated by toolstack. IORT table is only 
present

in case of device passthrough.

At a minimum domU IORT should include a single PCIRC and ITS Group.
Similar PCIRC can be added in DSDT.
The exact structure of DomU IORT would be covered along with PCI PT 
design.


5. Parsing of IORT in Xen
--
IORT nodes can be saved in structures so that IORT table parsing can 
be done
once and is reused by all xen subsystems like ITS / SMMU etc, domain 
creation.

Proposed are the structures to hold IORT information. [4]

struct rid_map_struct {
 void *pcirc_node;
 u16 ib; /* Input base */
 u32 ob; /* Output base */
 u16 idc; /* Id Count */
  struct list_head entry;
};

struct iort_ref
{
 struct list_head rid_streamId_map;
 struct list_head rid_deviceId_map;
}iortref;

5.1 Functions to query StreamID and DeviceID from RID.

void query_streamId(void *pcirc_node, u16 rid, u32 *streamId);
void query_deviceId(void *pcirc_node, u16 rid, u32 *deviceId);

Adding a mapping is done via helper functions

intadd_rid_streamId_map(void*pcirc_node, u32 ib, u32 ob, u32 idc) 
intadd_rid_de

Re: [Xen-devel] [RFC v2 5/7] acpi:arm64: Add support for parsing IORT table

2017-11-08 Thread Manish Jaggi

Hi Sameer

On 9/21/2017 6:07 AM, Sameer Goel wrote:

Add support for parsing IORT table to initialize SMMU devices.
* The code for creating an SMMU device has been modified, so that the SMMU
device can be initialized.
* The NAMED NODE code has been commented out as this will need DOM0 kernel
support.
* ITS code has been included but it has not been tested.

Signed-off-by: Sameer Goel 
Followup of the discussions we had on iort parsing and querying streamID 
and deviceId based on RID.

I have extended your patchset with a patch that provides an alternative
way of parsing iort into maps : {rid-streamid}, {rid-deviceID)
which can directly be looked up for searching streamId for a rid. This
will remove the need to traverse iort table again.

The test patch just describes the proposed flow and how the parsing and
query code might fit in. I have not tested it.
The code only compiles.

https://github.com/mjaggi-cavium/xen-wip/commit/df006d64bdbb5c8344de5a710da8bf64c9e8edd5
(This repo has all 7 of your patches + test code patch merged.

Note: The commit text of the patch describes the basic flow /assumptions 
/ usage of functions.

Please see the code along with the v2 design draft.
[RFC] [Draft Design v2] ACPI/IORT Support in Xen.
https://lists.xen.org/archives/html/xen-devel/2017-11/msg00512.html

I seek your advice on this. Please provide your feedback.

Thanks
Manish



---
  xen/arch/arm/setup.c   |   3 +
  xen/drivers/acpi/Makefile  |   1 +
  xen/drivers/acpi/arm/Makefile  |   1 +
  xen/drivers/acpi/arm/iort.c| 173 +
  xen/drivers/passthrough/arm/smmu.c |   1 +
  xen/include/acpi/acpi_iort.h   |  17 ++--
  xen/include/asm-arm/device.h   |   2 +
  xen/include/xen/acpi.h |  21 +
  xen/include/xen/pci.h  |   8 ++
  9 files changed, 146 insertions(+), 81 deletions(-)
  create mode 100644 xen/drivers/acpi/arm/Makefile

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 92f173b..4ba09b2 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -49,6 +49,7 @@
  #include 
  #include 
  #include 
+#include 
  
  struct bootinfo __initdata bootinfo;
  
@@ -796,6 +797,8 @@ void __init start_xen(unsigned long boot_phys_offset,
  
  tasklet_subsys_init();
  
+/* Parse the ACPI iort data */

+acpi_iort_init();
  
  xsm_dt_init();
  
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile

index 444b11d..e7ffd82 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -1,5 +1,6 @@
  subdir-y += tables
  subdir-y += utilities
+subdir-$(CONFIG_ARM) += arm
  subdir-$(CONFIG_X86) += apei
  
  obj-bin-y += tables.init.o

diff --git a/xen/drivers/acpi/arm/Makefile b/xen/drivers/acpi/arm/Makefile
new file mode 100644
index 000..7c039bb
--- /dev/null
+++ b/xen/drivers/acpi/arm/Makefile
@@ -0,0 +1 @@
+obj-y += iort.o
diff --git a/xen/drivers/acpi/arm/iort.c b/xen/drivers/acpi/arm/iort.c
index 2e368a6..7f54062 100644
--- a/xen/drivers/acpi/arm/iort.c
+++ b/xen/drivers/acpi/arm/iort.c
@@ -14,17 +14,47 @@
   * This file implements early detection/parsing of I/O mapping
   * reported to OS through firmware via I/O Remapping Table (IORT)
   * IORT document number: ARM DEN 0049A
+ *
+ * Based on Linux drivers/acpi/arm64/iort.c
+ * => commit ca78d3173cff3503bcd15723b049757f75762d15
+ *
+ * Xen modification:
+ * Sameer Goel 
+ * Copyright (C) 2017, The Linux Foundation, All rights reserved.
+ *
   */
  
-#define pr_fmt(fmt)	"ACPI: IORT: " fmt

-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* Xen: Define compatibility functions */
+#define FW_BUG "[Firmware Bug]: "
+#define pr_err(fmt, ...) printk(XENLOG_ERR fmt, ## __VA_ARGS__)
+#define pr_warn(fmt, ...) printk(XENLOG_WARNING fmt, ## __VA_ARGS__)
+
+/* Alias to Xen allocation helpers */
+#define kfree xfree
+#define kmalloc(size, flags)_xmalloc(size, sizeof(void *))
+#define kzalloc(size, flags)_xzalloc(size, sizeof(void *))
+
+/* Redefine WARN macros */
+#undef WARN
+#undef WARN_ON
+#define WARN(condition, format...) ({  \
+   int __ret_warn_on = !!(condition);  \
+   if (unlikely(__ret_warn_on))\
+   printk(format); \
+   unlikely(__ret_warn_on);\
+})
+#define WARN_TAINT(cond, taint, format...) WARN(cond, format)
+#define WARN_ON(cond)  (!!cond)
  
  #define IORT_TYPE_MASK(type)	(1 << (type))

  #define IORT_MSI_TYPE (1 << ACPI_IORT_NODE_ITS_GROUP)
@@ -256,6 +286,13 @@ static acpi_status iort_match_node_callback(struct 
acpi_iort_node *node,
acpi_status status;
  
  	if (node->type == 

[Xen-devel] [RFC] [Draft Design v2] ACPI/IORT Support in Xen.

2017-11-08 Thread Manish Jaggi

ACPI/IORT Support in Xen.
--
Draft 2

Revision History:

Changes since v1-
- Modified IORT Parsing data structures.
- Added RID->StreamID and RID->DeviceID map as per Andre's suggestion.
- Added reference code which can be read along with this document.
- Removed domctl for DomU, it would be covered in PCI-PT design.

Introduction:
-

I had sent out patch series [0] to hide smmu from Dom0 IORT.
This document is a rework of the series as it:
(a) extends scope by adding parsing of IORT table once
and storing it in in-memory data structures, which can then be used
for querying. This would eliminate the need to parse complete iort
table multiple times.

(b) Generation of IORT for domains be independent using a set of
helper routines.

Index


1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Implementation Phases
8. References

1. IORT Structure ?

IORT refers to Input Output remapping table. It is essentially used to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships between
devices.

A general structure of IORT [1]:
It has nodes for PCI RC, SMMU, ITS and Platform devices. Using an IORT table
relationship between RID -> StreamID -> DeviceId can be obtained.
Which device is behind which SMMU and which interrupt controller, topology
is described in IORT Table.

Some PCI RC may be not behind an SMMU, and directly map RID->DeviceID.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

Each iort_node contains an ID map array to translate one ID into another.
IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is associated with PCI RC node, SMMU node, Named component node.
and can reference to a SMMU or ITS node.

2. Current Support of IORT
---
IORT is proposed to be used by Xen to setup SMMU's and platform devices
and for translating RID->StreamID and RID->DeviceID.

It is proposed in this document to parse iort once and use the information
to translate RID without traversing IORT again and again.

Also Xen prepares an IORT table for dom0 based on host IORT.
For DomU IORT table proposed only in case of device passthrough.

3. IORT for Dom0
-
IORT for Dom0 is based on host iort. Few nodes could be removed or modified.
 For instance
- Host SMMU nodes should not be present as Xen should only touch it.
- platform nodes (named components) may be controlled by xen command line.

4. IORT for DomU
-
IORT for DomU should be generated by toolstack. IORT table is only present
in case of device passthrough.

At a minimum domU IORT should include a single PCIRC and ITS Group.
Similar PCIRC can be added in DSDT.
The exact structure of DomU IORT would be covered along with PCI PT design.

5. Parsing of IORT in Xen
--
IORT nodes can be saved in structures so that IORT table parsing can be done
once and is reused by all xen subsystems like ITS / SMMU etc, domain creation.
Proposed are the structures to hold IORT information. [4]

struct rid_map_struct {
void *pcirc_node;
u16 ib; /* Input base */
u32 ob; /* Output base */
u16 idc; /* Id Count */
struct list_head entry;
};

struct iort_ref
{
struct list_head rid_streamId_map;
struct list_head rid_deviceId_map;
}iortref;

5.1 Functions to query StreamID and DeviceID from RID.

void query_streamId(void *pcirc_node, u16 rid, u32 *streamId);
void query_deviceId(void *pcirc_node, u16 rid, u32 *deviceId);

Adding a mapping is done via helper functions

intadd_rid_streamId_map(void*pcirc_node, u32 ib, u32 ob, u32 idc) 
intadd_rid_deviceId_map(void*pcirc_node, u32 ib, u32 ob, u32 idc) - rid-streamId map is straight forward and is created using pci_rc's idmap

- rid-deviceId map is created by translating streamIds to deviceIds.
fixup_rid_deviceId_map function does that. (See [6])

It is proposed that query functions should replace functions like
iort_node_map_rid which is currently used in linux and is imported in Xen
in the patchset [2][5]

5.2 Proposed Flow of parsing
The flow is based on the patchset in [5]. I have added a reference code on
top of it which does IORT parsing as described in this section. The code
is available at [6].

The commit also describes the code flow and assumptions.

6. IORT Generation
---
It is proposed to have a common helper library to generate IORT for dom0/U.
Note: it is desired to have IORT generation code sharing between toolstack
and Xen.

a. For Dom0
 rid_deviceId_map can be used directly to generate dom0 IORT table.
 Exclusions of nodes is still open for suggestions.

b. For DomU
 Minimal structure is discussed in section 4. It will be further discussed
 in the context of PCI 

Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-31 Thread Manish Jaggi



On 10/31/2017 5:03 AM, Goel, Sameer wrote:

On 10/12/2017 3:03 PM, Manish Jaggi wrote:

ACPI/IORT Support in Xen.
--

I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending the scope
and including all that is required to support ACPI/IORT in Xen. Presenting for 
review
first _draft_ of design of ACPI/IORT support in Xen. Not complete though.

Discussed is the parsing and generation of IORT table for Dom0 and DomUs.
It is proposed that IORT be parsed and the information in saved into xen 
data-structure
say host_iort_struct and is reused by all xen subsystems like ITS / SMMU etc.

Since this is first draft is open to technical comments, modifications
and suggestions. Please be open and feel free to add any missing points / 
additions.

1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Future Work and TODOs

1. What is IORT. What are its components ?

IORT refers to Input Output remapping table. It is essentially used to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships between
devices.

A general structure of IORT is has nodes which have information about PCI RC,
SMMU, ITS and Platform devices. Using an IORT table relationship between
RID -> StreamID -> DeviceId can be obtained. More specifically which device is
behind which SMMU and which interrupt controller, this topology is described in
IORT Table.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

For a non-pci device RID could be simply an ID.

Each iort_node contains an ID map array to translate from one ID into another.
IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is present in PCI RC node,SMMU node, Named component node etc
and can reference to a SMMU or ITS node.

2. Current Support of IORT
---
Currently Xen passes host IORT table to dom0 without any modifications.
For DomU no IORT table is passed.

3. IORT for Dom0
-
IORT for Dom0 is prepared by xen and it is fairly similar to the host iort.
However few nodes could be removed removed or modified. For instance
- host SMMU nodes should not be present
- ITS group nodes are same as host iort but, no stage2 mapping is done for them.
- platform nodes (named components) may be selectively present depending on the
case where xen is using some. This could be controlled by  xen command line.
- More items : TODO

4. IORT for DomU
---
  

IORT for DomU is generated by the toolstack. IORT topology is different when
DomU supports device passthrough.

At a minimum domU IORT should include a single PCIRC and ITS Group.
Similar PCIRC can be added in DSDT.
Additional node can be added if platform device is assigned to domU.
No extra node should be required for PCI device pass-through.

It is proposed that the idrange of PCIRC and ITS group be constant for domUs.
In case if PCI PT,using a domctl toolstack can communicate
physical RID: virtual RID, deviceID: virtual deviceID to xen.

It is assumed that domU PCI Config access would be trapped in Xen. The RID at
which assigned device is enumerated would be the one provided by the domctl,
domctl_set_deviceid_mapping

TODO: device assign domctl i/f.

Note: This should suffice the virtual deviceID support pointed by Andre. [4]
We might not need this domctl if assign_device hypercall is extended to provide 
this information.

5. Parsing of IORT in Xen
--

I think a Linux like approach will solve the following use cases:
1. Identify the SMMU devices and initialize the devices as needed.
2. API function to setup SMMUs in response to a discovery notification from DOM0
- We will still need a path for non pcie devices.
- I agree with Andre that the use cases for the named nodes in IORT should 
be treated the same as PCIe RC devices.
3. The concept of fwnode is still valid as per 4.14 and we can try reuse most 
of the parsing code.

The idea is parse one use at multiple places.
- IORT creation for Dom0
- smmu init
- finding smmu for a deviceID when pci_assign_device is called by dom0


Manish, I looked at your old patch and had a couple of questions before I 
comment more on this design. From an initial
glance, it seems that you should be able to hide SMMUs by calling the already 
defined API functions in the iort.c implementation
(for most part :)).

Yes some of the parsing functions can be replaced with APIs.



I am wondering if we really need to keep a list of parsed nodes. Or which use 
case apart from hw dom IORT mandates this?
For all cases I believe where a mapping lookup of Devid-smmu-pcirc is 
required.

IORT nodes can be saved in structures so that IORT table parsing can be done
once and is reused by all xen subsystems like ITS / SMMU etc, 

Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-31 Thread Manish Jaggi



On 10/27/2017 7:35 PM, Andre Przywara wrote:

Hi,

Hey Andre,


On 25/10/17 09:22, Manish Jaggi wrote:


On 10/23/2017 7:27 PM, Andre Przywara wrote:

Hi Manish,

On 12/10/17 22:03, Manish Jaggi wrote:

ACPI/IORT Support in Xen.
--

I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending
the scope
and including all that is required to support ACPI/IORT in Xen.
Presenting for review
first _draft_ of design of ACPI/IORT support in Xen. Not complete
though.

Discussed is the parsing and generation of IORT table for Dom0 and
DomUs.
It is proposed that IORT be parsed and the information in saved into xen
data-structure
say host_iort_struct and is reused by all xen subsystems like ITS / SMMU
etc.

Since this is first draft is open to technical comments, modifications
and suggestions. Please be open and feel free to add any missing points
/ additions.

1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Future Work and TODOs

1. What is IORT. What are its components ?

IORT refers to Input Output remapping table. It is essentially used
to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships
between
devices.

A general structure of IORT is has nodes which have information about
PCI RC,
SMMU, ITS and Platform devices. Using an IORT table relationship between
RID -> StreamID -> DeviceId can be obtained. More specifically which
device is
behind which SMMU and which interrupt controller, this topology is
described in
IORT Table.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

For a non-pci device RID could be simply an ID.

Each iort_node contains an ID map array to translate from one ID into
another.
IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is present in PCI RC node,SMMU node, Named component node etc
and can reference to a SMMU or ITS node.

2. Current Support of IORT
---
Currently Xen passes host IORT table to dom0 without any modifications.
For DomU no IORT table is passed.

3. IORT for Dom0
-
IORT for Dom0 is prepared by xen and it is fairly similar to the host
iort.
However few nodes could be removed removed or modified. For instance
- host SMMU nodes should not be present
- ITS group nodes are same as host iort but, no stage2 mapping is done
for them.

What do you mean with stage2 mapping?

Please ignore this line. Copy paste error. Read it as follows

- ITS group nodes are same as host iort.
(though I would modify the same as in next draft)


- platform nodes (named components) may be selectively present depending
on the case where xen is using some. This could be controlled by  xen
command
line.

Mmh, I am not so sure platform devices described in the IORT (those
which use MSIs!) are so much different from PCI devices here. My
understanding is those platform devices are network adapters, for
instance, for which Xen has no use.

ok.

So I would translate "Named Components" or "platform devices" as devices
just not using the PCIe bus (so no config space and no (S)BDF), but
being otherwise the same from an ITS or SMMU point of view.

Correct.

- More items : TODO

I think we agreed upon rewriting the IORT table instead of patching it?

yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was
_rewriting_ most of Dom0 IORT and not patching it.

I was just after the wording above:
"IORT for Dom0 is prepared by xen and it is fairly similar to the host
iort. However few nodes could be removed removed or modified."
... which sounds a bit like you alter the h/w IORT.
It would be good to clarify this by explicitly mentioning the
parsing/generation cycle, as this is a fundamental design decision.

Sure will do that. Thanks for pointing that.

We can have a IRC discussion on this.

I think apart from rewriting, the other tasks which were required that
are handled in this epic task
- parse IORT and save in xen internal data structures
- common code to generate IORT for dom0/domU
- All xen code that parses IORT multiple times use now the xen internal
data structures.

Yes, that sounds about right.

:)



(I have explained this in this mail below)

So to some degree your statements are true, but when we rewrite the IORT
table without SMMUs (and possibly without other components like the
PMUs), it would be kind of a stretch to call it "fairly similar to the
host IORT". I think "based on the host IORT" would be more precise.

Yes. Based on host IORT is better,thanks.

4. IORT for DomU
-
IORT for DomU is generated by the toolstack. IORT topology is different
when DomU supports device passthrough.

Can you elaborate on that? Different compared to what? My understanding
i

Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-25 Thread Manish Jaggi



On 10/23/2017 8:26 PM, Julien Grall wrote:

Hi,

On 23/10/17 14:57, Andre Przywara wrote:

On 12/10/17 22:03, Manish Jaggi wrote:

It is proposed that the idrange of PCIRC and ITS group be constant for
domUs.


"constant" is a bit confusing here. Maybe "arbitrary", "from scratch" or
"independent from the actual h/w"?


I don't think we should tie to anything here. IORT for DomU will get 
some input, it could be same as the host or something generated (not 
necessarily constant). That's implementation details and might be up 
to the user.





In case if PCI PT,using a domctl toolstack can communicate
physical RID: virtual RID, deviceID: virtual deviceID to xen.

It is assumed that domU PCI Config access would be trapped in Xen. The
RID at which assigned device is enumerated would be the one provided 
by the

domctl, domctl_set_deviceid_mapping

TODO: device assign domctl i/f.
Note: This should suffice the virtual deviceID support pointed by 
Andre.

[4]


Well, there's more to it. First thing: while I tried to include virtual
ITS deviceIDs to be different from physical ones, in the moment there
are fixed to being mapped 1:1 in the code.

So the first step would be to go over the ITS code and identify where
"devid" refers to a virtual deviceID and where to a physical one
(probably renaming them accordingly). Then we would need a function to
translate between the two. At the moment this would be a dummy function
(just return the input value). Later we would loop in the actual table.


We might not need this domctl if assign_device hypercall is extended to
provide this information.


Do we actually need a new interface or even extend the existing one?
If I got Julien correctly, the existing interface is just fine?


In the first place, I am not sure to understand why Domctl is 
mentioned in this document.

I have answered this in reply to Andres mail. Please refer to that.
(Just avoiding duplication)
I can understand why you want to describe the information used for 
DomU IORT. But it does not matter at how this is tying to the rest of 
the passthrough work.



Passthrough could be PCI Device PT or platform device passthrough.


[...]



6. IORT Generation
---
There would be a common code to generate IORT table from 
iort_table_struct.


That sounds useful, but we would need to be careful with sharing code
between Xen and the tool stack. Has this actually been done before?


Yes, see libelf for instance. But I think there is a terminology 
problem here.


Skimming the rest of the e-mail I see: "populate a basic IORT in a 
buffer passed by toolstack (using a domctl : domctl_prepare_dom_iort)".
By sharing code, I meant creating a library that would be compiled in 
both the hypervisor and the toolstack.
It might need more work. I have answered this in reply to Andres mail. 
Please refer to that.


But as I said before, this is not the purpose now. The purpose is 
finally getting support of IORT in the hypervisor with the generation 
of the IORT for Dom0 fully separated from the parsing.


Thats not the only purpose, I have described the tasks in reply to 
Andres mail. Please refer to that.

a. For Dom0
 the structure (iort_table_struct) be modified to remove smmu nodes
 and update id_mappings.
 PCIRC idmap -> output refrence to ITS group.
 (RID -> DeviceID).

 TODO: Describe algo in update_id_mapping function to map RID ->
DeviceID used
 in my earlier patch [3]


If the above approach works, this would become a simple list iteration,
creating PCI rc nodes with the appropriate pointer to the ITS nodes.


b. For DomU
 - iort_table_struct would have minimal 2 nodes (1 PCIRC and 1 ITS
group)
 - populate a basic IORT in a buffer passed by toolstack( using a
domctl : domctl_prepare_dom_iort)


I think we should reduce this to iterating the same data structure as
for Dom0. Each pass-through-ed PCI device would possibly create one
struct instance, and later on we do the same iteration as we do for
Dom0. If that proves to be simple enough, we might even live with the
code duplication between Xen and the toolstack.


I think you summarize quite what I have been saying in the previous 
thread. Thank you :).


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-25 Thread Manish Jaggi



On 10/23/2017 7:27 PM, Andre Przywara wrote:

Hi Manish,

On 12/10/17 22:03, Manish Jaggi wrote:

ACPI/IORT Support in Xen.
--

I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending
the scope
and including all that is required to support ACPI/IORT in Xen.
Presenting for review
first _draft_ of design of ACPI/IORT support in Xen. Not complete though.

Discussed is the parsing and generation of IORT table for Dom0 and DomUs.
It is proposed that IORT be parsed and the information in saved into xen
data-structure
say host_iort_struct and is reused by all xen subsystems like ITS / SMMU
etc.

Since this is first draft is open to technical comments, modifications
and suggestions. Please be open and feel free to add any missing points
/ additions.

1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Future Work and TODOs

1. What is IORT. What are its components ?

IORT refers to Input Output remapping table. It is essentially used to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships
between
devices.

A general structure of IORT is has nodes which have information about
PCI RC,
SMMU, ITS and Platform devices. Using an IORT table relationship between
RID -> StreamID -> DeviceId can be obtained. More specifically which
device is
behind which SMMU and which interrupt controller, this topology is
described in
IORT Table.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

For a non-pci device RID could be simply an ID.

Each iort_node contains an ID map array to translate from one ID into
another.
IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is present in PCI RC node,SMMU node, Named component node etc
and can reference to a SMMU or ITS node.

2. Current Support of IORT
---
Currently Xen passes host IORT table to dom0 without any modifications.
For DomU no IORT table is passed.

3. IORT for Dom0
-
IORT for Dom0 is prepared by xen and it is fairly similar to the host iort.
However few nodes could be removed removed or modified. For instance
- host SMMU nodes should not be present
- ITS group nodes are same as host iort but, no stage2 mapping is done
for them.

What do you mean with stage2 mapping?

Please ignore this line. Copy paste error. Read it as follows

- ITS group nodes are same as host iort.
(though I would modify the same as in next draft)




- platform nodes (named components) may be selectively present depending
on the case where xen is using some. This could be controlled by  xen command
line.

Mmh, I am not so sure platform devices described in the IORT (those
which use MSIs!) are so much different from PCI devices here. My
understanding is those platform devices are network adapters, for
instance, for which Xen has no use.

ok.

So I would translate "Named Components" or "platform devices" as devices
just not using the PCIe bus (so no config space and no (S)BDF), but
being otherwise the same from an ITS or SMMU point of view.

Correct.

- More items : TODO

I think we agreed upon rewriting the IORT table instead of patching it?
yes. In fact if you look at my patch v2 on IORT SMMU hiding, it was 
_rewriting_ most of Dom0 IORT and not patching it.

We can have a IRC discussion on this.

I think apart from rewriting, the other tasks which were required that 
are handled in this epic task

- parse IORT and save in xen internal data structures
- common code to generate IORT for dom0/domU
- All xen code that parses IORT multiple times use now the xen internal 
data structures.

(I have explained this in this mail below)

So to some degree your statements are true, but when we rewrite the IORT
table without SMMUs (and possibly without other components like the
PMUs), it would be kind of a stretch to call it "fairly similar to the
host IORT". I think "based on the host IORT" would be more precise.

Yes. Based on host IORT is better,thanks.



4. IORT for DomU
-
IORT for DomU is generated by the toolstack. IORT topology is different
when DomU supports device passthrough.

Can you elaborate on that? Different compared to what? My understanding
is that without device passthrough there would be no IORT in the first
place?

I was exploring the possibility of having virtual devices for DomU.
So if a virtual is assigned to guest there needs to be some mapping in 
IORT as well.

This virtual device can be on a PCI bus / or as a platform device.

Device Pass-through can be split into two parts
a. platform device passthrough (not on PCI bus)
b. PCI device PT

=> If we discount the possibility of a virtual device for domU and 
platform device passthrough

 then you are correct 

Re: [Xen-devel] [RFC v2 5/7] acpi:arm64: Add support for parsing IORT table

2017-10-20 Thread Manish Jaggi



On 10/19/2017 8:30 PM, Goel, Sameer wrote:

On 10/10/2017 6:36 AM, Manish Jaggi wrote:

Hi Sameer,
On 9/21/2017 6:07 AM, Sameer Goel wrote:

Add support for parsing IORT table to initialize SMMU devices.
* The code for creating an SMMU device has been modified, so that the SMMU
device can be initialized.
* The NAMED NODE code has been commented out as this will need DOM0 kernel
support.
* ITS code has been included but it has not been tested.

Could you please refactor this patch into another set of two patches.
I am planning to rebase my IORT for Dom0 Hiding patch rework on this patch.

I will try to break this up. Lets discuss this a bit more next week.

Please have a look at the draft design. [1]
[1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg125951.html

Thanks,
Manish

Signed-off-by: Sameer Goel <sg...@codeaurora.org>
---
   xen/arch/arm/setup.c   |   3 +
   xen/drivers/acpi/Makefile  |   1 +
   xen/drivers/acpi/arm/Makefile  |   1 +
   xen/drivers/acpi/arm/iort.c| 173 
+
   xen/drivers/passthrough/arm/smmu.c |   1 +
   xen/include/acpi/acpi_iort.h   |  17 ++--
   xen/include/asm-arm/device.h   |   2 +
   xen/include/xen/acpi.h |  21 +
   xen/include/xen/pci.h  |   8 ++
   9 files changed, 146 insertions(+), 81 deletions(-)
   create mode 100644 xen/drivers/acpi/arm/Makefile

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 92f173b..4ba09b2 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -49,6 +49,7 @@
   #include 
   #include 
   #include 
+#include 
 struct bootinfo __initdata bootinfo;
   @@ -796,6 +797,8 @@ void __init start_xen(unsigned long boot_phys_offset,
 tasklet_subsys_init();
   +/* Parse the ACPI iort data */
+acpi_iort_init();
 xsm_dt_init();
   diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile
index 444b11d..e7ffd82 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -1,5 +1,6 @@
   subdir-y += tables
   subdir-y += utilities
+subdir-$(CONFIG_ARM) += arm
   subdir-$(CONFIG_X86) += apei
 obj-bin-y += tables.init.o
diff --git a/xen/drivers/acpi/arm/Makefile b/xen/drivers/acpi/arm/Makefile
new file mode 100644
index 000..7c039bb
--- /dev/null
+++ b/xen/drivers/acpi/arm/Makefile
@@ -0,0 +1 @@
+obj-y += iort.o
diff --git a/xen/drivers/acpi/arm/iort.c b/xen/drivers/acpi/arm/iort.c
index 2e368a6..7f54062 100644
--- a/xen/drivers/acpi/arm/iort.c
+++ b/xen/drivers/acpi/arm/iort.c
@@ -14,17 +14,47 @@
* This file implements early detection/parsing of I/O mapping
* reported to OS through firmware via I/O Remapping Table (IORT)
* IORT document number: ARM DEN 0049A
+ *
+ * Based on Linux drivers/acpi/arm64/iort.c
+ * => commit ca78d3173cff3503bcd15723b049757f75762d15
+ *
+ * Xen modification:
+ * Sameer Goel <sg...@codeaurora.org>
+ * Copyright (C) 2017, The Linux Foundation, All rights reserved.
+ *
*/
   -#define pr_fmt(fmt)"ACPI: IORT: " fmt
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* Xen: Define compatibility functions */
+#define FW_BUG"[Firmware Bug]: "
+#define pr_err(fmt, ...) printk(XENLOG_ERR fmt, ## __VA_ARGS__)
+#define pr_warn(fmt, ...) printk(XENLOG_WARNING fmt, ## __VA_ARGS__)
+
+/* Alias to Xen allocation helpers */
+#define kfree xfree
+#define kmalloc(size, flags)_xmalloc(size, sizeof(void *))
+#define kzalloc(size, flags)_xzalloc(size, sizeof(void *))
+
+/* Redefine WARN macros */
+#undef WARN
+#undef WARN_ON
+#define WARN(condition, format...) ({\
+int __ret_warn_on = !!(condition);\
+if (unlikely(__ret_warn_on))\
+printk(format);\
+unlikely(__ret_warn_on);\
+})
+#define WARN_TAINT(cond, taint, format...) WARN(cond, format)
+#define WARN_ON(cond)  (!!cond)
 #define IORT_TYPE_MASK(type)(1 << (type))
   #define IORT_MSI_TYPE(1 << ACPI_IORT_NODE_ITS_GROUP)
@@ -256,6 +286,13 @@ static acpi_status iort_match_node_callback(struct 
acpi_iort_node *node,
   acpi_status status;
 if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT) {
+status = AE_NOT_IMPLEMENTED;
+/*
+ * We need the namespace object name from dsdt to match the iort node, this
+ * will need additions to the kernel xen bus notifiers.
+ * So, disabling the named node code till a proposal is approved.
+ */
+#if 0
   struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL };
   struct acpi_device *adev = to_acpi_device_node(dev->fwnode);
   struct acpi_iort_named_component *ncomp;
@@ -275,11 +312,12 @@ static acpi_status iort_match_node_callback(struct 
acpi_iort_node *node,
  

Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table

2017-10-12 Thread Manish Jaggi



On 10/12/2017 5:14 PM, Julien Grall wrote:



On 12/10/17 12:22, Manish Jaggi wrote:

Hi Julien,

Why do you omit parts of mail where I have asked a question , please 
avoid skiping  that removes the context.


I believe I answered it just after because you asked twice the same 
thing. So may I dropped the context but the answer was there...


For your convenience here the replicated answer.

"Why? The generation of IORT is fairly standalone.

And again, this was suggestion to share in the future and an 
expectation for this series. What I care the most is the generation to 
be fully separated from the rest."


I raised a valid point and it was totally ignored and you asked me to 
explain the rationale on a later point.

So if you choose to ignore my first point, how can I put any point.


Well, maybe you should read the e-mail more carefully because your 
point have been addressed. If they are not, then please say it rather 
than accusing the reviewers on spending not enough time on your series...


[...]

Now if you see both the codes are quite similar and there is 
redundancy in libxl and in xen code for preparing ACPI tables for 
dom0 and domU.
The point I am raising is quite clear, if all other tables like MADT, 
XSDT, RSDP, GTDT etc does not share a common generation code with xen 
what is so special about IORT.
Either we move all generation into a common code or keep redundancy 
for IORT.


I hope I have shown the code and made the point quite clear.
Please provide a technical answer rather than a simple "Why".


Why do you still continue arguing on how this is going to interact 
with libxl when your only work now (as I stated in every single 
e-mail) is for Dom0.


If the generation is generic enough, it will require little code to 
interface. After all, you only need:

- informations (e.g DeviceID, MasterID...)
- buffer for writing the generated IORT

So now it is maybe time for you to suggest an interface we can discuss 
on.

Sure. A quick draft is shared on mailing list. [1]

[1] https://marc.info/?l=xen-devel=150784236208192=2


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC] [Draft Design] ACPI/IORT Support in Xen.

2017-10-12 Thread Manish Jaggi

ACPI/IORT Support in Xen.
--

I had sent out patch series [0] to hide smmu from Dom0 IORT. Extending 
the scope
and including all that is required to support ACPI/IORT in Xen. 
Presenting for review

first _draft_ of design of ACPI/IORT support in Xen. Not complete though.

Discussed is the parsing and generation of IORT table for Dom0 and DomUs.
It is proposed that IORT be parsed and the information in saved into xen 
data-structure
say host_iort_struct and is reused by all xen subsystems like ITS / SMMU 
etc.


Since this is first draft is open to technical comments, modifications
and suggestions. Please be open and feel free to add any missing points 
/ additions.


1. What is IORT. What are its components ?
2. Current Support in Xen
3. IORT for Dom0
4. IORT for DomU
5. Parsing of IORT in Xen
6. Generation of IORT
7. Future Work and TODOs

1. What is IORT. What are its components ?

IORT refers to Input Output remapping table. It is essentially used to find
information about the IO topology (PCIRC-SMMU-ITS) and relationships between
devices.

A general structure of IORT is has nodes which have information about 
PCI RC,

SMMU, ITS and Platform devices. Using an IORT table relationship between
RID -> StreamID -> DeviceId can be obtained. More specifically which 
device is
behind which SMMU and which interrupt controller, this topology is 
described in

IORT Table.

RID is a requester ID in PCI context,
StreamID is the ID of the device in SMMU context,
DeviceID is the ID programmed in ITS.

For a non-pci device RID could be simply an ID.

Each iort_node contains an ID map array to translate from one ID into 
another.

IDmap Entry {input_range, output_range, output_node_ref, id_count}
This array is present in PCI RC node,SMMU node, Named component node etc
and can reference to a SMMU or ITS node.

2. Current Support of IORT
---
Currently Xen passes host IORT table to dom0 without any modifications.
For DomU no IORT table is passed.

3. IORT for Dom0
-
IORT for Dom0 is prepared by xen and it is fairly similar to the host iort.
However few nodes could be removed removed or modified. For instance
- host SMMU nodes should not be present
- ITS group nodes are same as host iort but, no stage2 mapping is done 
for them.
- platform nodes (named components) may be selectively present depending 
on the

case where xen is using some. This could be controlled by  xen command line.
- More items : TODO

4. IORT for DomU
-
IORT for DomU is generated by the toolstack. IORT topology is different when
DomU supports device passthrough.

At a minimum domU IORT should include a single PCIRC and ITS Group.
Similar PCIRC can be added in DSDT.
Additional node can be added if platform device is assigned to domU.
No extra node should be required for PCI device pass-through.

It is proposed that the idrange of PCIRC and ITS group be constant for 
domUs.

In case if PCI PT,using a domctl toolstack can communicate
physical RID: virtual RID, deviceID: virtual deviceID to xen.

It is assumed that domU PCI Config access would be trapped in Xen. The 
RID at

which assigned device is enumerated would be the one provided by the domctl,
domctl_set_deviceid_mapping

TODO: device assign domctl i/f.

Note: This should suffice the virtual deviceID support pointed by Andre. [4]
We might not need this domctl if assign_device hypercall is extended to 
provide this information.


5. Parsing of IORT in Xen
--
IORT nodes can be saved in structures so that IORT table parsing can be done
once and is reused by all xen subsystems like ITS / SMMU etc, domain 
creation.

Proposed are the structures to hold IORT information, very similar to ACPI
structures.

iort_id_map {
range_t input_range;
range_t output_range;
void *output_reference;
...
}
=>output_reference points to object of iort_node.

struct iort_node {
struct list_head id_map;
void *context;
struct list_head list;
}
=> context could be a reference to acpi_iort_node.

struct iort_table_struct {
struct list_head pci_rc_nodes;
struct list_head smmu_nodes;
struct list_head plat_devices;
struct list_head its_group;
}

This structure is created at the point IORT table is parsed say from 
acpi_iort_init.
It is proposed to use this structure information in 
iort_init_platform_devices.

[2] [RFC v2 4/7] ACPI: arm: Support for IORT

6. IORT Generation
---
There would be a common code to generate IORT table from iort_table_struct.

a. For Dom0
the structure (iort_table_struct) be modified to remove smmu nodes
and update id_mappings.
PCIRC idmap -> output refrence to ITS group.
(RID -> DeviceID).

TODO: Describe algo in update_id_mapping function to map RID -> 
DeviceID used

in my earlier patch [3]

b. For DomU
- iort_table_struct would have minimal 2 nodes (1 PCIRC 

Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table

2017-10-12 Thread Manish Jaggi

Hi Julien,

Why do you omit parts of mail where I have asked a question , please 
avoid skiping  that removes the context.
I raised a valid point and it was totally ignored and you asked me to 
explain the rationale on a later point.

So if you choose to ignore my first point, how can I put any point.

This is what I have asked


>>The ACPI tables for DomU are generated by the toolstack today. So I 
don't see why we would change that to support IORT.

>>
>>However, you can have a file shared between the toolstack and Xen and 
contain the generation of IORT.

>>
>>For instance, this is what we already does with libelf (see 
common/libelf).


This will amount to  adding a function make_iort in libxl__prepare_acpi,
 which would use the common code that is also use generate dom0 
IORT (domain_build.c).

Correct ?

If we go by this logic, then libxl_prepare_acpi and domain_build.c 
should use a common code for all acpi tables.
- Are you suggesting we change that as well and make it part of common 
code.


The code in domain_build.c and in libxl__prepare_acpi is very similar, 
see the code below.


static int prepare_acpi(struct domain *d, struct kernel_info *kinfo)
{

d->arch.efi_acpi_table = alloc_xenheap_pages(order, 0);
...

rc = acpi_create_fadt(d, tbl_add);
if ( rc != 0 )
return rc;

rc = acpi_create_madt(d, tbl_add);
if ( rc != 0 )
return rc;

rc = acpi_create_stao(d, tbl_add);
if ( rc != 0 )
return rc;

rc = acpi_create_xsdt(d, tbl_add);
if ( rc != 0 )
return rc;

rc = acpi_create_rsdp(d, tbl_add);
if ( rc != 0 )
return rc;

...
}

int libxl__prepare_acpi(libxl__gc *gc, libxl_domain_build_info *info,
struct xc_dom_image *dom)
{
...

rc = libxl__allocate_acpi_tables(gc, info, dom, acpitables);
if (rc)
goto out;

make_acpi_rsdp(gc, dom, acpitables);
make_acpi_xsdt(gc, dom, acpitables);
make_acpi_gtdt(gc, dom, acpitables);
rc = make_acpi_madt(gc, dom, info, acpitables);
if (rc)
goto out;

make_acpi_fadt(gc, dom, acpitables);
make_acpi_dsdt(gc, dom, acpitables);

out:
return rc;
}

Now if you see both the codes are quite similar and there is redundancy 
in libxl and in xen code for preparing ACPI tables for dom0 and domU.
The point I am raising is quite clear, if all other tables like MADT, 
XSDT, RSDP, GTDT etc does not share a common generation code with xen 
what is so special about IORT.
Either we move all generation into a common code or keep redundancy for 
IORT.


I hope I have shown the code and made the point quite clear.
Please provide a technical answer rather than a simple "Why".

Cheers!

Manish

On 10/12/2017 4:34 PM, Julien Grall wrote:

Hello,

On 12/10/17 07:11, Manish Jaggi wrote:

On 10/6/2017 7:54 PM, Julien Grall wrote:
I am not asking to write the DomU support, but at least have a full 
separation between the Parsing and Generation. So we could easily 
adapt and re-use the code when we get the DomU support.


I got your point, but as of today there is no code reuse for most of 
apci_tables. So code result _only_ for IORT but not for acpi is not 
correct approach.


Why? The generation of IORT is fairly standalone.

And again, this was suggestion to share in the future and an 
expectation for this series. What I care the most is the generation to 
be fully separated from the rest.




Also this is the part of PCI passthrough flow so that also might 
change few things.


But from pov of dom0 smmu hiding, it is a different flow and is 
coupled with PCI PT.






I think 1) can be solved using this series as a base. I have 
quite some

comments ready for the patches, shall we follow this route.

2) obviously would change the game completely. We need to sit 
down and
design this properly. Probably this means that Xen parses the 
IORT and
builds internal representations of the mappings, 
Can you please add more detail on the internal representations of 
the mappings.


What exactly do you want? This is likely going to be decided once 
you looked what is the expected interaction between IORT and Xen.
More details on this line "Probably this means that Xen parses the 
IORT and

builds internal representations of the mappings,"


I think you have enough meat in this thread to come up with a 
proposition based on the feedback. Once you send it, we can have a 
discussion and find agreement.


[...]

The IORT for the hardware domain is just a specific case as it is 
based pre-existing information. But because of removing nodes (e.g 
SMMU nodes and probably the PMU nodes), it is basically a full 
re-write.


So I would consider of full separate the logic of generating the 
IORT table from the host IORT table. By that I mean not browsing 
the host IORT when generating the host.



by "the host" you mean dom0 IORT  ?


yes.


Something on the lines

Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table

2017-10-12 Thread Manish Jaggi



On 10/6/2017 7:54 PM, Julien Grall wrote:

Hello,

On 04/10/17 06:22, Manish Jaggi wrote:

On 10/4/2017 12:12 AM, Julien Grall wrote:

On 25/09/17 05:22, Manish Jaggi wrote:

On 9/22/2017 7:42 PM, Andre Przywara wrote:

Hi Manish,

On 11/09/17 22:33, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

The set is divided into two patches. First one calculates the 
size of IORT

while second one writes the IORT table itself.
It would be good if you could give a quick introduction *why* this 
set

is needed here (and introduce IORT to the casual reader).
In general some more high-level documentation on your functions 
would be
good, as it took me quite some time to understand what each 
function does.

ok, will add more documentation.

So my understanding is:
phase 1:
- go over each entry in each RC node
Rather than each entry (which could be a large number) I am taking 
the complete range and checking it with the same logic.
If the ID range is a subset or a super-set of id range in smmu, new 
id range is created.


So if pci_rc node has an id map 
{p_input-base,p_output-base,p_out_ref, p_count} and it an output 
reference to smmu node with id-map
{s_input-base, s_output-base,s_out_ref,  s_count}, based on the the 
the s_count and s_input/p_output the new id-map is created with 
{p_input, s_output, s_out_ref, adjusted_count}


update_id_mapping function does that.

So I am following the same logic. We can chat over IRC / I can give 
a code walk-through ...


-   if that points to an SMMU node, go over each outgoing ITS 
entry and

find overlaps with this RC entry
- for each overlap create a new entry in a list with this RC
pointing to the ITS directly

phase 2, creating the new IORT
- go over each RC node
-   if that points to an ITS, copy through IORT entries
-   if that points to an SMMU, replace with the remapped entries
- go over each ITS node
-   copy through IORT entries

Thats exactly what this patch does.

What are you comments on the current patch approach to hide smmu nodes.
I have answered to your comments, see below.


I am not sure to understand the 2 sentences above. What are they related?


IMHO we can reuse most of the fixup code here.


That's your choice as long as it is properly documented and fits the 
end goal.





So I believe this would do the trick and you end up with an efficient
representation of the IORT without SMMUs - at least for RC nodes.

After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. That should be fixable by removing the
hardcoded IORT node types in the code and treating NC nodes like 
RC nodes.
Yes, so first we can take this as a base, once this is ok, I can 
add support for named components.
2) Eventually we will need *virtual* deviceID support, for DomUs. 
Now we
I am a bit surprised that you answered to the e-mail but didn't 
provide any opinion on 2).

Apologies for that.

could start introducing that already, also doing some virtual mapping
for Dom0. The ITS code would then translate each virtual device ID 
that

Dom0 requests into a hardware device ID.
I agree that this means a lot more work, but we will need it anyway.


I am a bit surprised that you answered to the e-mail but didn't 
provide any opinion on 2).

Apologies for that. Sorry to surprise you twice :)


Damm, I moved the sentence but forgot to drop the original one.



IMHO It was a bit obvious for DomU and I was waiting to hear what 
other would say on this.

as (2) below.
Moreover we need to discuss IORT generation for DomU
- could be done by xl tools
or xen should do it.


The ACPI tables for DomU are generated by the toolstack today. So I 
don't see why we would change that to support IORT.


However, you can have a file shared between the toolstack and Xen and 
contain the generation of IORT.


For instance, this is what we already does with libelf (see 
common/libelf).

This will amount to  adding a function make_iort in libxl__prepare_acpi,
 which would use the common code that is also use generate dom0 
IORT (domain_build.c).

Correct ?

If we go by this logic, then libxl_prepare_acpi and domain_build.c 
should use a common code for all acpi tables.

- Are you suggesting we change that as well and make it part of common code.



I am not asking to write the DomU support, but at least have a full 
separation between the Parsing and Generation. So we could easily 
adapt and re-use the code when we get the DomU support.


I got your point, but as of today there is no code reuse for most of 
apci_tables. So code result _only_ for IORT but not for acpi is not 
correct approach.


Also this is the part of PCI passthrough flow so that also might 
change few things.


But from pov of dom0 smmu hiding, it is a different flow and is 
coupled with PCI PT.






I think 1) can be solved using this series as a base. I have quite 
some

comments ready

Re: [Xen-devel] [PATCH v6 3/5] ARM: ITS: Deny hardware domain access to ITS

2017-10-10 Thread Manish Jaggi

Hi Julien,

On 10/10/2017 7:09 PM, Julien Grall wrote:

Hi Manish,

On 10/10/17 13:52, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

This patch extends the gicv3_iomem_deny_access functionality by adding
support for ITS region as well. Add function gicv3_its_deny_access.

Reviewed-by: Andre Przywara <andre.przyw...@arm.com>
Acked-by: Julien Grall <julien.gr...@arm.com>


Please state after "---" when you modified a patch and keep the tags 
to at least check if the reviewer is happy with it.


It is one of the reason I like the changelog in each patch. It helps 
to know what changed in a specific one. It helps me to decide whether 
I am happy with you keeping my tag and avoid to fully review yet 
another time the patch.


In that case, it is fine to keep it.

For this patch please ack it.
Changelog:
I have added
- a check on return value for gicv3_its_deny_access(d);
- used its_data->size in place of GICV3_ITS_SIZE
- remove extra space in printk

Thanks
manish



Signed-off-by: Manish Jaggi <mja...@cavium.com> > ---
  xen/arch/arm/gic-v3-its.c| 22 ++
  xen/arch/arm/gic-v3.c|  4 
  xen/include/asm-arm/gic_v3_its.h |  9 +
  3 files changed, 35 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 3023ee5..bd94308 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -21,6 +21,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -905,6 +906,27 @@ struct pending_irq 
*gicv3_assign_guest_event(struct domain *d,

  return pirq;
  }
  +int gicv3_its_deny_access(const struct domain *d)
+{
+int rc = 0;
+unsigned long mfn, nr;
+const struct host_its *its_data;
+
+list_for_each_entry( its_data, _its_list, entry )
+{
+mfn = paddr_to_pfn(its_data->addr);
+nr = PFN_UP(its_data->size);
+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+{
+printk("iomem_deny_access failed for %lx:%lx \r\n", mfn, 
nr);

+break;
+}
+}
+
+return rc;
+}
+
  /*
   * Create the respective guest DT nodes from a list of host ITSes.
   * This copies the reg property, so the guest sees the ITS at the 
same address

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 6f562f4..475e0d3 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1308,6 +1308,10 @@ static int gicv3_iomem_deny_access(const 
struct domain *d)

  if ( rc )
  return rc;
  +rc = gicv3_its_deny_access(d);
+if ( rc )
+return rc;
+
  for ( i = 0; i < gicv3.rdist_count; i++ )
  {
  mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT;
diff --git a/xen/include/asm-arm/gic_v3_its.h 
b/xen/include/asm-arm/gic_v3_its.h

index 73d1fd1..73ee0ba 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -139,6 +139,10 @@ void gicv3_its_dt_init(const struct 
dt_device_node *node);

  #ifdef CONFIG_ACPI
  void gicv3_its_acpi_init(void);
  #endif
+
+/* Deny iomem access for its */
+int gicv3_its_deny_access(const struct domain *d);
+
  bool gicv3_its_host_has_its(void);
unsigned int vgic_v3_its_count(const struct domain *d);
@@ -206,6 +210,11 @@ static inline void gicv3_its_acpi_init(void)
  }
  #endif
  +static inline int gicv3_its_deny_access(const struct domain *d)
+{
+return 0;
+}
+
  static inline bool gicv3_its_host_has_its(void)
  {
  return false;



Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 5/7] acpi:arm64: Add support for parsing IORT table

2017-10-10 Thread Manish Jaggi

Hi Sameer,
On 9/21/2017 6:07 AM, Sameer Goel wrote:

Add support for parsing IORT table to initialize SMMU devices.
* The code for creating an SMMU device has been modified, so that the SMMU
device can be initialized.
* The NAMED NODE code has been commented out as this will need DOM0 kernel
support.
* ITS code has been included but it has not been tested.

Could you please refactor this patch into another set of two patches.
I am planning to rebase my IORT for Dom0 Hiding patch rework on this patch.
Thanks,
Manish

Signed-off-by: Sameer Goel 
---
  xen/arch/arm/setup.c   |   3 +
  xen/drivers/acpi/Makefile  |   1 +
  xen/drivers/acpi/arm/Makefile  |   1 +
  xen/drivers/acpi/arm/iort.c| 173 +
  xen/drivers/passthrough/arm/smmu.c |   1 +
  xen/include/acpi/acpi_iort.h   |  17 ++--
  xen/include/asm-arm/device.h   |   2 +
  xen/include/xen/acpi.h |  21 +
  xen/include/xen/pci.h  |   8 ++
  9 files changed, 146 insertions(+), 81 deletions(-)
  create mode 100644 xen/drivers/acpi/arm/Makefile

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 92f173b..4ba09b2 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -49,6 +49,7 @@
  #include 
  #include 
  #include 
+#include 
  
  struct bootinfo __initdata bootinfo;
  
@@ -796,6 +797,8 @@ void __init start_xen(unsigned long boot_phys_offset,
  
  tasklet_subsys_init();
  
+/* Parse the ACPI iort data */

+acpi_iort_init();
  
  xsm_dt_init();
  
diff --git a/xen/drivers/acpi/Makefile b/xen/drivers/acpi/Makefile

index 444b11d..e7ffd82 100644
--- a/xen/drivers/acpi/Makefile
+++ b/xen/drivers/acpi/Makefile
@@ -1,5 +1,6 @@
  subdir-y += tables
  subdir-y += utilities
+subdir-$(CONFIG_ARM) += arm
  subdir-$(CONFIG_X86) += apei
  
  obj-bin-y += tables.init.o

diff --git a/xen/drivers/acpi/arm/Makefile b/xen/drivers/acpi/arm/Makefile
new file mode 100644
index 000..7c039bb
--- /dev/null
+++ b/xen/drivers/acpi/arm/Makefile
@@ -0,0 +1 @@
+obj-y += iort.o
diff --git a/xen/drivers/acpi/arm/iort.c b/xen/drivers/acpi/arm/iort.c
index 2e368a6..7f54062 100644
--- a/xen/drivers/acpi/arm/iort.c
+++ b/xen/drivers/acpi/arm/iort.c
@@ -14,17 +14,47 @@
   * This file implements early detection/parsing of I/O mapping
   * reported to OS through firmware via I/O Remapping Table (IORT)
   * IORT document number: ARM DEN 0049A
+ *
+ * Based on Linux drivers/acpi/arm64/iort.c
+ * => commit ca78d3173cff3503bcd15723b049757f75762d15
+ *
+ * Xen modification:
+ * Sameer Goel 
+ * Copyright (C) 2017, The Linux Foundation, All rights reserved.
+ *
   */
  
-#define pr_fmt(fmt)	"ACPI: IORT: " fmt

-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* Xen: Define compatibility functions */
+#define FW_BUG "[Firmware Bug]: "
+#define pr_err(fmt, ...) printk(XENLOG_ERR fmt, ## __VA_ARGS__)
+#define pr_warn(fmt, ...) printk(XENLOG_WARNING fmt, ## __VA_ARGS__)
+
+/* Alias to Xen allocation helpers */
+#define kfree xfree
+#define kmalloc(size, flags)_xmalloc(size, sizeof(void *))
+#define kzalloc(size, flags)_xzalloc(size, sizeof(void *))
+
+/* Redefine WARN macros */
+#undef WARN
+#undef WARN_ON
+#define WARN(condition, format...) ({  \
+   int __ret_warn_on = !!(condition);  \
+   if (unlikely(__ret_warn_on))\
+   printk(format); \
+   unlikely(__ret_warn_on);\
+})
+#define WARN_TAINT(cond, taint, format...) WARN(cond, format)
+#define WARN_ON(cond)  (!!cond)
  
  #define IORT_TYPE_MASK(type)	(1 << (type))

  #define IORT_MSI_TYPE (1 << ACPI_IORT_NODE_ITS_GROUP)
@@ -256,6 +286,13 @@ static acpi_status iort_match_node_callback(struct 
acpi_iort_node *node,
acpi_status status;
  
  	if (node->type == ACPI_IORT_NODE_NAMED_COMPONENT) {

+   status = AE_NOT_IMPLEMENTED;
+/*
+ * We need the namespace object name from dsdt to match the iort node, this
+ * will need additions to the kernel xen bus notifiers.
+ * So, disabling the named node code till a proposal is approved.
+ */
+#if 0
struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER, NULL };
struct acpi_device *adev = to_acpi_device_node(dev->fwnode);
struct acpi_iort_named_component *ncomp;
@@ -275,11 +312,12 @@ static acpi_status iort_match_node_callback(struct 
acpi_iort_node *node,
status = !strcmp(ncomp->device_name, buf.pointer) ?
AE_OK : AE_NOT_FOUND;
acpi_os_free(buf.pointer);
+#endif
} else if (node->type == 

Re: [Xen-devel] [PATCH v5 4/5] ARM: Update Formula to compute MADT size using new callbacks in gic_hw_operations

2017-10-10 Thread Manish Jaggi



On 10/10/2017 3:44 PM, Julien Grall wrote:

Hi Manish,

On 10/10/17 07:16, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

estimate_acpi_efi_size needs to be updated to provide correct size of
hardware domains MADT, which now adds ITS information as well.

This patch updates the formula to compute extra MADT size, as per 
GICv2/3

by calling gic_get_hwdom_extra_madt_size


Missing full stop.

oh i missed it.




Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
  xen/arch/arm/domain_build.c |  7 +--
  xen/arch/arm/gic-v2.c   |  6 ++
  xen/arch/arm/gic-v3.c   | 19 +++
  xen/arch/arm/gic.c  | 12 
  xen/include/asm-arm/gic.h   |  3 +++
  5 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d6f9585..f17fcf1 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1808,12 +1808,7 @@ static int estimate_acpi_efi_size(struct 
domain *d, struct kernel_info *kinfo)

  acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8);
  acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);
  -madt_size = sizeof(struct acpi_table_madt)
-+ sizeof(struct acpi_madt_generic_interrupt) * 
d->max_vcpus

-+ sizeof(struct acpi_madt_generic_distributor);
-if ( d->arch.vgic.version == GIC_V3 )
-madt_size += sizeof(struct acpi_madt_generic_redistributor)
- * d->arch.vgic.nr_regions;
+madt_size = gic_get_hwdom_madt_size(d);
  acpi_size += ROUNDUP(madt_size, 8);
addr = acpi_os_get_root_pointer();
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index cbe71a9..0123ea4 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const 
struct domain *d)

  return iomem_deny_access(d, mfn, mfn + nr);
  }
  +static unsigned long gicv2_get_hwdom_extra_madt_size(const struct 
domain *d)

+{
+return 0;
+}
+
  #ifdef CONFIG_ACPI
  static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
  {
@@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops 
= {

  .read_apr= gicv2_read_apr,
  .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
  .make_hwdom_madt = gicv2_make_hwdom_madt,
+.get_hwdom_extra_madt_size = gicv2_get_hwdom_extra_madt_size,
  .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
  .iomem_deny_access   = gicv2_iomem_deny_access,
  .do_LPI  = gicv2_do_LPI,
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index b3d605d..447998d 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1406,6 +1406,19 @@ static int gicv3_make_hwdom_madt(const struct 
domain *d, u32 offset)

  return table_len;
  }
  +static unsigned long gicv3_get_hwdom_extra_madt_size(const struct 
domain *d)

+{
+unsigned long size;
+
+size  = sizeof(struct acpi_madt_generic_redistributor)
+* d->arch.vgic.nr_regions;


Here you align the * with struct. But below, you align with sizeof. 
Please stay consistent and always align with sizeof.



+
+size  += vgic_v3_its_count(d)
+* sizeof(struct acpi_madt_generic_translator);


Same here.

Could you please help with the specific section on coding style 
guidelines on xen for indentation when line over 80 chars which I am not 
following for this case.

+
+return size;
+}
+
  static int __init
  gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
  const unsigned long end)
@@ -1597,6 +1610,11 @@ static int gicv3_make_hwdom_madt(const struct 
domain *d, u32 offset)

  {
  return 0;
  }
+
+static unsigned long gicv3_get_hwdom_extra_madt_size(const struct 
domain *d)

+{
+return 0;
+}
  #endif
/* Set up the GIC */
@@ -1698,6 +1716,7 @@ static const struct gic_hw_operations gicv3_ops 
= {

  .secondary_init  = gicv3_secondary_cpu_init,
  .make_hwdom_dt_node  = gicv3_make_hwdom_dt_node,
  .make_hwdom_madt = gicv3_make_hwdom_madt,
+.get_hwdom_extra_madt_size = gicv3_get_hwdom_extra_madt_size,
  .iomem_deny_access   = gicv3_iomem_deny_access,
  .do_LPI  = gicv3_do_LPI,
  };
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 6c803bf..3c7b6df 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -851,6 +851,18 @@ int gic_make_hwdom_madt(const struct domain *d, 
u32 offset)

  return gic_hw_ops->make_hwdom_madt(d, offset);
  }
  +unsigned long gic_get_hwdom_madt_size(const struct domain *d)
+{
+unsigned long madt_size;
+
+madt_size = sizeof(struct acpi_table_madt)
++ sizeof(struct acpi_madt_generic_interrupt) * 
d->max_vcpus

++ sizeof(struct acpi_madt_generic_distributor)
++ gic_hw_ops->get_hwdom_extra_madt_size(d);
+

Re: [Xen-devel] [PATCH v4 5/5] ARM: ITS: Expose ITS in the MADT table

2017-10-05 Thread Manish Jaggi

Hi Andre,

On 10/3/2017 8:03 PM, Julien Grall wrote:

Hi Manish,

On 21/09/17 14:17, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

Add gicv3_its_make_hwdom_madt to update hwdom MADT ITS information.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
  xen/arch/arm/gic-v3-its.c| 19 +++
  xen/arch/arm/gic-v3.c|  1 +
  xen/include/asm-arm/gic_v3_its.h |  8 
  3 files changed, 28 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 8697e5b..e3e7e92 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -1062,6 +1062,25 @@ void gicv3_its_acpi_init(void)
  acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
  gicv3_its_acpi_probe, 0);
  }
+
+unsigned long gicv3_its_make_hwdom_madt(const struct domain *d, void 
*base_ptr)

+{
+unsigned long i = 0;
+void *fw_its;
+struct acpi_madt_generic_translator *hwdom_its;
+
+hwdom_its = base_ptr;
+
+for ( i = 0; i < vgic_v3_its_count(d); i++ )
+{
+fw_its = 
acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,

+   i);
+memcpy(hwdom_its, fw_its, sizeof(struct 
acpi_madt_generic_translator));

+hwdom_its++;
+}
+
+return sizeof(struct acpi_madt_generic_translator) * 
vgic_v3_its_count(d);

+}
  #endif
/*
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 6e8d580..d29eea6 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1403,6 +1403,7 @@ static int gicv3_make_hwdom_madt(const struct 
domain *d, u32 offset)

  table_len += size;
  }
  +table_len += gicv3_its_make_hwdom_madt(d, base_ptr + table_len);


Newline here please.

I will leave Andre to comment on this patch as he suggested the rework.
Could you please provide comments on this patch so that I can send an 
updated v5.


Cheers,


  return table_len;
  }
  diff --git a/xen/include/asm-arm/gic_v3_its.h 
b/xen/include/asm-arm/gic_v3_its.h

index 31fca66..fc37776 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -138,6 +138,8 @@ void gicv3_its_dt_init(const struct 
dt_device_node *node);

#ifdef CONFIG_ACPI
  void gicv3_its_acpi_init(void);
+unsigned long gicv3_its_make_hwdom_madt(const struct domain *d,
+void *base_ptr);
  #endif
/* Deny iomem access for its */
@@ -208,6 +210,12 @@ static inline void gicv3_its_dt_init(const 
struct dt_device_node *node)

  static inline void gicv3_its_acpi_init(void)
  {
  }
+
+static inline unsigned long gicv3_its_make_hwdom_madt(const struct 
domain *d,

+  void *base_ptr)
+{
+return 0;
+}
  #endif
static inline int gicv3_its_deny_access(const struct domain *d)






___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 4/5] ARM: Introduce get_hwdom_madt_size in gic_hw_operations

2017-10-03 Thread Manish Jaggi

Hi

On 10/3/2017 8:01 PM, Julien Grall wrote:

Hi,

On 21/09/17 14:17, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

estimate_acpi_efi_size needs to be updated to provide correct size of
hardware domains MADT, which now adds ITS information as well.

Introducing gic_get_hwdom_madt_size.


I think the commit title is misleading, the main purpose of this patch 
is updating the formula to compute the MADT size for GICv3. Not 
introducing the callbacks.


But likely, you want two patches here:
- Patch #1 adding the callbacks
- Patch #2 updating the formula for GICv3

For this time, I would be ok to have only one patch providing the 
commit message is updated.



ok, will update..

Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v4 2/5] ARM: ITS: Populate host_its_list from ACPI MADT Table

2017-10-03 Thread Manish Jaggi

Hello Julien,

On 10/3/2017 7:17 PM, Julien Grall wrote:

Hi Manish,

On 21/09/17 14:17, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

Added gicv3_its_acpi_init to update host_its_list from MADT table.
For ACPI, host_its structure  stores dt_node as NULL.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
  xen/arch/arm/gic-v3-its.c| 24 
  xen/arch/arm/gic-v3.c|  2 ++
  xen/include/asm-arm/gic_v3_its.h | 10 ++
  3 files changed, 36 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 0610991..0f662cf 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -18,6 +18,7 @@
   * along with this program; If not, see 
<http://www.gnu.org/licenses/>.

   */
  +#include 
  #include 
  #include 
  #include 
@@ -1018,6 +1019,29 @@ void gicv3_its_dt_init(const struct 
dt_device_node *node)

  }
  }
  +#ifdef CONFIG_ACPI
+static int gicv3_its_acpi_probe(struct acpi_subtable_header *header,
+const unsigned long end)
+{
+struct acpi_madt_generic_translator *its;
+
+its = (struct acpi_madt_generic_translator *)header;
+if ( BAD_MADT_ENTRY(its, end) )
+return -EINVAL;
+
+add_to_host_its_list(its->base_address, GICV3_ITS_SIZE, NULL);


After the comment from Andre, I was expecting some rework to avoid 
store the size of the ITS in host_its. So what's the plan for that?
GICV3_ITS_SIZE  is now 128K (prev 64k, see below), same as what used in 
linux code, I think andre mentioned that need to add additional 64K.



+
+return 0;
+}
+
+void gicv3_its_acpi_init(void)
+{
+/* Parse ITS information */
+acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
+gicv3_its_acpi_probe, 0);


The indentation still looks wrong here.

ah.. ok.



+}
+#endif
+
  /*
   * Local variables:
   * mode: C
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index f990eae..6f562f4 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1567,6 +1567,8 @@ static void __init gicv3_acpi_init(void)
gicv3.rdist_stride = 0;
  +gicv3_its_acpi_init();
+
  /*
   * In ACPI, 0 is considered as the invalid address. However the 
rest

   * of the initialization rely on the invalid address to be
diff --git a/xen/include/asm-arm/gic_v3_its.h 
b/xen/include/asm-arm/gic_v3_its.h

index 1fac1c7..e1be33c 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -20,6 +20,7 @@
  #ifndef __ASM_ARM_ITS_H__
  #define __ASM_ARM_ITS_H__
  +#define GICV3_ITS_SIZE  SZ_128K


A less random place for this is close to the ITS_DOORBELL_OFFSET 
definition.

ok will do :)



  #define GITS_CTLR 0x000
  #define GITS_IIDR   0x004
  #define GITS_TYPER  0x008
@@ -135,6 +136,9 @@ extern struct list_head host_its_list;
  /* Parse the host DT and pick up all host ITSes. */
  void gicv3_its_dt_init(const struct dt_device_node *node);
  +#ifdef CONFIG_ACPI
+void gicv3_its_acpi_init(void);
+#endif
  bool gicv3_its_host_has_its(void);
unsigned int vgic_v3_its_count(const struct domain *d);
@@ -196,6 +200,12 @@ static inline void gicv3_its_dt_init(const 
struct dt_device_node *node)

  {
  }
  +#ifdef CONFIG_ACPI
+static inline void gicv3_its_acpi_init(void)
+{
+}
+#endif
+
  static inline bool gicv3_its_host_has_its(void)
  {
  return false;



Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table

2017-10-03 Thread Manish Jaggi

Hello Julien,

On 10/4/2017 12:12 AM, Julien Grall wrote:

Hello,

On 25/09/17 05:22, Manish Jaggi wrote:

On 9/22/2017 7:42 PM, Andre Przywara wrote:

Hi Manish,

On 11/09/17 22:33, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

The set is divided into two patches. First one calculates the size 
of IORT

while second one writes the IORT table itself.

It would be good if you could give a quick introduction *why* this set
is needed here (and introduce IORT to the casual reader).
In general some more high-level documentation on your functions 
would be
good, as it took me quite some time to understand what each function 
does.

ok, will add more documentation.

So my understanding is:
phase 1:
- go over each entry in each RC node
Rather than each entry (which could be a large number) I am taking 
the complete range and checking it with the same logic.
If the ID range is a subset or a super-set of id range in smmu, new 
id range is created.


So if pci_rc node has an id map 
{p_input-base,p_output-base,p_out_ref, p_count} and it an output 
reference to smmu node with id-map
{s_input-base, s_output-base,s_out_ref,  s_count}, based on the the 
the s_count and s_input/p_output the new id-map is created with 
{p_input, s_output, s_out_ref, adjusted_count}


update_id_mapping function does that.

So I am following the same logic. We can chat over IRC / I can give a 
code walk-through ...



-   if that points to an SMMU node, go over each outgoing ITS entry and
find overlaps with this RC entry
- for each overlap create a new entry in a list with this RC
pointing to the ITS directly

phase 2, creating the new IORT
- go over each RC node
-   if that points to an ITS, copy through IORT entries
-   if that points to an SMMU, replace with the remapped entries
- go over each ITS node
-   copy through IORT entries

Thats exactly what this patch does.

What are you comments on the current patch approach to hide smmu nodes.
I have answered to your comments, see below.
IMHO we can reuse most of the fixup code here.


So I believe this would do the trick and you end up with an efficient
representation of the IORT without SMMUs - at least for RC nodes.

After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. That should be fixable by removing the
hardcoded IORT node types in the code and treating NC nodes like RC 
nodes.
Yes, so first we can take this as a base, once this is ok, I can add 
support for named components.
2) Eventually we will need *virtual* deviceID support, for DomUs. 
Now we
I am a bit surprised that you answered to the e-mail but didn't 
provide any opinion on 2).

Apologies for that.

could start introducing that already, also doing some virtual mapping
for Dom0. The ITS code would then translate each virtual device ID that
Dom0 requests into a hardware device ID.
I agree that this means a lot more work, but we will need it anyway.


I am a bit surprised that you answered to the e-mail but didn't 
provide any opinion on 2).

Apologies for that. Sorry to surprise you twice :)

IMHO It was a bit obvious for DomU and I was waiting to hear what other 
would say on this.

as (2) below.
Moreover we need to discuss IORT generation for DomU
- could be done by xl tools
or xen should do it.

Also this is the part of PCI passthrough flow so that also might change 
few things.


But from pov of dom0 smmu hiding, it is a different flow and is coupled 
with PCI PT.






I think 1) can be solved using this series as a base. I have quite some
comments ready for the patches, shall we follow this route.

2) obviously would change the game completely. We need to sit down and
design this properly. Probably this means that Xen parses the IORT and
builds internal representations of the mappings, 
Can you please add more detail on the internal representations of the 
mappings.
IIUC the information is already there in ACPI tables, would it not add 
extra overhead of abstractions to maintain.
Enumeration of PCI devices would generate a pci list which would be 
anyways separate.

which are consulted as
needed when passing through devices. The guest's (that would include
Dom0) IORT would then be generated completely from scratch.

I have a different opinion here, dom0 IORT would is most cases be very 
close to host IORT sans smmu nodes and few platform devices.
And which platform devices to hide would probably depend on the xen 
command line,
For instance for dom0 we would copy ITS information while for domU it 
would have to be generated, so scratch would be more for domU.
We could have a common code for creating IORT structure but it would be 
a bit complex code with lot of abstractions and callbacks, so I suggest 
that keeping code simpler would be better.

I would like to hear your opinion on this. I will try to discuss the
feasibility of 2) with people at Connect. It would be 

Re: [Xen-devel] [PATCH v2 0/2] ARM: ACPI: IORT: Hide SMMU from hardware domain's IORT table

2017-09-24 Thread Manish Jaggi

Hi Andre,

On 9/22/2017 7:42 PM, Andre Przywara wrote:

Hi Manish,

On 11/09/17 22:33, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

The set is divided into two patches. First one calculates the size of IORT
while second one writes the IORT table itself.

It would be good if you could give a quick introduction *why* this set
is needed here (and introduce IORT to the casual reader).
In general some more high-level documentation on your functions would be
good, as it took me quite some time to understand what each function does.

ok, will add more documentation.

So my understanding is:
phase 1:
- go over each entry in each RC node
Rather than each entry (which could be a large number) I am taking the 
complete range and checking it with the same logic.
If the ID range is a subset or a super-set of id range in smmu, new id 
range is created.


So if pci_rc node has an id map {p_input-base,p_output-base,p_out_ref, 
p_count} and it an output reference to smmu node with id-map
{s_input-base, s_output-base,s_out_ref,  s_count}, based on the the the 
s_count and s_input/p_output the new id-map is created with {p_input, 
s_output, s_out_ref, adjusted_count}


update_id_mapping function does that.

So I am following the same logic. We can chat over IRC / I can give a 
code walk-through ...



-   if that points to an SMMU node, go over each outgoing ITS entry and
find overlaps with this RC entry
- for each overlap create a new entry in a list with this RC
pointing to the ITS directly

phase 2, creating the new IORT
- go over each RC node
-   if that points to an ITS, copy through IORT entries
-   if that points to an SMMU, replace with the remapped entries
- go over each ITS node
-   copy through IORT entries

Thats exactly what this patch does.

So I believe this would do the trick and you end up with an efficient
representation of the IORT without SMMUs - at least for RC nodes.

After some brainstorming with Julien we found two problems:
1) This only covers RC nodes, but not "named components" (platform
devices), which we will need. That should be fixable by removing the
hardcoded IORT node types in the code and treating NC nodes like RC nodes.
Yes, so first we can take this as a base, once this is ok, I can add 
support for named components.

2) Eventually we will need *virtual* deviceID support, for DomUs. Now we
could start introducing that already, also doing some virtual mapping
for Dom0. The ITS code would then translate each virtual device ID that
Dom0 requests into a hardware device ID.
I agree that this means a lot more work, but we will need it anyway.

I think 1) can be solved using this series as a base. I have quite some
comments ready for the patches, shall we follow this route.

2) obviously would change the game completely. We need to sit down and
design this properly. Probably this means that Xen parses the IORT and
builds internal representations of the mappings, which are consulted as
needed when passing through devices. The guest's (that would include
Dom0) IORT would then be generated completely from scratch.

I would like to hear your opinion on this. I will try to discuss the
feasibility of 2) with people at Connect. It would be good if we could
decide whether this is the way to go or we should use a solution based
on this series.

Cheers,
Andre.



patch1: estimates size of hardware domain IORT table by parsing all
the pcirc nodes and their idmaps, and thereby calculating size by
removing smmu nodes.

Hardware domain IORT table will have only ITS and PCIRC nodes, and PCIRC
nodes' idmap will have output refrences to ITS group nodes.

patch 2: The steps are:
a. First ITS group nodes are written and their offsets are saved
along with the respective offsets from the firmware table.
This is required when smmu node is hidden and smmu node still points
to the old output_reference.

b. PCIRC idmap is parsed and a list of idmaps is created which will
have PCIRC idmap -> ITS group nodes.
Each idmap is written by resolving ITS offset from the map saved in
previous step.

Changes wrt v1:
No assumption is made wrt format of IORT / hw support

Manish Jaggi (2):
   ARM: ACPI: IORT: Estimate the size of hardware domain IORT table
   ARM: ACPI: IORT: Write Hardware domain's IORT table

  xen/arch/arm/acpi/Makefile  |   1 +
  xen/arch/arm/acpi/iort.c| 414 
  xen/arch/arm/domain_build.c |  49 +-
  xen/include/asm-arm/acpi.h  |   1 +
  xen/include/asm-arm/iort.h  |  17 ++
  5 files changed, 481 insertions(+), 1 deletion(-)
  create mode 100644 xen/arch/arm/acpi/iort.c
  create mode 100644 xen/include/asm-arm/iort.h




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2 0/7] SMMUv3 driver and the supporting framework

2017-09-20 Thread Manish Jaggi

Hi Sameer,

On 9/21/2017 6:07 AM, Sameer Goel wrote:

This change incoporates most of the review comments from [1] and adds the
proposed SMMUv3 driver.

List of changes:
- Introduce the iommu_fwspec implementation - No change from the last RFC
- IORT port from linux. The differences are as under:
* Modified the code for creating the SMMU devices. This code also
  initializes the discoverd SMMU devices.
* MSI code is left as is, but this code is untested.
* IORT node data parsing is delegated to the driver. Looking for 
comments
   on enabling the code in IORT driver. This will need a standard resource
   object. (Direct port from Linux or a new define for Xen?)
 * Assumptions on PCI IORT SMMU interaction. PCI assign device will call
   iort_iommu_configure to setup the streamids.Then it will call SMMU
   assign device with the right struct device argument.
- SMMUv3 port from Linux. The list of changes are as under:
* The Xen iommu_ops list is at parity with SMMUv2.
* There is generally no need for an IOMMU group, but have kept a dummy
  define for now.
* Have commented out the S1 translation code.
* MSI code is commented out.
* Page table ops are commented out as the driver shares the page tables
  with the cpu.
* The list of SMMU devices is maintained from the driver code.

Open questions:
- IORT regeneration for DOM0. I was hoping to get some update on [2].

Please see v2 patch set
https://lists.xen.org/archives/html/xen-devel/2017-09/msg01143.html

- We also need a notification framework to get the Named node information from 
DSDT.
- Should we port over code for non-hsared page tables from the kernel or 
leverage [3].


[1] "[RFC 0/6] IORT support and introduce fwspec"
[2] "[Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware 
domain"
[3] "Non-shared" IOMMU support on ARM"

Sameer Goel (7):
   passthrough/arm: Modify SMMU driver to use generic device definition
   arm64: Add definitions for fwnode_handle
   xen/passthrough/arm: Introduce iommu_fwspec
   ACPI: arm: Support for IORT
   acpi:arm64: Add support for parsing IORT table
   Add verbatim copy of arm-smmu-v3.c from Linux
   xen/iommu: smmu-v3: Add Xen specific code to enable the ported driver

  xen/arch/arm/setup.c  |3 +
  xen/drivers/acpi/Makefile |1 +
  xen/drivers/acpi/arm/Makefile |1 +
  xen/drivers/acpi/arm/iort.c   |  986 ++
  xen/drivers/passthrough/arm/Makefile  |1 +
  xen/drivers/passthrough/arm/iommu.c   |   66 +
  xen/drivers/passthrough/arm/smmu-v3.c | 3412 +
  xen/drivers/passthrough/arm/smmu.c|   13 +-
  xen/include/acpi/acpi_iort.h  |   61 +
  xen/include/asm-arm/device.h  |5 +
  xen/include/xen/acpi.h|   21 +
  xen/include/xen/fwnode.h  |   33 +
  xen/include/xen/iommu.h   |   29 +
  xen/include/xen/pci.h |8 +
  14 files changed, 4634 insertions(+), 6 deletions(-)
  create mode 100644 xen/drivers/acpi/arm/Makefile
  create mode 100644 xen/drivers/acpi/arm/iort.c
  create mode 100644 xen/drivers/passthrough/arm/smmu-v3.c
  create mode 100644 xen/include/acpi/acpi_iort.h
  create mode 100644 xen/include/xen/fwnode.h




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3 3/5] ARM: ITS: Deny hardware domain access to ITS

2017-09-20 Thread Manish Jaggi


On 9/7/2017 10:27 PM, Andre Przywara wrote:

Hi,

On 05/09/17 18:14, mja...@caviumnetworks.com wrote:

From: Manish Jaggi <mja...@cavium.com>

This patch extends the gicv3_iomem_deny_access functionality by adding
support for ITS region as well. Add function gicv3_its_deny_access.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
  xen/arch/arm/gic-v3-its.c| 22 ++
  xen/arch/arm/gic-v3.c|  3 +++
  xen/include/asm-arm/gic_v3_its.h |  9 +
  3 files changed, 34 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 536b48d..0ab1466 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,6 +20,7 @@
  
  #include 

  #include 
+#include 
  #include 
  #include 
  #include 
@@ -906,6 +907,27 @@ struct pending_irq *gicv3_assign_guest_event(struct domain 
*d,
  return pirq;
  }
  
+int gicv3_its_deny_access(const struct domain *d)

+{
+int rc = 0;
+unsigned long mfn, nr;
+const struct host_its *its_data;
+
+list_for_each_entry( its_data, _its_list, entry )
+{
+mfn = paddr_to_pfn(its_data->addr);
+nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE);

Shouldn't this not only cover the ITS register frame, but also the
following 64K page containing the doorbell address? Otherwise we leave
the doorbell address open, which seems to be asking for trouble ...

Cheers,
Andre.

ok,  I will fix in patch 2 the size as 128K, same a linux.
If no other change required in this patch can you please ack it.



+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+{
+printk( "iomem_deny_access failed for %lx:%lx \r\n", mfn, nr);
+break;
+}
+}
+
+return rc;
+}
+
  /*
   * Create the respective guest DT nodes from a list of host ITSes.
   * This copies the reg property, so the guest sees the ITS at the same address
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 6f562f4..b3d605d 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1308,6 +1308,9 @@ static int gicv3_iomem_deny_access(const struct domain *d)
  if ( rc )
  return rc;
  
+if ( gicv3_its_deny_access(d) )

+return rc;
+
  for ( i = 0; i < gicv3.rdist_count; i++ )
  {
  mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT;
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 993819a..9cf18da 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -138,6 +138,10 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
  #ifdef CONFIG_ACPI
  void gicv3_its_acpi_init(void);
  #endif
+
+/* Deny iomem access for its */
+int gicv3_its_deny_access(const struct domain *d);
+
  bool gicv3_its_host_has_its(void);
  
  unsigned int vgic_v3_its_count(const struct domain *d);

@@ -205,6 +209,11 @@ static inline void gicv3_its_acpi_init(void)
  }
  #endif
  
+static inline int gicv3_its_deny_access(const struct domain *d)

+{
+return 0;
+}
+
  static inline bool gicv3_its_host_has_its(void)
  {
  return false;




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Next Xen ARM community call - Wednesday 13th September 2017

2017-09-06 Thread Manish Jaggi

Hi All,

On 8/25/2017 4:12 PM, Julien Grall wrote:

Hi all,

I would suggest to have the next community call on Wednesday 13th 
September 2017 5pm BST. Does it sound good?


Do you have any specific topic you would like to discuss?
Will it be possible to have a small discussion on the PCI passthrough 
support / _implementation timelines_ with all concerned people?


-manish


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain

2017-08-10 Thread Manish Jaggi



On 8/10/2017 6:44 PM, Julien Grall wrote:



On 08/10/2017 02:00 PM, Manish Jaggi wrote:

HI Julien,

On 8/10/2017 5:43 PM, Julien Grall wrote:



On 10/08/17 13:00, Manish Jaggi wrote:

Hi Julien,

On 8/10/2017 4:58 PM, Julien Grall wrote:



On 10/08/17 12:21, Manish Jaggi wrote:

Hi Julien,

On 6/21/2017 6:53 PM, Julien Grall wrote:

Hi Manish,

On 21/06/17 02:01, Manish Jaggi wrote:
This patch series adds the support of ITS for ACPI hardware 
domain.

It is tested on staging branch with has ITS v12 patchset by Andre.

I have tried to incorporate the review comments on the RFC v1/v2
patch.
The single patch in RFC is now split into 4 patches.


I will comment here rather than on each patches.



Patch1: ARM: ITS: Add translation_id to host_its
 Adds translation_id in host_its data structure, which is 
populated

from
 translation_id read from firmwar MADT. This value is then 
programmed

into
 local MADT created for hardware domain in patch 4.


I don't see any reason to store value that will only be used for
generating the MADT which BTW is just a copy for the ITS. 
Instead we

should copy over the MADT entries.


There are two approaches,

If I use the standard API  acpi_table_parse_madt which would iterate
over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to 
maintain the
addr and translation_id in some data structure, to be filled 
later in

the hwdomain copy of madt generic translator.

If I don't use the standard API I have to add code to manually 
parse all

the translator entries.


There are a 3rd approach I suggested and ignored... The ITS entries
for Dom0 is exactly the same as the host entries.

Yes, and if not passed properly dom0 wont get device interrupts...

So you only need to do a verbatim copy of the entry...


Can you please check patch 4/2, the translation_id and address are
passed verbatim, the other values are reserved in
acpi_madt_generic_translator.


For ACPI, we took the approach to only rewrite what's necessary and 
give the rest to Dom0 as it is. If newer version of ACPI re-used 
those fields, then they will be copied over to Dom0. I don't 
consider it as an issue because the problem would be the same if 
those fields have an important meaning for the platform.

Few thoughts...
If we follow this approach, few points needs to be considered
- If ACPI may use the reserved information later it could be equally 
important for dom0 and Xen,

  so it might be useful to keep reserved in xen as well.


I already covered that in my previous e-mail.


Yes, I am just stating it again for xen.


- For platforms which use dt, translation_id is not required to be 
stored in struct host_its, similarly for platforms which use acpi

dt_node pointer might be of no use.

So we can have struct host_its having a union with dt_device_node * 
for dt and acpi_madt_generic_translator * for acpi.

IMHO this could be an approach we can take.

struct host_its {
  struct list_head entry;
-const struct dt_device_node *dt_node;
+   union {
+const struct dt_device_node *dt_node;
+const struct acpi_madt_generic_translator *acpi_its_entry;
+};
 paddr_t addr;


What don't you get in my previous e-mail? A no is a no, full stop.

This is not helping.



Just do what we do in *_make_hwdom_madt. That will work here with no 
need of a union or anything else.

The patchset provides two features
 (a) populates host_its list from ACPI tables, so ACPI xen can use ITS
 (b) provides a MADT with ITS information to dom0.

What I am focusing with union is for (a) ,
and (b) code would be simpler if we use the union in (a).

You seem to be discounting (a) in comments so far.

why union? as I have mentioned before...
 It will make the host_its structure accommodate dt node and 
acpi_madt_generic_translator, both has same purpose.

If one is valid why not other.

please provide a technical reason for not doing it.


Even the DT code can be reworked to avoid storing the node.


we can have a separate patch for that.



Cheers,


Cheers!
Sending next rev shortly.

-manish

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain

2017-08-10 Thread Manish Jaggi

HI Julien,

On 8/10/2017 5:43 PM, Julien Grall wrote:



On 10/08/17 13:00, Manish Jaggi wrote:

Hi Julien,

On 8/10/2017 4:58 PM, Julien Grall wrote:



On 10/08/17 12:21, Manish Jaggi wrote:

Hi Julien,

On 6/21/2017 6:53 PM, Julien Grall wrote:

Hi Manish,

On 21/06/17 02:01, Manish Jaggi wrote:

This patch series adds the support of ITS for ACPI hardware domain.
It is tested on staging branch with has ITS v12 patchset by Andre.

I have tried to incorporate the review comments on the RFC v1/v2
patch.
The single patch in RFC is now split into 4 patches.


I will comment here rather than on each patches.



Patch1: ARM: ITS: Add translation_id to host_its
 Adds translation_id in host_its data structure, which is populated
from
 translation_id read from firmwar MADT. This value is then 
programmed

into
 local MADT created for hardware domain in patch 4.


I don't see any reason to store value that will only be used for
generating the MADT which BTW is just a copy for the ITS. Instead we
should copy over the MADT entries.


There are two approaches,

If I use the standard API  acpi_table_parse_madt which would iterate
over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the
addr and translation_id in some data structure, to be filled later in
the hwdomain copy of madt generic translator.

If I don't use the standard API I have to add code to manually 
parse all

the translator entries.


There are a 3rd approach I suggested and ignored... The ITS entries
for Dom0 is exactly the same as the host entries.

Yes, and if not passed properly dom0 wont get device interrupts...

So you only need to do a verbatim copy of the entry...


Can you please check patch 4/2, the translation_id and address are
passed verbatim, the other values are reserved in
acpi_madt_generic_translator.


For ACPI, we took the approach to only rewrite what's necessary and 
give the rest to Dom0 as it is. If newer version of ACPI re-used those 
fields, then they will be copied over to Dom0. I don't consider it as 
an issue because the problem would be the same if those fields have an 
important meaning for the platform.

Few thoughts...
If we follow this approach, few points needs to be considered
- If ACPI may use the reserved information later it could be equally 
important for dom0 and Xen,

 so it might be useful to keep reserved in xen as well.

- For platforms which use dt, translation_id is not required to be 
stored in struct host_its, similarly for platforms which use acpi

dt_node pointer might be of no use.

So we can have struct host_its having a union with dt_device_node * for 
dt and acpi_madt_generic_translator * for acpi.

IMHO this could be an approach we can take.

struct host_its {
 struct list_head entry;
-const struct dt_device_node *dt_node;
+   union {
+const struct dt_device_node *dt_node;
+const struct acpi_madt_generic_translator *acpi_its_entry;
+};
paddr_t addr;






Could you please detail 3rd approach and how different it is from
approach 2.


ACPI_MEMCPY(its, host_its, size);

Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain

2017-08-10 Thread Manish Jaggi

Hi Julien,

On 8/10/2017 4:58 PM, Julien Grall wrote:



On 10/08/17 12:21, Manish Jaggi wrote:

Hi Julien,

On 6/21/2017 6:53 PM, Julien Grall wrote:

Hi Manish,

On 21/06/17 02:01, Manish Jaggi wrote:

This patch series adds the support of ITS for ACPI hardware domain.
It is tested on staging branch with has ITS v12 patchset by Andre.

I have tried to incorporate the review comments on the RFC v1/v2 
patch.

The single patch in RFC is now split into 4 patches.


I will comment here rather than on each patches.



Patch1: ARM: ITS: Add translation_id to host_its
 Adds translation_id in host_its data structure, which is populated 
from

 translation_id read from firmwar MADT. This value is then programmed
into
 local MADT created for hardware domain in patch 4.


I don't see any reason to store value that will only be used for
generating the MADT which BTW is just a copy for the ITS. Instead we
should copy over the MADT entries.


There are two approaches,

If I use the standard API  acpi_table_parse_madt which would iterate
over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the
addr and translation_id in some data structure, to be filled later in
the hwdomain copy of madt generic translator.

If I don't use the standard API I have to add code to manually parse all
the translator entries.


There are a 3rd approach I suggested and ignored... The ITS entries 
for Dom0 is exactly the same as the host entries.

Yes, and if not passed properly dom0 wont get device interrupts...

So you only need to do a verbatim copy of the entry...

Can you please check patch 4/2, the translation_id and address are 
passed verbatim, the other values are reserved in 
acpi_madt_generic_translator.


Could you please detail 3rd approach and how different it is from 
approach 2.

Which of the two you find cleaner?

This would also avoid to introduce a fake ID for DT as you currently
do in patch #2.


This can be avoided by storing translator_id only for acpi.

+static int add_to_host_its_list(u64 addr, u64 size,
+  u32 translation_id, const void *node)
+{
+struct host_its *its_data;
+its_data = xzalloc(struct host_its);
+
+if ( !its_data )
+return -1;
+
+if ( node )
+its_data->dt_node = node;
+else
+its_data->translation_id = translation_id;
+
+its_data->addr = addr;
+its_data->size = size;
+printk("GICv3: Found ITS @0x%lx\n", addr);
+
+list_add_tail(_data->entry, _its_list);
+
+return 0;

What do you think?


I don't want to see the translation_id stored for no use at all but 
creating the DOM0 ACPI tables. Is that clearer?

ok, I will remove it.


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain

2017-08-10 Thread Manish Jaggi

Hi Julien,

On 6/21/2017 6:53 PM, Julien Grall wrote:

Hi Manish,

On 21/06/17 02:01, Manish Jaggi wrote:

This patch series adds the support of ITS for ACPI hardware domain.
It is tested on staging branch with has ITS v12 patchset by Andre.

I have tried to incorporate the review comments on the RFC v1/v2 patch.
The single patch in RFC is now split into 4 patches.


I will comment here rather than on each patches.



Patch1: ARM: ITS: Add translation_id to host_its
 Adds translation_id in host_its data structure, which is populated from
 translation_id read from firmwar MADT. This value is then programmed 
into

 local MADT created for hardware domain in patch 4.


I don't see any reason to store value that will only be used for 
generating the MADT which BTW is just a copy for the ITS. Instead we 
should copy over the MADT entries.



There are two approaches,

If I use the standard API  acpi_table_parse_madt which would iterate 
over ACPI_MADT_TYPE_GENERIC_TRANSLATOR entries, I have to maintain the 
addr and translation_id in some data structure, to be filled later in 
the hwdomain copy of madt generic translator.


If I don't use the standard API I have to add code to manually parse all 
the translator entries.

Which of the two you find cleaner?
This would also avoid to introduce a fake ID for DT as you currently 
do in patch #2.



This can be avoided by storing translator_id only for acpi.

+static int add_to_host_its_list(u64 addr, u64 size,
+  u32 translation_id, const void *node)
+{
+struct host_its *its_data;
+its_data = xzalloc(struct host_its);
+
+if ( !its_data )
+return -1;
+
+if ( node )
+its_data->dt_node = node;
+else
+its_data->translation_id = translation_id;
+
+its_data->addr = addr;
+its_data->size = size;
+printk("GICv3: Found ITS @0x%lx\n", addr);
+
+list_add_tail(_data->entry, _its_list);
+
+return 0;

What do you think?


Patch2: ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
 Introduces function for its_acpi_init, which calls add_to_host_its_list
 which is a common function also called from _dt variant.


Just reading at the description, there are a call for splitting this 
patch... Looking at the code, you mix code movement and code addition.


Have a look at [1] to see how to break patches.


Yes I will break into multiple patches patch 2 and 4.


Patch3: ARM: ITS: Deny hardware domain access to its
 Extends the gicv3_iomem_deny to include its regions as well

Patch4: ARM: ACPI: Add ITS to hardware domain MADT
 This patch adds ITS information in hardware domain's MADT table.
 Also this patch interoduces .get_hwdom_madt_size in gic_hw_operations,
 to return the complete size of MADT table for hardware domain.


Same here.

Yes.





Manish Jaggi (4):
  ARM: ITS: Add translation_id to host_its
  ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
  ARM: ITS: Deny hardware domain access to its
  ARM: ACPI: Add ITS to hardware domain MADT

 xen/arch/arm/domain_build.c  |   7 +--
 xen/arch/arm/gic-v2.c|   6 +++
 xen/arch/arm/gic-v3-its.c| 102 
+++

 xen/arch/arm/gic-v3.c|  31 
 xen/arch/arm/gic.c   |  11 +
 xen/include/asm-arm/gic.h|   3 ++
 xen/include/asm-arm/gic_v3_its.h |  36 ++
 7 files changed, 180 insertions(+), 16 deletions(-)



Cheers,

[1] 
https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches#Making_good_patches





___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] ARM: SMMUv3 support

2017-07-31 Thread Manish Jaggi



On 6/13/2017 10:19 AM, Manish Jaggi wrote:



On 3/29/2017 5:30 AM, Goel, Sameer wrote:

Sure, I will try to post something soon.

Hi Sameer,
Are you still working on SMMU v3, can you please post patches.


Hi Sameer,
Could you please post RFC patches for SMMUv3, can provide feedback by 
testing on thunderX platform.


Thanks
manish

Thanks
Manish

Thanks,
Sameer

On 3/27/2017 11:03 PM, Vijay Kilari wrote:
On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameer 
<sg...@codeaurora.org> wrote:

Hi,
  I am working on adding this support. The work is in initial 
stages and will target ACPI systems to start with. Do you have a 
specific requirement? Or even better: want to help with DT testing 
? :)

Thanks Sameer. I don't have any specific requirement. I am also
looking with ACPI support.
Please share your RFC patches so that I can test on our platform.


Thanks,
Sameer

On 3/20/2017 11:58 PM, Vijay Kilari wrote:

Hi,

  Is there any effort put by anyone to get SMMUv3 support in 
Xen for ARM64?.

Would be glad to know.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,

a Linux Foundation Collaborative Project.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Xen 4.10 Development Update

2017-07-20 Thread Manish Jaggi

Hi Julien,

On Mon, Jul 17, 2017 at 02:26:22PM +0100, Julien Grall wrote:

This email only tracks big items for xen.git tree. Please reply for items you
woulk like to see in 4.10 so that people have an idea what is going on and
prioritise accordingly.

You're welcome to provide description and use cases of the feature you're
working on.

= Timeline =

We now adopt a fixed cut-off date scheme. We will release twice a
year. The upcoming 4.10 timeline are as followed:

* Last posting date: September 15th, 2017
* Hard code freeze: September 29th, 2017
* RC1: TBD
* Release: December 2, 2017

Note that we don't have freeze exception scheme anymore. All patches
that wish to go into 4.10 must be posted no later than the last posting
date. All patches posted after that date will be automatically queued
into next release.

RCs will be arranged immediately after freeze.

We recently introduced a jira instance to track all the tasks (not only big)
for the project. See: https://xenproject.atlassian.net/projects/XEN/issues.

Most of the tasks tracked by this e-mail also have a corresponding jira task
referred by XEN-N.

I have started to include the version number of series associated to each
feature. Can each owner send an update on the version number if the series
was posted upstream?

= Projects =

== Hypervisor ==

*  Per-cpu tasklet
   -  XEN-28
   -  Konrad Rzeszutek Wilk

*  Add support of rcu_idle_{enter,exit}
   -  XEN-27
   -  Dario Faggioli

=== x86 ===

I am working on XEN-70, have already posted rfc. [1]

Also can you please add a xen-jira issue for the ITS ACPI support [2] v2 
patches,

which I have already sent and am working on next rev.

[1] https://www.mail-archive.com/xen-devel@lists.xen.org/msg110269.html
[2] https://www.mail-archive.com/xen-devel@lists.xen.org/msg111342.html




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit

2017-07-20 Thread Manish Jaggi

Hi Roger,

On 7/20/2017 3:59 PM, Roger Pau Monné wrote:

On Thu, Jul 20, 2017 at 03:02:19PM +0530, Manish Jaggi wrote:

Hi Roger,

On 7/20/2017 1:54 PM, Roger Pau Monné wrote:

On Thu, Jul 20, 2017 at 09:24:36AM +0530, Manish Jaggi wrote:

Hi Punit,

On 7/19/2017 8:11 PM, Punit Agrawal wrote:

I took some notes for the PCI Passthrough design discussion at Xen
Summit. Due to the wide range of topics covered, the notes got sparser
towards the end of the session. I've tried to attribute names against
comments but have very likely got things mixed up. Apologies in advance.

Was curious if any discussions happened on the RC Emu (config space
emulation) as per slide 18
https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf

Part of this is already posted on the list (ATM for x86 only) but the
PCI specification (and therefore the config space emulation) is not
tied to any arch:

https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg03698.html

 From the summary, I have a  questions on
"
  - Roger: Registering config space with Xen before device discovery
   will allow the hypervisor to set access traps for certain
  functionality as appropriate"

Traps will do emulation or something else ?

Have you read the series?

What else could the traps do? I'm not sure I understand the question.


  Is the config space emulation only for DomU or it for Dom0 as well ?

Again, have you read the series? This is explained in the cover letter
(0/9).

On x86 this is initially for Dom0 only, DomU will continue to use QEMU
until the implementation inside the hypervisor (vPCI) is complete
enough to handle DomU securely.


Slide 18 shows only for DomU ?

ARM folks believe this is not needed for Dom0 in the ARM case, I don't
have an opinion, I know it's certainly mandatory for x86 PVH Dom0.

Julien clarified about Slide18.

Roger.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit

2017-07-20 Thread Manish Jaggi

HI Julien,

On 7/20/2017 4:11 PM, Julien Grall wrote:



On 20/07/17 10:32, Manish Jaggi wrote:

Hi Roger,

On 7/20/2017 1:54 PM, Roger Pau Monné wrote:

On Thu, Jul 20, 2017 at 09:24:36AM +0530, Manish Jaggi wrote:

Hi Punit,

On 7/19/2017 8:11 PM, Punit Agrawal wrote:

I took some notes for the PCI Passthrough design discussion at Xen
Summit. Due to the wide range of topics covered, the notes got 
sparser

towards the end of the session. I've tried to attribute names against
comments but have very likely got things mixed up. Apologies in
advance.

Was curious if any discussions happened on the RC Emu (config space
emulation) as per slide 18
https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf 




Part of this is already posted on the list (ATM for x86 only) but the
PCI specification (and therefore the config space emulation) is not
tied to any arch:

https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg03698.html 




From the summary, I have a  questions on
"
 - Roger: Registering config space with Xen before device discovery
  will allow the hypervisor to set access traps for certain
 functionality as appropriate"

Traps will do emulation or something else ?
 Is the config space emulation only for DomU or it for Dom0 as well ?
Slide 18 shows only for DomU ?


My slides are not meant to be read without the talk. In this 
particular case, this is only explaining how passthrough will work for 
DomU.



Thanks for clarification.
Ah ok, The single slide created confusion, It would be nice if you have 
added one more describing dom0 config access. I will wait for the video 
to get posted.
Roger series is at the moment focusing on emulating a fully ECAM 
compliant hostbridge for the hardware domain. This is because Xen and 
the hardware domain should not access the configuration space at the 
same time. 

Yes as discussed on this topic on list few weeks back.
We may also perform some tasks (i.e MSI mapping, memory mapping) or 
sanitizing when the configuration space is updated by the hardware 
domain.


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit

2017-07-20 Thread Manish Jaggi

Hi Roger,

On 7/20/2017 1:54 PM, Roger Pau Monné wrote:

On Thu, Jul 20, 2017 at 09:24:36AM +0530, Manish Jaggi wrote:

Hi Punit,

On 7/19/2017 8:11 PM, Punit Agrawal wrote:

I took some notes for the PCI Passthrough design discussion at Xen
Summit. Due to the wide range of topics covered, the notes got sparser
towards the end of the session. I've tried to attribute names against
comments but have very likely got things mixed up. Apologies in advance.

Was curious if any discussions happened on the RC Emu (config space
emulation) as per slide 18
https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf

Part of this is already posted on the list (ATM for x86 only) but the
PCI specification (and therefore the config space emulation) is not
tied to any arch:

https://lists.xenproject.org/archives/html/xen-devel/2017-06/msg03698.html

From the summary, I have a  questions on
"
 - Roger: Registering config space with Xen before device discovery
  will allow the hypervisor to set access traps for certain
 functionality as appropriate"

Traps will do emulation or something else ?
 Is the config space emulation only for DomU or it for Dom0 as well ?
Slide 18 shows only for DomU ?

-manish


Roger.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] Notes from PCI Passthrough design discussion at Xen Summit

2017-07-19 Thread Manish Jaggi

Hi Punit,

On 7/19/2017 8:11 PM, Punit Agrawal wrote:

I took some notes for the PCI Passthrough design discussion at Xen
Summit. Due to the wide range of topics covered, the notes got sparser
towards the end of the session. I've tried to attribute names against
comments but have very likely got things mixed up. Apologies in advance.
Was curious if any discussions happened on the RC Emu (config space 
emulation) as per slide 18

https://schd.ws/hosted_files/xendeveloperanddesignsummit2017/76/slides.pdf

Although the session was well attended, some of the more active
discussions involved - Julien Grall, Stefano Stabillini, Roger Pau
Monné, Jan Beulich, Vikram Sethi. I'm sure I am missing some folks here.

Please do point out any mistakes I've made for the audience's benefit.

* Discovery of PCI hostbridges
   - Dom0 will be responsible for scanning the ECAM for devices and
 register them with Xen. This approach is chosen due to variety of
 non-standard PCI controllers on ARM platforms and the desire to
 not duplicate driver code between Linux and Xen.
   - Jan, Roger: Bus scan needs to happer before device discovery
 otherwise a small window where Xen doesn't know which host bridge
 the device is registered on (as it'll likely only refer to the
 segment number).
   - Roger: Registering config space with Xen before device discovery
 will allow the hypervisor to set access traps for certain
 functionality as appropriate.
   - Jan: Xen and Dom0 have to agree on the PCI segment number mapping
 to host bridges. This is so that for future calls, Dom0 and
 hypervisor can communicate using sBDF without ambiguity.
   - Julien: Dom0 will register config space address and segment
 number. mcfg_add will be used to pass the segment to Xen.
   - PCI segment - it's purely a software construct so identify
 different host bridges.
   - Some discussion on whether boot devices need to be on
 Segment 0. Technically, MCFG is only required to describe Segment
 0 - other host bridges can be described in AML.

* Configuration accesses for non-ecam compliant host bridge
   - Julien proposed these to be forwarded to Dom0 for handling.
   - Audience: What kind of non-compliance are we talking about? If
 they are simple, can they be implemented in Xen in a few lines of
 code?
   - A few different types
 - restrictions on access size, e.g., only certain sizes supported
 - register multiplexing via a window; similar to legacy x86 PCI
   access mechanism
 - ECAM compliant but with special casing for different devices

* Support on 32bit platforms
   - Is there enough address space to map ECAM into Dom0. Maximum ECAM
 size is 256MB.

* PCI ACS support
   - Vikram: Xen needs to be aware of the PCI device topology to
 correctly setup device groups for passthrough
   - Jan: Roger: IIRC, Xen is already aware of the device topology
 thought it doesn't use ACS to work out which devices need to be
 passed to guest as a group.
   - Stefano: There was support in xend (previous Xen toolstack) but the
 functionality has not yet been ported to libxl.

* Implementation milestones
   - Julien provided a summary of breakdown
 - M0 - design document, currently under discussion on xen-devel
 - M1 - PCI support in Xen
   - Xen aware of PCI devices (via Dom0 registration)
 - M2 - Guest PCIe passthrough
   - Julien: Some complexity in dealing with Legacy interrupts as they can 
be shared.
   - Roger: MSIs mandatory for PCIe. So legacy interrupts can be
 tackled at a later stage.
 - M3 - testing
   - fuzzing. Jan: If implemented it'll be better than what x86
 currently have.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH] ARM: SMMUv2: Add compatible match entry for cavium smmuv2

2017-06-20 Thread Manish Jaggi
This patch adds cavium,smmu-v2 compatible match entry in smmu driver

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/drivers/passthrough/arm/smmu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/drivers/passthrough/arm/smmu.c 
b/xen/drivers/passthrough/arm/smmu.c
index 1082fcf..887f874 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2272,6 +2272,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-400", .data = (void *)ARM_SMMU_V1 },
{ .compatible = "arm,mmu-401", .data = (void *)ARM_SMMU_V1 },
{ .compatible = "arm,mmu-500", .data = (void *)ARM_SMMU_V2 },
+   { .compatible = "cavium,smmu-v2", .data = (void *)ARM_SMMU_V2 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 3/4] ARM: ITS: Deny hardware domain access to its

2017-06-20 Thread Manish Jaggi
This patch extends the gicv3_iomem_deny_access functionality by adding support
for its region as well. Added function gicv3_its_deny_access.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/gic-v3-its.c| 19 +++
 xen/arch/arm/gic-v3.c|  7 +++
 xen/include/asm-arm/gic_v3_its.h |  8 
 3 files changed, 34 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index e11f29a..98c8f46 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,6 +20,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -905,6 +906,24 @@ struct pending_irq *gicv3_assign_guest_event(struct domain 
*d,
 return pirq;
 }
 
+int gicv3_its_deny_access(const struct domain *d)
+{
+int rc = 0;
+unsigned long mfn, nr;
+const struct host_its *its_data;
+
+list_for_each_entry(its_data, _its_list, entry)
+{
+mfn = paddr_to_pfn(its_data->addr);
+nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE);
+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+break;
+}
+
+return rc;
+}
+
 /*
  * Create the respective guest DT nodes from a list of host ITSes.
  * This copies the reg property, so the guest sees the ITS at the same address
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 558b32c..f6fbf2f 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1308,6 +1308,13 @@ static int gicv3_iomem_deny_access(const struct domain 
*d)
 if ( rc )
 return rc;
 
+if ( gicv3_its_host_has_its() )
+{
+rc = gicv3_its_deny_access(d);
+if ( rc )
+return rc;
+}
+
 for ( i = 0; i < gicv3.rdist_count; i++ )
 {
 mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT;
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index bcfa181..84dbb9c 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -143,6 +143,9 @@ int gicv3_its_acpi_init(struct acpi_subtable_header *header,
 const unsigned long end);
 #endif
 
+/* Deny iomem access for its */
+int gicv3_its_deny_access(const struct domain *d);
+
 bool gicv3_its_host_has_its(void);
 
 unsigned int vgic_v3_its_count(const struct domain *d);
@@ -212,6 +215,11 @@ static inline int gicv3_its_acpi_init(struct 
acpi_subtable_header *header,
 }
 #endif
 
+static inline int gicv3_its_deny_access(const struct domain *d)
+{
+return 0;
+}
+
 static inline bool gicv3_its_host_has_its(void)
 {
 return false;
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/4] ARM: ACPI: Add ITS to hardware domain MADT

2017-06-20 Thread Manish Jaggi
This patch adds ITS information in hardware domain's MADT table.
Also this patch interoduces .get_hwdom_madt_size in gic_hw_operations,
to return the complete size of MADT table for hardware domain.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/domain_build.c  |  7 +--
 xen/arch/arm/gic-v2.c|  6 ++
 xen/arch/arm/gic-v3-its.c| 34 ++
 xen/arch/arm/gic-v3.c| 18 ++
 xen/arch/arm/gic.c   | 11 +++
 xen/include/asm-arm/gic.h|  3 +++
 xen/include/asm-arm/gic_v3_its.h | 12 
 7 files changed, 85 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 3abacc0..15c7f9b 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1802,12 +1802,7 @@ static int estimate_acpi_efi_size(struct domain *d, 
struct kernel_info *kinfo)
 acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8);
 acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);
 
-madt_size = sizeof(struct acpi_table_madt)
-+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
-+ sizeof(struct acpi_madt_generic_distributor);
-if ( d->arch.vgic.version == GIC_V3 )
-madt_size += sizeof(struct acpi_madt_generic_redistributor)
- * d->arch.vgic.nr_regions;
+madt_size = gic_get_hwdom_madt_size(d);
 acpi_size += ROUNDUP(madt_size, 8);
 
 addr = acpi_os_get_root_pointer();
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index ffbe47c..e92dc3d 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct domain 
*d)
 return iomem_deny_access(d, mfn, mfn + nr);
 }
 
+static u32 gicv2_get_hwdom_madt_size(const struct domain *d)
+{
+return 0;
+}
+
 #ifdef CONFIG_ACPI
 static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
 {
@@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = {
 .read_apr= gicv2_read_apr,
 .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
 .make_hwdom_madt = gicv2_make_hwdom_madt,
+.get_hwdom_madt_size = gicv2_get_hwdom_madt_size,
 .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
 .iomem_deny_access   = gicv2_iomem_deny_access,
 .do_LPI  = gicv2_do_LPI,
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 98c8f46..7f8ff34 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -924,6 +924,40 @@ int gicv3_its_deny_access(const struct domain *d)
 return rc;
 }
 
+#ifdef CONFIG_ACPI
+u32 gicv3_its_madt_generic_translator_size(void)
+{
+const struct host_its *its_data;
+u32 size = 0;
+
+list_for_each_entry(its_data, _its_list, entry)
+size += sizeof(struct acpi_madt_generic_translator);
+
+return size;
+}
+
+u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset)
+{
+struct acpi_madt_generic_translator *gic_its;
+const struct host_its *its_data;
+u32 table_len = offset, size;
+
+/* Update GIC ITS information in hardware domain's MADT */
+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr
+   + table_len);
+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;
+gic_its->translation_id = its_data->translation_id;
+table_len +=  size;
+}
+
+return table_len;
+}
+#endif
 /*
  * Create the respective guest DT nodes from a list of host ITSes.
  * This copies the reg property, so the guest sees the ITS at the same address
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index f6fbf2f..c7a8c1c 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1407,9 +1407,21 @@ static int gicv3_make_hwdom_madt(const struct domain *d, 
u32 offset)
 table_len += size;
 }
 
+table_len = gicv3_its_make_hwdom_madt(base_ptr, table_len);
 return table_len;
 }
 
+static u32 gicv3_get_hwdom_madt_size(const struct domain *d)
+{
+u32 size;
+size  = sizeof(struct acpi_madt_generic_redistributor)
+ * d->arch.vgic.nr_regions;
+if ( gicv3_its_host_has_its() )
+size  += gicv3_its_madt_generic_translator_size();
+
+return size;
+}
+
 static int __init
 gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
 const unsigned long end)
@@ -1605,6 +1617,11 @@ static int gicv3_make_hwdom_madt(const struct domain *d, 
u32 offset)
 {
 return 0;
 }
+
+static u32 gicv3_get_hwdom_madt_size(const struct domain *d)
+{
+return 0;
+}
 #endif
 
 /* Set up the GIC */
@@ -1706,6 +1723,7 @@ s

[Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain

2017-06-20 Thread Manish Jaggi
This patch series adds the support of ITS for ACPI hardware domain.
It is tested on staging branch with has ITS v12 patchset by Andre.

I have tried to incorporate the review comments on the RFC v1/v2 patch.
The single patch in RFC is now split into 4 patches. 

Patch1: ARM: ITS: Add translation_id to host_its
 Adds translation_id in host_its data structure, which is populated from 
 translation_id read from firmwar MADT. This value is then programmed into
 local MADT created for hardware domain in patch 4.

Patch2: ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
 Introduces function for its_acpi_init, which calls add_to_host_its_list
 which is a common function also called from _dt variant.

Patch3: ARM: ITS: Deny hardware domain access to its
 Extends the gicv3_iomem_deny to include its regions as well

Patch4: ARM: ACPI: Add ITS to hardware domain MADT
 This patch adds ITS information in hardware domain's MADT table. 
 Also this patch interoduces .get_hwdom_madt_size in gic_hw_operations,
 to return the complete size of MADT table for hardware domain.


Manish Jaggi (4):
  ARM: ITS: Add translation_id to host_its
  ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
  ARM: ITS: Deny hardware domain access to its
  ARM: ACPI: Add ITS to hardware domain MADT

 xen/arch/arm/domain_build.c  |   7 +--
 xen/arch/arm/gic-v2.c|   6 +++
 xen/arch/arm/gic-v3-its.c| 102 +++
 xen/arch/arm/gic-v3.c|  31 
 xen/arch/arm/gic.c   |  11 +
 xen/include/asm-arm/gic.h|   3 ++
 xen/include/asm-arm/gic_v3_its.h |  36 ++
 7 files changed, 180 insertions(+), 16 deletions(-)

-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/4] ARM: ITS: ACPI: Introduce gicv3_its_acpi_init

2017-06-20 Thread Manish Jaggi
This patch adds gicv3_its_acpi_init. To avoid duplicate code for
initializing and adding to host_its_list a common function
add_to_host_its_list is added which is called by both _dt_init and _acpi_init.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/gic-v3-its.c| 49 
 xen/arch/arm/gic-v3.c|  6 +
 xen/include/asm-arm/gic_v3_its.h | 14 
 3 files changed, 59 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 2d36030..e11f29a 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -33,6 +33,7 @@
 
 #define ITS_CMD_QUEUE_SZSZ_1M
 
+#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K)
 /*
  * No lock here, as this list gets only populated upon boot while scanning
  * firmware tables for all host ITSes, and only gets iterated afterwards.
@@ -976,11 +977,35 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d,
 return res;
 }
 
+/* Common function for addind to host_its_list
+*/
+static int add_to_host_its_list(u64 addr, u64 size,
+  u32 translation_id, const void *node)
+{
+struct host_its *its_data;
+its_data = xzalloc(struct host_its);
+
+if ( !its_data )
+return -1;
+
+if ( node )
+its_data->dt_node = node;
+
+its_data->addr = addr;
+its_data->size = size;
+its_data->translation_id = translation_id;
+printk("GICv3: Found ITS @0x%lx\n", addr);
+
+list_add_tail(_data->entry, _its_list);
+
+return 0;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
 const struct dt_device_node *its = NULL;
-struct host_its *its_data;
+static int its_id = 1;
 
 /*
  * Check for ITS MSI subnodes. If any, add the ITS register
@@ -996,19 +1021,23 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
 if ( dt_device_get_address(its, 0, , ) )
 panic("GICv3: Cannot find a valid ITS frame address");
 
-its_data = xzalloc(struct host_its);
-if ( !its_data )
-panic("GICv3: Cannot allocate memory for ITS frame");
+if ( add_to_host_its_list(addr, size, its_id++, its) )
+panic("GICV3: Adding Host ITS failed ");
+}
+}
 
-its_data->addr = addr;
-its_data->size = size;
-its_data->dt_node = its;
+#ifdef CONFIG_ACPI
+int gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned 
long end)
+{
+struct acpi_madt_generic_translator *its_entry;
 
-printk("GICv3: Found ITS @0x%lx\n", addr);
+its_entry = (struct acpi_madt_generic_translator *)header;
 
-list_add_tail(_data->entry, _its_list);
-}
+return add_to_host_its_list(its_entry->base_address,
+ACPI_GICV3_ITS_MEM_SIZE,
+its_entry->translation_id, NULL);
 }
+#endif
 
 /*
  * Local variables:
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index c927306..558b32c 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1567,6 +1567,12 @@ static void __init gicv3_acpi_init(void)
 
 gicv3.rdist_stride = 0;
 
+count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
+  gicv3_its_acpi_init, 0);
+
+if ( count <= 0 )
+panic("GICv3: Can't get ITS entry");
+
 /*
  * In ACPI, 0 is considered as the invalid address. However the rest
  * of the initialization rely on the invalid address to be
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 96b910b..bcfa181 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -105,6 +105,7 @@
 
 #include 
 #include 
+#include 
 
 #define HOST_ITS_FLUSH_CMD_QUEUE(1U << 0)
 #define HOST_ITS_USES_PTA   (1U << 1)
@@ -137,6 +138,11 @@ extern struct list_head host_its_list;
 /* Parse the host DT and pick up all host ITSes. */
 void gicv3_its_dt_init(const struct dt_device_node *node);
 
+#ifdef CONFIG_ACPI
+int gicv3_its_acpi_init(struct acpi_subtable_header *header,
+const unsigned long end);
+#endif
+
 bool gicv3_its_host_has_its(void);
 
 unsigned int vgic_v3_its_count(const struct domain *d);
@@ -198,6 +204,14 @@ static inline void gicv3_its_dt_init(const struct 
dt_device_node *node)
 {
 }
 
+#ifdef CONFIG_ACPI
+static inline int gicv3_its_acpi_init(struct acpi_subtable_header *header,
+const unsigned long end)
+{
+return false;
+}
+#endif
+
 static inline bool gicv3_its_host_has_its(void)
 {
 return false;
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] ARM: ITS: Add translation_id to host_its

2017-06-20 Thread Manish Jaggi
This patch adds a translation_id to host_its data structure.
Value stored in this id should be copied over to hardware domains
MADT table.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/include/asm-arm/gic_v3_its.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 1fac1c7..96b910b 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -118,6 +118,8 @@ struct host_its {
 const struct dt_device_node *dt_node;
 paddr_t addr;
 paddr_t size;
+/* A unique value to identify each ITS */
+u32 translation_id;
 void __iomem *its_base;
 unsigned int devid_bits;
 unsigned int evid_bits;
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] Hugepage support for Dom0

2017-06-19 Thread Manish Jaggi

Hi,

Does Xen arm64 support hugepages for Dom0 ? If yes how to enable it.
Found wiki page on it : 
https://wiki.xenproject.org/wiki/Huge_Page_Support but is not updated.


Thanks
-Manish


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 4/4] ARM: ACPI: Add ITS to hardware domain MADT

2017-06-16 Thread Manish Jaggi

This patch adds ITS information in hardware domain's MADT table.
Also this patch introduces .get_hwdom_madt_size in gic_hw_operations,
to return the complete size of MADT table for hardware domain.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/domain_build.c  |  7 +--
 xen/arch/arm/gic-v2.c|  6 ++
 xen/arch/arm/gic-v3-its.c| 34 ++
 xen/arch/arm/gic-v3.c| 18 ++
 xen/arch/arm/gic.c   | 11 +++
 xen/include/asm-arm/gic.h|  3 +++
 xen/include/asm-arm/gic_v3_its.h | 12 
 7 files changed, 85 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 3abacc0..15c7f9b 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1802,12 +1802,7 @@ static int estimate_acpi_efi_size(struct domain 
*d, struct kernel_info *kinfo)

 acpi_size = ROUNDUP(sizeof(struct acpi_table_fadt), 8);
 acpi_size += ROUNDUP(sizeof(struct acpi_table_stao), 8);

-madt_size = sizeof(struct acpi_table_madt)
-+ sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus
-+ sizeof(struct acpi_madt_generic_distributor);
-if ( d->arch.vgic.version == GIC_V3 )
-madt_size += sizeof(struct acpi_madt_generic_redistributor)
- * d->arch.vgic.nr_regions;
+madt_size = gic_get_hwdom_madt_size(d);
 acpi_size += ROUNDUP(madt_size, 8);

 addr = acpi_os_get_root_pointer();
diff --git a/xen/arch/arm/gic-v2.c b/xen/arch/arm/gic-v2.c
index ffbe47c..e92dc3d 100644
--- a/xen/arch/arm/gic-v2.c
+++ b/xen/arch/arm/gic-v2.c
@@ -1012,6 +1012,11 @@ static int gicv2_iomem_deny_access(const struct 
domain *d)

 return iomem_deny_access(d, mfn, mfn + nr);
 }

+static u32 gicv2_get_hwdom_madt_size(const struct domain *d)
+{
+return 0;
+}
+
 #ifdef CONFIG_ACPI
 static int gicv2_make_hwdom_madt(const struct domain *d, u32 offset)
 {
@@ -1248,6 +1253,7 @@ const static struct gic_hw_operations gicv2_ops = {
 .read_apr= gicv2_read_apr,
 .make_hwdom_dt_node  = gicv2_make_hwdom_dt_node,
 .make_hwdom_madt = gicv2_make_hwdom_madt,
+.get_hwdom_madt_size = gicv2_get_hwdom_madt_size,
 .map_hwdom_extra_mappings = gicv2_map_hwdown_extra_mappings,
 .iomem_deny_access   = gicv2_iomem_deny_access,
 .do_LPI  = gicv2_do_LPI,
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 98c8f46..7f8ff34 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -924,6 +924,40 @@ int gicv3_its_deny_access(const struct domain *d)
 return rc;
 }

+#ifdef CONFIG_ACPI
+u32 gicv3_its_madt_generic_translator_size(void)
+{
+const struct host_its *its_data;
+u32 size = 0;
+
+list_for_each_entry(its_data, _its_list, entry)
+size += sizeof(struct acpi_madt_generic_translator);
+
+return size;
+}
+
+u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset)
+{
+struct acpi_madt_generic_translator *gic_its;
+const struct host_its *its_data;
+u32 table_len = offset, size;
+
+/* Update GIC ITS information in hardware domain's MADT */
+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr
+   + table_len);
+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;
+gic_its->translation_id = its_data->translation_id;
+table_len +=  size;
+}
+
+return table_len;
+}
+#endif
 /*
  * Create the respective guest DT nodes from a list of host ITSes.
  * This copies the reg property, so the guest sees the ITS at the same 
address

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index f6fbf2f..c7a8c1c 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1407,9 +1407,21 @@ static int gicv3_make_hwdom_madt(const struct 
domain *d, u32 offset)

 table_len += size;
 }

+table_len = gicv3_its_make_hwdom_madt(base_ptr, table_len);
 return table_len;
 }

+static u32 gicv3_get_hwdom_madt_size(const struct domain *d)
+{
+u32 size;
+size  = sizeof(struct acpi_madt_generic_redistributor)
+ * d->arch.vgic.nr_regions;
+if ( gicv3_its_host_has_its() )
+size  += gicv3_its_madt_generic_translator_size();
+
+return size;
+}
+
 static int __init
 gic_acpi_parse_madt_cpu(struct acpi_subtable_header *header,
 const unsigned long end)
@@ -1605,6 +1617,11 @@ static int gicv3_make_hwdom_madt(const struct 
domain *d, u32 offset)

 {
 return 0;
 }
+
+static u32 gicv3_get_hwdom_madt_size(const struct domain *d)
+{
+return 0;
+}
 #endif

 /* Set up the GIC */
@@ -1706,6 +1723,7 @@ static cons

[Xen-devel] [PATCH 3/4] ARM: ITS: Deny hardware domain access to its region

2017-06-16 Thread Manish Jaggi
This patch extends the gicv3_iomem_deny_access functionality by adding 
support

for its region as well. Added function gicv3_its_deny_access.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/gic-v3-its.c| 19 +++
 xen/arch/arm/gic-v3.c|  7 +++
 xen/include/asm-arm/gic_v3_its.h |  8 
 3 files changed, 34 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index e11f29a..98c8f46 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,6 +20,7 @@

 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -905,6 +906,24 @@ struct pending_irq *gicv3_assign_guest_event(struct 
domain *d,

 return pirq;
 }

+int gicv3_its_deny_access(const struct domain *d)
+{
+int rc = 0;
+unsigned long mfn, nr;
+const struct host_its *its_data;
+
+list_for_each_entry(its_data, _its_list, entry)
+{
+mfn = paddr_to_pfn(its_data->addr);
+nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE);
+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+break;
+}
+
+return rc;
+}
+
 /*
  * Create the respective guest DT nodes from a list of host ITSes.
  * This copies the reg property, so the guest sees the ITS at the same 
address

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 558b32c..f6fbf2f 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1308,6 +1308,13 @@ static int gicv3_iomem_deny_access(const struct 
domain *d)

 if ( rc )
 return rc;

+if ( gicv3_its_host_has_its() )
+{
+rc = gicv3_its_deny_access(d);
+if ( rc )
+return rc;
+}
+
 for ( i = 0; i < gicv3.rdist_count; i++ )
 {
 mfn = gicv3.rdist_regions[i].base >> PAGE_SHIFT;
diff --git a/xen/include/asm-arm/gic_v3_its.h 
b/xen/include/asm-arm/gic_v3_its.h

index bcfa181..84dbb9c 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -143,6 +143,9 @@ int gicv3_its_acpi_init(struct acpi_subtable_header 
*header,

 const unsigned long end);
 #endif

+/* Deny iomem access for its */
+int gicv3_its_deny_access(const struct domain *d);
+
 bool gicv3_its_host_has_its(void);

 unsigned int vgic_v3_its_count(const struct domain *d);
@@ -212,6 +215,11 @@ static inline int gicv3_its_acpi_init(struct 
acpi_subtable_header *header,

 }
 #endif

+static inline int gicv3_its_deny_access(const struct domain *d)
+{
+return 0;
+}
+
 static inline bool gicv3_its_host_has_its(void)
 {
 return false;
--
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 2/4] ARM: ITS: ACPI: Introduce gicv3_its_acpi_init

2017-06-16 Thread Manish Jaggi

This patch adds gicv3_its_acpi_init. To avoid duplicate code for
initializing and adding to host_its_list a common function
add_to_host_its_list is added which is called by both _dt_init and 
_acpi_init.


Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/gic-v3-its.c| 49 


 xen/arch/arm/gic-v3.c|  6 +
 xen/include/asm-arm/gic_v3_its.h | 14 
 3 files changed, 59 insertions(+), 10 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 2d36030..e11f29a 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -33,6 +33,7 @@

 #define ITS_CMD_QUEUE_SZSZ_1M

+#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K)
 /*
  * No lock here, as this list gets only populated upon boot while scanning
  * firmware tables for all host ITSes, and only gets iterated afterwards.
@@ -976,11 +977,35 @@ int gicv3_its_make_hwdom_dt_nodes(const struct 
domain *d,

 return res;
 }

+/* Common function for addind to host_its_list
+*/
+static int add_to_host_its_list(u64 addr, u64 size,
+  u32 translation_id, const void *node)
+{
+struct host_its *its_data;
+its_data = xzalloc(struct host_its);
+
+if ( !its_data )
+return -1;
+
+if ( node )
+its_data->dt_node = node;
+
+its_data->addr = addr;
+its_data->size = size;
+its_data->translation_id = translation_id;
+printk("GICv3: Found ITS @0x%lx\n", addr);
+
+list_add_tail(_data->entry, _its_list);
+
+return 0;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out 
of it. */

 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
 const struct dt_device_node *its = NULL;
-struct host_its *its_data;
+static int its_id = 1;

 /*
  * Check for ITS MSI subnodes. If any, add the ITS register
@@ -996,19 +1021,23 @@ void gicv3_its_dt_init(const struct 
dt_device_node *node)

 if ( dt_device_get_address(its, 0, , ) )
 panic("GICv3: Cannot find a valid ITS frame address");

-its_data = xzalloc(struct host_its);
-if ( !its_data )
-panic("GICv3: Cannot allocate memory for ITS frame");
+if ( add_to_host_its_list(addr, size, its_id++, its) )
+panic("GICV3: Adding Host ITS failed ");
+}
+}

-its_data->addr = addr;
-its_data->size = size;
-its_data->dt_node = its;
+#ifdef CONFIG_ACPI
+int gicv3_its_acpi_init(struct acpi_subtable_header *header, const 
unsigned long end)

+{
+struct acpi_madt_generic_translator *its_entry;

-printk("GICv3: Found ITS @0x%lx\n", addr);
+its_entry = (struct acpi_madt_generic_translator *)header;

-list_add_tail(_data->entry, _its_list);
-}
+return add_to_host_its_list(its_entry->base_address,
+ACPI_GICV3_ITS_MEM_SIZE,
+its_entry->translation_id, NULL);
 }
+#endif

 /*
  * Local variables:
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index c927306..558b32c 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1567,6 +1567,12 @@ static void __init gicv3_acpi_init(void)

 gicv3.rdist_stride = 0;

+count = acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,
+  gicv3_its_acpi_init, 0);
+
+if ( count <= 0 )
+panic("GICv3: Can't get ITS entry");
+
 /*
  * In ACPI, 0 is considered as the invalid address. However the rest
  * of the initialization rely on the invalid address to be
diff --git a/xen/include/asm-arm/gic_v3_its.h 
b/xen/include/asm-arm/gic_v3_its.h

index 96b910b..bcfa181 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -105,6 +105,7 @@

 #include 
 #include 
+#include 

 #define HOST_ITS_FLUSH_CMD_QUEUE(1U << 0)
 #define HOST_ITS_USES_PTA   (1U << 1)
@@ -137,6 +138,11 @@ extern struct list_head host_its_list;
 /* Parse the host DT and pick up all host ITSes. */
 void gicv3_its_dt_init(const struct dt_device_node *node);

+#ifdef CONFIG_ACPI
+int gicv3_its_acpi_init(struct acpi_subtable_header *header,
+const unsigned long end);
+#endif
+
 bool gicv3_its_host_has_its(void);

 unsigned int vgic_v3_its_count(const struct domain *d);
@@ -198,6 +204,14 @@ static inline void gicv3_its_dt_init(const struct 
dt_device_node *node)

 {
 }

+#ifdef CONFIG_ACPI
+static inline int gicv3_its_acpi_init(struct acpi_subtable_header *header,
+const unsigned long end)
+{
+return false;
+}
+#endif
+
 static inline bool gicv3_its_host_has_its(void)
 {
 return false;
--
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 1/4] ARM: ITS: Add translation_id to host_its

2017-06-16 Thread Manish Jaggi

This patch adds a translation_id to host_its data structure.
Value stored in this id should be copied over to hardware domains
MADT table.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/include/asm-arm/gic_v3_its.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/asm-arm/gic_v3_its.h 
b/xen/include/asm-arm/gic_v3_its.h

index 1fac1c7..96b910b 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -118,6 +118,8 @@ struct host_its {
 const struct dt_device_node *dt_node;
 paddr_t addr;
 paddr_t size;
+/* A unique value to identify each ITS */
+u32 translation_id;
 void __iomem *its_base;
 unsigned int devid_bits;
 unsigned int evid_bits;
--
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH 0/4] ARM: ACPI: ITS: Add ITS Support for ACPI hardware domain

2017-06-16 Thread Manish Jaggi

Hi,

This patch series adds the support of ITS for ACPI hardware domain.
It is tested on staging branch with has ITS v12 patchset by Andre.

I have tried to incorporate the review comments on the RFC v1/v2 patch.
The single patch in RFC is now split into 4 patches.

Patch1: ARM: ITS: Add translation_id to host_its
 Adds translation_id in host_its data structure, which is populated from
 translation_id read from firmware MADT. This value is then programmed into
 local MADT created for hardware domain in patch 4.

Patch2: ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
 Introduces function for its_acpi_init, which calls add_to_host_its_list
 which is a common function also called from _dt variant.

Patch3: ARM: ITS: Deny hardware domain access to its
 Extends the gicv3_iomem_deny to include its regions as well

Patch4: ARM: ACPI: Add ITS to hardware domain MADT
 This patch adds ITS information in hardware domain's MADT table.
 Also this patch introduces .get_hwdom_madt_size in gic_hw_operations,
 to return the complete size of MADT table for hardware domain.

Manish Jaggi (4):
  ARM: ITS: Add translation_id to host_its
  ARM: ITS: ACPI: Introduce gicv3_its_acpi_init
  ARM: ITS: Deny hardware domain access to its
  ARM: ACPI: Add ITS to hardware domain MADT

 xen/arch/arm/domain_build.c  |   7 +--
 xen/arch/arm/gic-v2.c|   6 +++
 xen/arch/arm/gic-v3-its.c| 102 
+++

 xen/arch/arm/gic-v3.c|  31 
 xen/arch/arm/gic.c   |  11 +
 xen/include/asm-arm/gic.h|   3 ++
 xen/include/asm-arm/gic_v3_its.h |  36 ++
 7 files changed, 180 insertions(+), 16 deletions(-)

--
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0

2017-06-13 Thread Manish Jaggi



On 6/13/2017 4:58 PM, Julien Grall wrote:

On 13/06/17 12:02, Manish Jaggi wrote:

Will the below code be ok?


If you noticed, I didn't say this code is wrong. Instead I asked why
you use the same ID. Meaning, is there anything in the DSDT requiring
this value?


+ int tras_id = 0;


unsigned.


+ list_for_each_entry(its_data, _its_list, entry)
+ {
+gic_its->translation_id = ++trans_id;


You start the translation ID at 1. Why?


as per the ACPI spec the value should be unique to each GIC ITS unit.
Does starting with 1 break anything? Or should I start with a magic 
number?


Rather than arguing on the start value here, you should have first 
answer to the question regarding the usage of translation_id.
in v1 I assumed that it would be the same as read from host its tables, 
so it would have a unique value as programmed by host firmware.


I understand that nobody is using it today. However, when I asked 
around me nobody ruled out to any future usage of GIC ITS ID and 
request this to be kept as it is.


This means that you can simply copy over the ACPI tables. Rather 
regenerating them.



I dont follow your comment, a bit confused
In v1 you mentioned that "Please explain why you need to have the same 
ID as the host."
now when you say we copy over the translation_id would be same as that 
of host?



Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0

2017-06-13 Thread Manish Jaggi

Hi julien,

On 6/9/2017 2:09 PM, Julien Grall wrote:



On 09/06/2017 07:48, Manish Jaggi wrote:


On 6/8/2017 7:28 PM, Julien Grall wrote:

Hi,

Hello Julien,


Hello,


+list_for_each_entry(its_data, _its_list, entry)
+{


Pointless {


+size += sizeof(struct acpi_madt_generic_translator);
+}

Just for readability of code.


You have indentation for that. So I don't think it helps.

ok i will fix it.




Same here + add a newline.


Sure.

+return size;
+}
+
+u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset)
+{
+struct acpi_madt_generic_translator *gic_its;
+const struct host_its *its_data;
+u32 table_len = offset, size;
+
+/* Update GIC ITS information in hardware domain's MADT */
+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr +
table_len);


This line is likely too long.


I will check it.

+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;


On the previous patch you had:

gic_its->translation_id = its_data->translation_id;

I asked to explain why you need to have the same ID as the host. And
now you dropped it. This does not match the spec (Table 5-67 in ACPI
6.1):

"GIC ITS ID. In a system with multiple GIC ITS units, this value must
be unique to each one."

But here, the ITS ID will not be unique. So why did you dropped it?


The reason I dropped it from its_data as I was not setting it. So it
doesn't belong there.


Where would it belong then?

This function is used to generate ACPI tables for the hardware domain.



Will the below code be ok?


If you noticed, I didn't say this code is wrong. Instead I asked why 
you use the same ID. Meaning, is there anything in the DSDT requiring 
this value?



+ int tras_id = 0;


unsigned.


+ list_for_each_entry(its_data, _its_list, entry)
+ {
+gic_its->translation_id = ++trans_id;


You start the translation ID at 1. Why?


as per the ACPI spec the value should be unique to each GIC ITS unit.
Does starting with 1 break anything? Or should I start with a magic number?



+table_len +=  size;
+}
+return table_len;
+}
+
 /*
  * Create the respective guest DT nodes from a list of host ITSes.
  * This copies the reg property, so the guest sees the ITS at the 
same

address
@@ -992,6 +1045,26 @@ int gicv3_its_make_hwdom_dt_nodes(const struct
domain *d,
 return res;
 }

+int gicv3_its_acpi_init(struct acpi_subtable_header *header, const
unsigned long end)


ACPI is an option and is not able by default. Please make sure that
this code build without ACPI. Likely this means surrounding with
#ifdef CONFIG_ACPI.

I will get compiled but not called. Do you still want to put ifdef, i
can add that.


All ACPIs functions are protected by ifdef. So this one should be as 
well.

ok will do.





+{
+struct acpi_madt_generic_translator *its_entry;
+struct host_its *its_data;
+
+its_data = xzalloc(struct host_its);
+if (!its_data)


Coding style.


Sure.

+return -1;
+
+its_entry = (struct acpi_madt_generic_translator *)header;
+its_data->addr  = its_entry->base_address;
+its_data->size = ACPI_GICV3_ITS_MEM_SIZE;
+
+spin_lock_init(_data->cmd_lock);
+
+printk("GICv3: Found ITS @0x%lx\n", its_data->addr);
+
+list_add_tail(_data->entry, _its_list);


As said on v1, likely you could re-use factorize a part of
gicv3_its_dt_init to avoid implementing twice the initialization.


For this I have a different opinion.


Why didn't you state it on the previous version? I usually interpret a 
non-answer as an acknowledgment.



gicv3_its_dt_init has a loop dt_for_each_child_node(node, its) while
gicv3_its_acpi_init is a callback.
Moreover,  apart from xzalloc and list_add_tail most of the code is
different. so IMHO keeping them separate is better.


You still set addr and size as in the DT counterpart. Also, this is a 
call to forget to initialize a field if we decided to extend the 
structure host_its. So I still don't see any reason to open-code it 
and take the risk to introduce bug in the future...

ok Added.



Also newline.


+return 0;
+}


Newline here.

Sure.


 /* Scan the DT for any ITS nodes and create a list of host ITSes 
out of

it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index c927306..f0f6d12 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1333,9 +1333,8 @@ static int gicv3_iomem_deny_access(const struct
domain *d)
 return iomem_deny_access(d, mfn, mfn + nr);
 }

-return 0;
+return gicv3_its_deny_access(d);


Copying my answer from v1 for convenience:

if ( vbase != INVALID_PADDR )
{
mfn = vbase >> PAGE_S

Re: [Xen-devel] ARM: SMMUv3 support

2017-06-12 Thread Manish Jaggi



On 3/29/2017 5:30 AM, Goel, Sameer wrote:

Sure, I will try to post something soon.

Hi Sameer,
Are you still working on SMMU v3, can you please post patches.

Thanks
Manish

Thanks,
Sameer

On 3/27/2017 11:03 PM, Vijay Kilari wrote:

On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameer  wrote:

Hi,
  I am working on adding this support. The work is in initial stages and will 
target ACPI systems to start with. Do you have a specific requirement? Or even 
better: want to help with DT testing ? :)

Thanks Sameer. I don't have any specific requirement. I am also
looking with ACPI support.
Please share your RFC patches so that I can test on our platform.


Thanks,
Sameer

On 3/20/2017 11:58 PM, Vijay Kilari wrote:

Hi,

  Is there any effort put by anyone to get SMMUv3 support in Xen for ARM64?.
Would be glad to know.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v11 00/34] arm64: Dom0 ITS emulation

2017-06-11 Thread Manish Jaggi



On 6/9/2017 11:11 PM, Andre Przywara wrote:

Hi,

Hi Andre,
Tested this patchset + my acpi ITS patch 
(https://lists.xen.org/archives/html/xen-devel/2017-06/msg00716.html) on 
our platform and it works.

With v10 was not able to get interrupts. v9 was booting ok.

WBR
-Manish

fixes to v10, with their number getting eventually smaller ;-)
The same restriction as for the previous versions  still apply: the locking
is considered somewhat insufficient and will be fixed by an upcoming rework.

Patch 01/34 was reworked to properly synchronize access to the priority
in a lock-less fashion. This should be back-ported to 4.9.
The former patch 12/32 ("enable ITS and LPIs on the host") was moved up-front
and split to allow back-porting the new 02/34 to Xen 4.9, which is broken
if the preliminary ITS support is configured in and the machine advertises
an ITS in the device tree.

No big changes this time: some bugs fixed (many thanks to Julien for
proper testing!), some extended comments and some improvements to better
protect parallel accesses. For a detailed changelog see below.

I added Acked-by: and Reviewed-by: tags, but refrained from doing so
for Julien's tags for patch 18/34 and 20/34, since I changed them slightly.

Cheers,
Andre

--
This series adds support for emulation of an ARM GICv3 ITS interrupt
controller. For hardware which relies on the ITS to provide interrupts for
its peripherals this code is needed to get a machine booted into Dom0 at
all. ITS emulation for DomUs is only really useful with PCI passthrough,
which is not yet available for ARM. It is expected that this feature
will be co-developed with the ITS DomU code. However this code drop here
considered DomU emulation already, to keep later architectural changes
to a minimum.

This is technical preview version to allow early testing of the feature.
Things not (properly) addressed in this release:
- There is only support for Dom0 at the moment. DomU support is only really
useful with PCI passthrough, which is not there yet for ARM.
- The MOVALL command is not emulated. In our case there is really nothing
to do here. We might need to revisit this in the future for DomU support.
- The INVALL command might need some rework to be more efficient. Currently
we iterate over all mapped LPIs, which might take a bit longer.
- Indirect tables are not supported. This affects both the host and the
virtual side.
- The ITS tables inside (Dom0) guest memory cannot easily be protected
at the moment (without restricting access to Xen as well). So for now
we trust Dom0 not to touch this memory (which the spec forbids as well).
- With malicious guests (DomUs) there is a possibility of an interrupt
storm triggered by a device. We would need to investigate what that means
for Xen and if there is a nice way to prevent this. Disabling the LPI on
the host side would require command queuing, which has its downsides to
be issued during runtime.
- Dom0 should make sure that the ITS resources (number of LPIs, devices,
events) later handed to a DomU are really limited, as a large number of
them could mean much time spend in Xen to initialize, free or handle those.
It is expected that the toolstack sets up a tailored ITS with just enough
resources to accommodate the needs of the actual passthrough-ed device(s).
- The command queue locking is currently suboptimal and should be made more
fine-grained in the future, if possible.
- Provide support for running with an IOMMU, to map the doorbell page
to all devices.


Some generic design principles:

* The current GIC code statically allocates structures for each supported
IRQ (both for the host and the guest), which due to the potentially
millions of LPI interrupts is not feasible to copy for the ITS.
So we refrain from introducing the ITS as a first class Xen interrupt
controller, also we don't hold struct irq_desc's or struct pending_irq's
for each possible LPI.
Fortunately LPIs are only interesting to guests, so we get away with
storing only the virtual IRQ number and the guest VCPU for each allocated
host LPI, which can be stashed into one uint64_t. This data is stored in
a two-level table, which is both memory efficient and quick to access.
We hook into the existing IRQ handling and VGIC code to avoid accessing
the normal structures, providing alternative methods for getting the
needed information (priority, is enabled?) for LPIs.
Whenever a guest maps a device, we allocate the maximum required number
of struct pending_irq's, so that any triggering LPI can find its data
structure. Upon the guest actually mapping the LPI, this pointer to the
corresponding pending_irq gets entered into a radix tree, so that it can
be quickly looked up.

* On the guest side we (later will) have to deal with malicious guests
trying to hog Xen with mapping requests for a lot of LPIs, for instance.
As the ITS actually uses system memory for storing status information,
we use this memory (which the guest has to 

Re: [Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain

2017-06-09 Thread Manish Jaggi

HI Julien,

On 6/9/2017 2:53 PM, Julien Grall wrote:



On 09/06/2017 08:13, Manish Jaggi wrote:

On 6/8/2017 6:39 PM, Julien Grall wrote:

Hi Manish,


Hi Julien,


Hello,


On 08/06/17 13:38, Manish Jaggi wrote:




Spurious line.


This patch disables the smmu node in IORT table for hardware domain.
Also patches the output_base of pci_rc id_array with output_base of
smmu node id_array.


I would have appreciated a bit more description in the commit message
to explain your logic.


I will add it.



Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/domain_build.c | 142
+++-


domain_build.c is starting to be really big. I think it is time to
move some acpi bits outside domain_build.c.


You are right, I also thought that
How about 3 files
domain_build.c
acpi_domain_build.c
dt_domain_build.c


If you want to split the current code, then fine. But it is not 
strictly mandatory for this code. What I want is adding new code in 
separate files. But in this case they should be named:


domain_build.c
acpi/domain_build.c
dt/domain_build.c

This would keep the ACPI and DT firmware code separated and not 
polluting the arch/arm.

I will follow this structure.



 xen/include/acpi/actbl2.h   |   3 +-
 xen/include/asm-arm/acpi.h  |   1 +
 3 files changed, 144 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d6d6c94..9f41d0e 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -32,6 +32,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus);
 int dom0_11_mapping = 1;

 static u64 __initdata dom0_mem;
+static u8 *iort_base_ptr;


Looking at the code, I don't see any reason to have this global.

If you look a bit closer this is used at multiple places
see fixup_pcirc_node, hide_smmu_iort.


My point stands... you could have passed iort_base_ptr as an extra 
parameter of the functions. Or even use kinfo.


Anyway, at the moment I don't see any reason to have this global 
variable.



ok, I will pass it as a parameter.




 static void __init parse_dom0_mem(const char *s)
 {
@@ -1336,6 +1337,96 @@ static int prepare_dtb(struct domain *d, struct
kernel_info *kinfo)
 #ifdef CONFIG_ACPI
 #define ACPI_DOM0_FDT_MIN_SIZE 4096

+static void patch_output_ref(struct acpi_iort_id_mapping *pci_idmap,
+  struct acpi_iort_node *smmu_node)
+{
+struct acpi_iort_id_mapping *idmap = NULL;
+int i;


Newline.

Sure.



+for (i=0; i < smmu_node->mapping_count; i++) {


Please respect Xen coding style... I expect you to fix *all* the place
in the next version.

Also, there is a latent lack of comments within the patch to explain
the logic.


I will add detail comments.

+if(!idmap)
+idmap = (struct acpi_iort_id_mapping*)((u8*)smmu_node
+  + 
smmu_node->mapping_offset);

+else
+idmap++;
+
+if (pci_idmap->output_base == idmap->input_base) {
+pci_idmap->output_base = idmap->output_base;
+pci_idmap->output_reference = idmap->output_reference;


As I pointed out on the previous thread, you assume that one PCI ID
mapping will end up to be translated to one Device ID mapping and not
split across multiple one. For instance:


The  assumption is based on the ACPI tables on two platforms ThunderX
and ThunderX2.
While the spec does not deny it but would there be a use case as such
where a PCI node id array would split the
range into the same smmu.


May I remind you that the goal of Xen is to run on *all* the current 
and future platforms. If the spec says it is allowed, then we should 
do it unless there is a strong reason not to do it.





RC A
 // doesn't use SMMU 0 so just outputs DeviceIDs to ITS GROUP 0
 // Input ID --> Output reference: Output ID
0x-0x --> ITS GROUP 0 : 0x->0x


This is not relevant as this code wont touch RC A.


Can you avoid to dismiss any example that don't fit your solution? 
This is not helpful.

Sure. I will add more description in that case.


Describing the RC is relevant in my example to show a case that your 
solution will not handle.
I will add my rationale here. Hiding smmu from IORT table would require 
setting device ID in the pci_rc id_array for RID and output reference as 
ITS group.
For the RC idarray elements which don't have an output reference as smmu 
but a ITS group, there is no need to touch them.

Based on this rationale I said this is not relevant.



SMMU 0
// Note that range of StreamIDs that map to DeviceIDs excludes
// the NIC 0 DeviceID as it does not generate MSIs
 // Input ID --> Output reference: Output ID
0x-0x01ff --> ITS GROUP 0 : 0x1->0x101ff
0x0200-0x --> ITS GROUP 0 : 0x2->0x207ff


It can be from 2 different RC's and not from same RC.


It is not my point in this example. My point is same RC with spli

Re: [Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain

2017-06-09 Thread Manish Jaggi

On 6/8/2017 6:39 PM, Julien Grall wrote:

Hi Manish,


Hi Julien,

On 08/06/17 13:38, Manish Jaggi wrote:




Spurious line.


This patch disables the smmu node in IORT table for hardware domain.
Also patches the output_base of pci_rc id_array with output_base of
smmu node id_array.


I would have appreciated a bit more description in the commit message 
to explain your logic.



I will add it.



Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/domain_build.c | 142
+++-


domain_build.c is starting to be really big. I think it is time to 
move some acpi bits outside domain_build.c.



You are right, I also thought that
How about 3 files
domain_build.c
acpi_domain_build.c
dt_domain_build.c

 xen/include/acpi/actbl2.h   |   3 +-
 xen/include/asm-arm/acpi.h  |   1 +
 3 files changed, 144 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d6d6c94..9f41d0e 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -32,6 +32,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus);
 int dom0_11_mapping = 1;

 static u64 __initdata dom0_mem;
+static u8 *iort_base_ptr;


Looking at the code, I don't see any reason to have this global.

If you look a bit closer this is used at multiple places
see fixup_pcirc_node, hide_smmu_iort.




 static void __init parse_dom0_mem(const char *s)
 {
@@ -1336,6 +1337,96 @@ static int prepare_dtb(struct domain *d, struct
kernel_info *kinfo)
 #ifdef CONFIG_ACPI
 #define ACPI_DOM0_FDT_MIN_SIZE 4096

+static void patch_output_ref(struct acpi_iort_id_mapping *pci_idmap,
+  struct acpi_iort_node *smmu_node)
+{
+struct acpi_iort_id_mapping *idmap = NULL;
+int i;


Newline.

Sure.



+for (i=0; i < smmu_node->mapping_count; i++) {


Please respect Xen coding style... I expect you to fix *all* the place 
in the next version.


Also, there is a latent lack of comments within the patch to explain 
the logic.



I will add detail comments.

+if(!idmap)
+idmap = (struct acpi_iort_id_mapping*)((u8*)smmu_node
+  + smmu_node->mapping_offset);
+else
+idmap++;
+
+if (pci_idmap->output_base == idmap->input_base) {
+pci_idmap->output_base = idmap->output_base;
+pci_idmap->output_reference = idmap->output_reference;


As I pointed out on the previous thread, you assume that one PCI ID 
mapping will end up to be translated to one Device ID mapping and not 
split across multiple one. For instance:


The  assumption is based on the ACPI tables on two platforms ThunderX 
and ThunderX2.
While the spec does not deny it but would there be a use case as such 
where a PCI node id array would split the

range into the same smmu.



RC A
 // doesn't use SMMU 0 so just outputs DeviceIDs to ITS GROUP 0
 // Input ID --> Output reference: Output ID
0x-0x --> ITS GROUP 0 : 0x->0x


This is not relevant as this code wont touch RC A.

SMMU 0
// Note that range of StreamIDs that map to DeviceIDs excludes
// the NIC 0 DeviceID as it does not generate MSIs
 // Input ID --> Output reference: Output ID
0x-0x01ff --> ITS GROUP 0 : 0x1->0x101ff
0x0200-0x --> ITS GROUP 0 : 0x2->0x207ff


It can be from 2 different RC's and not from same RC.

// SMMU 0 Control interrupt is MSI based
 // Input ID --> Output reference: Output ID
N/A --> ITS GROUP 0 : 0x21

I still don't see anything in the spec preventing that. And I would 
like clarification from your side before going forward. *hint* The 
spec should be quoted *hint*


Spec does not prevent that, but we need to see IMHO what all cases are 
practically possible and current platforms support it.
Is there any platform which supports that ? I can add code for the 
combinations but how I will test it.

[...]





diff --git a/xen/include/acpi/actbl2.h b/xen/include/acpi/actbl2.h
index 42beac4..f180ea5 100644
--- a/xen/include/acpi/actbl2.h
+++ b/xen/include/acpi/actbl2.h
@@ -591,7 +591,8 @@ enum acpi_iort_node_type {
 ACPI_IORT_NODE_NAMED_COMPONENT = 0x01,
 ACPI_IORT_NODE_PCI_ROOT_COMPLEX = 0x02,
 ACPI_IORT_NODE_SMMU = 0x03,
-ACPI_IORT_NODE_SMMU_V3 = 0x04
+ACPI_IORT_NODE_SMMU_V3 = 0x04,
+ACPI_IORT_NODE_RESERVED = 0xff


This is likely a call to a separate patch.


ok.

 };

 struct acpi_iort_id_mapping {
diff --git a/xen/include/asm-arm/acpi.h b/xen/include/asm-arm/acpi.h
index 9f954d3..1cc0167 100644
--- a/xen/include/asm-arm/acpi.h
+++ b/xen/include/asm-arm/acpi.h
@@ -36,6 +36,7 @@ typedef enum {
 TBL_FADT,
 TBL_MADT,
 TBL_STAO,
+TBL_IORT,
 TBL_XSDT,
 TBL_RSDP,
 TBL_EFIT,


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0

2017-06-09 Thread Manish Jaggi


On 6/8/2017 7:28 PM, Julien Grall wrote:

Hi,

Hello Julien,


Please CC all relevant maintainers.

Sure. Will do in the next patch rev.


On 08/06/17 14:03, Manish Jaggi wrote:




Spurious newline


This patch supports ITS in hardware domain, supports ITS in Xen
when booting with ACPI.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
Changes since v1:
- Moved its specific code to gic-v3-its.c
- fixed macros


It sounds like you haven't addressed all my comments. I will repeat 
them for this time. But next time, I will not bother reviewing your 
patch.

*Thanks* for reviewing the patch, I will try to address _all_ the comments




 xen/arch/arm/domain_build.c  |  6 ++--
 xen/arch/arm/gic-v3-its.c| 75
+++-
 xen/arch/arm/gic-v3.c| 10 --
 xen/include/asm-arm/gic_v3_its.h |  6 
 4 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 3abacc0..d6d6c94 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-


Why did you drop this newline?

I will fix it.



+#include 


Nack. I asked on v1 to separate code between GICv3 and ITS, it is not 
for directly calling gicv3 code directly in the common code.


If you need to call GICv3 specific code, then introduce a callback in 
gic_hw_operations.



Good point, I will add it.

 #include 
 #include 
 #include 
@@ -1804,7 +1804,9 @@ static int estimate_acpi_efi_size(struct domain
*d, struct kernel_info *kinfo)

 madt_size = sizeof(struct acpi_table_madt)
 + sizeof(struct acpi_madt_generic_interrupt) *
d->max_vcpus
-+ sizeof(struct acpi_madt_generic_distributor);
++ sizeof(struct acpi_madt_generic_distributor)
++ gicv3_its_madt_generic_translator_size();


See my comment above.

Will address it.



+
 if ( d->arch.vgic.version == GIC_V3 )
 madt_size += sizeof(struct acpi_madt_generic_redistributor)
  * d->arch.vgic.nr_regions;
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 1fb06ca..937b970 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -25,14 +25,18 @@
 #include 
 #include 
 #include 
+#include 


The include are ordered alphabetically, please respect it.


Sure. I will fix it.

 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 


Ditto.


Sure. I will fix it.


 #define ITS_CMD_QUEUE_SZSZ_1M
-


Again, we don't drop newline for no reason.

I will fix it.



+#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K)
 /*
  * No lock here, as this list gets only populated upon boot while 
scanning
  * firmware tables for all host ITSes, and only gets iterated 
afterwards.

@@ -920,6 +924,55 @@ int gicv3_lpi_change_vcpu(struct domain *d, paddr_t
vdoorbell,
 return 0;
 }

+int gicv3_its_deny_access(const struct domain *d)
+{
+int rc = 0;
+unsigned long mfn, nr;
+const struct host_its *its_data;
+
+list_for_each_entry(its_data, _its_list, entry)
+{
+mfn = paddr_to_pfn(its_data->addr);
+nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE);
+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+goto end;


Hmmm, why not using a break here rather than a goto?

I can use break, np.



+}
+end:
+return rc;
+}
+
+u32 gicv3_its_madt_generic_translator_size(void)
+{
+const struct host_its *its_data;
+u32 size = 0;
+
+list_for_each_entry(its_data, _its_list, entry)
+{


Pointless {


+size += sizeof(struct acpi_madt_generic_translator);
+}

Just for readability of code.


Same here + add a newline.


Sure.

+return size;
+}
+
+u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset)
+{
+struct acpi_madt_generic_translator *gic_its;
+const struct host_its *its_data;
+u32 table_len = offset, size;
+
+/* Update GIC ITS information in hardware domain's MADT */
+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr +
table_len);


This line is likely too long.


I will check it.

+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;


On the previous patch you had:

gic_its->translation_id = its_data->translation_id;

I asked to explain why you need to have the same ID as the host. And 
now you dropped it. This does not match the spec (Table 5-67 in ACPI 
6.1):


"GIC ITS ID. In a system with multiple GIC ITS units, this value must
be unique to each one."

But here, the ITS ID will not be unique. So why did you dropped it?

The reason I dropped it from its_data as I was not setting it. So it 
doesn't b

[Xen-devel] [RFC v2][PATCH] arm-acpi: Add ITS Support for Dom0

2017-06-08 Thread Manish Jaggi


This patch supports ITS in hardware domain, supports ITS in Xen
when booting with ACPI.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
Changes since v1:
- Moved its specific code to gic-v3-its.c
- fixed macros

 xen/arch/arm/domain_build.c  |  6 ++--
 xen/arch/arm/gic-v3-its.c| 75 
+++-

 xen/arch/arm/gic-v3.c| 10 --
 xen/include/asm-arm/gic_v3_its.h |  6 
 4 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 3abacc0..d6d6c94 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-
+#include 
 #include 
 #include 
 #include 
@@ -1804,7 +1804,9 @@ static int estimate_acpi_efi_size(struct domain 
*d, struct kernel_info *kinfo)


 madt_size = sizeof(struct acpi_table_madt)
 + sizeof(struct acpi_madt_generic_interrupt) * 
d->max_vcpus

-+ sizeof(struct acpi_madt_generic_distributor);
++ sizeof(struct acpi_madt_generic_distributor)
++ gicv3_its_madt_generic_translator_size();
+
 if ( d->arch.vgic.version == GIC_V3 )
 madt_size += sizeof(struct acpi_madt_generic_redistributor)
  * d->arch.vgic.nr_regions;
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 1fb06ca..937b970 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -25,14 +25,18 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 

 #define ITS_CMD_QUEUE_SZSZ_1M
-
+#define ACPI_GICV3_ITS_MEM_SIZE (SZ_64K)
 /*
  * No lock here, as this list gets only populated upon boot while scanning
  * firmware tables for all host ITSes, and only gets iterated afterwards.
@@ -920,6 +924,55 @@ int gicv3_lpi_change_vcpu(struct domain *d, paddr_t 
vdoorbell,

 return 0;
 }

+int gicv3_its_deny_access(const struct domain *d)
+{
+int rc = 0;
+unsigned long mfn, nr;
+const struct host_its *its_data;
+
+list_for_each_entry(its_data, _its_list, entry)
+{
+mfn = paddr_to_pfn(its_data->addr);
+nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE);
+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+goto end;
+}
+end:
+return rc;
+}
+
+u32 gicv3_its_madt_generic_translator_size(void)
+{
+const struct host_its *its_data;
+u32 size = 0;
+
+list_for_each_entry(its_data, _its_list, entry)
+{
+size += sizeof(struct acpi_madt_generic_translator);
+}
+return size;
+}
+
+u32 gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset)
+{
+struct acpi_madt_generic_translator *gic_its;
+const struct host_its *its_data;
+u32 table_len = offset, size;
+
+/* Update GIC ITS information in hardware domain's MADT */
+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr + 
table_len);

+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;
+table_len +=  size;
+}
+return table_len;
+}
+
 /*
  * Create the respective guest DT nodes from a list of host ITSes.
  * This copies the reg property, so the guest sees the ITS at the same 
address
@@ -992,6 +1045,26 @@ int gicv3_its_make_hwdom_dt_nodes(const struct 
domain *d,

 return res;
 }

+int gicv3_its_acpi_init(struct acpi_subtable_header *header, const 
unsigned long end)

+{
+struct acpi_madt_generic_translator *its_entry;
+struct host_its *its_data;
+
+its_data = xzalloc(struct host_its);
+if (!its_data)
+return -1;
+
+its_entry = (struct acpi_madt_generic_translator *)header;
+its_data->addr  = its_entry->base_address;
+its_data->size = ACPI_GICV3_ITS_MEM_SIZE;
+
+spin_lock_init(_data->cmd_lock);
+
+printk("GICv3: Found ITS @0x%lx\n", its_data->addr);
+
+list_add_tail(_data->entry, _its_list);
+return 0;
+}
 /* Scan the DT for any ITS nodes and create a list of host ITSes out 
of it. */

 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index c927306..f0f6d12 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1333,9 +1333,8 @@ static int gicv3_iomem_deny_access(const struct 
domain *d)

 return iomem_deny_access(d, mfn, mfn + nr);
 }

-return 0;
+return gicv3_its_deny_access(d);
 }
-
 #ifdef CONFIG_ACPI
 static void __init
 gic_acpi_add_rdist_region(paddr_t base, paddr_t size, bool single_rdist)
@@ -1374,6 +1373,7 @@ static int gicv3_make_hwdom_madt(const struct 
domain *d, u32 offset)

 for ( i =

[Xen-devel] [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain

2017-06-08 Thread Manish Jaggi


This patch disables the smmu node in IORT table for hardware domain.
Also patches the output_base of pci_rc id_array with output_base of
smmu node id_array.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/domain_build.c | 142 
+++-

 xen/include/acpi/actbl2.h   |   3 +-
 xen/include/asm-arm/acpi.h  |   1 +
 3 files changed, 144 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d6d6c94..9f41d0e 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -32,6 +32,7 @@ integer_param("dom0_max_vcpus", opt_dom0_max_vcpus);
 int dom0_11_mapping = 1;

 static u64 __initdata dom0_mem;
+static u8 *iort_base_ptr;

 static void __init parse_dom0_mem(const char *s)
 {
@@ -1336,6 +1337,96 @@ static int prepare_dtb(struct domain *d, struct 
kernel_info *kinfo)

 #ifdef CONFIG_ACPI
 #define ACPI_DOM0_FDT_MIN_SIZE 4096

+static void patch_output_ref(struct acpi_iort_id_mapping *pci_idmap,
+  struct acpi_iort_node *smmu_node)
+{
+struct acpi_iort_id_mapping *idmap = NULL;
+int i;
+for (i=0; i < smmu_node->mapping_count; i++) {
+if(!idmap)
+idmap = (struct acpi_iort_id_mapping*)((u8*)smmu_node
+  + smmu_node->mapping_offset);
+else
+idmap++;
+
+if (pci_idmap->output_base == idmap->input_base) {
+pci_idmap->output_base = idmap->output_base;
+pci_idmap->output_reference = idmap->output_reference;
+}
+}
+}
+
+static void fixup_pcirc_node(struct acpi_iort_node *node)
+{
+struct acpi_iort_id_mapping *idmap = NULL;
+struct acpi_iort_node *onode;
+int i=0;
+
+for (i=0; i < node->mapping_count; i++) {
+if(!idmap)
+idmap = (struct acpi_iort_id_mapping*)((u8*)node +
+  + node->mapping_offset);
+else
+idmap++;
+
+onode = (struct acpi_iort_node*)(iort_base_ptr +
+ idmap->output_reference);
+switch (onode->type)
+{
+case ACPI_IORT_NODE_ITS_GROUP:
+continue;
+case ACPI_IORT_NODE_SMMU:
+case ACPI_IORT_NODE_SMMU_V3:
+ patch_output_ref(idmap, onode);
+break;
+}
+}
+}
+
+static int hide_smmu_iort(void)
+{
+u32 i;
+u32 node_offset = 0;
+struct acpi_table_iort *iort_table;
+struct acpi_iort_node *node = NULL;
+
+iort_table = (struct acpi_table_iort *)iort_base_ptr;
+
+for (i=0; i < iort_table->node_count; i++) {
+if (!node){
+node = (struct acpi_iort_node *)(iort_base_ptr +
+ iort_table->node_offset);
+node_offset =  iort_table->node_offset;
+} else {
+node = (struct acpi_iort_node *)(iort_base_ptr +
+ node_offset);
+}
+
+node_offset +=  node->length;
+if (node->type == ACPI_IORT_NODE_PCI_ROOT_COMPLEX)
+fixup_pcirc_node(node);
+}
+
+node_offset = 0;
+node = NULL;
+for (i=0; i < iort_table->node_count; i++) {
+if (!node){
+node = (struct acpi_iort_node *)(iort_base_ptr +
+ iort_table->node_offset);
+node_offset =  iort_table->node_offset;
+} else {
+node = (struct acpi_iort_node *)(iort_base_ptr +
+ node_offset);
+}
+node_offset +=  node->length;
+if ((node->type == ACPI_IORT_NODE_SMMU) ||
+ (node->type == ACPI_IORT_NODE_SMMU_V3))
+node->type = ACPI_IORT_NODE_RESERVED;
+}
+
+return 0;
+}
+
 static int acpi_iomem_deny_access(struct domain *d)
 {
 acpi_status status;
@@ -1348,7 +1439,12 @@ static int acpi_iomem_deny_access(struct domain *d)
 if ( rc )
 return rc;

-/* TODO: Deny MMIO access for SMMU, GIC ITS */
+/* Hide SMMU from IORT */
+rc = hide_smmu_iort();
+if (rc)
+return rc;
+
+/* Deny MMIO access for GIC ITS */
 status = acpi_get_table(ACPI_SIG_SPCR, 0,
 (struct acpi_table_header **));

@@ -1646,6 +1742,8 @@ static int acpi_create_xsdt(struct domain *d, 
struct membank tbl_add[])

ACPI_SIG_FADT, tbl_add[TBL_FADT].start);
 acpi_xsdt_modify_entry(xsdt->table_offset_entry, entry_count,
ACPI_SIG_MADT, tbl_add[TBL_MADT].start);
+acpi_xsdt_modify_entry(xsdt->table_offset_entry, entry_count,
+   ACPI_SIG_IORT, tbl_add[TBL_IORT].start);
 xsdt->table_offset_entry[entry_count] = tbl_add[TBL_STAO].start;

 xsdt->header.length = table_size;
@@ -1794,11 +1892,23 @@ static int estimate_acpi_efi_size(struct domain 
*d, struct kernel_info *kinfo)

 {
 size

Re: [Xen-devel] xen/arm: Hiding SMMUs from Dom0 when using ACPI on Xen

2017-06-08 Thread Manish Jaggi



On 5/19/2017 1:39 AM, Julien Grall wrote:



On 18/05/2017 21:02, Manish Jaggi wrote:

In the IORT table using the  PCI-RC node, SMMU node and ITS node,
RID->StreamID->Device-ID  mapping can be generated.
As per IORT spec toady, same RID can be mapped to different StreamIDs
using two ID Array elements with same RID range but different output
reference.
There exists no use case for such a scenario hence a clarification is
required in IORT spec which states that RID range cannot overlap in the
ID array.


I understand that.



with this clarification in place, it is straight-forward to map RID to a
device-ID by replacing output of SMMU to output of RCI-RC


I am not sure to follow your suggestion here. But I will wait a patch 
before commenting.



Please see [RFC] [PATCH] arm-acpi: Hide SMMU from IORT for hardware domain

Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC v2] [PATCH] arm64-its: Add ITS support for ACPI dom0

2017-06-08 Thread Manish Jaggi
This patch supports ITS in hardware domain, supports ITS in Xen when 
booting with ACPI. Signed-off-by: Manish Jaggi <mja...@cavium.com> --- 
Changes since v1: - Moved its specific code to gic-v3-its.c - fixed 
macros xen/arch/arm/domain_build.c | 6 ++-- xen/arch/arm/gic-v3-its.c | 
75 +++- xen/arch/arm/gic-v3.c | 10 
-- xen/include/asm-arm/gic_v3_its.h | 6  4 files changed, 91 
insertions(+), 6 deletions(-) diff --git a/xen/arch/arm/domain_build.c 
b/xen/arch/arm/domain_build.c index 3abacc0..d6d6c94 100644 --- 
a/xen/arch/arm/domain_build.c +++ b/xen/arch/arm/domain_build.c @@ -20,7 
+20,7 @@ #include  #include  #include 
 - +#include  #include 
 #include  #include  @@ -1804,7 
+1804,9 @@ static int estimate_acpi_efi_size(struct domain *d, struct 
kernel_info *kinfo) madt_size = sizeof(struct acpi_table_madt) + 
sizeof(struct acpi_madt_generic_interrupt) * d->max_vcpus - + 
sizeof(struct acpi_madt_generic_distributor); + + sizeof(struct 
acpi_madt_generic_distributor) + + 
gicv3_its_madt_generic_translator_size(); + if ( d->arch.vgic.version == 
GIC_V3 ) madt_size += sizeof(struct acpi_madt_generic_redistributor) * 
d->arch.vgic.nr_regions; diff --git a/xen/arch/arm/gic-v3-its.c 
b/xen/arch/arm/gic-v3-its.c index 1fb06ca..937b970 100644 --- 
a/xen/arch/arm/gic-v3-its.c +++ b/xen/arch/arm/gic-v3-its.c @@ -25,14 
+25,18 @@ #include  #include  #include 
 +#include  #include  #include 
 #include  #include  
#include  +#include  +#include  
+#include  #define ITS_CMD_QUEUE_SZ SZ_1M - +#define 
ACPI_GICV3_ITS_MEM_SIZE (SZ_64K) /* * No lock here, as this list gets 
only populated upon boot while scanning * firmware tables for all host 
ITSes, and only gets iterated afterwards. @@ -920,6 +924,55 @@ int 
gicv3_lpi_change_vcpu(struct domain *d, paddr_t vdoorbell, return 0; } 
+int gicv3_its_deny_access(const struct domain *d) +{ + int rc = 0; + 
unsigned long mfn, nr; + const struct host_its *its_data; + + 
list_for_each_entry(its_data, _its_list, entry) + { + mfn = 
paddr_to_pfn(its_data->addr); + nr = PFN_UP(ACPI_GICV3_ITS_MEM_SIZE); + 
rc = iomem_deny_access(d, mfn, mfn + nr); + if ( rc ) + goto end; + } 
+end: + return rc; +} + +u32 
gicv3_its_madt_generic_translator_size(void) +{ + const struct host_its 
*its_data; + u32 size = 0; + + list_for_each_entry(its_data, 
_its_list, entry) + { + size += sizeof(struct 
acpi_madt_generic_translator); + } + return size; +} + +u32 
gicv3_its_make_hwdom_madt(u8 *base_ptr, u32 offset) +{ + struct 
acpi_madt_generic_translator *gic_its; + const struct host_its 
*its_data; + u32 table_len = offset, size; + + /* Update GIC ITS 
information in hardware domain's MADT */ + list_for_each_entry(its_data, 
_its_list, entry) + { + size = sizeof(struct 
acpi_madt_generic_translator); + gic_its = (struct 
acpi_madt_generic_translator *)(base_ptr + table_len); + 
gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR; + 
gic_its->header.length = size; + gic_its->base_address = its_data->addr; 
+ table_len += size; + } + return table_len; +} + /* * Create the 
respective guest DT nodes from a list of host ITSes. * This copies the 
reg property, so the guest sees the ITS at the same address @@ -992,6 
+1045,26 @@ int gicv3_its_make_hwdom_dt_nodes(const struct domain *d, 
return res; } +int gicv3_its_acpi_init(struct acpi_subtable_header 
*header, const unsigned long end) +{ + struct 
acpi_madt_generic_translator *its_entry; + struct host_its *its_data; + 
+ its_data = xzalloc(struct host_its); + if (!its_data) + return -1; + + 
its_entry = (struct acpi_madt_generic_translator *)header; + 
its_data->addr = its_entry->base_address; + its_data->size = 
ACPI_GICV3_ITS_MEM_SIZE; + + spin_lock_init(_data->cmd_lock); + + 
printk("GICv3: Found ITS @0x%lx\n", its_data->addr); + + 
list_add_tail(_data->entry, _its_list); + return 0; +} /* Scan 
the DT for any ITS nodes and create a list of host ITSes out of it. */ 
void gicv3_its_dt_init(const struct dt_device_node *node) { diff --git 
a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c index c927306..f0f6d12 
100644 --- a/xen/arch/arm/gic-v3.c +++ b/xen/arch/arm/gic-v3.c @@ 
-1333,9 +1333,8 @@ static int gicv3_iomem_deny_access(const struct 
domain *d) return iomem_deny_access(d, mfn, mfn + nr); } - return 0; + 
return gicv3_its_deny_access(d); } - #ifdef CONFIG_ACPI static void 
__init gic_acpi_add_rdist_region(paddr_t base, paddr_t size, bool 
single_rdist) @@ -1374,6 +1373,7 @@ static int 
gicv3_make_hwdom_madt(const struct domain *d, u32 offset) for ( i = 0; i 
< d->max_vcpus; i++ ) { gicc = (struct acpi_madt_generic_interrupt 
*)(base_ptr + table_len); + ACPI_MEMCPY(gicc, host_gicc, size); 
gicc->cpu_interface_number = i; gicc->uid = i; @@ -1399,7 +1399,7 @@ 
static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset) 
gicr->length = d->arch.vgic.rdist_regions[i].size; table_len += size; } 
- + table_len = gic

Re: [Xen-devel] [RFC] [PATCH] arm64-its: Add ITS support for ACPI dom0

2017-06-08 Thread Manish Jaggi

Hi Julien,

On 5/30/2017 4:07 PM, Julien Grall wrote:

Hello Manish,

On 30/05/17 07:07, Manish Jaggi wrote:

This patch is an RFC on top of Andre's v10 series.
https://www.mail-archive.com/xen-devel@lists.xen.org/msg109093.html

This patch deny's access to ITS region for the guest and also updates


s/deny's/denies/


the acpi tables for dom0.


This patch is doing more that supporting ITS in the hardware domain. 
It also allows support of ITS in Xen when booting using ACPI.




Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/gic-v3.c| 49

 xen/include/asm-arm/gic_v3_its.h |  1 +
 2 files changed, 50 insertions(+)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index c927306..f496fc1 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1301,6 +1301,7 @@ static int gicv3_iomem_deny_access(const struct
domain *d)
 {
 int rc, i;
 unsigned long mfn, nr;
+const struct host_its *its_data;

 mfn = dbase >> PAGE_SHIFT;
 nr = DIV_ROUND_UP(SZ_64K, PAGE_SIZE);
@@ -1333,6 +1334,16 @@ static int gicv3_iomem_deny_access(const struct
domain *d)
 return iomem_deny_access(d, mfn, mfn + nr);
 }


If GICv2 is supported, the function will bail out as soon as the virtual
base region is denied (see just above).

Didnt get your point. gicv2 has already a similar function 
gicv2_iomem_deny_access. Can you please elaborate.

I am sending a v2 version on patch incorporating other comments.


+/* deny for ITS as well */
+list_for_each_entry(its_data, _its_list, entry)
+{
+mfn = its_data->addr >> PAGE_SHIFT;


Please don't open-code the shift and using paddr_to_pfn(...).

ok.



+nr = DIV_ROUND_UP(SZ_128K, PAGE_SIZE);


Please use PFN_UP rather than DIV_ROUND_UP(...).

ok


Also, where does the SZ_128K comes from?


+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+return rc;
+}


No implementation of ITS specific code in the GICv3 driver please.
Instead introduce a helper for that.


+
 return 0;
 }

@@ -1357,8 +1368,10 @@ static int gicv3_make_hwdom_madt(const struct
domain *d, u32 offset)
 struct acpi_subtable_header *header;
 struct acpi_madt_generic_interrupt *host_gicc, *gicc;
 struct acpi_madt_generic_redistributor *gicr;
+struct acpi_madt_generic_translator *gic_its;
 u8 *base_ptr = d->arch.efi_acpi_table + offset;
 u32 i, table_len = 0, size;
+const struct host_its *its_data;


See my comment above regarding ITS specific code.


 /* Add Generic Interrupt */
 header =
acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0);
@@ -1374,6 +1387,7 @@ static int gicv3_make_hwdom_madt(const struct
domain *d, u32 offset)
 for ( i = 0; i < d->max_vcpus; i++ )
 {
 gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + 
table_len);

+


Spurious change.


 ACPI_MEMCPY(gicc, host_gicc, size);
 gicc->cpu_interface_number = i;
 gicc->uid = i;
@@ -1399,6 +1413,18 @@ static int gicv3_make_hwdom_madt(const struct
domain *d, u32 offset)
 gicr->length = d->arch.vgic.rdist_regions[i].size;
 table_len += size;
 }
+
+/* Update GIC ITS information in dom0 madt */


s/dom0/hardware domain/
s/madt/MADT/

Also, likely you want to make sure you have space in efi_acpi_table 
(see estimate_acpi_efi_size).



+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr +
table_len);
+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;
+gic_its->translation_id = its_data->translation_id;


Please explain why you need to have the same ID as the host.


+table_len +=  size;
+}

 return table_len;
 }
@@ -1511,6 +1537,25 @@ gic_acpi_get_madt_redistributor_num(struct
acpi_subtable_header *header,
  */
 return 0;
 }


Newnline here.


+#define ACPI_GICV3_ITS_MEM_SIZE (SZ_128K)
+
+int  gicv3_its_acpi_init(struct acpi_subtable_header *header, const 
unsigned long end)


Why this is not static?


+{


Same remark as above regarding ITS specific code.


+struct acpi_madt_generic_translator *its_entry;
+struct host_its *its_data;
+
+its_data = xzalloc(struct host_its);


What if xzalloc fails?



+its_entry = (struct acpi_madt_generic_translator *)header;
+its_data->addr  = its_entry->base_address;
+its_data->size = ACPI_GICV3_ITS_MEM_SIZE;
+
+spin_lock_init(_data->cmd_lock);
+
+printk("GICv3: Found ITS @0x%lx\n", its_data->addr);
+
+list_add_tail(_data->entry, _its_list);


Likely you could re-use factorize a part of gicv3_its_dt_init to avoid 
implementing twice the initi

[Xen-devel] [RFC] [PATCH] arm64-its: Add ITS support for ACPI dom0

2017-05-30 Thread Manish Jaggi

This patch is an RFC on top of Andre's v10 series.
https://www.mail-archive.com/xen-devel@lists.xen.org/msg109093.html

This patch deny's access to ITS region for the guest and also updates
the acpi tables for dom0.

Signed-off-by: Manish Jaggi <mja...@cavium.com>
---
 xen/arch/arm/gic-v3.c| 49 
 xen/include/asm-arm/gic_v3_its.h |  1 +
 2 files changed, 50 insertions(+)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index c927306..f496fc1 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1301,6 +1301,7 @@ static int gicv3_iomem_deny_access(const struct domain *d)
 {
 int rc, i;
 unsigned long mfn, nr;
+const struct host_its *its_data;
 
 mfn = dbase >> PAGE_SHIFT;

 nr = DIV_ROUND_UP(SZ_64K, PAGE_SIZE);
@@ -1333,6 +1334,16 @@ static int gicv3_iomem_deny_access(const struct domain 
*d)
 return iomem_deny_access(d, mfn, mfn + nr);
 }
 
+/* deny for ITS as well */

+list_for_each_entry(its_data, _its_list, entry)
+{
+mfn = its_data->addr >> PAGE_SHIFT;
+nr = DIV_ROUND_UP(SZ_128K, PAGE_SIZE);
+rc = iomem_deny_access(d, mfn, mfn + nr);
+if ( rc )
+return rc;
+}
+
 return 0;
 }
 
@@ -1357,8 +1368,10 @@ static int gicv3_make_hwdom_madt(const struct domain *d, u32 offset)

 struct acpi_subtable_header *header;
 struct acpi_madt_generic_interrupt *host_gicc, *gicc;
 struct acpi_madt_generic_redistributor *gicr;
+struct acpi_madt_generic_translator *gic_its;
 u8 *base_ptr = d->arch.efi_acpi_table + offset;
 u32 i, table_len = 0, size;
+const struct host_its *its_data;
 
 /* Add Generic Interrupt */

 header = acpi_table_get_entry_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT, 0);
@@ -1374,6 +1387,7 @@ static int gicv3_make_hwdom_madt(const struct domain *d, 
u32 offset)
 for ( i = 0; i < d->max_vcpus; i++ )
 {
 gicc = (struct acpi_madt_generic_interrupt *)(base_ptr + table_len);
+
 ACPI_MEMCPY(gicc, host_gicc, size);
 gicc->cpu_interface_number = i;
 gicc->uid = i;
@@ -1399,6 +1413,18 @@ static int gicv3_make_hwdom_madt(const struct domain *d, 
u32 offset)
 gicr->length = d->arch.vgic.rdist_regions[i].size;
 table_len += size;
 }
+
+/* Update GIC ITS information in dom0 madt */
+list_for_each_entry(its_data, _its_list, entry)
+{
+size = sizeof(struct acpi_madt_generic_translator);
+gic_its = (struct acpi_madt_generic_translator *)(base_ptr + 
table_len);
+gic_its->header.type = ACPI_MADT_TYPE_GENERIC_TRANSLATOR;
+gic_its->header.length = size;
+gic_its->base_address = its_data->addr;
+gic_its->translation_id = its_data->translation_id;
+table_len +=  size;
+}
 
 return table_len;

 }
@@ -1511,6 +1537,25 @@ gic_acpi_get_madt_redistributor_num(struct 
acpi_subtable_header *header,
  */
 return 0;
 }
+#define ACPI_GICV3_ITS_MEM_SIZE (SZ_128K)
+
+int  gicv3_its_acpi_init(struct acpi_subtable_header *header, const unsigned 
long end)
+{
+struct acpi_madt_generic_translator *its_entry;
+struct host_its *its_data;
+
+its_data = xzalloc(struct host_its);
+its_entry = (struct acpi_madt_generic_translator *)header;
+its_data->addr  = its_entry->base_address;
+its_data->size = ACPI_GICV3_ITS_MEM_SIZE;
+
+spin_lock_init(_data->cmd_lock);
+
+printk("GICv3: Found ITS @0x%lx\n", its_data->addr);
+
+list_add_tail(_data->entry, _its_list);
+return 0;
+}
 
 static void __init gicv3_acpi_init(void)

 {
@@ -1567,6 +1612,9 @@ static void __init gicv3_acpi_init(void)
 
 gicv3.rdist_stride = 0;
 
+acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_TRANSLATOR,

+  gicv3_its_acpi_init, 0);
+
 /*
  * In ACPI, 0 is considered as the invalid address. However the rest
  * of the initialization rely on the invalid address to be
@@ -1585,6 +1633,7 @@ static void __init gicv3_acpi_init(void)
 else
 vsize = GUEST_GICC_SIZE;
 
+

 }
 #else
 static void __init gicv3_acpi_init(void) { }
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index d2a3e53..c92cdb9 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -125,6 +125,7 @@ struct host_its {
 spinlock_t cmd_lock;
 void *cmd_buf;
 unsigned int flags;
+u32 translation_id;
 };
 
 
--

2.7.4



___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC] ARM PCI Passthrough design document

2017-05-29 Thread Manish Jaggi

Hi Julien,

On 5/29/2017 11:44 PM, Julien Grall wrote:



On 05/29/2017 03:30 AM, Manish Jaggi wrote:

Hi Julien,


Hello Manish,


On 5/26/2017 10:44 PM, Julien Grall wrote:
PCI pass-through allows the guest to receive full control of 
physical PCI
devices. This means the guest will have full and direct access to 
the PCI

device.

ARM is supporting a kind of guest that exploits as much as possible
virtualization support in hardware. The guest will rely on PV driver 
only

for IO (e.g block, network) and interrupts will come through the
virtualized
interrupt controller, therefore there are no big changes required
within the
kernel.

As a consequence, it would be possible to replace PV drivers by
assigning real
devices to the guest for I/O access. Xen on ARM would therefore be
able to
run unmodified operating system.

To achieve this goal, it looks more sensible to go towards emulating 
the

host bridge (there will be more details later).

IIUC this means that domU would have an emulated host bridge and dom0
will see the actual host bridge?


You don't want the hardware domain and Xen access the configuration 
space at the same time. So if Xen is in charge of the host bridge, 
then an emulated host bridge should be exposed to the hardware.
I believe in x86 case dom0 and Xen do access the config space. In the 
context of pci device add hypercall.

Thats when the pci_config_XXX functions in xen are called.


Although, this is depending on who is in charge of the the host 
bridge. As you may have noticed, this design document is proposing two 
ways to handle configuration space access. At the moment any generic 
host bridge (see the definition in the design document) will be 
handled in Xen and the hardware domain will have an emulated host bridge.


So in case of generic hb, xen will manage the config space and provide a 
emulated I/f to dom0, and accesses would be trapped by Xen.
Essentially the goal is to scan all pci devices and register them with 
Xen (which in turn will configure the smmu).
For a  generic hb, this can be done either in dom0/xen. The only doubt 
here is what extra benefit the emulated hb give in case of dom0.


If your host bridges is not a generic one, then the hardware domain 
will be  in charge of the host bridges, any configuration access from 
Xen will be forward to the hardware domain.


At the moment, as part of the first implementation, we are only 
looking to implement a generic host bridge in Xen. We will decide on 
case by case basis for all the other host bridges whether we want to 
have the driver in Xen.

agreed.


[...]


## IOMMU

The IOMMU will be used to isolate the PCI device when accessing the
memory (e.g
DMA and MSI Doorbells). Often the IOMMU will be configured using a
MasterID
(aka StreamID for ARM SMMU)  that can be deduced from the SBDF with
the help
of the firmware tables (see below).

Whilst in theory, all the memory transactions issued by a PCI device
should
go through the IOMMU, on certain platforms some of the memory
transaction may
not reach the IOMMU because they are interpreted by the host bridge. 
For

instance, this could happen if the MSI doorbell is built into the PCI
host
bridge or for P2P traffic. See [6] for more details.

XXX: I think this could be solved by using direct mapping (e.g GFN ==
MFN),
this would mean the guest memory layout would be similar to the host
one when
PCI devices will be pass-throughed => Detail it.
In the example given in the IORT spec, for pci devices not behind an 
SMMU,

how would the writes from the device be protected.


I realize the XXX paragraph is quite confusing. I am not trying to 
solve the problem where PCI devices are not protected behind an SMMU 
but platform where some transactions (e.g P2P or MSI doorbell access) 
are by-passing the SMMU.


You may still want to allow PCI passthrough in that case, because you 
know that P2P cannot be done (or potentially disabled) and MSI 
doorbell access is protected (for instance a write in the ITS doorbell 
will be tagged with the device by the hardware). In order to support 
such platform you need to direct map the doorbel (e.g GFN == MFN) and 
carve out the P2P region from the guest memory map. Hence the 
suggestion to re-use the host memory layout for the guest.


Note that it does not mean the RAM region will be direct mapped. It is 
only there to ease carving out memory region by-passed by the SMMU.


[...]


## ACPI

### Host bridges

The static table MCFG (see 4.2 in [1]) will describe the host bridges
available
at boot and supporting ECAM. Unfortunately, there are platforms out 
there

(see [2]) that re-use MCFG to describe host bridge that are not fully
ECAM
compatible.

This means that Xen needs to account for possible quirks in the host
bridge.
The Linux community are working on a patch series for this, see [2]
and [3],
where quirks will be detected with:
 * OEM ID
 * OEM Table ID
 * OEM Revision
 * PCI Segment
 * PCI bus number ra

Re: [Xen-devel] [RFC] ARM PCI Passthrough design document

2017-05-28 Thread Manish Jaggi

Hi Julien,

On 5/26/2017 10:44 PM, Julien Grall wrote:

Hi all,

The document below is an RFC version of a design proposal for PCI
Passthrough in Xen on ARM. It aims to describe from an high level perspective
the interaction with the different subsystems and how guest will be able
to discover and access PCI.

Currently on ARM, Xen does not have any knowledge about PCI devices. This
means that IOMMU and interrupt controller (such as ITS) requiring specific
configuration will not work with PCI even with DOM0.

The PCI Passthrough work could be divided in 2 phases:
 * Phase 1: Register all PCI devices in Xen => will allow
to use ITS and SMMU with PCI in Xen
 * Phase 2: Assign devices to guests

This document aims to describe the 2 phases, but for now only phase
1 is fully described.


I think I was able to gather all of the feedbacks and come up with a solution
that will satisfy all the parties. The design document has changed quite a lot
compare to the early draft sent few months ago. The major changes are:
* Provide more details how PCI works on ARM and the interactions with
MSI controller and IOMMU
* Provide details on the existing host bridge implementations
* Give more explanation and justifications on the approach chosen
* Describing the hypercalls used and how they should be called

Feedbacks are welcomed.

Cheers,



% PCI pass-through support on ARM
% Julien Grall 
% Draft B

# Preface

This document aims to describe the components required to enable the PCI
pass-through on ARM.

This is an early draft and some questions are still unanswered. When this is
the case, the text will contain XXX.

# Introduction

PCI pass-through allows the guest to receive full control of physical PCI
devices. This means the guest will have full and direct access to the PCI
device.

ARM is supporting a kind of guest that exploits as much as possible
virtualization support in hardware. The guest will rely on PV driver only
for IO (e.g block, network) and interrupts will come through the virtualized
interrupt controller, therefore there are no big changes required within the
kernel.

As a consequence, it would be possible to replace PV drivers by assigning real
devices to the guest for I/O access. Xen on ARM would therefore be able to
run unmodified operating system.

To achieve this goal, it looks more sensible to go towards emulating the
host bridge (there will be more details later).
IIUC this means that domU would have an emulated host bridge and dom0 
will see the actual host bridge?

  A guest would be able to take
advantage of the firmware tables, obviating the need for a specific driver
for Xen.

Thus, in this document we follow the emulated host bridge approach.

# PCI terminologies

Each PCI device under a host bridge is uniquely identified by its Requester ID
(AKA RID). A Requester ID is a triplet of Bus number, Device number, and
Function.

When the platform has multiple host bridges, the software can add a fourth
number called Segment (sometimes called Domain) to differentiate host bridges.
A PCI device will then uniquely by segment:bus:device:function (AKA SBDF).

So given a specific SBDF, it would be possible to find the host bridge and the
RID associated to a PCI device. The pair (host bridge, RID) will often be used
to find the relevant information for configuring the different subsystems (e.g
IOMMU, MSI controller). For convenience, the rest of the document will use
SBDF to refer to the pair (host bridge, RID).

# PCI host bridge

PCI host bridge enables data transfer between a host processor and PCI bus
based devices. The bridge is used to access the configuration space of each
PCI devices and, on some platform may also act as an MSI controller.

## Initialization of the PCI host bridge

Whilst it would be expected that the bootloader takes care of initializing
the PCI host bridge, on some platforms it is done in the Operating System.

This may include enabling/configuring the clocks that could be shared among
multiple devices.

## Accessing PCI configuration space

Accessing the PCI configuration space can be divided in 2 category:
 * Indirect access, where the configuration spaces are multiplexed. An
 example would be legacy method on x86 (e.g 0xcf8 and 0xcfc). On ARM a
 similar method is used by PCIe RCar root complex (see [12]).
 * ECAM access, each configuration space will have its own address space.

Whilst ECAM is a standard, some PCI host bridges will require specific fiddling
when access the registers (see thunder-ecam [13]).

In most of the cases, accessing all the PCI configuration spaces under a
given PCI host will be done the same way (i.e either indirect access or ECAM
access). However, there are a few cases, dependent on the PCI devices accessed,
which will use different methods (see 

Re: [Xen-devel] xen/arm: Hiding SMMUs from Dom0 when using ACPI on Xen

2017-05-18 Thread Manish Jaggi

Hi Julien,

On 5/18/2017 8:27 PM, Julien Grall wrote:

Hello,

On 18/05/17 12:59, Manish Jaggi wrote:

On 2/27/2017 11:42 PM, Julien Grall wrote:

On 02/27/2017 04:58 PM, Shanker Donthineni wrote:

Hi Julien,


Hi Shanker,

Please don't drop people in CC. In my case, any e-mail I am not CCed
are skipping my inbox and I may not read them for a while.



On 02/27/2017 08:12 AM, Julien Grall wrote:



On 27/02/17 13:23, Vijay Kilari wrote:

Hi Julien,


Hello Vijay,


On Wed, Feb 22, 2017 at 7:40 PM, Julien Grall <julien.gr...@arm.com>
wrote:

Hello,

There was few discussions recently about hiding SMMUs from DOM0 
when

using
ACPI. I thought it would be good to have a separate thread for 
this.


When using ACPI, the SMMUs will be described in the IO Remapping
Table
(IORT). The specification can be found on the ARM website [1].

For a brief summary, the IORT can be used to discover the SMMUs
present on
the platform and find for a given device the ID to configure
components such
as ITS (DeviceID) and SMMU (StreamID).

The appendix A in the specification gives an example how 
DeviceID and

StreamID can be found. For instance, when a PCI device is both
protected by
an SMMU and MSI-capable the following translation will happen:
RID -> StreamID -> DeviceID

Currently, SMMUs are hidden from DOM0 because they are been used by
Xen and
we don't support stage-1 SMMU. If we pass the IORT as it is, DOM0
will try
to initialize SMMU and crash.

I first thought about using a Xen specific way (STAO) or 
extending a

flag in
IORT. But that is not ideal.

So we would have to rewrite the IORT for DOM0. Given that a 
range of

RID can
mapped to multiple ranges of DeviceID,
Do you envisage a scenario where same RID can map to multiple StreamIDs 
belonging to different SMMUs ?

we would have to translate
RID one by
one to find the associated DeviceID. I think this may end up to
complex code
and have a big IORT table.


Why can't we replace Output base of IORT of PCI node with SMMU 
output

base?.
I mean similar to PCI node without SMMU, why can't replace output 
base

of PCI node with
SMMU's output base?.


Because I don't see anything in the spec preventing one RC ID mapping
to produce multiple SMMU ID mapping. So which output base would you
use?



Basically, remove SMMU nodes, and replaces output of the PCIe and 
named

nodes ID mappings with ITS nodes.

RID --> StreamID  --> dviceID  --> ITS device id = RID --> dviceID  
-->

ITS device id


Can you detail it? You seem to assume that one RC ID mapping range
will only produce ID mapping range. AFAICT, this is not mandated by
the spec.


You are correct that it is not mandated by the spec, but AFAIK there
seems to be no valid use case for that.


Xen has to be compliant with the spec, if the spec says something then 
we should do it unless there is a strong reason not to.


In this case, it is not too difficult to implement the suggestion I 
wrote a couple of months ago. So why would we try to put us in a corner?



See below


RID range should not overlap between ID Array entries.


I believe you misunderstood my point here. So let me give an example. 
My understanding of the spec is it is possible to have:


RC A
 // doesn't use SMMU 0 so just outputs DeviceIDs to ITS GROUP 0
 // Input ID --> Output reference: Output ID
0x-0x --> ITS GROUP 0 : 0x->0x

SMMU 0
// Note that range of StreamIDs that map to DeviceIDs excludes
// the NIC 0 DeviceID as it does not generate MSIs
 // Input ID --> Output reference: Output ID
0x-0x01ff --> ITS GROUP 0 : 0x1->0x101ff
0x0200-0x --> ITS GROUP 0 : 0x2->0x207ff

// SMMU 0 Control interrupt is MSI based
 // Input ID --> Output reference: Output ID
N/A --> ITS GROUP 0 : 0x21

I could have misunderstood so I am stating my understanding so far .. 
please feel free to correct me :)


In the IORT table using the  PCI-RC node, SMMU node and ITS node, 
RID->StreamID->Device-ID  mapping can be generated.
As per IORT spec toady, same RID can be mapped to different StreamIDs 
using two ID Array elements with same RID range but different output 
reference.
There exists no use case for such a scenario hence a clarification is 
required in IORT spec which states that RID range cannot overlap in the 
ID array.


with this clarification in place, it is straight-forward to map RID to a 
device-ID by replacing output of SMMU to output of RCI-RC



I believe this would be updated in the next IORT spec revision.


Well, Xen should still support current revision of IORT even if the 
next version add more restriction.


Cheers,



-Manish


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] xen/arm: Hiding SMMUs from Dom0 when using ACPI on Xen

2017-05-18 Thread Manish Jaggi

+Chales.

Hi Julien,

On 2/27/2017 11:42 PM, Julien Grall wrote:

On 02/27/2017 04:58 PM, Shanker Donthineni wrote:

Hi Julien,


Hi Shanker,

Please don't drop people in CC. In my case, any e-mail I am not CCed 
are skipping my inbox and I may not read them for a while.




On 02/27/2017 08:12 AM, Julien Grall wrote:



On 27/02/17 13:23, Vijay Kilari wrote:

Hi Julien,


Hello Vijay,


On Wed, Feb 22, 2017 at 7:40 PM, Julien Grall 
wrote:

Hello,

There was few discussions recently about hiding SMMUs from DOM0 when
using
ACPI. I thought it would be good to have a separate thread for this.

When using ACPI, the SMMUs will be described in the IO Remapping 
Table

(IORT). The specification can be found on the ARM website [1].

For a brief summary, the IORT can be used to discover the SMMUs
present on
the platform and find for a given device the ID to configure
components such
as ITS (DeviceID) and SMMU (StreamID).

The appendix A in the specification gives an example how DeviceID and
StreamID can be found. For instance, when a PCI device is both
protected by
an SMMU and MSI-capable the following translation will happen:
RID -> StreamID -> DeviceID

Currently, SMMUs are hidden from DOM0 because they are been used by
Xen and
we don't support stage-1 SMMU. If we pass the IORT as it is, DOM0
will try
to initialize SMMU and crash.

I first thought about using a Xen specific way (STAO) or extending a
flag in
IORT. But that is not ideal.

So we would have to rewrite the IORT for DOM0. Given that a range of
RID can
mapped to multiple ranges of DeviceID, we would have to translate
RID one by
one to find the associated DeviceID. I think this may end up to
complex code
and have a big IORT table.


Why can't we replace Output base of IORT of PCI node with SMMU output
base?.
I mean similar to PCI node without SMMU, why can't replace output base
of PCI node with
SMMU's output base?.


Because I don't see anything in the spec preventing one RC ID mapping
to produce multiple SMMU ID mapping. So which output base would you 
use?




Basically, remove SMMU nodes, and replaces output of the PCIe and named
nodes ID mappings with ITS nodes.

RID --> StreamID  --> dviceID  --> ITS device id = RID --> dviceID  -->
ITS device id


Can you detail it? You seem to assume that one RC ID mapping range 
will only produce ID mapping range. AFAICT, this is not mandated by 
the spec.


You are correct that it is not mandated by the spec, but AFAIK there 
seems to be no valid use case for that.


RID range should not overlap between ID Array entries.
I believe this would be updated in the next IORT spec revision.

I have started working on recreating iort for dom0 with this restriction.




The issue I see is RID is [15:0] where is DeviceID is [17:0].


Actuality device id is 32bit field.



However, given that DeviceID will be used by DOM0 to only configure
the ITS.
We have no need to use to have the DOM0 DeviceID equal to the host
DeviceID.
So I think we could simplify our life by generating DeviceID for
each RID
range.


If DOM0 DeviceID != host Device ID, then we cannot initialize ITS
using DOM0
ITS commands (MAPD). So, is it concluded that ITS initializes all the
devices
with platform specific Device ID's in Xen?.


Initializing ITS using DOM0 ITS command is a workaround until we get
PCI passthrough done. It would still be possible to implement that
with vDeviceID != pDeviceID as Xen would likely have the mapping
between the 2 DeviceID.



I believe mapping dom0 ITS commands to XEN ITS commands one to one is
the better approach.  Physical DeviceID is unique per ITS group, not a
system wide unique ID.


As for guest, you don't care about the virtual DeviceID for DOM0 as 
long as you are able to map it to the host ITS and host DeviceID.


> In case of direct VLPI,  LPI number has to be

programmed whenever dom0/domU calls the MAPTI command but not at the
time of PCIe device creation.


I am a bit confused. Why are you speaking about direct vLPI here? This 
has no relation with the IORT.


Cheers,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-01-29 Thread Manish Jaggi
Hello Julien,

On 01/25/2017 08:55 PM, Julien Grall wrote:
> Hello Manish,
> 
> On 25/01/17 04:37, Manish Jaggi wrote:
>> On 01/24/2017 11:13 PM, Julien Grall wrote:
>>>
>>>
>>> On 19/01/17 05:09, Manish Jaggi wrote:
>>>> I think, PCI passthrough and DOM0 w/ACPI enumerating devices on PCI are 
>>>> separate features.
>>>> Without Xen mapping PCI config space region in stage2 of dom0, ACPI dom0 
>>>> wont boot.
>>>> Currently for dt xen does that.
>>>>
>>>> So can we have 2 design documents
>>>> a) PCI passthrough
>>>> b) ACPI dom0/domU support in Xen and Linux
>>>> - this may include:
>>>> b.1 Passing IORT to Dom0 without smmu
>>>> b.2 Hypercall to map PCI config space in dom0
>>>> b.3 
>>>>
>>>> What do you think?
>>>
>>> I don't think ACPI should be treated in a separate design document.
>> As PCI passthrough support will take time to mature, why should we hold the 
>> ACPI design ?
>> If I can boot dom0/domU with ACPI as it works with dt today, it would be a 
>> good milestone.
> 
> The way PCI is working on DT today is a hack.
Can you please elaborate why it is a hack ?
> There is no SMMU support
SMMU support can be turned on and off by iommu=0 and also by not having an smmu 
node in device tree.
So not having an smmu support for dom0 is not a hack IMHO.
domUs can continue with PV devices

And if you term without smmu as a hack, if I may suggest lets use this as a 
phase 0 for ACPI.

> and the first version of GICv3 ITS support will contain hardcoded DeviceID 
> (or very similar). 
I have a disagreement on this, why should it contain hardcoded device ID, what 
prevents it today technically?
Can you please elaborate.
If you are ok to have a first limited version of GICV3 ITS why not have a 
Phase0 for ACPI?

> 
> The current hack will introduce problem on platform where a specific host 
> controller is necessary to access the configuration space.
The specific host controller can be accessed by dom0 with Xen mapping stage2, 
then we dont need a driver? right?
Can you please elaborate on the problem?
> Indeed, at the beginning Xen may not have a driver available (this
> will depend on the contribution), but we still need to be able to use PCI 
> with Xen. 
ACPI dom0 boot can and should be done without smmu support.

> We chose this way on DT because we didn't know when the PCI passthrough will 
> be added in Xen.
not a technical argument.

> 
> As mentioned in the introduction of the design document, I envision PCI 
> passthrough implementation in 2 phases:
> - Phase 1: Register all PCI devices in Xen => will allow to use ITS and 
> SMMU with PCI in Xen
> - Phase 2: Assign devices to guests
> 
I think 3 phases, Lets add phase 0.
- Phase 0: Dom0 ACPI without SMMU, DomU with PV devices, ITS in Xen

> This design document will cover both phases because they are tight together. 
> But the implementation can be decoupled, it would be possible (and also my 
> plan) to see the 2 phases upstreamed in
> different Xen release.
> 
> Phase 1, will cover anything necessary for Xen to discover and register PCI 
> devices. This include the ACPI support (IORT,...).
> 
> I see little point to have a temporary solution for ACPI that will require 
> bandwidth review. It would be better to put this bandwidth focusing on 
> getting a good design document.
I disagree, it is not a temporary solution. There are several use cases where 
PCI pass-through is not required but ACPI is.
> 
> When we brainstormed about PCI passthrough, we identified some tasks that 
> could be done in parallel of the design document. The list I have in mind is:
> * SMMUv3: I am aware of a company working on this
> * GICv3-ITS: work done by ARM (see [2])
> * IORT: it is required to discover ITSes and SMMU with ACPI. So it can at 
> least be parsed (I will speak about hiding some part to DOM0 later)
> * PCI support for SMMUv2
> 
> There are quite a few companies willing to contribute to PCI passthrough. So 
> we need some coordination to avoid redundancy. Please get in touch with me if 
> you are interested to work on one of these
> items.
> 
Will mail you.
>> Later when PCI passthrough design gets mature and implemented the support 
>> can be extended.
>>> The support of ACPI may affect some of the decisions (such as hypercall) 
>>> and we have to know them now.
>>>
>> Still it can be an independent with only dependent features implemented or 
>> placeholders can be addded
>>> Regarding the ECAM region not mapped. This is not related to PCI 
>>> passthrough but how

Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-01-24 Thread Manish Jaggi


On 01/24/2017 11:13 PM, Julien Grall wrote:
> 
> 
> On 19/01/17 05:09, Manish Jaggi wrote:
>> Hi Julien,
> 
> Hello Manish,
[snip]

>> I think, PCI passthrough and DOM0 w/ACPI enumerating devices on PCI are 
>> separate features.
>> Without Xen mapping PCI config space region in stage2 of dom0, ACPI dom0 
>> wont boot.
>> Currently for dt xen does that.
>>
>> So can we have 2 design documents
>> a) PCI passthrough
>> b) ACPI dom0/domU support in Xen and Linux
>> - this may include:
>> b.1 Passing IORT to Dom0 without smmu
>> b.2 Hypercall to map PCI config space in dom0
>> b.3 
>>
>> What do you think?
> 
> I don't think ACPI should be treated in a separate design document.
As PCI passthrough support will take time to mature, why should we hold the 
ACPI design ?
If I can boot dom0/domU with ACPI as it works with dt today, it would be a good 
milestone.
Later when PCI passthrough design gets mature and implemented the support can 
be extended.
> The support of ACPI may affect some of the decisions (such as hypercall) and 
> we have to know them now.
> 
Still it can be an independent with only dependent features implemented or 
placeholders can be addded
> Regarding the ECAM region not mapped. This is not related to PCI passthrough 
> but how MMIO are mapped with ACPI. This is a separate subject already in 
> discussion (see [1]).
> 
What about IORT generation for Dom0 without smmu ?
I believe, It is not dependent on [1]
> Cheers,
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg01607.html
> 

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-01-24 Thread Manish Jaggi
Hi Julien/Stefano,

On 01/24/2017 07:58 PM, Julien Grall wrote:
> Hi Stefano,
> 
> On 04/01/17 00:24, Stefano Stabellini wrote:
>> On Thu, 29 Dec 2016, Julien Grall wrote:
> 
> [...]
> 
>>> # Introduction
>>>
>>> PCI passthrough allows to give control of physical PCI devices to guest. 
>>> This
>>> means that the guest will have full and direct access to the PCI device.
>>>
>>> ARM is supporting one kind of guest that is exploiting as much as possible
>>> virtualization support in hardware. The guest will rely on PV driver only
>>> for IO (e.g block, network), interrupts will come through the virtualized
>>> interrupt controller. This means that there are no big changes required
>>> within the kernel.
>>>
>>> By consequence, it would be possible to replace PV drivers by assigning real
>>   ^ As a consequence
> 
> I will fix all the typoes in the next version.
> 
>>
>>
>>> devices to the guest for I/O access. Xen on ARM would therefore be able to
>>> run unmodified operating system.
> 
> [...]
> 
>>> Instantiation of a specific driver for the host controller can be easily 
>>> done
>>> if Xen has the information to detect it. However, those drivers may require
>>> resources described in ASL (see [4] for instance).
q. would these drivers (like ecam/pem) be added in xen ?
If yes how would xen have the information to detect host controller compatible.
Should it be passed in the hypercall physdev_pci_host_bridge_add below.
>>>
>>> XXX: Need more investigation to know whether the missing information should
>>> be passed by DOM0 or hardcoded in the driver.
>>
>> Given that we are talking about quirks here, it would be better to just
>> hardcode them in the drivers, if possible.
> 
> Indeed hardcoded would be the preferred way to avoid introduce new hypercall 
> for quirk.
> 
> For instance, in the case of Thunder-X (see commit 44f22bd "PCI: Add MCFG 
> quirks for Cavium ThunderX pass2.x host controller) some region are read from 
> ACPI. What I'd like to understand is whether
> this could be hardcoded or can it change between platform? If it can change, 
> is there a way in ACPI to differentiate 2 platforms?
> 
> Maybe this is a question that Cavium can answer? (in CC).
> 
I think it is ok to hardcode.
You might need to see 648d93f "PCI: Add MCFG quirks for Cavium ThunderX pass1.x 
host controller" as well.

> 
> [...]
> 
>>> ## Discovering and register hostbridge
>>>
>>> Both ACPI and Device Tree do not provide enough information to fully
>>> instantiate an host bridge driver. In the case of ACPI, some data may come
>>> from ASL,
>>
>> The data available from ASL is just to initialize quirks and non-ECAM
>> controllers, right? Given that SBSA mandates ECAM, and we assume that
>> ACPI is mostly (if not only) for servers, then I think it is safe to say
>> that in the case of ACPI we should have all the info to fully
>> instantiate an host bridge driver.
> 
> From the spec, the MCFG will only describe host bridge available at boot (see 
> 4.2 in "PCI firmware specification, rev 3.2"). All the other host bridges 
> will be described in ASL.
> 
> So we need DOM0 to feed Xen about the latter host bridges.
> 
>>
>>
>>> whilst for Device Tree the segment number is not available.
>>>
>>> So Xen needs to rely on DOM0 to discover the host bridges and notify Xen
>>> with all the relevant informations. This will be done via a new hypercall
>>> PHYSDEVOP_pci_host_bridge_add. The layout of the structure will be:
>>
>> I understand that the main purpose of this hypercall is to get Xen and Dom0 
>> to
>> agree on the segment numbers, but why is it necessary? If Dom0 has an
>> emulated contoller like any other guest, do we care what segment numbers
>> Dom0 will use?
> 
> I was not planning to have a emulated controller for DOM0. The physical one 
> is not necessarily ECAM compliant so we would have to either emulate the 
> physical one (meaning multiple different emulation)
> or an ECAM compliant.
> 
> The latter is not possible because you don't know if there is enough free 
> MMIO space for the emulation.
> 
> In the case on ARM, I don't see much the point to emulate the host bridge for 
> DOM0. The only thing we need in Xen is to access the configuration space, we 
> don't have about driving the host bridge. So
> I would let DOM0 dealing with that.
> 
> Also, I don't see any reason for ARM to trap DOM0 configuration space access. 
> The MSI will be configured using the interrupt controller and it is a trusted 
> Domain.
> 
>>
>>
>>> struct physdev_pci_host_bridge_add
>>> {
>>> /* IN */
>>> uint16_t seg;
>>> /* Range of bus supported by the host bridge */
>>> uint8_t  bus_start;
>>> uint8_t  bus_nr;
>>> uint32_t res0;  /* Padding */
>>> /* Information about the configuration space region */
>>> uint64_t cfg_base;
>>> uint64_t cfg_size;
>>> }
>>>
>>> DOM0 will issue the hypercall PHYSDEVOP_pci_host_bridge_add for each host
>>> bridge available on the platform. When Xen is receiving the 

Re: [Xen-devel] [early RFC] ARM PCI Passthrough design document

2017-01-18 Thread Manish Jaggi
Hi Julien,

On 12/29/2016 07:34 PM, Julien Grall wrote:
> Hi all,
> 
> The document below is an early version of a design
> proposal for PCI Passthrough in Xen. It aims to
> describe from an high level perspective the interaction
> with the different subsystems and how guest will be able
> to discover and access PCI.
> 
> I am aware that a similar design has been posted recently
> by Cavium (see [1]), however the approach to expose PCI
> to guest is different. We have request to run unmodified
> baremetal OS on Xen, a such guest would directly
> access the devices and no PV drivers will be used.
> 
> That's why this design is based on emulating a root controller.
> This also has the advantage to have the VM interface as close
> as baremetal allowing the guest to use firmware tables to discover
> the devices.
> 
> Currently on ARM, Xen does not have any knowledge about PCI devices.
> This means that IOMMU and interrupt controller (such as ITS)
> requiring specific configuration will not work with PCI even with
> DOM0.
> 
> The PCI Passthrough work could be divided in 2 phases:
>   * Phase 1: Register all PCI devices in Xen => will allow
>  to use ITS and SMMU with PCI in Xen
> * Phase 2: Assign devices to guests
> 
> This document aims to describe the 2 phases, but for now only phase
> 1 is fully described.
> 
> I have sent the design document to start to gather feedback on
> phase 1.
> 
> Cheers,
> 
> [1] https://lists.xen.org/archives/html/xen-devel/2016-12/msg00224.html 
> 
> 
> % PCI pass-through support on ARM
> % Julien Grall 
> % Draft A
> 
> # Preface
> 
> This document aims to describe the components required to enable PCI
> passthrough on ARM.
> 
> This is an early draft and some questions are still unanswered, when this is
> the case the text will contain XXX.
> 
> # Introduction
> 
> PCI passthrough allows to give control of physical PCI devices to guest. This
> means that the guest will have full and direct access to the PCI device.
> 
> ARM is supporting one kind of guest that is exploiting as much as possible
> virtualization support in hardware. The guest will rely on PV driver only
> for IO (e.g block, network), interrupts will come through the virtualized
> interrupt controller. This means that there are no big changes required
> within the kernel.
> 
> By consequence, it would be possible to replace PV drivers by assigning real
> devices to the guest for I/O access. Xen on ARM would therefore be able to
> run unmodified operating system.
> 
> To achieve this goal, it looks more sensible to go towards emulating the
> host bridge (we will go into more details later). A guest would be able
> to take advantage of the firmware tables and obviating the need for a specific
> driver for Xen.
> 
> Thus in this document we follow the emulated host bridge approach.
> 
> # PCI terminologies
> 
> Each PCI device under a host bridge is uniquely identified by its Requester ID
> (AKA RID). A Requester ID is a triplet of Bus number, Device number, and
> Function.
> 
> When the platform has multiple host bridges, the software can add fourth
> number called Segment to differentiate host bridges. A PCI device will
> then uniquely by segment:bus:device:function (AKA SBDF).
> 
> So given a specific SBDF, it would be possible to find the host bridge and the
> RID associated to a PCI device.
> 
> # Interaction of the PCI subsystem with other subsystems
> 
> In order to have a PCI device fully working, Xen will need to configure
> other subsystems subsytems such as the SMMU and the Interrupt Controller.
> 
> The interaction expected between the PCI subsystem and the other is:
> * Add a device
> * Remove a device
> * Assign a device to a guest
> * Deassign a device from a guest
> 
> XXX: Detail the interaction when assigning/deassigning device
> 
> The following subsections will briefly describe the interaction from an
> higher level perspective. Implementation details (callback, structure...)
> is out of scope.
> 
> ## SMMU
> 
> The SMMU will be used to isolate the PCI device when accessing the memory
> (for instance DMA and MSI Doorbells). Often the SMMU will be configured using
> a StreamID (SID) that can be deduced from the RID with the help of the 
> firmware
> tables (see below).
> 
> Whilst in theory all the memory transaction issued by a PCI device should
> go through the SMMU, on certain platforms some of the memory transaction may
> not reach the SMMU because they are interpreted by the host bridge. For
> instance this could happen if the MSI doorbell is built into the PCI host
> bridge. See [6] for more details.
> 
> XXX: I think this could be solved by using the host memory layout when
> creating a guest with PCI devices => Detail it.
> 
> ## Interrupt controller
> 
> PCI supports three kind of interrupts: legacy interrupt, MSI and MSI-X. On ARM
> legacy interrupts will be mapped to SPIs. MSI and MSI-x will be

[Xen-devel] ARM PCI Pass through Design Draft 5

2016-12-02 Thread Manish Jaggi

  -
 | PCI Pass-through in Xen ARM |
  -
   manish.ja...@cavium.com
  -

  Draft-5


 -
 Introduction
 -
 This document describes the design for the PCI passthrough support in Xen
 ARM. The target system is an ARM 64bit SoC with GICv3 and SMMU and PCIe
 devices.

 It is assumed that the PVH guests will have its msi controller support and
 a Virtual ITS in Xen would redirect device interrupts to Guest.

 This document is limited to dt based pci, It will evolve to add ACPI

 -
 Revision History
 -
 Changes from Draft-1:
 -
 a) map_mmio hypercall removed from earlier draft
 b) device bar mapping into guest not 1:1
 c) Reserved Area in guest address space for mapping PCI-EP BARs in Stage2.
 d) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info).

 Changes from Draft-2:
 -
 a) DomU boot information updated with boot-time device assignment and
 hotplug.
 b) SMMU description added
 c) Mapping between streamID - bdf - deviceID.
 d) assign_device hypercall to include virtual(guest) sbdf.
 Toolstack to generate guest sbdf rather than pciback.

 Changes from Draft-3:
 -
 a) Fixed typos and added more description
 b) NUMA and PCI passthrough description removed for now.
 c) Added example from Ian's Mail

 Changes from Draft-4:
 
 a) Added Hypercall PHYSDEVOP_pci_dev_map_msi_specifier
 b) The design takes into account Linux PCI msi-map support
 c) Added Xen internal to get streamID from pci_dev
 d) Added few examples and dts/code snippets

 -
 Index
 -
   (1) Background

   (2) Basic PCI Support in Xen ARM
   (2.1) pci_hostbridge and pci_hostbridge_ops
   (2.2) PHYSDEVOP_HOSTBRIDGE_ADD hypercall
   (2.3) XEN Internal API

   (3) SMMU programming
   (3.1) Additions for PCI Passthrough
   (3.2) Mapping between streamID - deviceID - pci sbdf - requesterID

   (4) Assignment of PCI device
   (4.1) Dom0
   (4.1.1) Stage 2 Mapping of GITS_ITRANSLATER space (4k)
   (4.1.1.1) For Dom0
   (4.1.1.2) For DomU
   (4.1.1.2.1) Hypercall Details: XEN_DOMCTL_get_itranslater_space

   (4.2) DomU
   (4.2.1) Reserved Areas in guest memory space
   (4.2.2) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info).
   (4.2.3) Hypercall Modification for bdf mapping notification to xen

   (5) DomU FrontEnd Bus Changes
   (5.1) Change in Linux PCI frontend bus and gicv3-its node binding for domU

   (6) Glossary

   (7) References
 -

 1.Background
 -
 Passthrough refers to assigning a PCI device to a guest domain (domU) such
 that the guest has full control over the device. The MMIO space / interrupts
 are managed by the guest itself, close to how a bare kernel manages a device.

 Device's access to guest address space needs to be isolated and protected.
 SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow
 device access guest memory for data transfer and sending MSI/X interrupts.
 PCI devices generated message signaled interrupt writes are within guest
 address spaces which are also translated using SMMU.


 1.1 PCI device Id in Dom0
 --
 As per the bindings document [6], msi-specifier is generated from msi-map
 property such the msi-specifier [32:16] bits would come from the msi-map
 namespace and [15:0] would be same as RID.

 There could be multiple pci nodes in device tree with same msi-map property
pci@84a0 {
compatible = "pci-host-ecam-generic";
device_type = "pci";
msi-map = <0x0 0x6f 0x2 0x1>;
bus-range = <0x0 0x1f>;
reg = <0x84a0 0x0 0x0 0x200>;
...
};

pci@87e0c200 {
compatible = "cavium,pci-host-thunder-pem";
device_type = "pci";
msi-map = <0x0 0x6f 0x1 0x1>;
bus-range = <0x8f 0xc7>;
reg = <0x8880 0x8f00 0x0 0x3900 0x87e0 0xc200
 0x0 0x100>;
...
}

pci@8490 {
compatible = "pci-host-ecam-generic";
device_type = "pci";
msi-map = <0x0 

Re: [Xen-devel] PCI Pass-through in Xen ARM: Draft 4

2015-09-19 Thread Manish Jaggi



On Wednesday 16 September 2015 06:28 PM, Julien Grall wrote:

On 15/09/15 19:58, Jaggi, Manish wrote:

I can see 2 different solutions:
 1) Let DOM0 pass the first requester ID when registering the bus
Pros:
 * Less per-platform code in Xen
Cons:
 * Assume that the requester ID are contiguous. (Is it really a
cons?)
 * Still require quirk for buggy device (i.e requester ID not
correct)
 2) Do it in Xen
Pros:
 * We are not relying on DOM0 giving the requester ID
 => Not assuming contiguous requester ID
 Cons:
 * Per PCI bridge code to handle the mapping


   We can have (3) that when PHYSDEVOP_pci_add_device is called the sbdf
and requesterID both are passed in hypercall.

The name of the physdev operation is PHYSDEVOP_pci_device_add and not
PHYSDEVOP_pci_add_device. Please rename it all the usage in the design doc.

Although, we can't modify PHYSDEVOP_pci_device_add because it's part of
the ABI which is stable.

Based on David's mail, the requester ID of a given device can be found
using base + devfn where base is the first requesterID of the bus.

IIRC, this is also match the IORT ACPI spec.

So for now, I would extend the physdev you've introduced to add an host
bridge (PHYSDEV_pci_host_bridge_add) to pass the base requesterID.
The requester-ID is derived from the Node# and ECAM# as per David. I 
guess the ECAM and Node# can be derived from

the cfg_addr.
Each Ecam has a cfg_addr in Thunder, which is mentioned in the pci node 
in device tree.

For thunder I think we don't need to pass requester-ID in the phydevop.


We can think later to introduce a new physdev op to add PCI if we ever
require unique requesterID (i.e non-contiguous under the same bridge).

Regards,

---
Julien Grall



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM: Draft 4

2015-09-02 Thread Manish Jaggi



On Tuesday 01 September 2015 01:02 PM, Jan Beulich wrote:

On 31.08.15 at 14:36, <mja...@caviumnetworks.com> wrote:

On Thursday 13 August 2015 03:12 PM, Manish Jaggi wrote:

  4.2.1 Mapping BAR regions in guest address space
  -

  When a PCI-EP device is assigned to a domU the toolstack will read
the pci
  configuration space BAR registers. Toolstack allocates a virtual BAR
  region for each BAR region, from the area reserved in guest address
space for
  mapping BARs referred to as Guest BAR area. This area is defined in
  public/arch-arm.h

  /* For 32bit BARs*/
  #define GUEST_BAR_BASE_32 <<>>
  #define GUEST_BAR_SIZE_32 <<>>

  /* For 64bit BARs*/
  #define GUEST_BAR_BASE_64 <<>>
  #define GUEST_BAR_SIZE_64 <<>>

  Toolstack then invokes domctl xc_domain_memory_mapping to map in stage2
  translation. If a BAR region address is 32b BASE_32 area would be used,
  otherwise 64b. If a combination of both is required the support is TODO.

  Toolstack manages these areas and allocate from these area. The
allocation
  and deallocation is done using APIs similar to malloc and free.


To implement this feature in xl tools there is required to have a malloc
and free from the reserved area.
Can we have the XEN_DOMCTL_memory_mapping extended with a flag say
ALLOCATE/FREE_FROM_BAR_AREA.
When this flag is passed xen would add or remove the stage2 mapping for
the domain.
This will make use of the code already present in xen.

Above it was said that the tool stack manages this area (including
allocations from it). Why would this require a new hypercall?
As a rule xl tools should manage the guest memory map. Now if it does by 
itself or initiates it is another thing.
Allocating an area for PCI BAR and freeing it reserved area  would 
require adding  allocator code in xl tools.
Since xen already knows about the area (as it is defined in public 
header file) and there exists code in xen,

i believe it make sense to use that rather than adding the same in xl tools.



Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM: Draft 4

2015-08-31 Thread Manish Jaggi



On Thursday 13 August 2015 03:12 PM, Manish Jaggi wrote:

-
 | PCI Pass-through in Xen ARM |
  -
 manish.ja...@caviumnetworks.com
 ---

  Draft-4
[snip]

 4.2.1 Mapping BAR regions in guest address space
 - 

 When a PCI-EP device is assigned to a domU the toolstack will read 
the pci

 configuration space BAR registers. Toolstack allocates a virtual BAR
 region for each BAR region, from the area reserved in guest address 
space for

 mapping BARs referred to as Guest BAR area. This area is defined in
 public/arch-arm.h

 /* For 32bit BARs*/
 #define GUEST_BAR_BASE_32 <<>>
 #define GUEST_BAR_SIZE_32 <<>>

 /* For 64bit BARs*/
 #define GUEST_BAR_BASE_64 <<>>
 #define GUEST_BAR_SIZE_64 <<>>

 Toolstack then invokes domctl xc_domain_memory_mapping to map in stage2
 translation. If a BAR region address is 32b BASE_32 area would be used,
 otherwise 64b. If a combination of both is required the support is TODO.

 Toolstack manages these areas and allocate from these area. The 
allocation

 and deallocation is done using APIs similar to malloc and free.



To implement this feature in xl tools there is required to have a malloc 
and free from the reserved area.
Can we have the XEN_DOMCTL_memory_mapping extended with a flag say 
ALLOCATE/FREE_FROM_BAR_AREA.
When this flag is passed xen would add or remove the stage2 mapping for 
the domain.

This will make use of the code already present in xen.

Any reservations with this approach ?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] PCI Pass-through in Xen ARM: Draft 4

2015-08-13 Thread Manish Jaggi

  -
 | PCI Pass-through in Xen ARM |
  -
 manish.ja...@caviumnetworks.com
 ---

  Draft-4


 -
 Introduction
 -
 This document describes the design for the PCI passthrough support in Xen
 ARM. The target system is an ARM 64bit SoC with GICv3 and SMMU v2 and PCIe
 devices.

 -
 Revision History
 -
 Changes from Draft-1:
 -
 a) map_mmio hypercall removed from earlier draft
 b) device bar mapping into guest not 1:1
 c) Reserved Area in guest address space for mapping PCI-EP BARs in Stage2.
 d) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info).

 Changes from Draft-2:
 -
 a) DomU boot information updated with boot-time device assignment and
 hotplug.
 b) SMMU description added
 c) Mapping between streamID - bdf - deviceID.
 d) assign_device hypercall to include virtual(guest) sbdf.
 Toolstack to generate guest sbdf rather than pciback.

 Changes from Draft-3:
 -
 a) Fixed typos and added more description
 b) NUMA and PCI passthrough description removed for now.
 c) Added example from Ian's Mail

 -
 Index
 -
   (1) Background

   (2) Basic PCI Support in Xen ARM
   (2.1) pci_hostbridge and pci_hostbridge_ops
   (2.2) PHYSDEVOP_HOSTBRIDGE_ADD hypercall
   (2.3) XEN Internal API

   (3) SMMU programming
   (3.1) Additions for PCI Passthrough
   (3.2) Mapping between streamID - deviceID - pci sbdf - requesterID

   (4) Assignment of PCI device
   (4.1) Dom0
   (4.1.1) Stage 2 Mapping of GITS_ITRANSLATER space (4k)
   (4.1.1.1) For Dom0
   (4.1.1.2) For DomU
   (4.1.1.2.1) Hypercall Details: XEN_DOMCTL_get_itranslater_space

   (4.2) DomU
   (4.2.1) Reserved Areas in guest memory space
   (4.2.2) Xenstore Update: For each PCI-EP BAR (IPA-PA mapping info).
   (4.2.3) Hypercall Modification for bdf mapping notification to xen

   (5) DomU FrontEnd Bus Changes
   (5.1) Change in Linux PCI frontend bus and gicv3-its node binding 
for domU


   (6) Glossary

   (7) References
 -

 1.Background of PCI passthrough
 -
 Passthrough refers to assigning a PCI device to a guest domain (domU) such
 that the guest has full control over the device. The MMIO space / 
interrupts
 are managed by the guest itself, close to how a bare kernel manages a 
device.


 Device's access to guest address space needs to be isolated and protected.
 SMMU (System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow
 device access guest memory for data transfer and sending MSI/X interrupts.
 PCI devices generated message signalled interrupt writes are within guest
 address spaces which are also translated using SMMU.

 For this reason the GITS (ITS address space) Interrupt Translation 
Register

 space is mapped in the guest address space.

 2.Basic PCI Support for ARM
 -
 The APIs to read write from PCI configuration space are based on 
segment:bdf.

 How the sbdf is mapped to a physical address is under the realm of the PCI
 host controller.

 ARM PCI support in Xen, introduces PCI host controller similar to what
 exists in Linux. Host controller drivers registers callbacks, which are
 invoked on matching the compatible property in pci device tree node.

 Note: as pci devices are enumerated the pci node in device tree refers to
 the host controller.

 (TODO: for ACPI unimplemented)

 2.1pci_hostbridge and pci_hostbridge_ops
 -
 The init function in the PCI host driver calls to register hostbridge
 callbacks:

 int pci_hostbridge_register(pci_hostbridge_t *pcihb);

 struct pci_hostbridge_ops {
 u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn,
 u32 reg, u32 bytes);
 void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn,
 u32 reg, u32 bytes, u32 val);
 };

 struct pci_hostbridge{
 u32 segno;
 paddr_t cfg_base;
 paddr_t cfg_size;
 struct dt_device_node *dt_node;
 struct pci_hostbridge_ops ops;
 struct list_head list;
 };

 A PCI conf_read function would internally be as follows:
 u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 

Re: [Xen-devel] PCI Passthrough Design - Draft 3

2015-08-12 Thread Manish Jaggi

Below are the comments. I will also send a Draft 4 taking account of the 
comments.


On Wednesday 12 August 2015 02:04 AM, Konrad Rzeszutek Wilk wrote:

On Tue, Aug 04, 2015 at 05:57:24PM +0530, Manish Jaggi wrote:

  -
 | PCI Pass-through in Xen ARM |
  -
 manish.ja...@caviumnetworks.com
 ---

  Draft-3
...
[snip]
2.2PHYSDEVOP_pci_host_bridge_add hypercall
--
Xen code accesses PCI configuration space based on the sbdf received from
the
guest. The order in which the pci device tree node appear may not be the
same
order of device enumeration in dom0. Thus there needs to be a mechanism to
bind
the segment number assigned by dom0 to the pci host controller. The
hypercall
is introduced:

Why can't we extend the existing hypercall to have the segment value?

Oh wait, PHYSDEVOP_manage_pci_add_ext does it already!

It doesn’t pass the cfg_base and size to xen


And have the hypercall (and Xen) be able to deal with introduction of PCI
devices that are out of sync?

Maybe I am confused but aren't PCI host controllers also 'uploaded' to
Xen?

I need to add one more line here to be more descriptive. The binding is
between the segment number (domain number in linux)
used by dom0 and the pci config space address in the pci node of device
tree (reg property).
The hypercall was introduced to cater the fact that the dom0 may process
pci nodes in the device tree in any order.
By this binding it is a clear ABI.

#define PHYSDEVOP_pci_host_bridge_add44
struct physdev_pci_host_bridge_add {
 /* IN */
 uint16_t seg;
 uint64_t cfg_base;
 uint64_t cfg_size;
};

This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add
hypercall. The handler code invokes to update segment number in
pci_hostbridge:

int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t
cfg_size);

Subsequent calls to pci_conf_read/write are completed by the
pci_hostbridge_ops
of the respective pci_hostbridge.

This design sounds like it is added to deal with having to pre-allocate the
amount host controllers structure before the PCI devices are streaming in?

Instead of having the PCI devices and PCI host controllers be updated
as they are coming in?

Why can't the second option be done?

If you are referring to ACPI, we have to add the support.
PCI Host controllers are pci nodes in device tree.

2.3Helper Functions

a) pci_hostbridge_dt_node(pdev-seg);
Returns the device tree node pointer of the pci node from which the pdev got
enumerated.

3.SMMU programming
---

3.1.Additions for PCI Passthrough
---
3.1.1 - add_device in iommu_ops is implemented.

This is called when PHYSDEVOP_pci_add_device is called from dom0.

Or for PHYSDEVOP_manage_pci_add_ext ?

Not sure but it seems logical for this also.

.add_device = arm_smmu_add_dom0_dev,
static int arm_smmu_add_dom0_dev(u8 devfn, struct device *dev)
{
 if (dev_is_pci(dev)) {
 struct pci_dev *pdev = to_pci_dev(dev);
 return arm_smmu_assign_dev(pdev-domain, devfn, dev);
 }
 return -1;
}


What about removal?

What if the device is removed (hot-unplugged??

.remove_device  = arm_smmu_remove_device(). would be called.
Will update in Draft4


3.1.2 dev_get_dev_node is modified for pci devices.
-
The function is modified to return the dt_node of the pci hostbridge from
the device tree. This is required as non-dt devices need a way to find on
which smmu they are attached.

static struct arm_smmu_device *find_smmu_for_device(struct device *dev)
{
 struct device_node *dev_node = dev_get_dev_node(dev);


static struct device_node *dev_get_dev_node(struct device *dev)
{
 if (dev_is_pci(dev)) {
 struct pci_dev *pdev = to_pci_dev(dev);
 return pci_hostbridge_dt_node(pdev-seg);
 }
...


3.2.Mapping between streamID - deviceID - pci sbdf - requesterID
-
For a simpler case all should be equal to BDF. But there are some devices
that
use the wrong requester ID for DMA transactions. Linux kernel has pci quirks
for these. How the same be implemented in Xen or a diffrent approach has to

s/pci/PCI/

be
taken is TODO here.
Till that time, for basic implementation it is assumed that all are equal to
BDF.


4.Assignment of PCI device
-

4.1Dom0

All PCI devices are assigned to dom0 unless hidden by pci-hide bootargs in
dom0.

'pci-hide' in dom0? Greeping in Documentation/kernel-parameters.txt I don't
see anything.

%s/pci-hide//pciback/./hide//

Dom0 enumerates the PCI devices. For each device

[Xen-devel] PCI Passthrough Design - Draft 3

2015-08-04 Thread Manish Jaggi

 -
| PCI Pass-through in Xen ARM |
 -
manish.ja...@caviumnetworks.com
---

 Draft-3


---
Introduction
---
This document describes the design for the PCI passthrough support in 
Xen ARM.
The target system is an ARM 64bit Soc with GICv3 and SMMU v2 and PCIe 
devices.


---
Revision History
---
Changes from Draft-1:
-
a) map_mmio hypercall removed from earlier draft
b) device bar mapping into guest not 1:1
c) holes in guest address space 32bit / 64bit for MMIO virtual BARs
d) xenstore device's BAR info addition.

Changes from Draft-2:
-
a) DomU boot information updated with boot-time device assignment and 
hotplug.

b) SMMU description added
c) Mapping between streamID - bdf - deviceID.
d) assign_device hypercall to include virtual(guest) sbdf.
Toolstack to generate guest sbdf rather than pciback.

---
Index
---
  (1) Background

  (2) Basic PCI Support in Xen ARM
  (2.1)pci_hostbridge and pci_hostbridge_ops
  (2.2)PHYSDEVOP_HOSTBRIDGE_ADD hypercall

  (3) SMMU programming
  (3.1) Additions for PCI Passthrough
  (3.2)Mapping between streamID - deviceID - pci sbdf

  (4) Assignment of PCI device

  (4.1) Dom0
  (4.1.1) Stage 2 Mapping of GITS_ITRANSLATER space (4k)
  (4.1.1.1) For Dom0
  (4.1.1.2) For DomU
  (4.1.1.2.1) Hypercall Details: XEN_DOMCTL_get_itranslater_space

  (4.2) DomU
  (4.2.1) Reserved Areas in guest memory space
  (4.2.2) New entries in xenstore for device BARs
  (4.2.4) Hypercall Modification for bdf mapping notification to xen

  (5) DomU FrontEnd Bus Changes
  (5.1)Change in Linux PCI FrontEnd - backend driver for MSI/X 
programming

  (5.2)Frontend bus and interrupt parent vITS

  (6) NUMA and PCI passthrough
---

1.Background of PCI passthrough
--
Passthrough refers to assigning a pci device to a guest domain (domU) 
such that
the guest has full control over the device. The MMIO space and 
interrupts are

managed by the guest itself, close to how a bare kernel manages a device.

Device's access to guest address space needs to be isolated and 
protected. SMMU

(System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device
access guest memory for data transfer and sending MSI/X interrupts. PCI 
devices
generated message signalled interrupt write are within guest address 
spaces which

are also translated using SMMU.
For this reason the GITS (ITS address space) Interrupt Translation Register
space is mapped in the guest address space.

2.Basic PCI Support for ARM
--
The apis to read write from pci configuration space are based on 
segment:bdf.

How the sbdf is mapped to a physical address is under the realm of the pci
host controller.

ARM PCI support in Xen, introduces pci host controller similar to what 
exists
in Linux. Each drivers registers callbacks, which are invoked on 
matching the

compatible property in pci device tree node.

2.1pci_hostbridge and pci_hostbridge_ops
--
The init function in the pci host driver calls to register hostbridge 
callbacks:

int pci_hostbridge_register(pci_hostbridge_t *pcihb);

struct pci_hostbridge_ops {
u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn,
u32 reg, u32 bytes);
void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn,
u32 reg, u32 bytes, u32 val);
};

struct pci_hostbridge{
u32 segno;
paddr_t cfg_base;
paddr_t cfg_size;
struct dt_device_node *dt_node;
struct pci_hostbridge_ops ops;
struct list_head list;
};

A pci conf read function would internally be as follows:
u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes)
{
pci_hostbridge_t *pcihb;
list_for_each_entry(pcihb, pci_hostbridge_list, list)
{
if(pcihb-segno == seg)
return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes);
}
return -1;
}

2.2PHYSDEVOP_pci_host_bridge_add hypercall
--
Xen code accesses PCI configuration space based on the sbdf received 
from the
guest. The order in which the pci device tree node appear may not be the 
same
order of device enumeration in dom0. Thus 

Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-31 Thread Manish Jaggi



On 31/07/15 4:49 pm, Ian Campbell wrote:

On Fri, 2015-07-31 at 16:37 +0530, Manish Jaggi wrote:

On Friday 31 July 2015 01:35 PM, Ian Campbell wrote:

On Fri, 2015-07-31 at 13:16 +0530, Manish Jaggi wrote:

Secondly, the vdev-X entry is created async by dom0 watching on
event.
So how the tools could read back and call assign device again.

Perhaps by using a xenstore watch on that node to wait for the
assignment
from pciback to occur.

As per the flow in the do_pci_add function, assign_device is called
first and based on the success xenstore entry is created.
Are you suggesting to change the sequence.

Perhaps that is what it would take, yes, or maybe some other
refactoring
(e.g. splitting assign_device into two stages) might be the answer.

The hypercall from xenpciback (what I implemented) is actually making
the assign device in 2 stages.
I think the point of contention is the second stage should be from
toolstack.

I think calling xc_assign_device after xenstore from the watch callback
is the only option.

Only if you ignore the other option I proposed.


One question is how to split the code for ARM and x86 as this is the
common code.
Would #ifdef CONFIG_ARM64 ok with maintainers.

No. arch hooks in libxl_$ARCH.c (with nop implementations where necessary)
would be the way to approach this. However I still am not convinced this is
the approach we should be taking.


My current preference is for the suggestion below which is to let the
toolstack pick the vdevfn and have pciback honour it.

That would duplicate code for dev-fn generation into toolstack from
__xen_pcibk_add_pci_dev.

IMHO the toolstack is the correct place for this code, at least for ARM
guests. The toolstack is, in general, responsible for all aspects of the
guest layout. I don't think delegating the PCI bus parts of that to the
dom0 kernel makes sense.
Ok, i will implement the same from pciback to toolstack. I am not sure 
about the complexity but will give it a try.

With this xen-pciback will not create the vdev-X entry at all.


I'd not be surprised if the same turns out to be true for x86/PVH guests
too.
Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-31 Thread Manish Jaggi



On 31/07/15 8:26 pm, Julien Grall wrote:

On 31/07/15 15:33, Manish Jaggi wrote:

Hi Julien,

On 31/07/15 6:29 pm, Julien Grall wrote:

Hi Manish,

On 31/07/15 13:50, Manish Jaggi wrote:

Ok, i will implement the same from pciback to toolstack. I am not sure
about the complexity but will give it a try.
With this xen-pciback will not create the vdev-X entry at all.

Can you send a new draft before continuing to implement PCI support in
Xen?

I am working on the Draft 3 and addressing comments in draft 2. I am
doing a feasibility of the stuff I put in draft3.

Well, I don't think that anything we say within this thread was
impossible to do.


As long as we are not agree about it,

I thought I was trying to discuss the same. If you have any point please
raise it.

What I meant is, this is a 40-messages thread with lots of discussions
on it.

A new draft containing a summary on what was said would benefits
everyone and help us to get on a design that we think is good.


   you loose your time trying to
implement something that can drastically change in the next revision.

I am only putting the stuff in the Draft3 which *can* be implemented later.

But nothing prevent someone in the discussion on Draft3 to say this is
wrong and it has to be done in a different way.

Usually the time between two draft should be pretty short in order to
get sane base for discussion. For now, we are talking about small
portion of design and speculating/trying to remember what was agreed on
other sub-thread.
ok will send draft 3 with the points on this topic as under discussion. 
Is that fine?


Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-31 Thread Manish Jaggi

Hi Julien,

On 31/07/15 6:29 pm, Julien Grall wrote:

Hi Manish,

On 31/07/15 13:50, Manish Jaggi wrote:

Ok, i will implement the same from pciback to toolstack. I am not sure
about the complexity but will give it a try.
With this xen-pciback will not create the vdev-X entry at all.

Can you send a new draft before continuing to implement PCI support in Xen?
I am working on the Draft 3 and addressing comments in draft 2. I am 
doing a feasibility of the stuff I put in draft3.

As long as we are not agree about it,
I thought I was trying to discuss the same. If you have any point please 
raise it.

  you loose your time trying to
implement something that can drastically change in the next revision.

I am only putting the stuff in the Draft3 which *can* be implemented later.
Regards,


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-31 Thread Manish Jaggi



On Friday 31 July 2015 01:35 PM, Ian Campbell wrote:

On Fri, 2015-07-31 at 13:16 +0530, Manish Jaggi wrote:

Secondly, the vdev-X entry is created async by dom0 watching on
event.
So how the tools could read back and call assign device again.

Perhaps by using a xenstore watch on that node to wait for the
assignment
from pciback to occur.

As per the flow in the do_pci_add function, assign_device is called
first and based on the success xenstore entry is created.
Are you suggesting to change the sequence.

Perhaps that is what it would take, yes, or maybe some other refactoring
(e.g. splitting assign_device into two stages) might be the answer.
The hypercall from xenpciback (what I implemented) is actually making 
the assign device in 2 stages.
I think the point of contention is the second stage should be from 
toolstack.


I think calling xc_assign_device after xenstore from the watch callback 
is the only option.
One question is how to split the code for ARM and x86 as this is the 
common code.

Would #ifdef CONFIG_ARM64 ok with maintainers.


My current preference is for the suggestion below which is to let the
toolstack pick the vdevfn and have pciback honour it.
That would duplicate code for dev-fn generation into toolstack from 
__xen_pcibk_add_pci_dev.



We can discuss this more on #xenarm irc

Sorry I missed your ping yesterday, I had already gone home.


Or you could change things such that vdevfn is always chosen by the
toolstack for ARM, not optionally like it is on x86.

For this one, the struct libxl_device_pci has a field vdevfn, which
is
supposed to allow the user to specify a specific vdevfn. I'm not sure
how
that happens or fits together but libxl could undertake to set that on
ARM
in the case where the user hasn't done so, effectively taking control
of
the PCI bus assignment.

Ian.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-30 Thread Manish Jaggi



On Thursday 30 July 2015 03:24 PM, Ian Campbell wrote:

On Wed, 2015-07-29 at 15:07 +0530, Manish Jaggi wrote:

On Monday 06 July 2015 03:50 PM, Ian Campbell wrote:

On Mon, 2015-07-06 at 15:36 +0530, Manish Jaggi wrote:

On Monday 06 July 2015 02:41 PM, Ian Campbell wrote:

On Sun, 2015-07-05 at 11:25 +0530, Manish Jaggi wrote:

On Monday 29 June 2015 04:01 PM, Julien Grall wrote:

Hi Manish,

On 28/06/15 19:38, Manish Jaggi wrote:

4.1 Holes in guest memory space

Holes are added in the guest memory space for mapping pci
device's BAR
regions.
These are defined in arch-arm.h

/* For 32bit */
GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE
 
/* For 64bit */

GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE

The memory layout for 32bit and 64bit are exactly the same. Why
do you
need to differ here?

I think Ian has already replied. I will change the name of macro

4.2 New entries in xenstore for device BARs

toolkit also updates the xenstore information for the device
(virtualbar:physical bar).
This information is read by xenpciback and returned to the
pcifront
driver configuration
space accesses.

Can you details what do you plan to put in xenstore and how?

It is implementation . But I plan to put under domU / device /
heirarchy

Actually, xenstore is an API of sorts which needs to be maintained
going
forward (since front and backend can evolve separately, so it does
need
some level of design and documentation.


What about the expansion ROM?

Do you want to put some restriction on not using expansion ROM as
a
passthrough device.

expansion ROM as a passthrough device doesn't make sense to me,
passthrough devices may _have_ an expansion ROM.

The expansion ROM is just another BAR. I don't know how
pcifront/back
deal with those today on PV x86, but I see no reason for ARM to
deviate.



4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
u32 s;
u8 b;
u8 df;
u16 res;
} sbdf_t;
struct physdev_map_sbdf {
int domain_id;
sbdf_tsbdf;
sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all
pci devices.
The
pdev structure already has a sbdf information. The
arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be
injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
- Why do you need this?
- How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.

- Who will call this hypercall?
- Why not setting the gsbdf when the device is
assigned?

Can the maintainer of the pciback suggest an alternate.

That's not me, but I don't think this belongs here, I think it can
be
done from the toolstack. If you think not then please explain what
information the toolstack doesn't have in its possession which
prevents
this mapping from being done there.

The toolstack does not have the guest sbdf information. I could only
find it in xenpciback.

Are you sure? The sbdf relates to the physical device, correct? If so
then surely the toolstack knows it -- it's written in the config file
and is the primary parameter to all of the related libxl passthrough
APIs. The toolstack wouldn't be able to do anything about passing
through a given device without knowing which device it should be
passing
through.

Perhaps this info needs plumbing through to some new bit of the
toolstack, but it is surely available somewhere.

If you meant the virtual SBDF then that is in libxl_device_pci.vdevfn.

I added prints in libxl__device_pci_add. vdevfn is always 0 so this may
not be the right variable to use.
Can you please recheck.

Also the vdev-X entry in xenstore appears to be created from pciback
code and not from xl.
Check function xen_pcibk_publish_pci_dev.

So I have to send a hypercall from pciback only.

I don't think the necessarily follows.

You could have the tools read the vdev-X node back on plug.
I have been trying to get the flow of caller of libxl__device_pci_add 
during pci device assignemnt from cfg file(cold boot).
It should be called form xl create flow. Is it called from C code or 
Python code.


libxl__device_pci_add calls xc_assign_device


Secondly, the vdev-X entry is created async by dom0 watching on event. 
So how the tools could read back and call assign device again.


static void xen_pcibk_be_watch(struct xenbus_watch *watch,
 const char **vec, unsigned int len)
{
 ...
switch (xenbus_read_driver_state(pdev-xdev-nodename)) {
case XenbusStateInitWait:
xen_pcibk_setup_backend(pdev);
break;
}


Or you could change things such that vdevfn is always chosen by the
toolstack for ARM

Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-29 Thread Manish Jaggi



On Monday 06 July 2015 03:50 PM, Ian Campbell wrote:

On Mon, 2015-07-06 at 15:36 +0530, Manish Jaggi wrote:

On Monday 06 July 2015 02:41 PM, Ian Campbell wrote:

On Sun, 2015-07-05 at 11:25 +0530, Manish Jaggi wrote:

On Monday 29 June 2015 04:01 PM, Julien Grall wrote:

Hi Manish,

On 28/06/15 19:38, Manish Jaggi wrote:

4.1 Holes in guest memory space

Holes are added in the guest memory space for mapping pci device's BAR
regions.
These are defined in arch-arm.h

/* For 32bit */
GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE

/* For 64bit */

GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE

The memory layout for 32bit and 64bit are exactly the same. Why do you
need to differ here?

I think Ian has already replied. I will change the name of macro

4.2 New entries in xenstore for device BARs

toolkit also updates the xenstore information for the device
(virtualbar:physical bar).
This information is read by xenpciback and returned to the pcifront
driver configuration
space accesses.

Can you details what do you plan to put in xenstore and how?

It is implementation . But I plan to put under domU / device / heirarchy

Actually, xenstore is an API of sorts which needs to be maintained going
forward (since front and backend can evolve separately, so it does need
some level of design and documentation.


What about the expansion ROM?

Do you want to put some restriction on not using expansion ROM as a
passthrough device.

expansion ROM as a passthrough device doesn't make sense to me,
passthrough devices may _have_ an expansion ROM.

The expansion ROM is just another BAR. I don't know how pcifront/back
deal with those today on PV x86, but I see no reason for ARM to deviate.



4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
   u32 s;
   u8 b;
   u8 df;
   u16 res;
} sbdf_t;
struct physdev_map_sbdf {
   int domain_id;
   sbdf_tsbdf;
   sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci devices.
The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
- Why do you need this?
- How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.

- Who will call this hypercall?
- Why not setting the gsbdf when the device is assigned?

Can the maintainer of the pciback suggest an alternate.

That's not me, but I don't think this belongs here, I think it can be
done from the toolstack. If you think not then please explain what
information the toolstack doesn't have in its possession which prevents
this mapping from being done there.

The toolstack does not have the guest sbdf information. I could only
find it in xenpciback.

Are you sure? The sbdf relates to the physical device, correct? If so
then surely the toolstack knows it -- it's written in the config file
and is the primary parameter to all of the related libxl passthrough
APIs. The toolstack wouldn't be able to do anything about passing
through a given device without knowing which device it should be passing
through.

Perhaps this info needs plumbing through to some new bit of the
toolstack, but it is surely available somewhere.

If you meant the virtual SBDF then that is in libxl_device_pci.vdevfn.
I added prints in libxl__device_pci_add. vdevfn is always 0 so this may 
not be the right variable to use.

Can you please recheck.

Also the vdev-X entry in xenstore appears to be created from pciback 
code and not from xl.

Check function xen_pcibk_publish_pci_dev.

So I have to send a hypercall from pciback only.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-21 Thread Manish Jaggi



On Tuesday 14 July 2015 11:31 PM, Stefano Stabellini wrote:

On Tue, 14 Jul 2015, Julien Grall wrote:

Hi Stefano,

On 14/07/2015 18:46, Stefano Stabellini wrote:

Linux provides a function (pci_for_each_dma_alias) which will return a
requester ID for a given PCI device. It appears that the BDF (the 's' of
sBDF
is only internal to Linux and not part of the hardware) is equal to the
requester ID on your platform but we can't assume it for anyone else.

The PCI Express Base Specification states that the requester ID is The
combination of a Requester's Bus Number, Device Number, and Function
Number that uniquely identifies the Requester.

I think it is safe to assume BDF = requester ID on all platforms.

With the catch that in case of ARI devices
(http://pcisig.com/sites/default/files/specification_documents/ECN-alt-rid-interpretation-070604.pdf),
BDF is actually BF because the device number is always 0 and the
function number is 8 bits.

And some other problem such as broken PCI device...
Both Xen x86 (domain_context_mapping in drivers/passthrough/vtd/iommu.c) and
Linux (pci_dma_for_each_alias) use a code more complex than requesterID = BDF.

So I don't think we can use requesterID = BDF in physdev op unless we are
*stricly* sure this is valid.

The spec is quite clear about it, but I guess there might be hardware quirks.
Can we keep this open and for now till there is agreement make 
requesterid = bdf.

If you are ok, I will update and send Draft 3.




Although, based on the x86 code, Xen should be able to translate the BDF into
the requester ID...

Yes, that is a good point.



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-09 Thread Manish Jaggi



On Tuesday 07 July 2015 04:54 PM, Ian Campbell wrote:

On Tue, 2015-07-07 at 14:16 +0530, Manish Jaggi wrote:

As asked you in the previous mail, can you please prove it? The
function used to get the requester ID (pci_for_each_dma_alias) is more
complex than a simple return sbdf.

I am not sure what you would like me to prove.
As of ThunderX Xen code we have assumed sbdf == deviceID.

Please remember that you are not writing ThunderX Xen code here, you
are writing generic Xen code which you happen to be testing on Thunder
X. The design and implementation does need to consider the more generic
case I'm afraid.

In particular if this is going to be a PHYSDEVOP then it needs to be
designed to be future proof, since PHYSDEVOP is a stable API i.e. it is
hard to change in the future.

I think I did ask elsewhere _why_ this was a physdev op, since I can't
see why it can't be done by the toolstack, and therefore why it can't be
a domctl.

If it can be done in domctl I prefer that. Will get back on this.


If this was a domctl there might be scope for accepting an
implementation which made assumptions such as sbdf == deviceid. However
I'd still like to see this topic given proper treatment in the design
and not just glossed over with this is how ThunderX does things.

I got your point.

Or maybe the solution is simple and we should just do it now -- i.e. can
we add a new field to the PHYSDEVOP_pci_host_bridge_add argument struct
which contains the base deviceid for that bridge
deviceId would be same as sbdf. As we dont have a way to translate sbdf 
to deviceID.

What about SMMU streamID, can we also have sbdf = deviceID = smmuid_bdf
FYI: In thunder each RC is on a separate smmu.

Can we take that as step1.

  (since I believe both
DT and ACPI IORT assume a simple linear mapping[citation needed])?
I am ok with the approach but then we have to put something similar to 
IORT in device tree.

Currently it is not there.
If we take that route of creating IORT for host  / guest it would be 
altogether different effort.

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-09 Thread Manish Jaggi



On Thursday 09 July 2015 01:38 PM, Julien Grall wrote:

Hi Manish,

On 09/07/2015 08:13, Manish Jaggi wrote:


If this was a domctl there might be scope for accepting an
implementation which made assumptions such as sbdf == deviceid. However
I'd still like to see this topic given proper treatment in the design
and not just glossed over with this is how ThunderX does things.

I got your point.
Or maybe the solution is simple and we should just do it now -- i.e. 
can

we add a new field to the PHYSDEVOP_pci_host_bridge_add argument struct
which contains the base deviceid for that bridge

deviceId would be same as sbdf. As we dont have a way to translate sbdf
to deviceID.


I think we have to be clear in this design document about the 
different meaning.


When the Device Tree is used, it's assumed that the deviceID will be 
equal to the requester ID and not the sbdf.

Does SMMU v2 has a concept of requesterID.
I see requesterID term in SMMUv3


Linux provides a function (pci_for_each_dma_alias) which will return a 
requester ID for a given PCI device. It appears that the BDF (the 's' 
of sBDF is only internal to Linux and not part of the hardware) is 
equal to the requester ID on your platform but we can't assume it for 
anyone else.

so you mean requesterID = pci_for_each_dma_alias(sbdf)


When we have a PCI in hand, we have to find the requester ID for this 
device.

That is the question. How to map requesterID to sbdf
On 

Once ?
we have it we can deduce the streamID and the deviceID. The way to do 
it will depend on whether we use device tree or ACPI:
- For device tree, the streamID, and deviceID will be equal to the 
requester ID
what do you think should be streamID when a device is PCI EP and is 
enumerated. Also per ARM SMMU 2.0 spec  StreamID is implementation specific.

As per SMMUv3 specs
For PCI, it is intended that StreamID is generated from the PCI 
RequesterID. The generation function may be 1:1

where one Root Complex is hosted by one SMMU


- For ACPI, we would have to look up in the ACPI IORT.

For the latter, I think they are static tables and therefore can be 
parse in Xen. So we wouldn't need to PHYSDEVOP_pci_host_bridge_add to 
pass an offset. This will also avoid any assumption that deviceID for 
a given root complex are always contiguous and make extendable for any 
new hardware require a different *ID.


So what we really care is the requester ID. Although, I'm not sure if 
you can find it in Xen. If not, we may need to customize (i.e adding a 
new PHYSDEVOP) PCI add device to take a requesterID in parameter.


Now, in the case of the guest, as we are only supporting device tree, 
we could make the assumption that requesterID == deviceID as long as 
this is exposed in a DOMCTL to allow us flexibility.


It would make sense to extend DOMCTL_assign_device to take the vBDF 
(or requesterID?) in parameter.


Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-07 Thread Manish Jaggi



On Tuesday 07 July 2015 01:48 PM, Julien Grall wrote:

Hi Manish,

On 07/07/2015 08:10, Manish Jaggi wrote:

On Monday 06 July 2015 05:15 PM, Julien Grall wrote:

On 06/07/15 12:09, Manish Jaggi wrote:


On Monday 06 July 2015 04:13 PM, Julien Grall wrote:

On 05/07/15 06:55, Manish Jaggi wrote:

4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
   u32 s;
   u8 b;
   u8 df;
   u16 res;
} sbdf_t;
struct physdev_map_sbdf {
   int domain_id;
   sbdf_tsbdf;
   sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci
devices.
The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be
injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
  - Why do you need this?
  - How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.
That wasn't my question. I asked, how Xen will find the mapping 
between

the gdbf and vDeviceID? He doesn't have access to the firmware table
and
therefore not able to find the right one.

I believe gsbdf and vDeviceID would be same.

Xen and the guest need to translate the gsbdf the same way. If this is
clearly defined by a spec, then you should give a link to it.

They are same, will change sbdf -DeviceID and gsbdf-vDeviceID.


As asked you in the previous mail, can you please prove it? The 
function used to get the requester ID (pci_for_each_dma_alias) is more 
complex than a simple return sbdf.

I am not sure what you would like me to prove.
As of ThunderX Xen code we have assumed sbdf == deviceID. We are not 
using ACPI as of now. This is our implementation. It cannot be wrong 
outrightly.

Can you please suggest what could be the other approach.



Furthermore, AFAICT, the IORT Table (from ACPI) [1] is used to specify 
the relationships between the requester ID and the DeviceID. So it's 
not obvious that sbdf == DeviceID.



If not, you have to explain in this design doc how you plan to have xen
and the guest using the same vdevID for a given gsbdf.

Regards,





[1] 
http://infocenter.arm.com/help/topic/com.arm.doc.den0049a/DEN0049A_IO_Remapping_Table.pdf






___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-07 Thread Manish Jaggi



On Monday 06 July 2015 05:15 PM, Julien Grall wrote:

On 06/07/15 12:09, Manish Jaggi wrote:


On Monday 06 July 2015 04:13 PM, Julien Grall wrote:

On 05/07/15 06:55, Manish Jaggi wrote:

4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
   u32 s;
   u8 b;
   u8 df;
   u16 res;
} sbdf_t;
struct physdev_map_sbdf {
   int domain_id;
   sbdf_tsbdf;
   sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci
devices.
The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
  - Why do you need this?
  - How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.

That wasn't my question. I asked, how Xen will find the mapping between
the gdbf and vDeviceID? He doesn't have access to the firmware table and
therefore not able to find the right one.

I believe gsbdf and vDeviceID would be same.

Xen and the guest need to translate the gsbdf the same way. If this is
clearly defined by a spec, then you should give a link to it.

They are same, will change sbdf -DeviceID and gsbdf-vDeviceID.

If not, you have to explain in this design doc how you plan to have xen
and the guest using the same vdevID for a given gsbdf.

Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-07 Thread Manish Jaggi



On Tuesday 07 July 2015 02:16 PM, Manish Jaggi wrote:



On Tuesday 07 July 2015 01:48 PM, Julien Grall wrote:

Hi Manish,

On 07/07/2015 08:10, Manish Jaggi wrote:

On Monday 06 July 2015 05:15 PM, Julien Grall wrote:

On 06/07/15 12:09, Manish Jaggi wrote:


On Monday 06 July 2015 04:13 PM, Julien Grall wrote:

On 05/07/15 06:55, Manish Jaggi wrote:

4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
   u32 s;
   u8 b;
   u8 df;
   u16 res;
} sbdf_t;
struct physdev_map_sbdf {
   int domain_id;
   sbdf_tsbdf;
   sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci
devices.
The
pdev structure already has a sbdf information. The 
arch_pci_dev is

updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be
injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
  - Why do you need this?
  - How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.
That wasn't my question. I asked, how Xen will find the mapping 
between

the gdbf and vDeviceID? He doesn't have access to the firmware table
and
therefore not able to find the right one.

I believe gsbdf and vDeviceID would be same.

Xen and the guest need to translate the gsbdf the same way. If this is
clearly defined by a spec, then you should give a link to it.

They are same, will change sbdf -DeviceID and gsbdf-vDeviceID.


As asked you in the previous mail, can you please prove it? The 
function used to get the requester ID (pci_for_each_dma_alias) is 
more complex than a simple return sbdf.

I am not sure what you would like me to prove.
As of ThunderX Xen code we have assumed sbdf == deviceID. We are not 
using ACPI as of now. This is our implementation. It cannot be wrong 
outrightly.

Can you please suggest what could be the other approach.



Furthermore, AFAICT, the IORT Table (from ACPI) [1] is used to 
specify the relationships between the requester ID and the DeviceID. 
So it's not obvious that sbdf == DeviceID.


If not, you have to explain in this design doc how you plan to have 
xen

and the guest using the same vdevID for a given gsbdf.

Regards,





[1] 
http://infocenter.arm.com/help/topic/com.arm.doc.den0049a/DEN0049A_IO_Remapping_Table.pdf


If ACPI is not used IORT (sbdf - StreamID - deviceID mapping) has to be 
done in device tree.
Can we add this as a TODO. So that first series of patches can be 
accepted with StreamID == DeviceID = sbdf





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-06 Thread Manish Jaggi



On Monday 06 July 2015 04:13 PM, Julien Grall wrote:

On 05/07/15 06:55, Manish Jaggi wrote:

4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
  u32 s;
  u8 b;
  u8 df;
  u16 res;
} sbdf_t;
struct physdev_map_sbdf {
  int domain_id;
  sbdf_tsbdf;
  sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci devices.
The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
 - Why do you need this?
 - How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.

That wasn't my question. I asked, how Xen will find the mapping between
the gdbf and vDeviceID? He doesn't have access to the firmware table and
therefore not able to find the right one.
I believe gsbdf and vDeviceID would be same. In the hypercall processing 
its_assign_device would be called

with params its_assign_device(sbdf, gsbdf, domid)



Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-06 Thread Manish Jaggi



On Monday 06 July 2015 02:41 PM, Ian Campbell wrote:

On Sun, 2015-07-05 at 11:25 +0530, Manish Jaggi wrote:

On Monday 29 June 2015 04:01 PM, Julien Grall wrote:

Hi Manish,

On 28/06/15 19:38, Manish Jaggi wrote:

4.1 Holes in guest memory space

Holes are added in the guest memory space for mapping pci device's BAR
regions.
These are defined in arch-arm.h

/* For 32bit */
GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE
   
/* For 64bit */

GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE

The memory layout for 32bit and 64bit are exactly the same. Why do you
need to differ here?

I think Ian has already replied. I will change the name of macro

4.2 New entries in xenstore for device BARs

toolkit also updates the xenstore information for the device
(virtualbar:physical bar).
This information is read by xenpciback and returned to the pcifront
driver configuration
space accesses.

Can you details what do you plan to put in xenstore and how?

It is implementation . But I plan to put under domU / device / heirarchy

Actually, xenstore is an API of sorts which needs to be maintained going
forward (since front and backend can evolve separately, so it does need
some level of design and documentation.


What about the expansion ROM?

Do you want to put some restriction on not using expansion ROM as a
passthrough device.

expansion ROM as a passthrough device doesn't make sense to me,
passthrough devices may _have_ an expansion ROM.

The expansion ROM is just another BAR. I don't know how pcifront/back
deal with those today on PV x86, but I see no reason for ARM to deviate.



4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
  u32 s;
  u8 b;
  u8 df;
  u16 res;
} sbdf_t;
struct physdev_map_sbdf {
  int domain_id;
  sbdf_tsbdf;
  sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci devices.
The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
- Why do you need this?
- How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.

- Who will call this hypercall?
- Why not setting the gsbdf when the device is assigned?

Can the maintainer of the pciback suggest an alternate.

That's not me, but I don't think this belongs here, I think it can be
done from the toolstack. If you think not then please explain what
information the toolstack doesn't have in its possession which prevents
this mapping from being done there.
The toolstack does not have the guest sbdf information. I could only 
find it in xenpciback.



The answer to your question is that I have only found a place to issue
the hypercall where
all the information can be located is the function
__xen_pcibk_add_pci_dev


drivers/xen/xen-pciback/vpci.c

unlock:
...
  kfree(dev_entry);

+   /*Issue Hypercall here */
+#ifdef CONFIG_ARM64
+   map_sbdf.domain_id = pdev-xdev-otherend_id;
+   map_sbdf.sbdf_s = dev-bus-domain_nr;
+   map_sbdf.sbdf_b = dev-bus-number;
+   map_sbdf.sbdf_d = dev-devfn3;
+   map_sbdf.sbdf_f = dev-devfn  0x7;
+   map_sbdf.gsbdf_s = 0;
+   map_sbdf.gsbdf_b = 0;
+   map_sbdf.gsbdf_d = slot;
+   map_sbdf.gsbdf_f = dev-devfn  0x7;
+   pr_info(## sbdf = %d:%d:%d.%d g_sbdf %d:%d:%d.%d \
+   domain_id=%d ##\r\n,
+   map_sbdf.sbdf_s,
+   map_sbdf.sbdf_b,
+   map_sbdf.sbdf_d,
+   map_sbdf.sbdf_f,
+   map_sbdf.gsbdf_s,
+   map_sbdf.gsbdf_b,
+   map_sbdf.gsbdf_d,
+   map_sbdf.gsbdf_f,
+   map_sbdf.domain_id);
+
+   err = HYPERVISOR_physdev_op(PHYSDEVOP_map_sbdf, map_sbdf);
+   if (err)
+   printk(KERN_ERR  Xen Error PHYSDEVOP_map_sbdf);
+#endif
---


Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-06 Thread Manish Jaggi



On Sunday 05 July 2015 11:25 AM, Manish Jaggi wrote:



On Monday 29 June 2015 04:01 PM, Julien Grall wrote:

Hi Manish,

On 28/06/15 19:38, Manish Jaggi wrote:

4.1 Holes in guest memory space

Holes are added in the guest memory space for mapping pci device's BAR
regions.
These are defined in arch-arm.h

/* For 32bit */
GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE
  /* For 64bit */
GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE

The memory layout for 32bit and 64bit are exactly the same. Why do you
need to differ here?

I think Ian has already replied. I will change the name of macro

4.2 New entries in xenstore for device BARs

toolkit also updates the xenstore information for the device
(virtualbar:physical bar).
This information is read by xenpciback and returned to the pcifront
driver configuration
space accesses.

Can you details what do you plan to put in xenstore and how?

It is implementation . But I plan to put under domU / device / heirarchy

What about the expansion ROM?
Do you want to put some restriction on not using expansion ROM as a 
passthrough device.



4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
 u32 s;
 u8 b;
 u8 df;
 u16 res;
} sbdf_t;
struct physdev_map_sbdf {
 int domain_id;
 sbdf_tsbdf;
 sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci 
devices.

The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
- Why do you need this?
- How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.
The hypercall handler in xen, would call its_assign_device(sbdf, gsbdf, 
domid);

- Who will call this hypercall?
- Why not setting the gsbdf when the device is assigned?

Can the maintainer of the pciback suggest an alternate.
The answer to your question is that I have only found a place to issue 
the hypercall where

all the information can be located is the function
__xen_pcibk_add_pci_dev


drivers/xen/xen-pciback/vpci.c

unlock:
...
kfree(dev_entry);

+   /*Issue Hypercall here */
+#ifdef CONFIG_ARM64
+   map_sbdf.domain_id = pdev-xdev-otherend_id;
+   map_sbdf.sbdf_s = dev-bus-domain_nr;
+   map_sbdf.sbdf_b = dev-bus-number;
+   map_sbdf.sbdf_d = dev-devfn3;
+   map_sbdf.sbdf_f = dev-devfn  0x7;
+   map_sbdf.gsbdf_s = 0;
+   map_sbdf.gsbdf_b = 0;
+   map_sbdf.gsbdf_d = slot;
+   map_sbdf.gsbdf_f = dev-devfn  0x7;
+   pr_info(## sbdf = %d:%d:%d.%d g_sbdf %d:%d:%d.%d \
+   domain_id=%d ##\r\n,
+   map_sbdf.sbdf_s,
+   map_sbdf.sbdf_b,
+   map_sbdf.sbdf_d,
+   map_sbdf.sbdf_f,
+   map_sbdf.gsbdf_s,
+   map_sbdf.gsbdf_b,
+   map_sbdf.gsbdf_d,
+   map_sbdf.gsbdf_f,
+   map_sbdf.domain_id);
+
+   err = HYPERVISOR_physdev_op(PHYSDEVOP_map_sbdf, map_sbdf);
+   if (err)
+   printk(KERN_ERR  Xen Error PHYSDEVOP_map_sbdf);
+#endif
---


Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2

2015-07-05 Thread Manish Jaggi

Ian Campbell Wrote:

On Mon, 2015-06-29 at 00:08 +0530, Manish Jaggi wrote:
PCI Pass-through in Xen ARM
--

Draft 2

Index

1. Background

2. Basic PCI Support in Xen ARM
2.1 pci_hostbridge and pci_hostbridge_ops
2.2 PHYSDEVOP_HOSTBRIDGE_ADD hypercall

3. Dom0 Access PCI devices

4. DomU assignment of PCI device
4.1 Holes in guest memory space
4.2 New entries in xenstore for device BARs
4.3 Hypercall for bdf mapping noification to xen
4.4 Change in Linux PCI FrontEnd - backend driver
  for MSI/X programming

5. NUMA and PCI passthrough

6. DomU pci device attach flow


Revision History

Changes from Draft 1
a) map_mmio hypercall removed from earlier draft
b) device bar mapping into guest not 1:1
c) holes in guest address space 32bit / 64bit for MMIO virtual BARs
d) xenstore device's BAR info addition.


1. Background of PCI passthrough

Passthrough refers to assigning a pci device to a guest domain (domU) such
that
the guest has full control over the device.The MMIO space and interrupts are
managed by the guest itself, close to how a bare kernel manages a device.

Device's access to guest address space needs to be isolated and protected.
SMMU
(System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device
access guest memory for data transfer and sending MSI/X interrupts. In case of
MSI/X  the device writes to GITS (ITS address space) Interrupt Translation
Register.

2. Basic PCI Support for ARM

The apis to read write from pci configuration space are based on segment:bdf.
How the sbdf is mapped to a physical address is under the realm of the pci
host controller.

ARM PCI support in Xen, introduces pci host controller similar to what exists
in Linux. Each drivers registers callbacks, which are invoked on matching the
compatible property in pci device tree node.

2.1:
The init function in the pci host driver calls to register hostbridge
callbacks:
int pci_hostbridge_register(pci_hostbridge_t *pcihb);

struct pci_hostbridge_ops {
 u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn,
 u32 reg, u32 bytes);
 void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn,
 u32 reg, u32 bytes, u32 val);
};

struct pci_hostbridge{
 u32 segno;
 paddr_t cfg_base;
 paddr_t cfg_size;
 struct dt_device_node *dt_node;
 struct pci_hostbridge_ops ops;
 struct list_head list;
};

A pci conf read function would internally be as follows:
u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes)
{
 pci_hostbridge_t *pcihb;
 list_for_each_entry(pcihb, pci_hostbridge_list, list)
 {
 if(pcihb-segno == seg)
 return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes);
 }
 return -1;
}

2.2 PHYSDEVOP_pci_host_bridge_add hypercall

Xen code accesses PCI configuration space based on the sbdf received from the
guest. The order in which the pci device tree node appear may not be the same
order of device enumeration in dom0. Thus there needs to be a mechanism to
bind
the segment number assigned by dom0 to the pci host controller. The hypercall
is introduced:

#define PHYSDEVOP_pci_host_bridge_add44
struct physdev_pci_host_bridge_add {
 /* IN */
 uint16_t seg;
 uint64_t cfg_base;
 uint64_t cfg_size;
};

This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add
hypercall. The handler code invokes to update segment number in
pci_hostbridge:

int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t
cfg_size);

Subsequent calls to pci_conf_read/write are completed by the
pci_hostbridge_ops
of the respective pci_hostbridge.

3. Dom0 access PCI device
-
As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each
device the MMIO space has to be mapped in the Stage2 translation for dom0.


Here device is really host bridge, isn't it? i.e. this is done by
mapping the entire MMIO window of each host bridge, not the individual
BAR registers of each device one at a time.


No the device means the PCIe EP device not RC.



IOW this is functionality of the pci host driver's intitial setup, not
something which is driven from the dom0 enumeration of the bus.


 For
dom0 xen maps the ranges in pci nodes in stage 2 translation.

GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that
MSI/X
must work. This is done in vits initialization in dom0/domU.


This also happens at start of day, but what isn't mentioned is that
(AIUI) the SMMU will need to be programmed to map each SBDF to the dom0
p2m as the devices are discovered and reported. Right?


Yes, I will add SMMU section in the Draft3.


4. DomU access / assignment PCI device
--
In the flow of pci-attach device, the toolkit


I assume you mean toolstack throughout? If so

Re: [Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-07-04 Thread Manish Jaggi



On Monday 29 June 2015 04:01 PM, Julien Grall wrote:

Hi Manish,

On 28/06/15 19:38, Manish Jaggi wrote:

4.1 Holes in guest memory space

Holes are added in the guest memory space for mapping pci device's BAR
regions.
These are defined in arch-arm.h

/* For 32bit */
GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE
  
/* For 64bit */

GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE

The memory layout for 32bit and 64bit are exactly the same. Why do you
need to differ here?

I think Ian has already replied. I will change the name of macro

4.2 New entries in xenstore for device BARs

toolkit also updates the xenstore information for the device
(virtualbar:physical bar).
This information is read by xenpciback and returned to the pcifront
driver configuration
space accesses.

Can you details what do you plan to put in xenstore and how?

It is implementation . But I plan to put under domU / device / heirarchy

What about the expansion ROM?
Do you want to put some restriction on not using expansion ROM as a 
passthrough device.



4.3 Hypercall for bdf mapping notification to xen
---
#define PHYSDEVOP_map_sbdf  43
typedef struct {
 u32 s;
 u8 b;
 u8 df;
 u16 res;
} sbdf_t;
struct physdev_map_sbdf {
 int domain_id;
 sbdf_tsbdf;
 sbdf_tgsbdf;
};

Each domain has a pdev list, which contains the list of all pci devices.
The
pdev structure already has a sbdf information. The arch_pci_dev is
updated to
contain the gsbdf information. (gs- guest segment id)

Whenever there is trap from guest or an interrupt has to be injected,
the pdev
list is iterated to find the gsbdf.

Can you give more background for this section? i.e:
- Why do you need this?
- How xen will translate the gbdf to a vDeviceID?

In the context of the hypercall processing.

- Who will call this hypercall?
- Why not setting the gsbdf when the device is assigned?

Can the maintainer of the pciback suggest an alternate.
The answer to your question is that I have only found a place to issue 
the hypercall where

all the information can be located is the function
__xen_pcibk_add_pci_dev


drivers/xen/xen-pciback/vpci.c

unlock:
...
kfree(dev_entry);

+   /*Issue Hypercall here */
+#ifdef CONFIG_ARM64
+   map_sbdf.domain_id = pdev-xdev-otherend_id;
+   map_sbdf.sbdf_s = dev-bus-domain_nr;
+   map_sbdf.sbdf_b = dev-bus-number;
+   map_sbdf.sbdf_d = dev-devfn3;
+   map_sbdf.sbdf_f = dev-devfn  0x7;
+   map_sbdf.gsbdf_s = 0;
+   map_sbdf.gsbdf_b = 0;
+   map_sbdf.gsbdf_d = slot;
+   map_sbdf.gsbdf_f = dev-devfn  0x7;
+   pr_info(## sbdf = %d:%d:%d.%d g_sbdf %d:%d:%d.%d \
+   domain_id=%d ##\r\n,
+   map_sbdf.sbdf_s,
+   map_sbdf.sbdf_b,
+   map_sbdf.sbdf_d,
+   map_sbdf.sbdf_f,
+   map_sbdf.gsbdf_s,
+   map_sbdf.gsbdf_b,
+   map_sbdf.gsbdf_d,
+   map_sbdf.gsbdf_f,
+   map_sbdf.domain_id);
+
+   err = HYPERVISOR_physdev_op(PHYSDEVOP_map_sbdf, map_sbdf);
+   if (err)
+   printk(KERN_ERR  Xen Error PHYSDEVOP_map_sbdf);
+#endif
---


Regards,




___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


[Xen-devel] PCI Pass-through in Xen ARM - Draft 2.

2015-06-28 Thread Manish Jaggi

PCI Pass-through in Xen ARM
--

Draft 2

Index

1. Background

2. Basic PCI Support in Xen ARM
2.1 pci_hostbridge and pci_hostbridge_ops
2.2 PHYSDEVOP_HOSTBRIDGE_ADD hypercall

3. Dom0 Access PCI devices

4. DomU assignment of PCI device
4.1 Holes in guest memory space
4.2 New entries in xenstore for device BARs
4.3 Hypercall for bdf mapping noification to xen
4.4 Change in Linux PCI FrontEnd - backend driver
 for MSI/X programming

5. NUMA and PCI passthrough

6. DomU pci device attach flow


Revision History

Changes from Draft 1
a) map_mmio hypercall removed from earlier draft
b) device bar mapping into guest not 1:1
c) holes in guest address space 32bit / 64bit for MMIO virtual BARs
d) xenstore device's BAR info addition.


1. Background of PCI passthrough

Passthrough refers to assigning a pci device to a guest domain (domU) such that
the guest has full control over the device.The MMIO space and interrupts are
managed by the guest itself, close to how a bare kernel manages a device.

Device's access to guest address space needs to be isolated and protected. SMMU
(System MMU - IOMMU in ARM) is programmed by xen hypervisor to allow device
access guest memory for data transfer and sending MSI/X interrupts. In case of
MSI/X  the device writes to GITS (ITS address space) Interrupt Translation
Register.

2. Basic PCI Support for ARM

The apis to read write from pci configuration space are based on segment:bdf.
How the sbdf is mapped to a physical address is under the realm of the pci
host controller.

ARM PCI support in Xen, introduces pci host controller similar to what exists
in Linux. Each drivers registers callbacks, which are invoked on matching the
compatible property in pci device tree node.

2.1:
The init function in the pci host driver calls to register hostbridge callbacks:
int pci_hostbridge_register(pci_hostbridge_t *pcihb);

struct pci_hostbridge_ops {
u32 (*pci_conf_read)(struct pci_hostbridge*, u32 bus, u32 devfn,
u32 reg, u32 bytes);
void (*pci_conf_write)(struct pci_hostbridge*, u32 bus, u32 devfn,
u32 reg, u32 bytes, u32 val);
};

struct pci_hostbridge{
u32 segno;
paddr_t cfg_base;
paddr_t cfg_size;
struct dt_device_node *dt_node;
struct pci_hostbridge_ops ops;
struct list_head list;
};

A pci conf read function would internally be as follows:
u32 pcihb_conf_read(u32 seg, u32 bus, u32 devfn,u32 reg, u32 bytes)
{
pci_hostbridge_t *pcihb;
list_for_each_entry(pcihb, pci_hostbridge_list, list)
{
if(pcihb-segno == seg)
return pcihb-ops.pci_conf_read(pcihb, bus, devfn, reg, bytes);
}
return -1;
}

2.2 PHYSDEVOP_pci_host_bridge_add hypercall

Xen code accesses PCI configuration space based on the sbdf received from the
guest. The order in which the pci device tree node appear may not be the same
order of device enumeration in dom0. Thus there needs to be a mechanism to bind
the segment number assigned by dom0 to the pci host controller. The hypercall
is introduced:

#define PHYSDEVOP_pci_host_bridge_add44
struct physdev_pci_host_bridge_add {
/* IN */
uint16_t seg;
uint64_t cfg_base;
uint64_t cfg_size;
};

This hypercall is invoked before dom0 invokes the PHYSDEVOP_pci_device_add
hypercall. The handler code invokes to update segment number in pci_hostbridge:

int pci_hostbridge_setup(uint32_t segno, uint64_t cfg_base, uint64_t cfg_size);

Subsequent calls to pci_conf_read/write are completed by the pci_hostbridge_ops
of the respective pci_hostbridge.

3. Dom0 access PCI device
-
As per the design of xen hypervisor, dom0 enumerates the PCI devices. For each
device the MMIO space has to be mapped in the Stage2 translation for dom0. For
dom0 xen maps the ranges in pci nodes in stage 2 translation.

GITS_ITRANSLATER space (4k( must be programmed in Stage2 translation so that 
MSI/X
must work. This is done in vits initialization in dom0/domU.

4. DomU access / assignment PCI device
--
In the flow of pci-attach device, the toolkit will read the pci configuration
space BAR registers. The toolkit has the guest memory map and the information
of the MMIO holes.

When the first pci device is assigned to domU, toolkit allocates a virtual
BAR region from the MMIO hole area. toolkit then sends domctl 
xc_domain_memory_mapping
to map in stage2 translation.

4.1 Holes in guest memory space

Holes are added in the guest memory space for mapping pci device's BAR regions.
These are defined in arch-arm.h

/* For 32bit */
GUEST_MMIO_HOLE0_BASE, GUEST_MMIO_HOLE0_SIZE
 
/* For 64bit */

GUEST_MMIO_HOLE1_BASE , GUEST_MMIO_HOLE1_SIZE

4.2 New entries in xenstore for device BARs

toolkit also updates the xenstore information for the 

Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-26 Thread Manish Jaggi



On Friday 26 June 2015 01:02 PM, Ian Campbell wrote:

On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote:

On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote:

On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote:

On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote:

On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote:

On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU

While implementing I think rather than the toolstack, pciback driver in
dom0 can send the
hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs.

pciback doesn't (and shouldn't) have sufficient knowledge of the guest
address space layout to determine what the virtual BAR should be. The
toolstack is the right place for that decision to be made.

Yes, the point is the pciback driver reads the physical BAR regions on
request from domU.
So it sends a hypercall to map the physical bars into stage2 translation
for the domU through xen.
Xen would use the holes left in IPA for MMIO.

I still think it is the toolstack which should do this, that's whewre
these sorts of layout decisions belong.

can the xl tools read pci conf space ?

Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and
friends do.


Using some xen hypercall or a xl-dom0 ioctl ?

No, using normal pre-existing Linux functionality.


If not then there is no otherway but xenpciback

Also I need to introduce a hypercall which would tell toolkit the
available holes for virtualBAR mapping.
Much simpler is let xen allocate a virtualBAR and return to the caller.

At init - sure. But when the guest is running and doing those sort
of things. Unless you want guest - pciback - xenstore - libxl -
hypercall - send ack on xenstore - pciback - guest.

That would entail adding some pcibkack - user-space tickle mechanism
and another back. Much simpler to do all of this in xenpciback I think?

I agree. If the xenpciback sends a hypercall whenever a BAR read access,
the mapping
in xen would already have been done, so xen would simply be doing
PA-IPA lookup.
No xenstore lookup is required.

The xenstore read would happen once on device attach, at the same time
you are reading the rest of the dev-NNN stuff relating to the just
attached device.

Doing a xenstore transaction on every BAR read would indeed be silly and
doing a hypercall would not be much better. There is no need for either
a xenstore read or a hypercall during the cfg space access itself, you
just read the value from a pciback datastructure.

Add to that the fact that any new hypercall made from dom0 needs to be
added as a stable interface I can't see any reason to go with such a
model.
I think you are overlooking a point which is From what region the 
virtual BAR be allocated ?
One way is for xen to keep a hole for domains where the bar regions be 
mapped. This is not there as of now.


How would the tools know about this hole ?
A domctl is required ?
For this reason I was suggesting a hypercall to xen to map the physical 
BARs and return the virtualBARs.


Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-26 Thread Manish Jaggi



On Friday 26 June 2015 02:39 PM, Ian Campbell wrote:

On Fri, 2015-06-26 at 14:20 +0530, Manish Jaggi wrote:

On Friday 26 June 2015 01:02 PM, Ian Campbell wrote:

On Fri, 2015-06-26 at 07:37 +0530, Manish Jaggi wrote:

On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote:

On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote:

On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote:

On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote:

On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU

While implementing I think rather than the toolstack, pciback driver in
dom0 can send the
hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs.

pciback doesn't (and shouldn't) have sufficient knowledge of the guest
address space layout to determine what the virtual BAR should be. The
toolstack is the right place for that decision to be made.

Yes, the point is the pciback driver reads the physical BAR regions on
request from domU.
So it sends a hypercall to map the physical bars into stage2 translation
for the domU through xen.
Xen would use the holes left in IPA for MMIO.

I still think it is the toolstack which should do this, that's whewre
these sorts of layout decisions belong.

can the xl tools read pci conf space ?

Yes, via sysfs (possibly abstracted via libpci) . Just like lspci and
friends do.

Will implement that.

Using some xen hypercall or a xl-dom0 ioctl ?

No, using normal pre-existing Linux functionality.


If not then there is no otherway but xenpciback

Also I need to introduce a hypercall which would tell toolkit the
available holes for virtualBAR mapping.
Much simpler is let xen allocate a virtualBAR and return to the caller.

At init - sure. But when the guest is running and doing those sort
of things. Unless you want guest - pciback - xenstore - libxl -
hypercall - send ack on xenstore - pciback - guest.

That would entail adding some pcibkack - user-space tickle mechanism
and another back. Much simpler to do all of this in xenpciback I think?

I agree. If the xenpciback sends a hypercall whenever a BAR read access,
the mapping
in xen would already have been done, so xen would simply be doing
PA-IPA lookup.
No xenstore lookup is required.

The xenstore read would happen once on device attach, at the same time
you are reading the rest of the dev-NNN stuff relating to the just
attached device.

Doing a xenstore transaction on every BAR read would indeed be silly and
doing a hypercall would not be much better. There is no need for either
a xenstore read or a hypercall during the cfg space access itself, you
just read the value from a pciback datastructure.

Add to that the fact that any new hypercall made from dom0 needs to be
added as a stable interface I can't see any reason to go with such a
model.

I think you are overlooking a point which is From what region the
virtual BAR be allocated ?
One way is for xen to keep a hole for domains where the bar regions be
mapped. This is not there as of now.

How would the tools know about this hole ?

I think you've overlooked the point that _only_ the tools know enough
about the overall guest address space layout to know about this hole.
Xen has no need to know anything

Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-25 Thread Manish Jaggi



On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU
While implementing I think rather than the toolstack, pciback driver in 
dom0 can send the

hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs. Moreover a pci driver would 
read BARs only once.

c) domU will not anytime update the BARs, if it does then it is a fault,
till we decide how to handle it

As Julien has noted pciback already deals with this correctly, because
sizing a BAR involves a write, it implementes a scheme which allows
either the hardcoded virtual BAR to be written or all 1s (needed for
size detection).


d) when domU queries BAR address from pci-back the virtual BAR address
is provided.

Option2:
b) domU will not anytime update the BARs, if it does then it is a fault,
till we decide how to handle it
c) when domU queries BAR address from pci-back the virtual BAR address
is provided.
d) domU sends a hypercall to map virtual BARs,
e) xen pci code reads the BAR and maps { virtual BAR : physical BAR }
for domU

Which option is better I think Ian is for (2) and Stefano may be (1)

In fact I'm now (after Julien pointed out the current behaviour of
pciback) in favour of (1), although I'm not sure if Stefano is too.

(I was never in favour of (2), FWIW, I previously was in favour of (3)
which is like (2) except pciback makes the hypervcall to map the virtual
bars to the guest, I'd still favour that over (2) but (1) is now my
preference)

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-25 Thread Manish Jaggi



On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote:

On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote:

On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU

While implementing I think rather than the toolstack, pciback driver in
dom0 can send the
hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs.

pciback doesn't (and shouldn't) have sufficient knowledge of the guest
address space layout to determine what the virtual BAR should be. The
toolstack is the right place for that decision to be made.
Yes, the point is the pciback driver reads the physical BAR regions on 
request from domU.
So it sends a hypercall to map the physical bars into stage2 translation 
for the domU through xen.

Xen would use the holes left in IPA for MMIO.
Xen would return the IPA for pci-back to return to the request to domU.

Moreover a pci driver would read BARs only once.

You can't assume that though, a driver can do whatever it likes, or the
module might be unloaded and reloaded in the guest etc etc.

Are you going to send out a second draft based on the discussion so far?
yes, I was working on that only. I was traveling this week 24 hour 
flights jetlag...


Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-25 Thread Manish Jaggi



On Thursday 25 June 2015 10:56 PM, Konrad Rzeszutek Wilk wrote:

On Thu, Jun 25, 2015 at 01:21:28PM +0100, Ian Campbell wrote:

On Thu, 2015-06-25 at 17:29 +0530, Manish Jaggi wrote:

On Thursday 25 June 2015 02:41 PM, Ian Campbell wrote:

On Thu, 2015-06-25 at 13:14 +0530, Manish Jaggi wrote:

On Wednesday 17 June 2015 07:59 PM, Ian Campbell wrote:

On Wed, 2015-06-17 at 07:14 -0700, Manish Jaggi wrote:

On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per
device directory with virtual BAR address/size

Option1:
b) toolstack using some hypercall ask xen to create p2m mapping {
virtual BAR : physical BAR } for domU

While implementing I think rather than the toolstack, pciback driver in
dom0 can send the
hypercall by to map the physical bar to virtual bar.
Thus no xenstore entry is required for BARs.

pciback doesn't (and shouldn't) have sufficient knowledge of the guest
address space layout to determine what the virtual BAR should be. The
toolstack is the right place for that decision to be made.

Yes, the point is the pciback driver reads the physical BAR regions on
request from domU.
So it sends a hypercall to map the physical bars into stage2 translation
for the domU through xen.
Xen would use the holes left in IPA for MMIO.

I still think it is the toolstack which should do this, that's whewre
these sorts of layout decisions belong.

can the xl tools read pci conf space ?
Using some xen hypercall or a xl-dom0 ioctl ?
If not then there is no otherway but xenpciback

Also I need to introduce a hypercall which would tell toolkit the 
available holes for virtualBAR mapping.

Much simpler is let xen allocate a virtualBAR and return to the caller.

At init - sure. But when the guest is running and doing those sort
of things. Unless you want guest - pciback - xenstore - libxl -
hypercall - send ack on xenstore - pciback - guest.

That would entail adding some pcibkack - user-space tickle mechanism
and another back. Much simpler to do all of this in xenpciback I think?
I agree. If the xenpciback sends a hypercall whenever a BAR read access, 
the mapping
in xen would already have been done, so xen would simply be doing 
PA-IPA lookup.

No xenstore lookup is required.

Xen would return the IPA for pci-back to return to the request to domU.

Moreover a pci driver would read BARs only once.

You can't assume that though, a driver can do whatever it likes, or the
module might be unloaded and reloaded in the guest etc etc.

Are you going to send out a second draft based on the discussion so far?

yes, I was working on that only. I was traveling this week 24 hour
flights jetlag...

Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] PCI Passthrough ARM Design : Draft1

2015-06-17 Thread Manish Jaggi



On Wednesday 17 June 2015 06:43 AM, Ian Campbell wrote:

On Wed, 2015-06-17 at 13:58 +0100, Stefano Stabellini wrote:

Yes, pciback is already capable of doing that, see
drivers/xen/xen-pciback/conf_space.c


I am not sure if the pci-back driver can query the guest memory map. Is there 
an existing hypercall ?

No, that is missing.  I think it would be OK for the virtual BAR to be
initialized to the same value as the physical BAR.  But I would let the
guest change the virtual BAR address and map the MMIO region wherever it
wants in the guest physical address space with
XENMEM_add_to_physmap_range.

I disagree, given that we've apparently survived for years with x86 PV
guests not being able to right to the BARs I think it would be far
simpler to extend this to ARM and x86 PVH too than to allow guests to
start writing BARs which has various complex questions around it.
All that's needed is for the toolstack to set everything up and write
some new xenstore nodes in the per-device directory with the BAR
address/size.

Also most guests apparently don't reassign the PCI bus by default, so
using a 1:1 by default and allowing it to be changed would require
modifying the guests to reasssign. Easy on Linux, but I don't know about
others and I imagine some OSes (especially simpler/embedded ones) are
assuming the firmware sets up something sane by default.

Does the Flow below captures all points
a) When assigning a device to domU, toolstack creates a node in per 
device directory with virtual BAR address/size


Option1:
b) toolstack using some hypercall ask xen to create p2m mapping { 
virtual BAR : physical BAR } for domU
c) domU will not anytime update the BARs, if it does then it is a fault, 
till we decide how to handle it
d) when domU queries BAR address from pci-back the virtual BAR address 
is provided.


Option2:
b) domU will not anytime update the BARs, if it does then it is a fault, 
till we decide how to handle it
c) when domU queries BAR address from pci-back the virtual BAR address 
is provided.

d) domU sends a hypercall to map virtual BARs,
e) xen pci code reads the BAR and maps { virtual BAR : physical BAR } 
for domU


Which option is better I think Ian is for (2) and Stefano may be (1)


Ian.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel



___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


  1   2   >