Re: [Xen-devel] [PATCH v5 09/30] ARM: GICv3 ITS: introduce host LPI array

2017-04-06 Thread Julien Grall

Hi Andre,

On 06/04/17 11:31, Andre Przywara wrote:

On 06/04/17 00:37, Stefano Stabellini wrote:

On Thu, 6 Apr 2017, Andre Przywara wrote:

The number of LPIs on a host can be potentially huge (millions),
although in practise will be mostly reasonable. So prematurely allocating
an array of struct irq_desc's for each LPI is not an option.
However Xen itself does not care about LPIs, as every LPI will be injected
into a guest (Dom0 for now).
Create a dense data structure (8 Bytes) for each LPI which holds just
enough information to determine the virtual IRQ number and the VCPU into
which the LPI needs to be injected.
Also to not artificially limit the number of LPIs, we create a 2-level
table for holding those structures.
This patch introduces functions to initialize these tables and to
create, lookup and destroy entries for a given LPI.
By using the naturally atomic access guarantee the native uint64_t data
type gives us, we allocate and access LPI information in a way that does
not require a lock.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic-v3-its.c|  60 +++
 xen/arch/arm/gic-v3-lpi.c| 227 +++
 xen/include/asm-arm/gic_v3_its.h |   6 ++
 xen/include/asm-arm/irq.h|   8 ++
 4 files changed, 301 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 1ecd63b..eb47c9d 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c


[ ... ]


+
+void gicv3_free_host_lpi_block(uint32_t first_lpi)
+{
+union host_lpi *hlpi, empty_lpi = { .dom_id = DOMID_INVALID };
+int i;
+
+hlpi = gic_get_host_lpi(first_lpi);
+if ( !hlpi )
+return; /* Nothing to free here. */


We should check that first_lpi is actually the first lpi in a block
before calling gic_get_host_lpi.


Are you thinking about an:
ASSERT((first_lpi % LPI_BLOCK) == 0);
here? Or even a BUG_ON?

Technically the only two callers of gicv3_free_host_lpi_block() always
take this value from the host_lpi_blocks array, which should be safe.


Well, maybe today. But without documentation someone could misuse this 
function in the future...


The check would be here for safety

Cheers,

--
Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 09/30] ARM: GICv3 ITS: introduce host LPI array

2017-04-06 Thread Andre Przywara
Hi,

On 06/04/17 00:37, Stefano Stabellini wrote:
> On Thu, 6 Apr 2017, Andre Przywara wrote:
>> The number of LPIs on a host can be potentially huge (millions),
>> although in practise will be mostly reasonable. So prematurely allocating
>> an array of struct irq_desc's for each LPI is not an option.
>> However Xen itself does not care about LPIs, as every LPI will be injected
>> into a guest (Dom0 for now).
>> Create a dense data structure (8 Bytes) for each LPI which holds just
>> enough information to determine the virtual IRQ number and the VCPU into
>> which the LPI needs to be injected.
>> Also to not artificially limit the number of LPIs, we create a 2-level
>> table for holding those structures.
>> This patch introduces functions to initialize these tables and to
>> create, lookup and destroy entries for a given LPI.
>> By using the naturally atomic access guarantee the native uint64_t data
>> type gives us, we allocate and access LPI information in a way that does
>> not require a lock.
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  xen/arch/arm/gic-v3-its.c|  60 +++
>>  xen/arch/arm/gic-v3-lpi.c| 227 
>> +++
>>  xen/include/asm-arm/gic_v3_its.h |   6 ++
>>  xen/include/asm-arm/irq.h|   8 ++
>>  4 files changed, 301 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>> index 1ecd63b..eb47c9d 100644
>> --- a/xen/arch/arm/gic-v3-its.c
>> +++ b/xen/arch/arm/gic-v3-its.c

[ ... ]

>> +
>> +void gicv3_free_host_lpi_block(uint32_t first_lpi)
>> +{
>> +union host_lpi *hlpi, empty_lpi = { .dom_id = DOMID_INVALID };
>> +int i;
>> +
>> +hlpi = gic_get_host_lpi(first_lpi);
>> +if ( !hlpi )
>> +return; /* Nothing to free here. */
> 
> We should check that first_lpi is actually the first lpi in a block
> before calling gic_get_host_lpi.

Are you thinking about an:
ASSERT((first_lpi % LPI_BLOCK) == 0);
here? Or even a BUG_ON?

Technically the only two callers of gicv3_free_host_lpi_block() always
take this value from the host_lpi_blocks array, which should be safe.

Cheers,
Andre.

>> +spin_lock(_data.host_lpis_lock);
>> +
>> +for ( i = 0; i < LPI_BLOCK; i++ )
>> +write_u64_atomic([i].data, empty_lpi.data);
>> +
>> +/*
>> + * Make sure the next allocation can reuse this block, as we do only
>> + * forward scanning when finding an unused block.
>> + */
>> +if ( lpi_data.next_free_lpi > first_lpi )
>> +lpi_data.next_free_lpi = first_lpi;
>> +
>> +spin_unlock(_data.host_lpis_lock);
>> +
>> +return;
>> +}
>> +

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v5 09/30] ARM: GICv3 ITS: introduce host LPI array

2017-04-05 Thread Stefano Stabellini
On Thu, 6 Apr 2017, Andre Przywara wrote:
> The number of LPIs on a host can be potentially huge (millions),
> although in practise will be mostly reasonable. So prematurely allocating
> an array of struct irq_desc's for each LPI is not an option.
> However Xen itself does not care about LPIs, as every LPI will be injected
> into a guest (Dom0 for now).
> Create a dense data structure (8 Bytes) for each LPI which holds just
> enough information to determine the virtual IRQ number and the VCPU into
> which the LPI needs to be injected.
> Also to not artificially limit the number of LPIs, we create a 2-level
> table for holding those structures.
> This patch introduces functions to initialize these tables and to
> create, lookup and destroy entries for a given LPI.
> By using the naturally atomic access guarantee the native uint64_t data
> type gives us, we allocate and access LPI information in a way that does
> not require a lock.
> 
> Signed-off-by: Andre Przywara 
> ---
>  xen/arch/arm/gic-v3-its.c|  60 +++
>  xen/arch/arm/gic-v3-lpi.c| 227 
> +++
>  xen/include/asm-arm/gic_v3_its.h |   6 ++
>  xen/include/asm-arm/irq.h|   8 ++
>  4 files changed, 301 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 1ecd63b..eb47c9d 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -157,6 +157,20 @@ static int its_send_cmd_sync(struct host_its *its, 
> unsigned int cpu)
>  return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_mapti(struct host_its *its,
> +  uint32_t deviceid, uint32_t eventid,
> +  uint32_t pintid, uint16_t icid)
> +{
> +uint64_t cmd[4];
> +
> +cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
> +cmd[1] = eventid | ((uint64_t)pintid << 32);
> +cmd[2] = icid;
> +cmd[3] = 0x00;
> +
> +return its_send_command(its, cmd);
> +}
> +
>  static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
>   unsigned int cpu)
>  {
> @@ -171,6 +185,19 @@ static int its_send_cmd_mapc(struct host_its *its, 
> uint32_t collection_id,
>  return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_inv(struct host_its *its,
> +uint32_t deviceid, uint32_t eventid)
> +{
> +uint64_t cmd[4];
> +
> +cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
> +cmd[1] = eventid;
> +cmd[2] = 0x00;
> +cmd[3] = 0x00;
> +
> +return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(unsigned int cpu)
>  {
> @@ -450,6 +477,39 @@ int gicv3_its_init(void)
>  return 0;
>  }
>  
> +/*
> + * On the host ITS @its, map @nr_events consecutive LPIs.
> + * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
> + * increasing both @eventid and @lpi to cover the number of requested LPIs.
> + */
> +static int gicv3_its_map_host_events(struct host_its *its,
> + uint32_t devid, uint32_t eventid,
> + uint32_t lpi, uint32_t nr_events)
> +{
> +uint32_t i;
> +int ret;
> +
> +for ( i = 0; i < nr_events; i++ )
> +{
> +/* For now we map every host LPI to host CPU 0 */
> +ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);
> +if ( ret )
> +return ret;
> +
> +ret = its_send_cmd_inv(its, devid, eventid + i);
> +if ( ret )
> +return ret;
> +}
> +
> +/* TODO: Consider using INVALL here. Didn't work on the model, though. */
> +
> +ret = its_send_cmd_sync(its, 0);
> +if ( ret )
> +return ret;
> +
> +return gicv3_its_wait_commands(its);
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. 
> */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 9d3df7f..0785701 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -20,14 +20,37 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
>  
> +/*
> + * There could be a lot of LPIs on the host side, and they always go to
> + * a guest. So having a struct irq_desc for each of them would be wasteful
> + * and useless.
> + * Instead just store enough information to find the right VCPU to inject
> + * those LPIs into, which just requires the virtual LPI number.
> + * To avoid a global lock on this data structure, this is using a lockless
> + * approach relying on the architectural atomicity of native data types:
> + * We read or write the "data" view of this union atomically, then can
> + * access the 

[Xen-devel] [PATCH v5 09/30] ARM: GICv3 ITS: introduce host LPI array

2017-04-05 Thread Andre Przywara
The number of LPIs on a host can be potentially huge (millions),
although in practise will be mostly reasonable. So prematurely allocating
an array of struct irq_desc's for each LPI is not an option.
However Xen itself does not care about LPIs, as every LPI will be injected
into a guest (Dom0 for now).
Create a dense data structure (8 Bytes) for each LPI which holds just
enough information to determine the virtual IRQ number and the VCPU into
which the LPI needs to be injected.
Also to not artificially limit the number of LPIs, we create a 2-level
table for holding those structures.
This patch introduces functions to initialize these tables and to
create, lookup and destroy entries for a given LPI.
By using the naturally atomic access guarantee the native uint64_t data
type gives us, we allocate and access LPI information in a way that does
not require a lock.

Signed-off-by: Andre Przywara 
---
 xen/arch/arm/gic-v3-its.c|  60 +++
 xen/arch/arm/gic-v3-lpi.c| 227 +++
 xen/include/asm-arm/gic_v3_its.h |   6 ++
 xen/include/asm-arm/irq.h|   8 ++
 4 files changed, 301 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 1ecd63b..eb47c9d 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -157,6 +157,20 @@ static int its_send_cmd_sync(struct host_its *its, 
unsigned int cpu)
 return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_mapti(struct host_its *its,
+  uint32_t deviceid, uint32_t eventid,
+  uint32_t pintid, uint16_t icid)
+{
+uint64_t cmd[4];
+
+cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
+cmd[1] = eventid | ((uint64_t)pintid << 32);
+cmd[2] = icid;
+cmd[3] = 0x00;
+
+return its_send_command(its, cmd);
+}
+
 static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
  unsigned int cpu)
 {
@@ -171,6 +185,19 @@ static int its_send_cmd_mapc(struct host_its *its, 
uint32_t collection_id,
 return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_inv(struct host_its *its,
+uint32_t deviceid, uint32_t eventid)
+{
+uint64_t cmd[4];
+
+cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
+cmd[1] = eventid;
+cmd[2] = 0x00;
+cmd[3] = 0x00;
+
+return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(unsigned int cpu)
 {
@@ -450,6 +477,39 @@ int gicv3_its_init(void)
 return 0;
 }
 
+/*
+ * On the host ITS @its, map @nr_events consecutive LPIs.
+ * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
+ * increasing both @eventid and @lpi to cover the number of requested LPIs.
+ */
+static int gicv3_its_map_host_events(struct host_its *its,
+ uint32_t devid, uint32_t eventid,
+ uint32_t lpi, uint32_t nr_events)
+{
+uint32_t i;
+int ret;
+
+for ( i = 0; i < nr_events; i++ )
+{
+/* For now we map every host LPI to host CPU 0 */
+ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);
+if ( ret )
+return ret;
+
+ret = its_send_cmd_inv(its, devid, eventid + i);
+if ( ret )
+return ret;
+}
+
+/* TODO: Consider using INVALL here. Didn't work on the model, though. */
+
+ret = its_send_cmd_sync(its, 0);
+if ( ret )
+return ret;
+
+return gicv3_its_wait_commands(its);
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 9d3df7f..0785701 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -20,14 +20,37 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
+/*
+ * There could be a lot of LPIs on the host side, and they always go to
+ * a guest. So having a struct irq_desc for each of them would be wasteful
+ * and useless.
+ * Instead just store enough information to find the right VCPU to inject
+ * those LPIs into, which just requires the virtual LPI number.
+ * To avoid a global lock on this data structure, this is using a lockless
+ * approach relying on the architectural atomicity of native data types:
+ * We read or write the "data" view of this union atomically, then can
+ * access the broken-down fields in our local copy.
+ */
+union host_lpi {
+uint64_t data;
+struct {
+uint32_t virt_lpi;
+uint16_t dom_id;
+uint16_t vcpu_id;
+};
+};
+
 #define LPI_PROPTABLE_NEEDS_FLUSHING(1U << 0)
 
 /* Global state */
@@ -35,12 +58,23 @@ static struct {
 /* The global LPI property