Re: [Xen-devel] [RFC 02/16] hack: drop GIC v3 support

2018-11-29 Thread Andre Przywara
On Wed, 28 Nov 2018 23:31:57 +0200
Andrii Anisov  wrote:

Hi,

> From: Andrii Anisov 
> 
> This reduces some code and conditions in an IRQ processing path,
> and opens way to further code reduction.

While I understand that this is some sort of a hack, I am commenting
just on this patch to demonstrate some issues I see all over the series.
So please apply those ideas to the other patches as well if applicable.

> Also insert compilation errors into gicv3 code, because it would
> not compile or work with following patches.
> 
> Signed-off-by: Andrii Anisov 
> ---
>  xen/arch/arm/configs/arm64_defconfig |  1 +
>  xen/arch/arm/gic-v3-lpi.c|  2 ++
>  xen/arch/arm/gic-v3.c|  2 ++
>  xen/arch/arm/gic-vgic.c  | 13 -
>  xen/arch/arm/gic.c   |  6 ++
>  xen/arch/arm/vgic-v3-its.c   |  1 +
>  xen/arch/arm/vgic.c  | 20 
>  xen/arch/arm/vgic/vgic.c |  2 ++
>  xen/include/asm-arm/irq.h|  2 ++
>  xen/include/asm-arm/vgic.h   | 17 +++--
>  10 files changed, 59 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/configs/arm64_defconfig
> b/xen/arch/arm/configs/arm64_defconfig index e69de29..452c7ac 100644
> --- a/xen/arch/arm/configs/arm64_defconfig
> +++ b/xen/arch/arm/configs/arm64_defconfig
> @@ -0,0 +1 @@
> +# CONFIG_GICV3 is not set
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index e8c6e15..be64e17 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -32,6 +32,8 @@
>  #include 
>  #include 
>  
> +#error "The current gic/vgic/domain code does not support GICv3"
> +

I didn't check too thoroughly, but why do you need this? Your code
changes below seem to be guarded by #ifdef CONFIG_GICV3, so this file
(and the next) wouldn't even be build.

>  /*
>   * There could be a lot of LPIs on the host side, and they always go
> to
>   * a guest. So having a struct irq_desc for each of them would be
> wasteful diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 6fbc106..8e835b5 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -44,6 +44,8 @@
>  #include 
>  #include 
>  
> +#error "The current gic/vgic/domain code does not support GICv3"
> +
>  /* Global state */
>  static struct {
>  void __iomem *map_dbase;  /* Mapped address of distributor
> registers */ diff --git a/xen/arch/arm/gic-vgic.c
> b/xen/arch/arm/gic-vgic.c index 990399c..869987a 100644
> --- a/xen/arch/arm/gic-vgic.c
> +++ b/xen/arch/arm/gic-vgic.c
> @@ -36,7 +36,9 @@ static inline void gic_set_lr(int lr, struct
> pending_irq *p, {
>  ASSERT(!local_irq_is_enabled());
>  
> +#ifdef CONFIG_GICV3
>  clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, >status);
> +#endif
>  
>  gic_hw_ops->update_lr(lr, p->irq, p->priority,
>p->desc ? p->desc->irq : INVALID_IRQ,
> state); @@ -77,9 +79,11 @@ void gic_raise_inflight_irq(struct vcpu
> *v, unsigned int virtual_irq) {
>  struct pending_irq *n = irq_to_pending(v, virtual_irq);
>  
> +#ifdef CONFIG_GICV3
>  /* If an LPI has been removed meanwhile, there is nothing left
> to raise. */ if ( unlikely(!n) )
>  return;
> +#endif

So are you sure that this really saves you something? This check boils
down to a:
cbz x0, 670 
in assembly, which is basically free on modern CPUs.
Surprisingly removing this removes some more instructions from the
function, it would be interesting to know why.

In general I get the impression you optimise things just by looking at
the C code. Does saving a few instructions here and there really make a
difference? For instance I'd say that any non-memory instructions are
not worth to think about, and any stack accesses are probably caught in
the L1 cache.

In general I believe your performance drop is due to *exits* from the
guests due to the h/w interrupts, not necessarily because of the VGIC
code. So to reduce latency I think it would be more worthwhile to see
if we can reduce the world switch time, by avoiding unnessary
save/restores.
At least that saved a lot in KVM (with VHE), although the conditions
there are quite different.

>  ASSERT(spin_is_locked(>arch.vgic.lock));
>  
> @@ -112,13 +116,14 @@ static unsigned int gic_find_unused_lr(struct
> vcpu *v, {
>  unsigned int nr_lrs = gic_get_nr_lrs();
>  unsigned long *lr_mask = (unsigned long *) _cpu(lr_mask);
> -struct gic_lr lr_val;
>  
>  ASSERT(spin_is_locked(>arch.vgic.lock));
>  
> +#ifdef CONFIG_GICV3

I am not against removing GICv3 code from the code path if that is
not configured, but obviously those #ifdef's directly in the code are
hideous.
If you can prove that this saves you something, you could think
factoring out those functions, and define them as empty if CONFIG_GICV3
is not defined.
In this case you could do (outside of this function):

#ifdef CONFIG_GICV3
#define lpi_pristine_bit_set(p) \

[Xen-devel] [RFC 02/16] hack: drop GIC v3 support

2018-11-28 Thread Andrii Anisov
From: Andrii Anisov 

This reduces some code and conditions in an IRQ processing path,
and opens way to further code reduction.

Also insert compilation errors into gicv3 code, because it would
not compile or work with following patches.

Signed-off-by: Andrii Anisov 
---
 xen/arch/arm/configs/arm64_defconfig |  1 +
 xen/arch/arm/gic-v3-lpi.c|  2 ++
 xen/arch/arm/gic-v3.c|  2 ++
 xen/arch/arm/gic-vgic.c  | 13 -
 xen/arch/arm/gic.c   |  6 ++
 xen/arch/arm/vgic-v3-its.c   |  1 +
 xen/arch/arm/vgic.c  | 20 
 xen/arch/arm/vgic/vgic.c |  2 ++
 xen/include/asm-arm/irq.h|  2 ++
 xen/include/asm-arm/vgic.h   | 17 +++--
 10 files changed, 59 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/configs/arm64_defconfig 
b/xen/arch/arm/configs/arm64_defconfig
index e69de29..452c7ac 100644
--- a/xen/arch/arm/configs/arm64_defconfig
+++ b/xen/arch/arm/configs/arm64_defconfig
@@ -0,0 +1 @@
+# CONFIG_GICV3 is not set
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index e8c6e15..be64e17 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -32,6 +32,8 @@
 #include 
 #include 
 
+#error "The current gic/vgic/domain code does not support GICv3"
+
 /*
  * There could be a lot of LPIs on the host side, and they always go to
  * a guest. So having a struct irq_desc for each of them would be wasteful
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 6fbc106..8e835b5 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -44,6 +44,8 @@
 #include 
 #include 
 
+#error "The current gic/vgic/domain code does not support GICv3"
+
 /* Global state */
 static struct {
 void __iomem *map_dbase;  /* Mapped address of distributor registers */
diff --git a/xen/arch/arm/gic-vgic.c b/xen/arch/arm/gic-vgic.c
index 990399c..869987a 100644
--- a/xen/arch/arm/gic-vgic.c
+++ b/xen/arch/arm/gic-vgic.c
@@ -36,7 +36,9 @@ static inline void gic_set_lr(int lr, struct pending_irq *p,
 {
 ASSERT(!local_irq_is_enabled());
 
+#ifdef CONFIG_GICV3
 clear_bit(GIC_IRQ_GUEST_PRISTINE_LPI, >status);
+#endif
 
 gic_hw_ops->update_lr(lr, p->irq, p->priority,
   p->desc ? p->desc->irq : INVALID_IRQ, state);
@@ -77,9 +79,11 @@ void gic_raise_inflight_irq(struct vcpu *v, unsigned int 
virtual_irq)
 {
 struct pending_irq *n = irq_to_pending(v, virtual_irq);
 
+#ifdef CONFIG_GICV3
 /* If an LPI has been removed meanwhile, there is nothing left to raise. */
 if ( unlikely(!n) )
 return;
+#endif
 
 ASSERT(spin_is_locked(>arch.vgic.lock));
 
@@ -112,13 +116,14 @@ static unsigned int gic_find_unused_lr(struct vcpu *v,
 {
 unsigned int nr_lrs = gic_get_nr_lrs();
 unsigned long *lr_mask = (unsigned long *) _cpu(lr_mask);
-struct gic_lr lr_val;
 
 ASSERT(spin_is_locked(>arch.vgic.lock));
 
+#ifdef CONFIG_GICV3
 if ( unlikely(test_bit(GIC_IRQ_GUEST_PRISTINE_LPI, >status)) )
 {
 unsigned int used_lr;
+struct gic_lr lr_val;
 
 for_each_set_bit(used_lr, lr_mask, nr_lrs)
 {
@@ -127,6 +132,7 @@ static unsigned int gic_find_unused_lr(struct vcpu *v,
 return used_lr;
 }
 }
+#endif
 
 lr = find_next_zero_bit(lr_mask, nr_lrs, lr);
 
@@ -142,9 +148,11 @@ void gic_raise_guest_irq(struct vcpu *v, unsigned int 
virtual_irq,
 
 ASSERT(spin_is_locked(>arch.vgic.lock));
 
+#ifdef CONFIG_GICV3
 if ( unlikely(!p) )
 /* An unmapped LPI does not need to be raised. */
 return;
+#endif
 
 if ( v == current && list_empty(>arch.vgic.lr_pending) )
 {
@@ -172,6 +180,8 @@ static void gic_update_one_lr(struct vcpu *v, int i)
 gic_hw_ops->read_lr(i, _val);
 irq = lr_val.virq;
 p = irq_to_pending(v, irq);
+
+#ifdef CONFIG_GICV3
 /*
  * An LPI might have been unmapped, in which case we just clean up here.
  * If that LPI is marked as PRISTINE, the information in the LR is bogus,
@@ -188,6 +198,7 @@ static void gic_update_one_lr(struct vcpu *v, int i)
 
 return;
 }
+#endif
 
 if ( lr_val.active )
 {
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 6cc7dec..77fc06f 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -136,7 +136,9 @@ int gic_route_irq_to_guest(struct domain *d, unsigned int 
virq,
 /* Caller has already checked that the IRQ is an SPI */
 ASSERT(virq >= 32);
 ASSERT(virq < vgic_num_irqs(d));
+#ifdef CONFIG_GICV3
 ASSERT(!is_lpi(virq));
+#endif
 
 /*
  * When routing an IRQ to guest, the virtual state is not synced
@@ -168,7 +170,9 @@ int gic_remove_irq_from_guest(struct domain *d, unsigned 
int virq,
 
 ASSERT(spin_is_locked(>lock));
 ASSERT(test_bit(_IRQ_GUEST, >status));
+#ifdef CONFIG_GICV3
 ASSERT(!is_lpi(virq));
+#endif
 
 /*
  * Removing an interrupt while the domain is