Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time

2015-06-08 Thread Marc Zyngier
Hi Christoffer,

On 28/05/15 19:49, Christoffer Dall wrote:
> Until now we have been calling kvm_guest_exit after re-enabling
> interrupts when we come back from the guest, but this has the
> unfortunate effect that CPU time accounting done in the context of timer
> interrupts occurring while the guest is running doesn't properly notice
> that the time since the last tick was spent in the guest.
> 
> Inspired by the comment in the x86 code, move the kvm_guest_exit() call
> below the local_irq_enable() call and change __kvm_guest_exit() to
> kvm_guest_exit(), because we are now calling this function with
> interrupts enabled.  We have to now explicitly disable preemption and
> not enable preemption before we've called kvm_guest_exit(), since
> otherwise we could be preempted and everything happening before we
> eventually get scheduled again would be accounted for as guest time.
> 
> At the same time, move the trace_kvm_exit() call outside of the atomic
> section, since there is no reason for us to do that with interrupts
> disabled.
> 
> Signed-off-by: Christoffer Dall 
> ---
> This patch is based on kvm/queue, because it has the kvm_guest_enter/exit
> rework recently posted by Christian Borntraeger.  I hope I got the logic
> of this right, there were 2 slightly worrying facts about this:
> 
> First, we now enable and disable and enable interrupts on each exit
> path, but I couldn't see any performance overhead on hackbench - yes the
> only benchmark we care about.
> 
> Second, looking at the ppc and mips code, they seem to also call
> kvm_guest_exit() before enabling interrupts, so I don't understand how
> guest CPU time accounting works on those architectures.
> 
> Changes since v1:
>  - Tweak comment and commit text based on Marc's feedback.
>  - Explicitly disable preemption and enable it only after kvm_guest_exit().
> 
>  arch/arm/kvm/arm.c | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index e41cb11..fe8028d 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -532,6 +532,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>   kvm_vgic_flush_hwstate(vcpu);
>   kvm_timer_flush_hwstate(vcpu);
>  
> + preempt_disable();
>   local_irq_disable();
>  
>   /*
> @@ -544,6 +545,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>  
>   if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
>   local_irq_enable();
> + preempt_enable();
>   kvm_timer_sync_hwstate(vcpu);
>   kvm_vgic_sync_hwstate(vcpu);
>   continue;
> @@ -559,8 +561,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> struct kvm_run *run)
>   ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
>  
>   vcpu->mode = OUTSIDE_GUEST_MODE;
> - __kvm_guest_exit();
> - trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
> + /*
> +  * Back from guest
> +  */
> +
>   /*
>* We may have taken a host interrupt in HYP mode (ie
>* while executing the guest). This interrupt is still
> @@ -574,8 +578,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> struct kvm_run *run)
>   local_irq_enable();
>  
>   /*
> -  * Back from guest
> -  */
> +  * We do local_irq_enable() before calling kvm_guest_exit() so
> +  * that if a timer interrupt hits while running the guest we
> +  * account that tick as being spent in the guest.  We enable
> +  * preemption after calling kvm_guest_exit() so that if we get
> +  * preempted we make sure ticks after that is not counted as
> +  * guest time.
> +  */
> + kvm_guest_exit();
> + trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
> + preempt_enable();
> +
>  
>   kvm_timer_sync_hwstate(vcpu);
>   kvm_vgic_sync_hwstate(vcpu);
> 

I've been thinking about this a bit more, and I wonder if we can
simplify it a bit. At the moment, we disable the interrupts around the
HYP entry. But now that you have introduced preempt_disable, it looks
like we move the local_irq_disable call to be just after
__kvm_guest_enter, and not bother with having such a long critical section.

This is possible because entering HYP mode automatically masks
interrupts, and we restore PSTATE on exception return.

I think this would slightly reduce the amount of code we run on the host
that gets accounted to the guest.

Thoughts?

Thanks,

M.

-- 
Jazz is not dead. It just smel

Re: [PATCH 00/13] arm64: KVM: GICv3 ITS emulation

2015-06-08 Thread Marc Zyngier
On 08/06/15 11:54, Pavel Fedin wrote:
>  Hi!
> 
>> I'm afraid this is not enough. A write to GICR_TRANSLATER (DID+EID)
>> results in a (LPI,CPU) pair. Can you easily express the CPU part in
>> irqfd (this is a genuine question, I'm not familiar enough with that
>> part of the core)?
> 
>  But... As far as i could understand, LPI is added to a collection as a part 
> of setup. And
> collection actually represents a destination CPU, doesn't it? And we can't 
> have multiple
> LPIs sharing the same number and going to different CPUs. Or am i wrong? 
> Unfortunately i
> don't have GICv3 arch reference manual.

This is true to some extent. But the point is that the result of the
translation is both an LPI and a CPU. My question was how you would
indicate convey the notion of a target vcpu when using irqfd. As far as
I know this doesn't really fit, unless we start introducing the dreaded
GSI routing...

Do we really want to go down that road?

>> Another concern
>> would be the support of GICv4, which relies on the command queue
>> handling to be handled in the kernel
> 
>  Wow, i didn't know about GICv4.

I wish I didn't know about it.

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 07/10] KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest

2015-06-08 Thread Marc Zyngier
To allow a HW interrupt to be injected into a guest, we lookup the
guest virtual interrupt in the irq_phys_map rbtree, and if we have
a match, encode both interrupts in the LR.

We also mark the interrupt as "active" at the host distributor level.

On guest EOI on the virtual interrupt, the host interrupt will be
deactivated.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic.c | 71 ++---
 1 file changed, 68 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index c6604f2..495ac7d 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1120,6 +1120,26 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, 
int irq,
if (!vgic_irq_is_edge(vcpu, irq))
vlr.state |= LR_EOI_INT;
 
+   if (vlr.irq >= VGIC_NR_SGIS) {
+   struct irq_phys_map *map;
+   map = vgic_irq_map_search(vcpu, irq);
+
+   if (map) {
+   int ret;
+
+   BUG_ON(!map->active);
+   vlr.hwirq = map->phys_irq;
+   vlr.state |= LR_HW;
+   vlr.state &= ~LR_EOI_INT;
+
+   ret = irq_set_irqchip_state(map->irq,
+   IRQCHIP_STATE_ACTIVE,
+   true);
+   vgic_irq_set_queued(vcpu, irq);
+   WARN_ON(ret);
+   }
+   }
+
vgic_set_lr(vcpu, lr_nr, vlr);
vgic_sync_lr_elrsr(vcpu, lr_nr, vlr);
 }
@@ -1344,6 +1364,35 @@ static bool vgic_process_maintenance(struct kvm_vcpu 
*vcpu)
return level_pending;
 }
 
+/* Return 1 if HW interrupt went from active to inactive, and 0 otherwise */
+static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
+{
+   struct irq_phys_map *map;
+   int ret;
+
+   if (!(vlr.state & LR_HW))
+   return 0;
+
+   map = vgic_irq_map_search(vcpu, vlr.irq);
+   BUG_ON(!map || !map->active);
+
+   ret = irq_get_irqchip_state(map->irq,
+   IRQCHIP_STATE_ACTIVE,
+   &map->active);
+
+   WARN_ON(ret);
+
+   if (map->active) {
+   ret = irq_set_irqchip_state(map->irq,
+   IRQCHIP_STATE_ACTIVE,
+   false);
+   WARN_ON(ret);
+   return 0;
+   }
+
+   return 1;
+}
+
 /* Sync back the VGIC state after a guest run */
 static void __kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu)
 {
@@ -1358,14 +1407,30 @@ static void __kvm_vgic_sync_hwstate(struct kvm_vcpu 
*vcpu)
elrsr = vgic_get_elrsr(vcpu);
elrsr_ptr = u64_to_bitmask(&elrsr);
 
-   /* Clear mappings for empty LRs */
-   for_each_set_bit(lr, elrsr_ptr, vgic->nr_lr) {
+   /* Deal with HW interrupts, and clear mappings for empty LRs */
+   for (lr = 0; lr < vgic->nr_lr; lr++) {
struct vgic_lr vlr;
 
-   if (!test_and_clear_bit(lr, vgic_cpu->lr_used))
+   if (!test_bit(lr, vgic_cpu->lr_used))
continue;
 
vlr = vgic_get_lr(vcpu, lr);
+   if (vgic_sync_hwirq(vcpu, vlr)) {
+   /*
+* So this is a HW interrupt that the guest
+* EOI-ed. Clean the LR state and allow the
+* interrupt to be queued again.
+*/
+   vlr.state &= ~LR_HW;
+   vlr.hwirq = 0;
+   vgic_set_lr(vcpu, lr, vlr);
+   vgic_irq_clear_queued(vcpu, vlr.irq);
+   }
+
+   if (!test_bit(lr, elrsr_ptr))
+   continue;
+
+   clear_bit(lr, vgic_cpu->lr_used);
 
BUG_ON(vlr.irq >= dist->nr_irqs);
vgic_cpu->vgic_irq_lr_map[vlr.irq] = LR_EMPTY;
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 10/10] KVM: arm/arm64: vgic: Allow non-shared device HW interrupts

2015-06-08 Thread Marc Zyngier
So far, the only use of the HW interrupt facility is the timer,
implying that the active state is context-switched for each vcpu,
as the device is is shared across all vcpus.

This does not work for a device that has been assigned to a VM,
as the guest is entierely in control of that device (the HW is
not shared). In that case, it makes sense to bypass the whole
active state switchint, and only track the deactivation of the
interrupt.

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h|  5 +++--
 virt/kvm/arm/arch_timer.c |  2 +-
 virt/kvm/arm/vgic.c   | 37 -
 3 files changed, 28 insertions(+), 16 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 1c653c1..5d47d60 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -164,7 +164,8 @@ struct irq_phys_map {
u32 virt_irq;
u32 phys_irq;
u32 irq;
-   boolactive;
+   boolshared;
+   boolactive; /* Only valid if shared */
 };
 
 struct vgic_dist {
@@ -347,7 +348,7 @@ void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-  int virt_irq, int irq);
+  int virt_irq, int irq, bool shared);
 int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 bool vgic_get_phys_irq_active(struct irq_phys_map *map);
 void vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index b9fff78..9544d79 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -202,7 +202,7 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 * Tell the VGIC that the virtual interrupt is tied to a
 * physical interrupt. We do that once per VCPU.
 */
-   timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+   timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq, true);
WARN_ON(!timer->map);
 }
 
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index f376b56..4223166 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1125,18 +1125,21 @@ static void vgic_queue_irq_to_lr(struct kvm_vcpu *vcpu, 
int irq,
map = vgic_irq_map_search(vcpu, irq);
 
if (map) {
-   int ret;
-
-   BUG_ON(!map->active);
vlr.hwirq = map->phys_irq;
vlr.state |= LR_HW;
vlr.state &= ~LR_EOI_INT;
 
-   ret = irq_set_irqchip_state(map->irq,
-   IRQCHIP_STATE_ACTIVE,
-   true);
vgic_irq_set_queued(vcpu, irq);
-   WARN_ON(ret);
+
+   if (map->shared) {
+   int ret;
+
+   BUG_ON(!map->active);
+   ret = irq_set_irqchip_state(map->irq,
+   
IRQCHIP_STATE_ACTIVE,
+   true);
+   WARN_ON(ret);
+   }
}
}
 
@@ -1368,21 +1371,28 @@ static bool vgic_process_maintenance(struct kvm_vcpu 
*vcpu)
 static int vgic_sync_hwirq(struct kvm_vcpu *vcpu, struct vgic_lr vlr)
 {
struct irq_phys_map *map;
+   bool active;
int ret;
 
if (!(vlr.state & LR_HW))
return 0;
 
map = vgic_irq_map_search(vcpu, vlr.irq);
-   BUG_ON(!map || !map->active);
+   BUG_ON(!map);
+   BUG_ON(map->shared && !map->active);
 
ret = irq_get_irqchip_state(map->irq,
IRQCHIP_STATE_ACTIVE,
-   &map->active);
+   &active);
 
WARN_ON(ret);
 
-   if (map->active) {
+   if (!map->shared)
+   return !active;
+
+   map->active = active;
+
+   if (active) {
ret = irq_set_irqchip_state(map->irq,
IRQCHIP_STATE_ACTIVE,
false);
@@ -1663,7 +1673,7 @@ static struct rb_root *vgic_get_irq_phys_map(struct 
kvm_vcpu *vcpu,
 }
 
 struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
-  int virt_irq, int irq)
+  int virt_irq, int irq, bool shared)
 {
struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
struct rb_root *root = vgic_get_irq_phys_map(vcpu, vir

[PATCH 08/10] KVM: arm/arm64: vgic: Add vgic_{get, set}_phys_irq_active

2015-06-08 Thread Marc Zyngier
In order to control the active state of an interrupt, introduce
a pair of accessors allowing the state to be set/queried.

This only affects the logical state, and the HW state will only be
applied at world-switch time.

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h |  2 ++
 virt/kvm/arm/vgic.c| 12 
 2 files changed, 14 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 33d121a..1c653c1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -349,6 +349,8 @@ int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
 struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
   int virt_irq, int irq);
 int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
+bool vgic_get_phys_irq_active(struct irq_phys_map *map);
+void vgic_set_phys_irq_active(struct irq_phys_map *map, bool active);
 
 #define irqchip_in_kernel(k)   (!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 495ac7d..f376b56 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -1744,6 +1744,18 @@ static struct irq_phys_map *vgic_irq_map_search(struct 
kvm_vcpu *vcpu,
return this;
 }
 
+bool vgic_get_phys_irq_active(struct irq_phys_map *map)
+{
+   BUG_ON(!map);
+   return map->active;
+}
+
+void vgic_set_phys_irq_active(struct irq_phys_map *map, bool active)
+{
+   BUG_ON(!map);
+   map->active = active;
+}
+
 int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map)
 {
struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 09/10] KVM: arm/arm64: timer: Allow the timer to control the active state

2015-06-08 Thread Marc Zyngier
In order to remove the crude hack where we sneak the masked bit
into the timer's control register, make use of the phys_irq_map
API control the active state of the interrupt.

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_arch_timer.h |  3 +++
 virt/kvm/arm/arch_timer.c| 13 +++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index e596675..9feebf1 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -52,6 +52,9 @@ struct arch_timer_cpu {
 
/* Timer IRQ */
const struct kvm_irq_level  *irq;
+
+   /* VGIC mapping */
+   struct irq_phys_map *map;
 };
 
 int kvm_timer_hyp_init(void);
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 98c95f2..b9fff78 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -64,7 +64,7 @@ static void kvm_timer_inject_irq(struct kvm_vcpu *vcpu)
int ret;
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
-   timer->cntv_ctl |= ARCH_TIMER_CTRL_IT_MASK;
+   vgic_set_phys_irq_active(timer->map, true);
ret = kvm_vgic_inject_irq(vcpu->kvm, vcpu->vcpu_id,
  timer->irq->irq,
  timer->irq->level);
@@ -117,7 +117,8 @@ bool kvm_timer_should_fire(struct kvm_vcpu *vcpu)
cycle_t cval, now;
 
if ((timer->cntv_ctl & ARCH_TIMER_CTRL_IT_MASK) ||
-   !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE))
+   !(timer->cntv_ctl & ARCH_TIMER_CTRL_ENABLE) ||
+   vgic_get_phys_irq_active(timer->map))
return false;
 
cval = timer->cntv_cval;
@@ -196,6 +197,13 @@ void kvm_timer_vcpu_reset(struct kvm_vcpu *vcpu,
 * vcpu timer irq number when the vcpu is reset.
 */
timer->irq = irq;
+
+   /*
+* Tell the VGIC that the virtual interrupt is tied to a
+* physical interrupt. We do that once per VCPU.
+*/
+   timer->map = vgic_map_phys_irq(vcpu, irq->irq, host_vtimer_irq);
+   WARN_ON(!timer->map);
 }
 
 void kvm_timer_vcpu_init(struct kvm_vcpu *vcpu)
@@ -335,6 +343,7 @@ void kvm_timer_vcpu_terminate(struct kvm_vcpu *vcpu)
struct arch_timer_cpu *timer = &vcpu->arch.timer_cpu;
 
timer_disarm(timer);
+   vgic_unmap_phys_irq(vcpu, timer->map);
 }
 
 void kvm_timer_enable(struct kvm *kvm)
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 05/10] KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs

2015-06-08 Thread Marc Zyngier
We only set the irq_queued flag for level interrupts, meaning
that "!vgic_irq_is_queued(vcpu, irq)" is a good enough predicate
for all interrupts.

This will allow us to inject edge HW interrupts, for which the
state ACTIVE+PENDING is not allowed.

Signed-off-by: Marc Zyngier 
---
 virt/kvm/arm/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 78fb820..59ed7a3 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -377,7 +377,7 @@ void vgic_cpu_irq_clear(struct kvm_vcpu *vcpu, int irq)
 
 static bool vgic_can_sample_irq(struct kvm_vcpu *vcpu, int irq)
 {
-   return vgic_irq_is_edge(vcpu, irq) || !vgic_irq_is_queued(vcpu, irq);
+   return !vgic_irq_is_queued(vcpu, irq);
 }
 
 /**
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 06/10] KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual interrupts

2015-06-08 Thread Marc Zyngier
In order to be able to feed physical interrupts to a guest, we need
to be able to establish the virtual-physical mapping between the two
worlds.

The mapping is kept in a rbtree, indexed by virtual interrupts.

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h |  18 
 virt/kvm/arm/vgic.c| 110 +
 2 files changed, 128 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4f9fa1d..33d121a 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -159,6 +159,14 @@ struct vgic_io_device {
struct kvm_io_device dev;
 };
 
+struct irq_phys_map {
+   struct rb_node  node;
+   u32 virt_irq;
+   u32 phys_irq;
+   u32 irq;
+   boolactive;
+};
+
 struct vgic_dist {
spinlock_t  lock;
boolin_kernel;
@@ -256,6 +264,10 @@ struct vgic_dist {
struct vgic_vm_ops  vm_ops;
struct vgic_io_device   dist_iodev;
struct vgic_io_device   *redist_iodevs;
+
+   /* Virtual irq to hwirq mapping */
+   spinlock_t  irq_phys_map_lock;
+   struct rb_root  irq_phys_map;
 };
 
 struct vgic_v2_cpu_if {
@@ -307,6 +319,9 @@ struct vgic_cpu {
struct vgic_v2_cpu_if   vgic_v2;
struct vgic_v3_cpu_if   vgic_v3;
};
+
+   /* Protected by the distributor's irq_phys_map_lock */
+   struct rb_root  irq_phys_map;
 };
 
 #define LR_EMPTY   0xff
@@ -331,6 +346,9 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, 
unsigned int irq_num,
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg);
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 int kvm_vgic_vcpu_active_irq(struct kvm_vcpu *vcpu);
+struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
+  int virt_irq, int irq);
+int vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, struct irq_phys_map *map);
 
 #define irqchip_in_kernel(k)   (!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)(!!((k)->arch.vgic.nr_cpus))
diff --git a/virt/kvm/arm/vgic.c b/virt/kvm/arm/vgic.c
index 59ed7a3..c6604f2 100644
--- a/virt/kvm/arm/vgic.c
+++ b/virt/kvm/arm/vgic.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -84,6 +85,8 @@ static void vgic_retire_disabled_irqs(struct kvm_vcpu *vcpu);
 static void vgic_retire_lr(int lr_nr, int irq, struct kvm_vcpu *vcpu);
 static struct vgic_lr vgic_get_lr(const struct kvm_vcpu *vcpu, int lr);
 static void vgic_set_lr(struct kvm_vcpu *vcpu, int lr, struct vgic_lr lr_desc);
+static struct irq_phys_map *vgic_irq_map_search(struct kvm_vcpu *vcpu,
+   int virt_irq);
 
 static const struct vgic_ops *vgic_ops;
 static const struct vgic_params *vgic;
@@ -1585,6 +1588,112 @@ static irqreturn_t vgic_maintenance_handler(int irq, 
void *data)
return IRQ_HANDLED;
 }
 
+static struct rb_root *vgic_get_irq_phys_map(struct kvm_vcpu *vcpu,
+int virt_irq)
+{
+   if (virt_irq < VGIC_NR_PRIVATE_IRQS)
+   return &vcpu->arch.vgic_cpu.irq_phys_map;
+   else
+   return &vcpu->kvm->arch.vgic.irq_phys_map;
+}
+
+struct irq_phys_map *vgic_map_phys_irq(struct kvm_vcpu *vcpu,
+  int virt_irq, int irq)
+{
+   struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
+   struct rb_root *root = vgic_get_irq_phys_map(vcpu, virt_irq);
+   struct rb_node **new = &root->rb_node, *parent = NULL;
+   struct irq_phys_map *new_map;
+   struct irq_desc *desc;
+   struct irq_data *data;
+   int phys_irq;
+
+   desc = irq_to_desc(irq);
+   if (!desc) {
+   kvm_err("kvm_arch_timer: can't obtain interrupt descriptor\n");
+   return NULL;
+   }
+
+   data = irq_desc_get_irq_data(desc);
+   while (data->parent_data)
+   data = data->parent_data;
+
+   phys_irq = data->hwirq;
+
+   spin_lock(&dist->irq_phys_map_lock);
+
+   /* Boilerplate rb_tree code */
+   while (*new) {
+   struct irq_phys_map *this;
+
+   this = container_of(*new, struct irq_phys_map, node);
+   parent = *new;
+   if (this->virt_irq < virt_irq)
+   new = &(*new)->rb_left;
+   else if (this->virt_irq > virt_irq)
+   new = &(*new)->rb_right;
+   else {
+   new_map = this;
+   goto out;
+   }
+   }
+
+   new_map = kzalloc(sizeof(*new_map), GFP_KERNEL);
+   if (!new_map)
+   goto out;
+
+   new_map->virt_irq = virt_irq;
+   new_map->phys_irq = phys_irq;
+   new_map->irq = irq;
+
+   rb_link_node(&new_map->node, parent, new);
+   rb_insert_co

[PATCH 01/10] arm/arm64: KVM: Fix ordering of timer/GIC on guest entry

2015-06-08 Thread Marc Zyngier
As we now inject the timer interrupt when we're about to enter
the guest, it makes a lot more sense to make sure this happens
before the vgic code queues the pending interrupts.

Otherwise, we get the interrupt on the following exit, which is
not great for latency (and leads to all kind of bizarre issues
when using with active interrupts at the HW level).

Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/arm.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index d9631ec..46db690 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -529,9 +529,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (vcpu->arch.pause)
vcpu_pause(vcpu);
 
-   kvm_vgic_flush_hwstate(vcpu);
kvm_timer_flush_hwstate(vcpu);
 
+   kvm_vgic_flush_hwstate(vcpu);
+
local_irq_disable();
 
/*
@@ -544,8 +545,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
local_irq_enable();
-   kvm_timer_sync_hwstate(vcpu);
kvm_vgic_sync_hwstate(vcpu);
+   kvm_timer_sync_hwstate(vcpu);
continue;
}
 
@@ -577,9 +578,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 * Back from guest
 */
 
-   kvm_timer_sync_hwstate(vcpu);
kvm_vgic_sync_hwstate(vcpu);
 
+   kvm_timer_sync_hwstate(vcpu);
+
ret = handle_exit(vcpu, run, ret);
}
 
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 04/10] KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR

2015-06-08 Thread Marc Zyngier
Now that struct vgic_lr supports the LR_HW bit and carries a hwirq
field, we can encode that information into the list registers.

This patch provides implementations for both GICv2 and GICv3.

Signed-off-by: Marc Zyngier 
---
 include/linux/irqchip/arm-gic-v3.h |  3 +++
 include/linux/irqchip/arm-gic.h|  3 ++-
 virt/kvm/arm/vgic-v2.c | 16 +++-
 virt/kvm/arm/vgic-v3.c | 21 ++---
 4 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/include/linux/irqchip/arm-gic-v3.h 
b/include/linux/irqchip/arm-gic-v3.h
index ffbc034..cf637d6 100644
--- a/include/linux/irqchip/arm-gic-v3.h
+++ b/include/linux/irqchip/arm-gic-v3.h
@@ -268,9 +268,12 @@
 
 #define ICH_LR_EOI (1UL << 41)
 #define ICH_LR_GROUP   (1UL << 60)
+#define ICH_LR_HW  (1UL << 61)
 #define ICH_LR_STATE   (3UL << 62)
 #define ICH_LR_PENDING_BIT (1UL << 62)
 #define ICH_LR_ACTIVE_BIT  (1UL << 63)
+#define ICH_LR_PHYS_ID_SHIFT   32
+#define ICH_LR_PHYS_ID_MASK(0x3ffUL << ICH_LR_PHYS_ID_SHIFT)
 
 #define ICH_MISR_EOI   (1 << 0)
 #define ICH_MISR_U (1 << 1)
diff --git a/include/linux/irqchip/arm-gic.h b/include/linux/irqchip/arm-gic.h
index 9de976b..ca88dad 100644
--- a/include/linux/irqchip/arm-gic.h
+++ b/include/linux/irqchip/arm-gic.h
@@ -71,11 +71,12 @@
 
 #define GICH_LR_VIRTUALID  (0x3ff << 0)
 #define GICH_LR_PHYSID_CPUID_SHIFT (10)
-#define GICH_LR_PHYSID_CPUID   (7 << GICH_LR_PHYSID_CPUID_SHIFT)
+#define GICH_LR_PHYSID_CPUID   (0x3ff << GICH_LR_PHYSID_CPUID_SHIFT)
 #define GICH_LR_STATE  (3 << 28)
 #define GICH_LR_PENDING_BIT(1 << 28)
 #define GICH_LR_ACTIVE_BIT (1 << 29)
 #define GICH_LR_EOI(1 << 19)
+#define GICH_LR_HW (1 << 31)
 
 #define GICH_VMCR_CTRL_SHIFT   0
 #define GICH_VMCR_CTRL_MASK(0x21f << GICH_VMCR_CTRL_SHIFT)
diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
index f9b9c7c..8d7b04d 100644
--- a/virt/kvm/arm/vgic-v2.c
+++ b/virt/kvm/arm/vgic-v2.c
@@ -48,6 +48,10 @@ static struct vgic_lr vgic_v2_get_lr(const struct kvm_vcpu 
*vcpu, int lr)
lr_desc.state |= LR_STATE_ACTIVE;
if (val & GICH_LR_EOI)
lr_desc.state |= LR_EOI_INT;
+   if (val & GICH_LR_HW) {
+   lr_desc.state |= LR_HW;
+   lr_desc.hwirq = (val & GICH_LR_PHYSID_CPUID) >> 
GICH_LR_PHYSID_CPUID_SHIFT;
+   }
 
return lr_desc;
 }
@@ -55,7 +59,9 @@ static struct vgic_lr vgic_v2_get_lr(const struct kvm_vcpu 
*vcpu, int lr)
 static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
   struct vgic_lr lr_desc)
 {
-   u32 lr_val = (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT) | 
lr_desc.irq;
+   u32 lr_val;
+
+   lr_val = lr_desc.irq;
 
if (lr_desc.state & LR_STATE_PENDING)
lr_val |= GICH_LR_PENDING_BIT;
@@ -64,6 +70,14 @@ static void vgic_v2_set_lr(struct kvm_vcpu *vcpu, int lr,
if (lr_desc.state & LR_EOI_INT)
lr_val |= GICH_LR_EOI;
 
+   if (lr_desc.state & LR_HW) {
+   lr_val |= GICH_LR_HW;
+   lr_val |= (u32)lr_desc.hwirq << GICH_LR_PHYSID_CPUID_SHIFT;
+   }
+
+   if (lr_desc.irq < VGIC_NR_SGIS)
+   lr_val |= (lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT);
+
vcpu->arch.vgic_cpu.vgic_v2.vgic_lr[lr] = lr_val;
 }
 
diff --git a/virt/kvm/arm/vgic-v3.c b/virt/kvm/arm/vgic-v3.c
index dff0602..afbf925 100644
--- a/virt/kvm/arm/vgic-v3.c
+++ b/virt/kvm/arm/vgic-v3.c
@@ -67,6 +67,10 @@ static struct vgic_lr vgic_v3_get_lr(const struct kvm_vcpu 
*vcpu, int lr)
lr_desc.state |= LR_STATE_ACTIVE;
if (val & ICH_LR_EOI)
lr_desc.state |= LR_EOI_INT;
+   if (val & ICH_LR_HW) {
+   lr_desc.state |= LR_HW;
+   lr_desc.hwirq = (val >> ICH_LR_PHYS_ID_SHIFT) & GENMASK(9, 0);
+   }
 
return lr_desc;
 }
@@ -84,10 +88,17 @@ static void vgic_v3_set_lr(struct kvm_vcpu *vcpu, int lr,
 * Eventually we want to make this configurable, so we may revisit
 * this in the future.
 */
-   if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+   switch (vcpu->kvm->arch.vgic.vgic_model) {
+   case KVM_DEV_TYPE_ARM_VGIC_V3:
lr_val |= ICH_LR_GROUP;
-   else
-   lr_val |= (u32)lr_desc.source << GICH_LR_PHYSID_CPUID_SHIFT;
+   break;
+   case  KVM_DEV_TYPE_ARM_VGIC_V2:
+   if (lr_desc.irq < VGIC_NR_SGIS)
+   lr_val |= (u32)lr_desc.source << 
GICH_LR_PHYSID_CPUID_SHIFT;
+   break;
+   default:
+   BUG();
+   }
 
if (lr_desc.state & LR_STATE_PENDING)
lr_val |= ICH_LR_PENDING_BIT;
@@ -95,6 +106,10

[PATCH 03/10] KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields

2015-06-08 Thread Marc Zyngier
As we're about to cram more information in the vgic_lr structure
(HW interrupt number and additional state information), we switch
to a layout similar to the HW's:

- use bitfields to save space (we don't need more than 10 bits
  to represent the irq numbers)
- source CPU and HW interrupt can share the same field, as
  a SGI doesn't have a physical line.

Signed-off-by: Marc Zyngier 
---
 include/kvm/arm_vgic.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 133ea00..4f9fa1d 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -95,11 +95,15 @@ enum vgic_type {
 #define LR_STATE_ACTIVE(1 << 1)
 #define LR_STATE_MASK  (3 << 0)
 #define LR_EOI_INT (1 << 2)
+#define LR_HW  (1 << 3)
 
 struct vgic_lr {
-   u16 irq;
-   u8  source;
-   u8  state;
+   unsigned irq:10;
+   union {
+   unsigned hwirq:10;
+   unsigned source:8;
+   };
+   unsigned state:4;
 };
 
 struct vgic_vmcr {
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 00/10] arm/arm64: KVM: Active interrupt state switching for shared devices

2015-06-08 Thread Marc Zyngier
>From day 1, our timer code has been using a terrible hack: whenever
the guest is scheduled with a timer interrupt pending (i.e. the HW
timer has expired), we restore the timer state with the MASK bit set,
in order to avoid the physical interrupt to fire again. And again. And
again...

This is absolutely silly, for at least two reasons:

- This relies on the device (the timer) having a mask bit that we can
  play with. Not all devices are built like this.

- This expects some behaviour of the guest that only works because the
  both the kernel timer code and the KVM counterpart have been written
  by the same idiot (the idiot being me).

The One True Way is to set the GIC active bit when injecting the
interrupt, and to context-switch across the world switch. This is what
this series implements.

We introduce a relatively simple infrastructure enabling the mapping
of a virtual interrupt with its physical counterpart:

- Whenever an virtual interrupt is injected, we look it up in an
  rbtree. If we have a match, the interrupt is injected with the HW
  bit set in the LR, together with the physical interrupt.

- Across the world switch, we save/restore the active state for these
  interrupts using the irqchip_state API.

- On guest EOI, the HW interrupt is automagically deactivated by the
  GIC, allowing the interrupt to be resampled.

The timer code is slightly modified to set the active state at the
same time as the injection.

The last patch also allows non-shared devices to have their interrupt
deactivated the same way (in this case we do not context-switch the
active state). This is the first step in the long overdue direction of
the mythical IRQ forwarding thing...

This series is based on v4.1-rc7, and has been tested on Juno (GICv2)
and the FVP Base model (GICv3 host, both GICv2 and GICv3 guests). I'd
appreciate any form of testing, specially in the context of guest
migration (there is obviously some interesting stuff there...).

The code is otherwise available at
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git 
kvm-arm64/active-timer

Marc Zyngier (10):
  arm/arm64: KVM: Fix ordering of timer/GIC on guest entry
  arm/arm64: KVM: Move vgic handling to a non-preemptible section
  KVM: arm/arm64: vgic: Convert struct vgic_lr to use bitfields
  KVM: arm/arm64: vgic: Allow HW irq to be encoded in LR
  KVM: arm/arm64: vgic: Relax vgic_can_sample_irq for edge IRQs
  KVM: arm/arm64: vgic: Allow dynamic mapping of physical/virtual
interrupts
  KVM: arm/arm64: vgic: Allow HW interrupts to be queued to a guest
  KVM: arm/arm64: vgic: Add vgic_{get,set}_phys_irq_active
  KVM: arm/arm64: timer: Allow the timer to control the active state
  KVM: arm/arm64: vgic: Allow non-shared device HW interrupts

 arch/arm/kvm/arm.c |  21 +++-
 include/kvm/arm_arch_timer.h   |   3 +
 include/kvm/arm_vgic.h |  31 +-
 include/linux/irqchip/arm-gic-v3.h |   3 +
 include/linux/irqchip/arm-gic.h|   3 +-
 virt/kvm/arm/arch_timer.c  |  13 ++-
 virt/kvm/arm/vgic-v2.c |  16 ++-
 virt/kvm/arm/vgic-v3.c |  21 +++-
 virt/kvm/arm/vgic.c| 206 -
 9 files changed, 300 insertions(+), 17 deletions(-)

-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 02/10] arm/arm64: KVM: Move vgic handling to a non-preemptible section

2015-06-08 Thread Marc Zyngier
As we're about to introduce some serious GIC-poking to the vgic code,
it is important to make sure that we're going to poke the part of
the GIC that belongs to the CPU we're about to run on (otherwise,
we'd end up with some unexpected interrupts firing)...

Introducing a non-preemptible section in kvm_arch_vcpu_ioctl_run
prevents the problem from occuring.

Signed-off-by: Marc Zyngier 
---
 arch/arm/kvm/arm.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 46db690..4986300 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -529,8 +529,18 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (vcpu->arch.pause)
vcpu_pause(vcpu);
 
+   /*
+* Disarming the timer must be done with in a
+* preemptible context, as this call may sleep.
+*/
kvm_timer_flush_hwstate(vcpu);
 
+   /*
+* Preparing the interrupts to be injected also
+* involves poking the GIC, which must be done in a
+* non-preemptible context.
+*/
+   preempt_disable();
kvm_vgic_flush_hwstate(vcpu);
 
local_irq_disable();
@@ -546,6 +556,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
local_irq_enable();
kvm_vgic_sync_hwstate(vcpu);
+   preempt_enable();
kvm_timer_sync_hwstate(vcpu);
continue;
}
@@ -580,6 +591,8 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct 
kvm_run *run)
 
kvm_vgic_sync_hwstate(vcpu);
 
+   preempt_enable();
+
kvm_timer_sync_hwstate(vcpu);
 
ret = handle_exit(vcpu, run, ret);
-- 
2.1.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2] arm/arm64: KVM: Properly account for guest CPU time

2015-06-08 Thread Christoffer Dall
On Fri, Jun 05, 2015 at 05:24:07AM -0700, Mario Smarduch wrote:
> On 06/02/2015 02:27 AM, Christoffer Dall wrote:
> > On Mon, Jun 01, 2015 at 08:48:22AM -0700, Mario Smarduch wrote:
> >> On 05/30/2015 11:59 PM, Christoffer Dall wrote:
> >>> Hi Mario,
> >>>
> >>> On Fri, May 29, 2015 at 03:34:47PM -0700, Mario Smarduch wrote:
>  On 05/28/2015 11:49 AM, Christoffer Dall wrote:
> > Until now we have been calling kvm_guest_exit after re-enabling
> > interrupts when we come back from the guest, but this has the
> > unfortunate effect that CPU time accounting done in the context of timer
> > interrupts occurring while the guest is running doesn't properly notice
> > that the time since the last tick was spent in the guest.
> >
> > Inspired by the comment in the x86 code, move the kvm_guest_exit() call
> > below the local_irq_enable() call and change __kvm_guest_exit() to
> > kvm_guest_exit(), because we are now calling this function with
> > interrupts enabled.  We have to now explicitly disable preemption and
> > not enable preemption before we've called kvm_guest_exit(), since
> > otherwise we could be preempted and everything happening before we
> > eventually get scheduled again would be accounted for as guest time.
> >
> > At the same time, move the trace_kvm_exit() call outside of the atomic
> > section, since there is no reason for us to do that with interrupts
> > disabled.
> >
> > Signed-off-by: Christoffer Dall 
> > ---
> > This patch is based on kvm/queue, because it has the 
> > kvm_guest_enter/exit
> > rework recently posted by Christian Borntraeger.  I hope I got the logic
> > of this right, there were 2 slightly worrying facts about this:
> >
> > First, we now enable and disable and enable interrupts on each exit
> > path, but I couldn't see any performance overhead on hackbench - yes the
> > only benchmark we care about.
> >
> > Second, looking at the ppc and mips code, they seem to also call
> > kvm_guest_exit() before enabling interrupts, so I don't understand how
> > guest CPU time accounting works on those architectures.
> >
> > Changes since v1:
> >  - Tweak comment and commit text based on Marc's feedback.
> >  - Explicitly disable preemption and enable it only after 
> > kvm_guest_exit().
> >
> >  arch/arm/kvm/arm.c | 21 +
> >  1 file changed, 17 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> > index e41cb11..fe8028d 100644
> > --- a/arch/arm/kvm/arm.c
> > +++ b/arch/arm/kvm/arm.c
> > @@ -532,6 +532,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> > struct kvm_run *run)
> > kvm_vgic_flush_hwstate(vcpu);
> > kvm_timer_flush_hwstate(vcpu);
> >  
> > +   preempt_disable();
> > local_irq_disable();
> >  
> > /*
> > @@ -544,6 +545,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> > struct kvm_run *run)
> >  
> > if (ret <= 0 || need_new_vmid_gen(vcpu->kvm)) {
> > local_irq_enable();
> > +   preempt_enable();
> > kvm_timer_sync_hwstate(vcpu);
> > kvm_vgic_sync_hwstate(vcpu);
> > continue;
> > @@ -559,8 +561,10 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> > struct kvm_run *run)
> > ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
> >  
> > vcpu->mode = OUTSIDE_GUEST_MODE;
> > -   __kvm_guest_exit();
> > -   trace_kvm_exit(kvm_vcpu_trap_get_class(vcpu), 
> > *vcpu_pc(vcpu));
> > +   /*
> > +* Back from guest
> > +
> > */
> > +
> > /*
> >  * We may have taken a host interrupt in HYP mode (ie
> >  * while executing the guest). This interrupt is still
> > @@ -574,8 +578,17 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, 
> > struct kvm_run *run)
> > local_irq_enable();
> >  
> > /*
> > -* Back from guest
> > -
> > */
> > +* We do local_irq_enable() before calling 
> > kvm_guest_exit() so
> > +* that if a timer interrupt hits while running the 
> > guest we
> > +* account that tick as being spent in the guest.  We 
> > enable
> > +* preemption after calling kvm_guest_exit() so that if 
> > we get
> > +* preempted we ma

RE: [PATCH 00/13] arm64: KVM: GICv3 ITS emulation

2015-06-08 Thread Pavel Fedin
 Hi!

> I'm afraid this is not enough. A write to GICR_TRANSLATER (DID+EID)
> results in a (LPI,CPU) pair. Can you easily express the CPU part in
> irqfd (this is a genuine question, I'm not familiar enough with that
> part of the core)?

 But... As far as i could understand, LPI is added to a collection as a part of 
setup. And
collection actually represents a destination CPU, doesn't it? And we can't have 
multiple
LPIs sharing the same number and going to different CPUs. Or am i wrong? 
Unfortunately i
don't have GICv3 arch reference manual.

> Another concern
> would be the support of GICv4, which relies on the command queue
> handling to be handled in the kernel

 Wow, i didn't know about GICv4.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


should KVM or userspace be the one which decides what MIPIDR/affinity values to assign to vcpus?

2015-06-08 Thread Peter Maydell
On 8 June 2015 at 11:32, Igor Mammedov  wrote:
> On Thu, 4 Jun 2015 18:17:39 +0100
> Peter Maydell  wrote:
>> On 4 June 2015 at 17:40, Shlomo Pongratz  wrote:
>>  In order for it to work correctly we must use MPIDR values in
>>  the device tree which match the MPIDR values the kernel has picked
>>  for the vcpus, so ask KVM what those values are.

> Could we set QEMU's generated mpidr in kernel instead of pulling it from 
> kernel,
> like we do with APIC ID in x86 and fix kernel not to reset it its own value
> (i.e. untie mpidr from vcpuid)?
>
> Then later we could move setting mpidr from kvm_arch_init_vcpu() into
> board code which should be setting it, since it knows/defines what topology 
> it has.

This is a question better asked on the kvmarm list (which I have cc'd),
because that is where the kernel folks hang out...

-- PMM
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 00/13] arm64: KVM: GICv3 ITS emulation

2015-06-08 Thread Marc Zyngier
Hi Pavel,

On 08/06/15 07:53, Pavel Fedin wrote:
>  Hello everybody!
> 
>> The GICv3 ITS (Interrupt Translation Service) is a part of the
>> ARM GICv3 interrupt controller used for implementing MSIs.
>> It specifies a new kind of interrupts (LPIs), which are mapped to
>> establish a connection between a device, its MSI payload value and
>> the target processor the IRQ is eventually delivered to.
>> In order to allow using MSIs in an ARM64 KVM guest, we emulate this
>> ITS widget in the kernel.
> 
>  I have tested the patch and got some more ideas for future extension...
> 
>  First of all, it would be nice to have a possibility to directly inject LPIs 
> by number.
> This will be useful for irqfd support in qemu.

Well, that poses the question of what we emulate. We expose the
emulation of an ITS, hence no direct access to the LPI space. What we
could do would be allow LPI injection if not ITS is instantiated in the
kernel. But a mix of the two is likely to in contradiction with the
architecture.

>  Next, irqfd support currently poses a problem. We need to somehow know IRQ 
> number from
> MSI-X data (device ID plus event ID). ITS has all this information, so it 
> would be nice to
> be able to query for the translation from within userspace. The question is - 
> how to do
> it? Should we add some ioctl for this purpose? Currently i am experimenting 
> with extra
> KVM_TRANSLATE_MSI ioctl which, given MSI data, would return LPI number.

I'm afraid this is not enough. A write to GICR_TRANSLATER (DID+EID)
results in a (LPI,CPU) pair. Can you easily express the CPU part in
irqfd (this is a genuine question, I'm not familiar enough with that
part of the core)?

>  Actually before your patch came out i have almost done the same thing. But 
> instead i
> decided to implement ITS in qemu while leaving LPI handling to kernel. In 
> this case my
> qemu would have everything needed.
>  By the way, why did you decide to put everything into kernel? Yes, in-kernel 
> emulation is
> faster, but ITS is not accessed frequently.

It may be interesting to find out what would be the implications if we
were to put it in userspace.

The obvious one would be that we'd have to duplicate the code in both
QEMU and kvmtool, and I don't think anyone fancies that. Another concern
would be the support of GICv4, which relies on the command queue
handling to be handled in the kernel (the GICv4 handling is basically a
command translation system, and I'm not prepared to let userspace inject
commands in the host ITS).

Thanks,

M.
-- 
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm