Re: [PATCH] KVM: arm64: fix potential bug

2019-03-11 Thread Gong John
Hi Andre,
On Tue, Mar 12, 2019 at 9:44 AM André Przywara  wrote:
>
> On 12/03/2019 00:32, John Gong wrote:
> Hi,
>
> > Since intid always >= VGIC_NR_PRIVATE_IRQS,
>
> How so? The PMU and the arch timer emulation use PPIs, so intid is
> definitely < VGIC_NR_PRIVATE_IRQS there.
>
> > so then even vcpu == NULL, it never return -EINVAL.
>
> I am not sure I follow.
> To uniquely identify an SPI interrupt, we just need the interrupt ID
> (which is always >= 32). For PPIs and SGIs, we additionally need the
> vCPU ID this private interrupt belongs to, as there are multiple
> interrupts with the same INTID (one per VCPU).
> The VCPU ID passed in for SPIs is just a dummy value (because we use the
> same function to inject private and shared interrupts), so we don't need
> to check for its validity.
>
> Cheers,
> Andre.
>
Thanks for your explanation. It's my fault to not consider the PPIs
and SGIs injection.
Sorry for polluting the mail list.

Cheers,
John Gong
> >
> > Signed-off-by: Shengmin Gong 
> > Signed-off-by: John Gong 
> > ---
> >  virt/kvm/arm/vgic/vgic.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> > index abd9c7352677..d3cb1ce880e2 100644
> > --- a/virt/kvm/arm/vgic/vgic.c
> > +++ b/virt/kvm/arm/vgic/vgic.c
> > @@ -424,7 +424,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, 
> > unsigned int intid,
> >   return ret;
> >
> >   vcpu = kvm_get_vcpu(kvm, cpuid);
> > - if (!vcpu && intid < VGIC_NR_PRIVATE_IRQS)
> > + if (!vcpu)
> >   return -EINVAL;
> >
> >   irq = vgic_get_irq(kvm, vcpu, intid);
> >
>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm64: fix potential bug

2019-03-11 Thread André Przywara
On 12/03/2019 00:32, John Gong wrote:
Hi,

> Since intid always >= VGIC_NR_PRIVATE_IRQS,

How so? The PMU and the arch timer emulation use PPIs, so intid is
definitely < VGIC_NR_PRIVATE_IRQS there.

> so then even vcpu == NULL, it never return -EINVAL.

I am not sure I follow.
To uniquely identify an SPI interrupt, we just need the interrupt ID
(which is always >= 32). For PPIs and SGIs, we additionally need the
vCPU ID this private interrupt belongs to, as there are multiple
interrupts with the same INTID (one per VCPU).
The VCPU ID passed in for SPIs is just a dummy value (because we use the
same function to inject private and shared interrupts), so we don't need
to check for its validity.

Cheers,
Andre.

> 
> Signed-off-by: Shengmin Gong 
> Signed-off-by: John Gong 
> ---
>  virt/kvm/arm/vgic/vgic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index abd9c7352677..d3cb1ce880e2 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -424,7 +424,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, 
> unsigned int intid,
>   return ret;
>  
>   vcpu = kvm_get_vcpu(kvm, cpuid);
> - if (!vcpu && intid < VGIC_NR_PRIVATE_IRQS)
> + if (!vcpu)
>   return -EINVAL;
>  
>   irq = vgic_get_irq(kvm, vcpu, intid);
> 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM: arm64: fix potential bug

2019-03-11 Thread John Gong
Since intid always >= VGIC_NR_PRIVATE_IRQS, so then even vcpu == NULL,
it never return -EINVAL.

Signed-off-by: Shengmin Gong 
Signed-off-by: John Gong 
---
 virt/kvm/arm/vgic/vgic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index abd9c7352677..d3cb1ce880e2 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -424,7 +424,7 @@ int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, 
unsigned int intid,
return ret;
 
vcpu = kvm_get_vcpu(kvm, cpuid);
-   if (!vcpu && intid < VGIC_NR_PRIVATE_IRQS)
+   if (!vcpu)
return -EINVAL;
 
irq = vgic_get_irq(kvm, vcpu, intid);
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.14 11/27] KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded

2019-03-11 Thread Sasha Levin
From: Christoffer Dall 

[ Upstream commit e761a927bc9a7ee6ceb7c4f63d5922dbced87f0d ]

We have two ways to reset a vcpu:
- either through VCPU_INIT
- or through a PSCI_ON call

The first one is easy to reason about. The second one is implemented
in a more bizarre way, as it is the vcpu that handles PSCI_ON that
resets the vcpu that is being powered-on. As we need to turn the logic
around and have the target vcpu to reset itself, we must take some
preliminary steps.

Resetting the VCPU state modifies the system register state in memory,
but this may interact with vcpu_load/vcpu_put if running with preemption
disabled, which in turn may lead to corrupted system register state.

Address this by disabling preemption and doing put/load if required
around the reset logic.

Reviewed-by: Andrew Jones 
Signed-off-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 arch/arm64/kvm/reset.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index a74311beda35..c1c5a57249d2 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -95,16 +95,33 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, 
long ext)
  * This function finds the right table above and sets the registers on
  * the virtual CPU struct to their architecturally defined reset
  * values.
+ *
+ * Note: This function can be called from two paths: The KVM_ARM_VCPU_INIT
+ * ioctl or as part of handling a request issued by another VCPU in the PSCI
+ * handling code.  In the first case, the VCPU will not be loaded, and in the
+ * second case the VCPU will be loaded.  Because this function operates purely
+ * on the memory-backed valus of system registers, we want to do a full put if
+ * we were loaded (handling a request) and load the values back at the end of
+ * the function.  Otherwise we leave the state alone.  In both cases, we
+ * disable preemption around the vcpu reset as we would otherwise race with
+ * preempt notifiers which also call put/load.
  */
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
const struct kvm_regs *cpu_reset;
+   int ret = -EINVAL;
+   bool loaded;
+
+   preempt_disable();
+   loaded = (vcpu->cpu != -1);
+   if (loaded)
+   kvm_arch_vcpu_put(vcpu);
 
switch (vcpu->arch.target) {
default:
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
if (!cpu_has_32bit_el1())
-   return -EINVAL;
+   goto out;
cpu_reset = &default_regs_reset32;
} else {
cpu_reset = &default_regs_reset;
@@ -127,5 +144,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
 
/* Reset timer */
-   return kvm_timer_vcpu_reset(vcpu);
+   ret = kvm_timer_vcpu_reset(vcpu);
+out:
+   if (loaded)
+   kvm_arch_vcpu_load(vcpu, smp_processor_id());
+   preempt_enable();
+   return ret;
 }
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.19 19/44] KVM: arm/arm64: vgic: Always initialize the group of private IRQs

2019-03-11 Thread Sasha Levin
From: Christoffer Dall 

[ Upstream commit ab2d5eb03dbb7b37a1c6356686fb48626ab0c93e ]

We currently initialize the group of private IRQs during
kvm_vgic_vcpu_init, and the value of the group depends on the GIC model
we are emulating.  However, CPUs created before creating (and
initializing) the VGIC might end up with the wrong group if the VGIC
is created as GICv3 later.

Since we have no enforced ordering of creating the VGIC and creating
VCPUs, we can end up with part the VCPUs being properly intialized and
the remaining incorrectly initialized.  That also means that we have no
single place to do the per-cpu data structure initialization which
depends on knowing the emulated GIC model (which is only the group
field).

This patch removes the incorrect comment from kvm_vgic_vcpu_init and
initializes the group of all previously created VCPUs's private
interrupts in vgic_init in addition to the existing initialization in
kvm_vgic_vcpu_init.

Signed-off-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 virt/kvm/arm/vgic/vgic-init.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 33e7ee814f7b..8196e4f8731f 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -231,13 +231,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->config = VGIC_CONFIG_LEVEL;
}
 
-   /*
-* GICv3 can only be created via the KVM_DEVICE_CREATE API and
-* so we always know the emulation type at this point as it's
-* either explicitly configured as GICv3, or explicitly
-* configured as GICv2, or not configured yet which also
-* implies GICv2.
-*/
if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
irq->group = 1;
else
@@ -281,7 +274,7 @@ int vgic_init(struct kvm *kvm)
 {
struct vgic_dist *dist = &kvm->arch.vgic;
struct kvm_vcpu *vcpu;
-   int ret = 0, i;
+   int ret = 0, i, idx;
 
if (vgic_initialized(kvm))
return 0;
@@ -298,6 +291,19 @@ int vgic_init(struct kvm *kvm)
if (ret)
goto out;
 
+   /* Initialize groups on CPUs created before the VGIC type was known */
+   kvm_for_each_vcpu(idx, vcpu, kvm) {
+   struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+
+   for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) {
+   struct vgic_irq *irq = &vgic_cpu->private_irqs[i];
+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+   irq->group = 1;
+   else
+   irq->group = 0;
+   }
+   }
+
if (vgic_has_its(kvm)) {
ret = vgic_v4_init(kvm);
if (ret)
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.19 18/44] arm/arm64: KVM: Don't panic on failure to properly reset system registers

2019-03-11 Thread Sasha Levin
From: Marc Zyngier 

[ Upstream commit 20589c8cc47dce5854c8bf1b44a9fc63d798d26d ]

Failing to properly reset system registers is pretty bad. But not
quite as bad as bringing the whole machine down... So warn loudly,
but slightly more gracefully.

Signed-off-by: Marc Zyngier 
Acked-by: Christoffer Dall 
Signed-off-by: Sasha Levin 
---
 arch/arm/kvm/coproc.c | 4 ++--
 arch/arm64/kvm/sys_regs.c | 8 +---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index cb094e55dc5f..fd6cde23bb5d 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -1450,6 +1450,6 @@ void kvm_reset_coprocs(struct kvm_vcpu *vcpu)
reset_coproc_regs(vcpu, table, num);
 
for (num = 1; num < NR_CP15_REGS; num++)
-   if (vcpu_cp15(vcpu, num) == 0x42424242)
-   panic("Didn't reset vcpu_cp15(vcpu, %zi)", num);
+   WARN(vcpu_cp15(vcpu, num) == 0x42424242,
+"Didn't reset vcpu_cp15(vcpu, %zi)", num);
 }
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 22fbbdbece3c..fe18e68f9a20 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2586,7 +2586,9 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
table = get_target_table(vcpu->arch.target, true, &num);
reset_sys_reg_descs(vcpu, table, num);
 
-   for (num = 1; num < NR_SYS_REGS; num++)
-   if (__vcpu_sys_reg(vcpu, num) == 0x4242424242424242)
-   panic("Didn't reset __vcpu_sys_reg(%zi)", num);
+   for (num = 1; num < NR_SYS_REGS; num++) {
+   if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+"Didn't reset __vcpu_sys_reg(%zi)\n", num))
+   break;
+   }
 }
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.19 07/44] KVM: arm/arm64: vgic: Make vgic_dist->lpi_list_lock a raw_spinlock

2019-03-11 Thread Sasha Levin
From: Julien Thierry 

[ Upstream commit fc3bc475231e12e9c0142f60100cf84d077c79e1 ]

vgic_dist->lpi_list_lock must always be taken with interrupts disabled as
it is used in interrupt context.

For configurations such as PREEMPT_RT_FULL, this means that it should
be a raw_spinlock since RT spinlocks are interruptible.

Signed-off-by: Julien Thierry 
Acked-by: Christoffer Dall 
Acked-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
Signed-off-by: Sasha Levin 
---
 include/kvm/arm_vgic.h|  2 +-
 virt/kvm/arm/vgic/vgic-init.c |  2 +-
 virt/kvm/arm/vgic/vgic-its.c  |  8 
 virt/kvm/arm/vgic/vgic.c  | 10 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4f31f96bbfab..90ac450745f1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -256,7 +256,7 @@ struct vgic_dist {
u64 propbaser;
 
/* Protects the lpi_list and the count value below. */
-   spinlock_t  lpi_list_lock;
+   raw_spinlock_t  lpi_list_lock;
struct list_headlpi_list_head;
int lpi_list_count;
 
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index c0c0b88af1d5..33e7ee814f7b 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -64,7 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
struct vgic_dist *dist = &kvm->arch.vgic;
 
INIT_LIST_HEAD(&dist->lpi_list_head);
-   spin_lock_init(&dist->lpi_list_lock);
+   raw_spin_lock_init(&dist->lpi_list_lock);
 }
 
 /* CREATION */
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 12502251727e..f376c82afb61 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -73,7 +73,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 
intid,
irq->target_vcpu = vcpu;
irq->group = 1;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
 
/*
 * There could be a race with another vgic_add_lpi(), so we need to
@@ -101,7 +101,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 
intid,
dist->lpi_list_count++;
 
 out_unlock:
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
/*
 * We "cache" the configuration table entries in our struct vgic_irq's.
@@ -339,7 +339,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu 
*vcpu, u32 **intid_ptr)
if (!intids)
return -ENOMEM;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
if (i == irq_count)
break;
@@ -348,7 +348,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu 
*vcpu, u32 **intid_ptr)
continue;
intids[i++] = irq->intid;
}
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
*intid_ptr = intids;
return i;
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index f884a54b2601..c5165e3b80cb 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -72,7 +72,7 @@ static struct vgic_irq *vgic_get_lpi(struct kvm *kvm, u32 
intid)
struct vgic_irq *irq = NULL;
unsigned long flags;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
 
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
if (irq->intid != intid)
@@ -88,7 +88,7 @@ static struct vgic_irq *vgic_get_lpi(struct kvm *kvm, u32 
intid)
irq = NULL;
 
 out_unlock:
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
return irq;
 }
@@ -138,15 +138,15 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
if (irq->intid < VGIC_MIN_LPI)
return;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
if (!kref_put(&irq->refcount, vgic_irq_release)) {
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
return;
};
 
list_del(&irq->lpi_list);
dist->lpi_list_count--;
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
kfree(irq);
 }
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.19 17/44] arm/arm64: KVM: Allow a VCPU to fully reset itself

2019-03-11 Thread Sasha Levin
From: Marc Zyngier 

[ Upstream commit 358b28f09f0ab074d781df72b8a671edb1547789 ]

The current kvm_psci_vcpu_on implementation will directly try to
manipulate the state of the VCPU to reset it.  However, since this is
not done on the thread that runs the VCPU, we can end up in a strangely
corrupted state when the source and target VCPUs are running at the same
time.

Fix this by factoring out all reset logic from the PSCI implementation
and forwarding the required information along with a request to the
target VCPU.

Reviewed-by: Andrew Jones 
Signed-off-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
Signed-off-by: Sasha Levin 
---
 arch/arm/include/asm/kvm_host.h   | 10 +
 arch/arm/kvm/reset.c  | 24 +
 arch/arm64/include/asm/kvm_host.h | 11 ++
 arch/arm64/kvm/reset.c| 24 +
 virt/kvm/arm/arm.c| 10 +
 virt/kvm/arm/psci.c   | 36 ++-
 6 files changed, 95 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 3ad482d2f1eb..d0d0227fc70d 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -48,6 +48,7 @@
 #define KVM_REQ_SLEEP \
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
+#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -147,6 +148,13 @@ struct kvm_cpu_context {
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
 
+struct vcpu_reset_state {
+   unsigned long   pc;
+   unsigned long   r0;
+   boolbe;
+   boolreset;
+};
+
 struct kvm_vcpu_arch {
struct kvm_cpu_context ctxt;
 
@@ -186,6 +194,8 @@ struct kvm_vcpu_arch {
/* Cache some mmu pages needed inside spinlock regions */
struct kvm_mmu_memory_cache mmu_page_cache;
 
+   struct vcpu_reset_state reset_state;
+
/* Detect first run of a vcpu */
bool has_run_once;
 };
diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
index 5ed0c3ee33d6..e53327912adc 100644
--- a/arch/arm/kvm/reset.c
+++ b/arch/arm/kvm/reset.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -69,6 +70,29 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
/* Reset CP15 registers */
kvm_reset_coprocs(vcpu);
 
+   /*
+* Additional reset state handling that PSCI may have imposed on us.
+* Must be done after all the sys_reg reset.
+*/
+   if (READ_ONCE(vcpu->arch.reset_state.reset)) {
+   unsigned long target_pc = vcpu->arch.reset_state.pc;
+
+   /* Gracefully handle Thumb2 entry point */
+   if (target_pc & 1) {
+   target_pc &= ~1UL;
+   vcpu_set_thumb(vcpu);
+   }
+
+   /* Propagate caller endianness */
+   if (vcpu->arch.reset_state.be)
+   kvm_vcpu_set_be(vcpu);
+
+   *vcpu_pc(vcpu) = target_pc;
+   vcpu_set_reg(vcpu, 0, vcpu->arch.reset_state.r0);
+
+   vcpu->arch.reset_state.reset = false;
+   }
+
/* Reset arch_timer context */
return kvm_timer_vcpu_reset(vcpu);
 }
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 3d6d7336f871..6abe4002945f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -48,6 +48,7 @@
 #define KVM_REQ_SLEEP \
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
+#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -206,6 +207,13 @@ struct kvm_cpu_context {
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
 
+struct vcpu_reset_state {
+   unsigned long   pc;
+   unsigned long   r0;
+   boolbe;
+   boolreset;
+};
+
 struct kvm_vcpu_arch {
struct kvm_cpu_context ctxt;
 
@@ -295,6 +303,9 @@ struct kvm_vcpu_arch {
/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
u64 vsesr_el2;
 
+   /* Additional reset state */
+   struct vcpu_reset_state reset_state;
+
/* True when deferrable sysregs are loaded on the physical CPU,
 * see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */
bool sysregs_loaded_on_cpu;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 644dd0050766..18b9a522a2b3 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /*
@@ -140,6 +141,29 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
/* Reset system registers */
kvm_reset_sys_regs(vcpu);
 
+   /*
+* Additional reset state handling that PSCI may have imposed on us.
+ 

[PATCH AUTOSEL 4.19 16/44] KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded

2019-03-11 Thread Sasha Levin
From: Christoffer Dall 

[ Upstream commit e761a927bc9a7ee6ceb7c4f63d5922dbced87f0d ]

We have two ways to reset a vcpu:
- either through VCPU_INIT
- or through a PSCI_ON call

The first one is easy to reason about. The second one is implemented
in a more bizarre way, as it is the vcpu that handles PSCI_ON that
resets the vcpu that is being powered-on. As we need to turn the logic
around and have the target vcpu to reset itself, we must take some
preliminary steps.

Resetting the VCPU state modifies the system register state in memory,
but this may interact with vcpu_load/vcpu_put if running with preemption
disabled, which in turn may lead to corrupted system register state.

Address this by disabling preemption and doing put/load if required
around the reset logic.

Reviewed-by: Andrew Jones 
Signed-off-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 arch/arm64/kvm/reset.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index e37c78bbe1ca..644dd0050766 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -99,16 +99,33 @@ int kvm_arch_dev_ioctl_check_extension(struct kvm *kvm, 
long ext)
  * This function finds the right table above and sets the registers on
  * the virtual CPU struct to their architecturally defined reset
  * values.
+ *
+ * Note: This function can be called from two paths: The KVM_ARM_VCPU_INIT
+ * ioctl or as part of handling a request issued by another VCPU in the PSCI
+ * handling code.  In the first case, the VCPU will not be loaded, and in the
+ * second case the VCPU will be loaded.  Because this function operates purely
+ * on the memory-backed valus of system registers, we want to do a full put if
+ * we were loaded (handling a request) and load the values back at the end of
+ * the function.  Otherwise we leave the state alone.  In both cases, we
+ * disable preemption around the vcpu reset as we would otherwise race with
+ * preempt notifiers which also call put/load.
  */
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
const struct kvm_regs *cpu_reset;
+   int ret = -EINVAL;
+   bool loaded;
+
+   preempt_disable();
+   loaded = (vcpu->cpu != -1);
+   if (loaded)
+   kvm_arch_vcpu_put(vcpu);
 
switch (vcpu->arch.target) {
default:
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
if (!cpu_has_32bit_el1())
-   return -EINVAL;
+   goto out;
cpu_reset = &default_regs_reset32;
} else {
cpu_reset = &default_regs_reset;
@@ -131,5 +148,10 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
 
/* Reset timer */
-   return kvm_timer_vcpu_reset(vcpu);
+   ret = kvm_timer_vcpu_reset(vcpu);
+out:
+   if (loaded)
+   kvm_arch_vcpu_load(vcpu, smp_processor_id());
+   preempt_enable();
+   return ret;
 }
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.19 20/44] KVM: arm64: Forbid kprobing of the VHE world-switch code

2019-03-11 Thread Sasha Levin
From: James Morse 

[ Upstream commit 7d82602909ed9c73b34ad26f05d10db4850a4f8c ]

On systems with VHE the kernel and KVM's world-switch code run at the
same exception level. Code that is only used on a VHE system does not
need to be annotated as __hyp_text as it can reside anywhere in the
kernel text.

__hyp_text was also used to prevent kprobes from patching breakpoint
instructions into this region, as this code runs at a different
exception level. While this is no longer true with VHE, KVM still
switches VBAR_EL1, meaning a kprobe's breakpoint executed in the
world-switch code will cause a hyp-panic.

echo "p:weasel sysreg_save_guest_state_vhe" > 
/sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/weasel/enable
lkvm run -k /boot/Image --console serial -p "console=ttyS0 
earlycon=uart,mmio,0x3f8"

  # lkvm run -k /boot/Image -m 384 -c 3 --name guest-1474
  Info: Placing fdt at 0x8fe0 - 0x8fff
  Info: virtio-mmio.devices=0x200@0x1:36

  Info: virtio-mmio.devices=0x200@0x10200:37

  Info: virtio-mmio.devices=0x200@0x10400:38

[  614.178186] Kernel panic - not syncing: HYP panic:
[  614.178186] PS:404003c9 PC:100d70e0 ESR:f204
[  614.178186] FAR:8008 HPFAR:00800800 PAR:1d7edbadc0de
[  614.178186] VCPU:f8de32f1
[  614.178383] CPU: 2 PID: 1482 Comm: kvm-vcpu-0 Not tainted 5.0.0-rc2 #10799
[  614.178446] Call trace:
[  614.178480]  dump_backtrace+0x0/0x148
[  614.178567]  show_stack+0x24/0x30
[  614.178658]  dump_stack+0x90/0xb4
[  614.178710]  panic+0x13c/0x2d8
[  614.178793]  hyp_panic+0xac/0xd8
[  614.178880]  kvm_vcpu_run_vhe+0x9c/0xe0
[  614.178958]  kvm_arch_vcpu_ioctl_run+0x454/0x798
[  614.179038]  kvm_vcpu_ioctl+0x360/0x898
[  614.179087]  do_vfs_ioctl+0xc4/0x858
[  614.179174]  ksys_ioctl+0x84/0xb8
[  614.179261]  __arm64_sys_ioctl+0x28/0x38
[  614.179348]  el0_svc_common+0x94/0x108
[  614.179401]  el0_svc_handler+0x38/0x78
[  614.179487]  el0_svc+0x8/0xc
[  614.179558] SMP: stopping secondary CPUs
[  614.179661] Kernel Offset: disabled
[  614.179695] CPU features: 0x003,2a80aa38
[  614.179758] Memory Limit: none
[  614.179858] ---[ end Kernel panic - not syncing: HYP panic:
[  614.179858] PS:404003c9 PC:100d70e0 ESR:f204
[  614.179858] FAR:8008 HPFAR:00800800 PAR:1d7edbadc0de
[  614.179858] VCPU:f8de32f1 ]---

Annotate the VHE world-switch functions that aren't marked
__hyp_text using NOKPROBE_SYMBOL().

Signed-off-by: James Morse 
Fixes: 3f5c90b890ac ("KVM: arm64: Introduce VHE-specific kvm_vcpu_run")
Acked-by: Masami Hiramatsu 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 arch/arm64/kvm/hyp/switch.c| 5 +
 arch/arm64/kvm/hyp/sysreg-sr.c | 5 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index a1c32c1f2267..6290a4e81d57 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -23,6 +23,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -107,6 +108,7 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 
write_sysreg(kvm_get_hyp_vector(), vbar_el1);
 }
+NOKPROBE_SYMBOL(activate_traps_vhe);
 
 static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 {
@@ -146,6 +148,7 @@ static void deactivate_traps_vhe(void)
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
write_sysreg(vectors, vbar_el1);
 }
+NOKPROBE_SYMBOL(deactivate_traps_vhe);
 
 static void __hyp_text __deactivate_traps_nvhe(void)
 {
@@ -529,6 +532,7 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
return exit_code;
 }
+NOKPROBE_SYMBOL(kvm_vcpu_run_vhe);
 
 /* Switch to the guest for legacy non-VHE systems */
 int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
@@ -636,6 +640,7 @@ static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
  read_sysreg_el2(esr),   read_sysreg_el2(far),
  read_sysreg(hpfar_el2), par, vcpu);
 }
+NOKPROBE_SYMBOL(__hyp_call_panic_vhe);
 
 void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *host_ctxt)
 {
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 9ce223944983..963d669ae3a2 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -98,12 +99,14 @@ void sysreg_save_host_state_vhe(struct kvm_cpu_context 
*ctxt)
 {
__sysreg_save_common_state(ctxt);
 }
+NOKPROBE_SYMBOL(sysreg_save_host_state_vhe);
 
 void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt)
 {
__sysreg_save_common_state(ctxt);
__sysreg_save_el2_return_state(ctxt);
 }
+NOKPROBE_SYMBOL(sysreg_save_guest_state_vhe);
 
 static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context 
*ctxt)
 {
@@ -171,12 +174,14 @@ void sysreg_restore_host_state_vhe(struct kvm_cpu_context 
*ctxt)
 {
__sysreg_restore_common_sta

[PATCH AUTOSEL 4.20 18/52] KVM: arm/arm64: Reset the VCPU without preemption and vcpu state loaded

2019-03-11 Thread Sasha Levin
From: Christoffer Dall 

[ Upstream commit e761a927bc9a7ee6ceb7c4f63d5922dbced87f0d ]

We have two ways to reset a vcpu:
- either through VCPU_INIT
- or through a PSCI_ON call

The first one is easy to reason about. The second one is implemented
in a more bizarre way, as it is the vcpu that handles PSCI_ON that
resets the vcpu that is being powered-on. As we need to turn the logic
around and have the target vcpu to reset itself, we must take some
preliminary steps.

Resetting the VCPU state modifies the system register state in memory,
but this may interact with vcpu_load/vcpu_put if running with preemption
disabled, which in turn may lead to corrupted system register state.

Address this by disabling preemption and doing put/load if required
around the reset logic.

Reviewed-by: Andrew Jones 
Signed-off-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 arch/arm64/kvm/reset.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index b72a3dd56204..f21a2a575939 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -105,16 +105,33 @@ int kvm_arch_vm_ioctl_check_extension(struct kvm *kvm, 
long ext)
  * This function finds the right table above and sets the registers on
  * the virtual CPU struct to their architecturally defined reset
  * values.
+ *
+ * Note: This function can be called from two paths: The KVM_ARM_VCPU_INIT
+ * ioctl or as part of handling a request issued by another VCPU in the PSCI
+ * handling code.  In the first case, the VCPU will not be loaded, and in the
+ * second case the VCPU will be loaded.  Because this function operates purely
+ * on the memory-backed valus of system registers, we want to do a full put if
+ * we were loaded (handling a request) and load the values back at the end of
+ * the function.  Otherwise we leave the state alone.  In both cases, we
+ * disable preemption around the vcpu reset as we would otherwise race with
+ * preempt notifiers which also call put/load.
  */
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
 {
const struct kvm_regs *cpu_reset;
+   int ret = -EINVAL;
+   bool loaded;
+
+   preempt_disable();
+   loaded = (vcpu->cpu != -1);
+   if (loaded)
+   kvm_arch_vcpu_put(vcpu);
 
switch (vcpu->arch.target) {
default:
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
if (!cpu_has_32bit_el1())
-   return -EINVAL;
+   goto out;
cpu_reset = &default_regs_reset32;
} else {
cpu_reset = &default_regs_reset;
@@ -137,7 +154,12 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
vcpu->arch.workaround_flags |= VCPU_WORKAROUND_2_FLAG;
 
/* Reset timer */
-   return kvm_timer_vcpu_reset(vcpu);
+   ret = kvm_timer_vcpu_reset(vcpu);
+out:
+   if (loaded)
+   kvm_arch_vcpu_load(vcpu, smp_processor_id());
+   preempt_enable();
+   return ret;
 }
 
 void kvm_set_ipa_limit(void)
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.20 21/52] KVM: arm/arm64: vgic: Always initialize the group of private IRQs

2019-03-11 Thread Sasha Levin
From: Christoffer Dall 

[ Upstream commit ab2d5eb03dbb7b37a1c6356686fb48626ab0c93e ]

We currently initialize the group of private IRQs during
kvm_vgic_vcpu_init, and the value of the group depends on the GIC model
we are emulating.  However, CPUs created before creating (and
initializing) the VGIC might end up with the wrong group if the VGIC
is created as GICv3 later.

Since we have no enforced ordering of creating the VGIC and creating
VCPUs, we can end up with part the VCPUs being properly intialized and
the remaining incorrectly initialized.  That also means that we have no
single place to do the per-cpu data structure initialization which
depends on knowing the emulated GIC model (which is only the group
field).

This patch removes the incorrect comment from kvm_vgic_vcpu_init and
initializes the group of all previously created VCPUs's private
interrupts in vgic_init in addition to the existing initialization in
kvm_vgic_vcpu_init.

Signed-off-by: Christoffer Dall 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 virt/kvm/arm/vgic/vgic-init.c | 22 ++
 1 file changed, 14 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 33e7ee814f7b..8196e4f8731f 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -231,13 +231,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu)
irq->config = VGIC_CONFIG_LEVEL;
}
 
-   /*
-* GICv3 can only be created via the KVM_DEVICE_CREATE API and
-* so we always know the emulation type at this point as it's
-* either explicitly configured as GICv3, or explicitly
-* configured as GICv2, or not configured yet which also
-* implies GICv2.
-*/
if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
irq->group = 1;
else
@@ -281,7 +274,7 @@ int vgic_init(struct kvm *kvm)
 {
struct vgic_dist *dist = &kvm->arch.vgic;
struct kvm_vcpu *vcpu;
-   int ret = 0, i;
+   int ret = 0, i, idx;
 
if (vgic_initialized(kvm))
return 0;
@@ -298,6 +291,19 @@ int vgic_init(struct kvm *kvm)
if (ret)
goto out;
 
+   /* Initialize groups on CPUs created before the VGIC type was known */
+   kvm_for_each_vcpu(idx, vcpu, kvm) {
+   struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+
+   for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) {
+   struct vgic_irq *irq = &vgic_cpu->private_irqs[i];
+   if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
+   irq->group = 1;
+   else
+   irq->group = 0;
+   }
+   }
+
if (vgic_has_its(kvm)) {
ret = vgic_v4_init(kvm);
if (ret)
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.20 19/52] arm/arm64: KVM: Allow a VCPU to fully reset itself

2019-03-11 Thread Sasha Levin
From: Marc Zyngier 

[ Upstream commit 358b28f09f0ab074d781df72b8a671edb1547789 ]

The current kvm_psci_vcpu_on implementation will directly try to
manipulate the state of the VCPU to reset it.  However, since this is
not done on the thread that runs the VCPU, we can end up in a strangely
corrupted state when the source and target VCPUs are running at the same
time.

Fix this by factoring out all reset logic from the PSCI implementation
and forwarding the required information along with a request to the
target VCPU.

Reviewed-by: Andrew Jones 
Signed-off-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
Signed-off-by: Sasha Levin 
---
 arch/arm/include/asm/kvm_host.h   | 10 +
 arch/arm/kvm/reset.c  | 24 +
 arch/arm64/include/asm/kvm_host.h | 11 ++
 arch/arm64/kvm/reset.c| 24 +
 virt/kvm/arm/arm.c| 10 +
 virt/kvm/arm/psci.c   | 36 ++-
 6 files changed, 95 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 5ca5d9af0c26..4ecd426e9b1c 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -48,6 +48,7 @@
 #define KVM_REQ_SLEEP \
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
+#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -147,6 +148,13 @@ struct kvm_cpu_context {
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
 
+struct vcpu_reset_state {
+   unsigned long   pc;
+   unsigned long   r0;
+   boolbe;
+   boolreset;
+};
+
 struct kvm_vcpu_arch {
struct kvm_cpu_context ctxt;
 
@@ -186,6 +194,8 @@ struct kvm_vcpu_arch {
/* Cache some mmu pages needed inside spinlock regions */
struct kvm_mmu_memory_cache mmu_page_cache;
 
+   struct vcpu_reset_state reset_state;
+
/* Detect first run of a vcpu */
bool has_run_once;
 };
diff --git a/arch/arm/kvm/reset.c b/arch/arm/kvm/reset.c
index 5ed0c3ee33d6..e53327912adc 100644
--- a/arch/arm/kvm/reset.c
+++ b/arch/arm/kvm/reset.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -69,6 +70,29 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
/* Reset CP15 registers */
kvm_reset_coprocs(vcpu);
 
+   /*
+* Additional reset state handling that PSCI may have imposed on us.
+* Must be done after all the sys_reg reset.
+*/
+   if (READ_ONCE(vcpu->arch.reset_state.reset)) {
+   unsigned long target_pc = vcpu->arch.reset_state.pc;
+
+   /* Gracefully handle Thumb2 entry point */
+   if (target_pc & 1) {
+   target_pc &= ~1UL;
+   vcpu_set_thumb(vcpu);
+   }
+
+   /* Propagate caller endianness */
+   if (vcpu->arch.reset_state.be)
+   kvm_vcpu_set_be(vcpu);
+
+   *vcpu_pc(vcpu) = target_pc;
+   vcpu_set_reg(vcpu, 0, vcpu->arch.reset_state.r0);
+
+   vcpu->arch.reset_state.reset = false;
+   }
+
/* Reset arch_timer context */
return kvm_timer_vcpu_reset(vcpu);
 }
diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 52fbc823ff8c..4fc7c8f0b717 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -48,6 +48,7 @@
 #define KVM_REQ_SLEEP \
KVM_ARCH_REQ_FLAGS(0, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_IRQ_PENDINGKVM_ARCH_REQ(1)
+#define KVM_REQ_VCPU_RESET KVM_ARCH_REQ(2)
 
 DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 
@@ -208,6 +209,13 @@ struct kvm_cpu_context {
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
 
+struct vcpu_reset_state {
+   unsigned long   pc;
+   unsigned long   r0;
+   boolbe;
+   boolreset;
+};
+
 struct kvm_vcpu_arch {
struct kvm_cpu_context ctxt;
 
@@ -297,6 +305,9 @@ struct kvm_vcpu_arch {
/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
u64 vsesr_el2;
 
+   /* Additional reset state */
+   struct vcpu_reset_state reset_state;
+
/* True when deferrable sysregs are loaded on the physical CPU,
 * see kvm_vcpu_load_sysregs and kvm_vcpu_put_sysregs. */
bool sysregs_loaded_on_cpu;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index f21a2a575939..f16a5f8ff2b4 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 /* Maximum phys_shift supported for any VM on this host */
@@ -146,6 +147,29 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu)
/* Reset system registers */
kvm_reset_sys_regs(vcpu);
 
+   /*
+* Additional reset

[PATCH AUTOSEL 4.20 20/52] arm/arm64: KVM: Don't panic on failure to properly reset system registers

2019-03-11 Thread Sasha Levin
From: Marc Zyngier 

[ Upstream commit 20589c8cc47dce5854c8bf1b44a9fc63d798d26d ]

Failing to properly reset system registers is pretty bad. But not
quite as bad as bringing the whole machine down... So warn loudly,
but slightly more gracefully.

Signed-off-by: Marc Zyngier 
Acked-by: Christoffer Dall 
Signed-off-by: Sasha Levin 
---
 arch/arm/kvm/coproc.c | 4 ++--
 arch/arm64/kvm/sys_regs.c | 8 +---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/arm/kvm/coproc.c b/arch/arm/kvm/coproc.c
index cb094e55dc5f..fd6cde23bb5d 100644
--- a/arch/arm/kvm/coproc.c
+++ b/arch/arm/kvm/coproc.c
@@ -1450,6 +1450,6 @@ void kvm_reset_coprocs(struct kvm_vcpu *vcpu)
reset_coproc_regs(vcpu, table, num);
 
for (num = 1; num < NR_CP15_REGS; num++)
-   if (vcpu_cp15(vcpu, num) == 0x42424242)
-   panic("Didn't reset vcpu_cp15(vcpu, %zi)", num);
+   WARN(vcpu_cp15(vcpu, num) == 0x42424242,
+"Didn't reset vcpu_cp15(vcpu, %zi)", num);
 }
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 22fbbdbece3c..fe18e68f9a20 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2586,7 +2586,9 @@ void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
table = get_target_table(vcpu->arch.target, true, &num);
reset_sys_reg_descs(vcpu, table, num);
 
-   for (num = 1; num < NR_SYS_REGS; num++)
-   if (__vcpu_sys_reg(vcpu, num) == 0x4242424242424242)
-   panic("Didn't reset __vcpu_sys_reg(%zi)", num);
+   for (num = 1; num < NR_SYS_REGS; num++) {
+   if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+"Didn't reset __vcpu_sys_reg(%zi)\n", num))
+   break;
+   }
 }
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH AUTOSEL 4.20 22/52] KVM: arm64: Forbid kprobing of the VHE world-switch code

2019-03-11 Thread Sasha Levin
From: James Morse 

[ Upstream commit 7d82602909ed9c73b34ad26f05d10db4850a4f8c ]

On systems with VHE the kernel and KVM's world-switch code run at the
same exception level. Code that is only used on a VHE system does not
need to be annotated as __hyp_text as it can reside anywhere in the
kernel text.

__hyp_text was also used to prevent kprobes from patching breakpoint
instructions into this region, as this code runs at a different
exception level. While this is no longer true with VHE, KVM still
switches VBAR_EL1, meaning a kprobe's breakpoint executed in the
world-switch code will cause a hyp-panic.

echo "p:weasel sysreg_save_guest_state_vhe" > 
/sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/weasel/enable
lkvm run -k /boot/Image --console serial -p "console=ttyS0 
earlycon=uart,mmio,0x3f8"

  # lkvm run -k /boot/Image -m 384 -c 3 --name guest-1474
  Info: Placing fdt at 0x8fe0 - 0x8fff
  Info: virtio-mmio.devices=0x200@0x1:36

  Info: virtio-mmio.devices=0x200@0x10200:37

  Info: virtio-mmio.devices=0x200@0x10400:38

[  614.178186] Kernel panic - not syncing: HYP panic:
[  614.178186] PS:404003c9 PC:100d70e0 ESR:f204
[  614.178186] FAR:8008 HPFAR:00800800 PAR:1d7edbadc0de
[  614.178186] VCPU:f8de32f1
[  614.178383] CPU: 2 PID: 1482 Comm: kvm-vcpu-0 Not tainted 5.0.0-rc2 #10799
[  614.178446] Call trace:
[  614.178480]  dump_backtrace+0x0/0x148
[  614.178567]  show_stack+0x24/0x30
[  614.178658]  dump_stack+0x90/0xb4
[  614.178710]  panic+0x13c/0x2d8
[  614.178793]  hyp_panic+0xac/0xd8
[  614.178880]  kvm_vcpu_run_vhe+0x9c/0xe0
[  614.178958]  kvm_arch_vcpu_ioctl_run+0x454/0x798
[  614.179038]  kvm_vcpu_ioctl+0x360/0x898
[  614.179087]  do_vfs_ioctl+0xc4/0x858
[  614.179174]  ksys_ioctl+0x84/0xb8
[  614.179261]  __arm64_sys_ioctl+0x28/0x38
[  614.179348]  el0_svc_common+0x94/0x108
[  614.179401]  el0_svc_handler+0x38/0x78
[  614.179487]  el0_svc+0x8/0xc
[  614.179558] SMP: stopping secondary CPUs
[  614.179661] Kernel Offset: disabled
[  614.179695] CPU features: 0x003,2a80aa38
[  614.179758] Memory Limit: none
[  614.179858] ---[ end Kernel panic - not syncing: HYP panic:
[  614.179858] PS:404003c9 PC:100d70e0 ESR:f204
[  614.179858] FAR:8008 HPFAR:00800800 PAR:1d7edbadc0de
[  614.179858] VCPU:f8de32f1 ]---

Annotate the VHE world-switch functions that aren't marked
__hyp_text using NOKPROBE_SYMBOL().

Signed-off-by: James Morse 
Fixes: 3f5c90b890ac ("KVM: arm64: Introduce VHE-specific kvm_vcpu_run")
Acked-by: Masami Hiramatsu 
Signed-off-by: Marc Zyngier 
Signed-off-by: Sasha Levin 
---
 arch/arm64/kvm/hyp/switch.c| 5 +
 arch/arm64/kvm/hyp/sysreg-sr.c | 5 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index f6e02cc4d856..c9f4b25f67d9 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -23,6 +23,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -107,6 +108,7 @@ static void activate_traps_vhe(struct kvm_vcpu *vcpu)
 
write_sysreg(kvm_get_hyp_vector(), vbar_el1);
 }
+NOKPROBE_SYMBOL(activate_traps_vhe);
 
 static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
 {
@@ -146,6 +148,7 @@ static void deactivate_traps_vhe(void)
write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
write_sysreg(vectors, vbar_el1);
 }
+NOKPROBE_SYMBOL(deactivate_traps_vhe);
 
 static void __hyp_text __deactivate_traps_nvhe(void)
 {
@@ -529,6 +532,7 @@ int kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
return exit_code;
 }
+NOKPROBE_SYMBOL(kvm_vcpu_run_vhe);
 
 /* Switch to the guest for legacy non-VHE systems */
 int __hyp_text __kvm_vcpu_run_nvhe(struct kvm_vcpu *vcpu)
@@ -636,6 +640,7 @@ static void __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
  read_sysreg_el2(esr),   read_sysreg_el2(far),
  read_sysreg(hpfar_el2), par, vcpu);
 }
+NOKPROBE_SYMBOL(__hyp_call_panic_vhe);
 
 void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *host_ctxt)
 {
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 68d6f7c3b237..b426e2cf973c 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -98,12 +99,14 @@ void sysreg_save_host_state_vhe(struct kvm_cpu_context 
*ctxt)
 {
__sysreg_save_common_state(ctxt);
 }
+NOKPROBE_SYMBOL(sysreg_save_host_state_vhe);
 
 void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt)
 {
__sysreg_save_common_state(ctxt);
__sysreg_save_el2_return_state(ctxt);
 }
+NOKPROBE_SYMBOL(sysreg_save_guest_state_vhe);
 
 static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context 
*ctxt)
 {
@@ -188,12 +191,14 @@ void sysreg_restore_host_state_vhe(struct kvm_cpu_context 
*ctxt)
 {
__sysreg_restore_common_sta

[PATCH AUTOSEL 4.20 07/52] KVM: arm/arm64: vgic: Make vgic_dist->lpi_list_lock a raw_spinlock

2019-03-11 Thread Sasha Levin
From: Julien Thierry 

[ Upstream commit fc3bc475231e12e9c0142f60100cf84d077c79e1 ]

vgic_dist->lpi_list_lock must always be taken with interrupts disabled as
it is used in interrupt context.

For configurations such as PREEMPT_RT_FULL, this means that it should
be a raw_spinlock since RT spinlocks are interruptible.

Signed-off-by: Julien Thierry 
Acked-by: Christoffer Dall 
Acked-by: Marc Zyngier 
Signed-off-by: Christoffer Dall 
Signed-off-by: Sasha Levin 
---
 include/kvm/arm_vgic.h|  2 +-
 virt/kvm/arm/vgic/vgic-init.c |  2 +-
 virt/kvm/arm/vgic/vgic-its.c  |  8 
 virt/kvm/arm/vgic/vgic.c  | 10 +-
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 4f31f96bbfab..90ac450745f1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -256,7 +256,7 @@ struct vgic_dist {
u64 propbaser;
 
/* Protects the lpi_list and the count value below. */
-   spinlock_t  lpi_list_lock;
+   raw_spinlock_t  lpi_list_lock;
struct list_headlpi_list_head;
int lpi_list_count;
 
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index c0c0b88af1d5..33e7ee814f7b 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -64,7 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
struct vgic_dist *dist = &kvm->arch.vgic;
 
INIT_LIST_HEAD(&dist->lpi_list_head);
-   spin_lock_init(&dist->lpi_list_lock);
+   raw_spin_lock_init(&dist->lpi_list_lock);
 }
 
 /* CREATION */
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index eb2a390a6c86..e99fb31582fb 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -73,7 +73,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 
intid,
irq->target_vcpu = vcpu;
irq->group = 1;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
 
/*
 * There could be a race with another vgic_add_lpi(), so we need to
@@ -101,7 +101,7 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 
intid,
dist->lpi_list_count++;
 
 out_unlock:
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
/*
 * We "cache" the configuration table entries in our struct vgic_irq's.
@@ -332,7 +332,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu 
*vcpu, u32 **intid_ptr)
if (!intids)
return -ENOMEM;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
if (i == irq_count)
break;
@@ -341,7 +341,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu 
*vcpu, u32 **intid_ptr)
continue;
intids[i++] = irq->intid;
}
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
*intid_ptr = intids;
return i;
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index f884a54b2601..c5165e3b80cb 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -72,7 +72,7 @@ static struct vgic_irq *vgic_get_lpi(struct kvm *kvm, u32 
intid)
struct vgic_irq *irq = NULL;
unsigned long flags;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
 
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
if (irq->intid != intid)
@@ -88,7 +88,7 @@ static struct vgic_irq *vgic_get_lpi(struct kvm *kvm, u32 
intid)
irq = NULL;
 
 out_unlock:
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
return irq;
 }
@@ -138,15 +138,15 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
if (irq->intid < VGIC_MIN_LPI)
return;
 
-   spin_lock_irqsave(&dist->lpi_list_lock, flags);
+   raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
if (!kref_put(&irq->refcount, vgic_irq_release)) {
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
return;
};
 
list_del(&irq->lpi_list);
dist->lpi_list_count--;
-   spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+   raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 
kfree(irq);
 }
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[RFC] Question about TLB flush while set Stage-2 huge pages

2019-03-11 Thread Zheng Xiang
Hi all,

While a page is merged into a transparent huge page, KVM will invalidate 
Stage-2 for
the base address of the huge page and the whole of Stage-1.
However, this just only invalidates the first page within the huge page and the 
other
pages are not invalidated, see bellow:

+---+--+
|abcde   2MB-Page  |
+---+--+

TLB before setting new pmd:
+---+--+
|  VA   |PAGESIZE  |
+---+--+
|  a|  4KB |
+---+--+
|  b|  4KB |
+---+--+
|  c|  4KB |
+---+--+
|  d|  4KB |
+---+--+

TLB after setting new pmd:
+---+--+
|  VA   |PAGESIZE  |
+---+--+
|  a|  2MB |
+---+--+
|  b|  4KB |
+---+--+
|  c|  4KB |
+---+--+
|  d|  4KB |
+---+--+

When VM access *b* address, it will hit the TLB and result in TLB conflict 
aborts or other potential exceptions.

For example, we need to keep tracking of the VM memory dirty pages when VM is 
in live migration.
KVM will set the memslot READONLY and split the huge pages.
After live migration is canceled and abort, the pages will be merged into THP.
The later access to these pages which are READONLY will cause level-3 
Permission Fault until they are invalidated.

So should we invalidate the tlb entries for all relative pages(e.g a,b,c,d), 
like __flush_tlb_range()?
Or we can call __kvm_tlb_flush_vmid() to invalidate all tlb entries.



-- 

Thanks,
Xiang

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: Question: KVM: Failed to bind vfio with PCI-e / SMMU on Juno-r2

2019-03-11 Thread Leo Yan
Hi Eric,

[ + Mark Rutland ]

On Mon, Mar 11, 2019 at 10:47:22AM +0100, Auger Eric wrote:

[...]

> >>> P.s. I also checked the sysfs node and found it doesn't contain node
> >>> 'iommu_group':
> >>>
> >>> # ls /sys/bus/pci/devices/\:08\:00.0/iommu_group
> >>> ls: cannot access '/sys/bus/pci/devices/:08:00.0/iommu_group': No
> >>> such file or directory
> >>
> >> please can you give the output of the following command:
> >> find /sys/kernel/iommu_groups/
> > 
> > I get below result on Juno board:
> > 
> > root@debian:~# find /sys/kernel/iommu_groups/
> > /sys/kernel/iommu_groups/
> > /sys/kernel/iommu_groups/1
> > /sys/kernel/iommu_groups/1/devices
> > /sys/kernel/iommu_groups/1/devices/2007.etr
> > /sys/kernel/iommu_groups/1/type
> > /sys/kernel/iommu_groups/1/reserved_regions
> > /sys/kernel/iommu_groups/0
> > /sys/kernel/iommu_groups/0/devices
> > /sys/kernel/iommu_groups/0/devices/7ffb.ohci
> > /sys/kernel/iommu_groups/0/devices/7ffc.ehci
> > /sys/kernel/iommu_groups/0/type
> > /sys/kernel/iommu_groups/0/reserved_regions
> > 
> > So the 'iommu_groups' is not created for pci-e devices, right?
> 
> Yes that's correct.

Juno's pci-e devices without 'iommu_groups' is caused by dt binding, in
dts file [1] it sets 'status = "disabled"' for 'smmu_pcie' node.  Thus
the SMMU device will not be enabled for pci-e devices.

@Mark Rutland, could you have chance to confirm this should be fixed in
juno-base.dtsi or juno-r1.dts/juno-r2.dts?

> > Will debug into dt binding and related code and keep posted at here.OK

So now I made some progress and can see the networking card is
pass-through to guest OS, though the networking card reports errors
now.  Below is detailed steps and info:

- Bind devices in the same IOMMU group to vfio driver:

  echo :03:00.0 > /sys/bus/pci/devices/\:03\:00.0/driver/unbind
  echo 1095 3132 > /sys/bus/pci/drivers/vfio-pci/new_id

  echo :08:00.0 > /sys/bus/pci/devices/\:08\:00.0/driver/unbind
  echo 11ab 4380 > /sys/bus/pci/drivers/vfio-pci/new_id

- Enable 'allow_unsafe_interrupts=1' for module vfio_iommu_type1;

- Use qemu to launch guest OS:

  qemu-system-aarch64 \
-cpu host -M virt,accel=kvm -m 4096 -nographic \
-kernel /root/virt/Image -append root=/dev/vda2 \
-net none -device vfio-pci,host=08:00.0 \
-drive if=virtio,file=/root/virt/qemu/debian.img \
-append 'loglevel=8 root=/dev/vda2 rw console=ttyAMA0 earlyprintk 
ip=dhcp'

- Host log:

[  188.329861] vfio-pci :08:00.0: enabling device ( -> 0003)

- Below is guest log, from log though the driver has been registered but
  it reports PCI hardware failure and the timeout for the interrupt.

  So is this caused by very 'slow' forward interrupt handling?  Juno
  board uses GICv2 (I think it has GICv2m extension).

[...]

[1.024483] sky2 :00:01.0 eth0: enabling interface
[1.026822] sky2 :00:01.0: error interrupt status=0x8000
[1.029155] sky2 :00:01.0: PCI hardware error (0x1010)
[4.000699] sky2 :00:01.0 eth0: Link is up at 1000 Mbps, full duplex, 
flow control both
[4.026116] Sending DHCP requests .
[4.026201] sky2 :00:01.0: error interrupt status=0x8000
[4.030043] sky2 :00:01.0: PCI hardware error (0x1010)
[6.546111] ..
[   14.118106] [ cut here ]
[   14.120672] NETDEV WATCHDOG: eth0 (sky2): transmit queue 0 timed out
[   14.123555] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:461 
dev_watchdog+0x2b4/0x2c0
[   14.127082] Modules linked in:
[   14.128631] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.0.0-rc8-00061-ga98f9a047756-dirty #
[   14.132800] Hardware name: linux,dummy-virt (DT)
[   14.135082] pstate: 6005 (nZCv daif -PAN -UAO)
[   14.137459] pc : dev_watchdog+0x2b4/0x2c0
[   14.139457] lr : dev_watchdog+0x2b4/0x2c0
[   14.141351] sp : 10003d70
[   14.142924] x29: 10003d70 x28: 112f60c0
[   14.145433] x27: 0140 x26: 8000fa6eb3b8
[   14.147936] x25:  x24: 8000fa7a7c80
[   14.150428] x23: 8000fa6eb39c x22: 8000fa6eafb8
[   14.152934] x21: 8000fa6eb000 x20: 112f7000
[   14.155437] x19:  x18: 
[   14.157929] x17:  x16: 
[   14.160432] x15: 112fd6c8 x14: 90003a97
[   14.162927] x13: 10003aa5 x12: 11315878
[   14.165428] x11: 11315000 x10: 05f5e0ff
[   14.167935] x9 : ffd0 x8 : 64656d6974203020
[   14.170430] x7 : 6575657571207469 x6 : 00e3
[   14.172935] x5 :  x4 : 
[   14.175443] x3 :  x2 : 113158a8
[   14.177938] x1 : f2db9128b1f08600 x0 : 
[   14.180443] Call trace:
[   14.181625]  dev_watchdog+0x2b4/0x2c0
[   14.183377]  call_timer_fn+0x20/0x78
[   14.185078]  expire_timers+0xa4/0xb0
[   14.186777]  run_timer_softirq+0xa0/0x190
[   14.188687]  __do_softirq+0x108/0x234
[ 

Re: [PATCH v11 6/8] arm64: KVM: Enable VHE support for :G/:H perf event modifiers

2019-03-11 Thread Andrew Murray
On Mon, Mar 11, 2019 at 09:39:19AM +, Julien Thierry wrote:
> Hi Andrew,
> 
> On 08/03/2019 12:07, Andrew Murray wrote:
> > With VHE different exception levels are used between the host (EL2) and
> > guest (EL1) with a shared exception level for userpace (EL0). We can take
> > advantage of this and use the PMU's exception level filtering to avoid
> > enabling/disabling counters in the world-switch code. Instead we just
> > modify the counter type to include or exclude EL0 at vcpu_{load,put} time.
> > 
> > We also ensure that trapped PMU system register writes do not re-enable
> > EL0 when reconfiguring the backing perf events.
> > 
> > This approach completely avoids blackout windows seen with !VHE.
> > 
> > Suggested-by: Christoffer Dall 
> > Signed-off-by: Andrew Murray 
> > ---
> >  arch/arm/include/asm/kvm_host.h   |  3 ++
> >  arch/arm64/include/asm/kvm_host.h |  5 +-
> >  arch/arm64/kernel/perf_event.c|  6 ++-
> >  arch/arm64/kvm/pmu.c  | 87 ++-
> >  arch/arm64/kvm/sys_regs.c |  3 ++
> >  virt/kvm/arm/arm.c|  2 +
> >  6 files changed, 102 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/arm/include/asm/kvm_host.h 
> > b/arch/arm/include/asm/kvm_host.h
> > index a358cb15bb0d..3ce429954306 100644
> > --- a/arch/arm/include/asm/kvm_host.h
> > +++ b/arch/arm/include/asm/kvm_host.h
> > @@ -326,6 +326,9 @@ static inline void kvm_arch_vcpu_load_fp(struct 
> > kvm_vcpu *vcpu) {}
> >  static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
> >  static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
> >  
> > +static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
> > +static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
> > +
> >  static inline void kvm_arm_vhe_guest_enter(void) {}
> >  static inline void kvm_arm_vhe_guest_exit(void) {}
> >  
> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index 7ca4e094626d..d631528898b5 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -487,7 +487,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> >  
> >  static inline bool kvm_pmu_counter_defered(struct perf_event_attr *attr)
> >  {
> > -   return attr->exclude_host;
> > +   return (!has_vhe() && attr->exclude_host);
> >  }
> >  
> >  #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
> > @@ -501,6 +501,9 @@ void kvm_clr_pmu_events(u32 clr);
> >  
> >  void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
> >  bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
> > +
> > +void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> > +void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
> >  #else
> >  static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr 
> > *attr) {}
> >  static inline void kvm_clr_pmu_events(u32 clr) {}
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 64f02a9fd7cd..a121a82fc54c 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -847,8 +847,12 @@ static int armv8pmu_set_event_filter(struct 
> > hw_perf_event *event,
> >  * with other architectures (x86 and Power).
> >  */
> > if (is_kernel_in_hyp_mode()) {
> > -   if (!attr->exclude_kernel)
> > +   if (!attr->exclude_kernel && !attr->exclude_host)
> > config_base |= ARMV8_PMU_INCLUDE_EL2;
> > +   if (attr->exclude_guest)
> > +   config_base |= ARMV8_PMU_EXCLUDE_EL1;
> > +   if (attr->exclude_host)
> > +   config_base |= ARMV8_PMU_EXCLUDE_EL0;
> > } else {
> > if (!attr->exclude_hv && !attr->exclude_host)
> > config_base |= ARMV8_PMU_INCLUDE_EL2;
> > diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> > index a1cee7919953..a0830c70ece5 100644
> > --- a/arch/arm64/kvm/pmu.c
> > +++ b/arch/arm64/kvm/pmu.c
> > @@ -12,11 +12,19 @@
> >  DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
> >  
> >  /*
> > - * Given the exclude_{host,guest} attributes, determine if we are going
> > - * to need to switch counters at guest entry/exit.
> > + * Given the perf event attributes and system type, determine
> > + * if we are going to need to switch counters at guest entry/exit.
> >   */
> >  static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
> >  {
> > +   /**
> > +* With VHE the guest kernel runs at EL1 and the host at EL2,
> > +* where user (EL0) is excluded then we have no reason to switch
> > +* counters.
> > +*/
> > +   if (has_vhe() && attr->exclude_user)
> > +   return false;
> > +
> > /* Only switch if attributes are different */
> > return (attr->exclude_host ^ attr->exclude_guest);
> >  }
> > @@ -87,3 +95,78 @@ void __hyp_text __pmu_switch_to_host(struct 
> > kvm_cpu_context *host_ctxt)
> >

Re: [PATCH v11 3/8] arm64: KVM: add accessors to track guest/host only counters

2019-03-11 Thread Andrew Murray
On Mon, Mar 11, 2019 at 11:00:04AM +, Suzuki K Poulose wrote:
> Hi Andrew,
> 
> 
> On 08/03/2019 12:07, Andrew Murray wrote:
> > In order to effeciently switch events_{guest,host} perf counters at
> > guest entry/exit we add bitfields to kvm_cpu_context for guest and host
> > events as well as accessors for updating them.
> > 
> > A function is also provided which allows the PMU driver to determine
> > if a counter should start counting when it is enabled. With exclude_host,
> > events on !VHE we may only start counting when entering the guest.
> 
> Some minor comments below.
> 
> > 
> > Signed-off-by: Andrew Murray 
> > ---
> >   arch/arm64/include/asm/kvm_host.h | 17 +++
> >   arch/arm64/kvm/Makefile   |  2 +-
> >   arch/arm64/kvm/pmu.c  | 49 +++
> >   3 files changed, 67 insertions(+), 1 deletion(-)
> >   create mode 100644 arch/arm64/kvm/pmu.c
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index 1d36619d6650..4b7219128f2d 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -207,8 +207,14 @@ struct kvm_cpu_context {
> > struct kvm_vcpu *__hyp_running_vcpu;
> >   };
> > +struct kvm_pmu_events {
> > +   u32 events_host;
> > +   u32 events_guest;
> > +};
> > +
> >   struct kvm_host_data {
> > struct kvm_cpu_context host_ctxt;
> > +   struct kvm_pmu_events pmu_events;
> >   };
> >   typedef struct kvm_host_data kvm_host_data_t;
> > @@ -479,11 +485,22 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> >   void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> >   void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> > +static inline bool kvm_pmu_counter_defered(struct perf_event_attr *attr)
> 
> nit: s/defered/deferred/

Thanks.

> 
> > +{
> > +   return attr->exclude_host;
> > +}
> > +
> 
> Does it need a definition for !CONFIG_KVM case, to make sure that
> the events are always enabled for that case ? i.e, does this introduce
> a change in behavior for !CONFIG_KVM case ?

I think this hunk is correct. It makes sense to not count with exclude_host
regardless to if there are guests or not. (By the way v10 has this test,
we've just moved it to this file.)

Later in this series we update the hunk to condition it on !has_vhe(), this
is still OK - it means for VHE we always enable to counter (despite
exclude_host) but that's OK because on VHE with exclude_host we exclude EL2,
EL0 (armv8pmu_set_event_filter). This does mean that we're enabling a counter
that doesn't do anything, but then from the user perspective it is a bit
pointless to use exclude_host on a system without KVM or guests.

> 
> >   #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
> >   static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> >   {
> > return kvm_arch_vcpu_run_map_fp(vcpu);
> >   }
> > +
> > +void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
> > +void kvm_clr_pmu_events(u32 clr);
> > +#else
> > +static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr 
> > *attr) {}
> > +static inline void kvm_clr_pmu_events(u32 clr) {}
> >   #endif
> >   static inline void kvm_arm_vhe_guest_enter(void)
> > diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> > index 0f2a135ba15b..f34cb49b66ae 100644
> > --- a/arch/arm64/kvm/Makefile
> > +++ b/arch/arm64/kvm/Makefile
> > @@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o 
> > $(KVM)/arm/perf.o
> >   kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
> >   kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
> >   kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o 
> > sys_regs_generic_v8.o
> > -kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
> > +kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o pmu.o
> >   kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
> >   kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
> > diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> > new file mode 100644
> > index ..43965a3cc0f4
> > --- /dev/null
> > +++ b/arch/arm64/kvm/pmu.c
> > @@ -0,0 +1,49 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/arm64/kvm/pmu.c: Switching between guest and host counters
> 
> minor nit: You don't need the file name, it is superfluous.
> 
> > + *
> > + * Copyright 2019 Arm Limited
> > + * Author: Andrew Murray 
> > + */
> > +#include 
> > +#include 
> > +
> 
> > +DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
> 
> nit: Do we really need this ? This is already in asm/kvm_host.h.

No we don't - I'll remove it.

> 
> > +
> > +/*
> > + * Given the exclude_{host,guest} attributes, determine if we are going
> > + * to need to switch counters at guest entry/exit.
> > + */
> > +static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
> > +{
> > +   /* Only switch if attributes are different */
> > +   return (attr->exclude_h

Re: [PATCH v11 3/8] arm64: KVM: add accessors to track guest/host only counters

2019-03-11 Thread Andrew Murray
On Fri, Mar 08, 2019 at 04:40:51PM +, Julien Thierry wrote:
> Hi Andrew,
> 
> On 08/03/2019 12:07, Andrew Murray wrote:
> > In order to effeciently switch events_{guest,host} perf counters at
> > guest entry/exit we add bitfields to kvm_cpu_context for guest and host
> > events as well as accessors for updating them.
> > 
> > A function is also provided which allows the PMU driver to determine
> > if a counter should start counting when it is enabled. With exclude_host,
> > events on !VHE we may only start counting when entering the guest.
> > 
> 
> I might have missed something here. Why is that true only for !VHE? Is
> it because with VHE we can just exclude EL1?

That's correct. For VHE we never defer counting and can rely on the PMU
filtering capabilities (even though for EL0 we have to reconfigure the
filtering upon entering/exiting the guest).

> (It's also a bit confusing since the patch does not seem to contain any
> VHE/nVHE distinction)

This is updated in the later patches of this series. I felt the series
would be easier to understand if I add the special VHE case last.

I will update the commit message such that it reads "With exclude_host, we
may only start counting when entering the guest.". I.e. the changes here
are valid for both VHE and !VHE until the later patches change it for VHE.

> 
> > Signed-off-by: Andrew Murray 
> > ---
> >  arch/arm64/include/asm/kvm_host.h | 17 +++
> >  arch/arm64/kvm/Makefile   |  2 +-
> >  arch/arm64/kvm/pmu.c  | 49 +++
> >  3 files changed, 67 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/kvm/pmu.c
> > 
> > diff --git a/arch/arm64/include/asm/kvm_host.h 
> > b/arch/arm64/include/asm/kvm_host.h
> > index 1d36619d6650..4b7219128f2d 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -207,8 +207,14 @@ struct kvm_cpu_context {
> > struct kvm_vcpu *__hyp_running_vcpu;
> >  };
> >  
> > +struct kvm_pmu_events {
> > +   u32 events_host;
> > +   u32 events_guest;
> > +};
> > +
> >  struct kvm_host_data {
> > struct kvm_cpu_context host_ctxt;
> > +   struct kvm_pmu_events pmu_events;
> >  };
> >  
> >  typedef struct kvm_host_data kvm_host_data_t;
> > @@ -479,11 +485,22 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
> >  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
> >  
> > +static inline bool kvm_pmu_counter_defered(struct perf_event_attr *attr)
> > +{
> > +   return attr->exclude_host;
> > +}
> > +
> >  #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
> >  static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
> >  {
> > return kvm_arch_vcpu_run_map_fp(vcpu);
> >  }
> > +
> > +void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
> > +void kvm_clr_pmu_events(u32 clr);
> > +#else
> > +static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr 
> > *attr) {}
> > +static inline void kvm_clr_pmu_events(u32 clr) {}
> >  #endif
> >  
> >  static inline void kvm_arm_vhe_guest_enter(void)
> > diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
> > index 0f2a135ba15b..f34cb49b66ae 100644
> > --- a/arch/arm64/kvm/Makefile
> > +++ b/arch/arm64/kvm/Makefile
> > @@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o 
> > $(KVM)/arm/perf.o
> >  kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
> >  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
> >  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o 
> > sys_regs_generic_v8.o
> > -kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
> > +kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o pmu.o
> >  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
> >  
> >  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o
> > diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> > new file mode 100644
> > index ..43965a3cc0f4
> > --- /dev/null
> > +++ b/arch/arm64/kvm/pmu.c
> > @@ -0,0 +1,49 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * arch/arm64/kvm/pmu.c: Switching between guest and host counters
> > + *
> > + * Copyright 2019 Arm Limited
> > + * Author: Andrew Murray 
> > + */
> > +#include 
> > +#include 
> > +
> > +DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
> > +
> > +/*
> > + * Given the exclude_{host,guest} attributes, determine if we are going
> > + * to need to switch counters at guest entry/exit.
> > + */
> > +static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
> > +{
> > +   /* Only switch if attributes are different */
> > +   return (attr->exclude_host ^ attr->exclude_guest);
> 
> Nit:
> 
> Is there any benefit to this rather than doing "attr->exclude_host !=
> attr->exclude_guest" ? The code generated is most likely the same, I
> just find the latter slightly more straightforward.

Nope, I'll change it. Not sure why I ended up with th

Re: [PATCH v11 3/8] arm64: KVM: add accessors to track guest/host only counters

2019-03-11 Thread Suzuki K Poulose

Hi Andrew,


On 08/03/2019 12:07, Andrew Murray wrote:

In order to effeciently switch events_{guest,host} perf counters at
guest entry/exit we add bitfields to kvm_cpu_context for guest and host
events as well as accessors for updating them.

A function is also provided which allows the PMU driver to determine
if a counter should start counting when it is enabled. With exclude_host,
events on !VHE we may only start counting when entering the guest.


Some minor comments below.



Signed-off-by: Andrew Murray 
---
  arch/arm64/include/asm/kvm_host.h | 17 +++
  arch/arm64/kvm/Makefile   |  2 +-
  arch/arm64/kvm/pmu.c  | 49 +++
  3 files changed, 67 insertions(+), 1 deletion(-)
  create mode 100644 arch/arm64/kvm/pmu.c

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 1d36619d6650..4b7219128f2d 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -207,8 +207,14 @@ struct kvm_cpu_context {
struct kvm_vcpu *__hyp_running_vcpu;
  };
  
+struct kvm_pmu_events {

+   u32 events_host;
+   u32 events_guest;
+};
+
  struct kvm_host_data {
struct kvm_cpu_context host_ctxt;
+   struct kvm_pmu_events pmu_events;
  };
  
  typedef struct kvm_host_data kvm_host_data_t;

@@ -479,11 +485,22 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
  void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
  void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
  
+static inline bool kvm_pmu_counter_defered(struct perf_event_attr *attr)


nit: s/defered/deferred/


+{
+   return attr->exclude_host;
+}
+


Does it need a definition for !CONFIG_KVM case, to make sure that
the events are always enabled for that case ? i.e, does this introduce
a change in behavior for !CONFIG_KVM case ?


  #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
  static inline int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
  {
return kvm_arch_vcpu_run_map_fp(vcpu);
  }
+
+void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr);
+void kvm_clr_pmu_events(u32 clr);
+#else
+static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) {}
+static inline void kvm_clr_pmu_events(u32 clr) {}
  #endif
  
  static inline void kvm_arm_vhe_guest_enter(void)

diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile
index 0f2a135ba15b..f34cb49b66ae 100644
--- a/arch/arm64/kvm/Makefile
+++ b/arch/arm64/kvm/Makefile
@@ -19,7 +19,7 @@ kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/psci.o 
$(KVM)/arm/perf.o
  kvm-$(CONFIG_KVM_ARM_HOST) += inject_fault.o regmap.o va_layout.o
  kvm-$(CONFIG_KVM_ARM_HOST) += hyp.o hyp-init.o handle_exit.o
  kvm-$(CONFIG_KVM_ARM_HOST) += guest.o debug.o reset.o sys_regs.o 
sys_regs_generic_v8.o
-kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o
+kvm-$(CONFIG_KVM_ARM_HOST) += vgic-sys-reg-v3.o fpsimd.o pmu.o
  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/aarch32.o
  
  kvm-$(CONFIG_KVM_ARM_HOST) += $(KVM)/arm/vgic/vgic.o

diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
new file mode 100644
index ..43965a3cc0f4
--- /dev/null
+++ b/arch/arm64/kvm/pmu.c
@@ -0,0 +1,49 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * arch/arm64/kvm/pmu.c: Switching between guest and host counters


minor nit: You don't need the file name, it is superfluous.


+ *
+ * Copyright 2019 Arm Limited
+ * Author: Andrew Murray 
+ */
+#include 
+#include 
+



+DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);


nit: Do we really need this ? This is already in asm/kvm_host.h.


+
+/*
+ * Given the exclude_{host,guest} attributes, determine if we are going
+ * to need to switch counters at guest entry/exit.
+ */
+static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
+{
+   /* Only switch if attributes are different */
+   return (attr->exclude_host ^ attr->exclude_guest);


I back Julien's suggestion to keep this simple.


+}
+
+/*
+ * Add events to track that we may want to switch at guest entry/exit
+ * time.
+ */
+void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
+{
+   struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+
+   if (!kvm_pmu_switch_needed(attr))
+   return;
+
+   if (!attr->exclude_host)
+   ctx->pmu_events.events_host |= set;
+   if (!attr->exclude_guest)
+   ctx->pmu_events.events_guest |= set;
+}
+
+/*
+ * Stop tracking events
+ */
+void kvm_clr_pmu_events(u32 clr)
+{
+   struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+
+   ctx->pmu_events.events_host &= ~clr;
+   ctx->pmu_events.events_guest &= ~clr;
+}



Rest looks fine.

Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: Question: KVM: Failed to bind vfio with PCI-e / SMMU on Juno-r2

2019-03-11 Thread Auger Eric
Hi Leo,

On 3/11/19 10:39 AM, Leo Yan wrote:
> Hi Auger,
> 
> On Mon, Mar 11, 2019 at 09:23:20AM +0100, Auger Eric wrote:
> 
> [...]
> 
>>> P.s. I also checked the sysfs node and found it doesn't contain node
>>> 'iommu_group':
>>>
>>> # ls /sys/bus/pci/devices/\:08\:00.0/iommu_group
>>> ls: cannot access '/sys/bus/pci/devices/:08:00.0/iommu_group': No
>>> such file or directory
>>
>> please can you give the output of the following command:
>> find /sys/kernel/iommu_groups/
> 
> I get below result on Juno board:
> 
> root@debian:~# find /sys/kernel/iommu_groups/
> /sys/kernel/iommu_groups/
> /sys/kernel/iommu_groups/1
> /sys/kernel/iommu_groups/1/devices
> /sys/kernel/iommu_groups/1/devices/2007.etr
> /sys/kernel/iommu_groups/1/type
> /sys/kernel/iommu_groups/1/reserved_regions
> /sys/kernel/iommu_groups/0
> /sys/kernel/iommu_groups/0/devices
> /sys/kernel/iommu_groups/0/devices/7ffb.ohci
> /sys/kernel/iommu_groups/0/devices/7ffc.ehci
> /sys/kernel/iommu_groups/0/type
> /sys/kernel/iommu_groups/0/reserved_regions
> 
> So the 'iommu_groups' is not created for pci-e devices, right?

Yes that's correct.
> 
> Will debug into dt binding and related code and keep posted at here.OK

Thanks

Eric
> 
>> when booting your host without noiommu=true
>>
>> At first sight I would say you have trouble with your iommu groups.
> 
> Thanks a lot for guidance.
> Leo Yan
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: Question: KVM: Failed to bind vfio with PCI-e / SMMU on Juno-r2

2019-03-11 Thread Leo Yan
Hi Auger,

On Mon, Mar 11, 2019 at 09:23:20AM +0100, Auger Eric wrote:

[...]

> > P.s. I also checked the sysfs node and found it doesn't contain node
> > 'iommu_group':
> > 
> > # ls /sys/bus/pci/devices/\:08\:00.0/iommu_group
> > ls: cannot access '/sys/bus/pci/devices/:08:00.0/iommu_group': No
> > such file or directory
> 
> please can you give the output of the following command:
> find /sys/kernel/iommu_groups/

I get below result on Juno board:

root@debian:~# find /sys/kernel/iommu_groups/
/sys/kernel/iommu_groups/
/sys/kernel/iommu_groups/1
/sys/kernel/iommu_groups/1/devices
/sys/kernel/iommu_groups/1/devices/2007.etr
/sys/kernel/iommu_groups/1/type
/sys/kernel/iommu_groups/1/reserved_regions
/sys/kernel/iommu_groups/0
/sys/kernel/iommu_groups/0/devices
/sys/kernel/iommu_groups/0/devices/7ffb.ohci
/sys/kernel/iommu_groups/0/devices/7ffc.ehci
/sys/kernel/iommu_groups/0/type
/sys/kernel/iommu_groups/0/reserved_regions

So the 'iommu_groups' is not created for pci-e devices, right?

Will debug into dt binding and related code and keep posted at here.

> when booting your host without noiommu=true
> 
> At first sight I would say you have trouble with your iommu groups.

Thanks a lot for guidance.
Leo Yan
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v11 6/8] arm64: KVM: Enable VHE support for :G/:H perf event modifiers

2019-03-11 Thread Julien Thierry
Hi Andrew,

On 08/03/2019 12:07, Andrew Murray wrote:
> With VHE different exception levels are used between the host (EL2) and
> guest (EL1) with a shared exception level for userpace (EL0). We can take
> advantage of this and use the PMU's exception level filtering to avoid
> enabling/disabling counters in the world-switch code. Instead we just
> modify the counter type to include or exclude EL0 at vcpu_{load,put} time.
> 
> We also ensure that trapped PMU system register writes do not re-enable
> EL0 when reconfiguring the backing perf events.
> 
> This approach completely avoids blackout windows seen with !VHE.
> 
> Suggested-by: Christoffer Dall 
> Signed-off-by: Andrew Murray 
> ---
>  arch/arm/include/asm/kvm_host.h   |  3 ++
>  arch/arm64/include/asm/kvm_host.h |  5 +-
>  arch/arm64/kernel/perf_event.c|  6 ++-
>  arch/arm64/kvm/pmu.c  | 87 ++-
>  arch/arm64/kvm/sys_regs.c |  3 ++
>  virt/kvm/arm/arm.c|  2 +
>  6 files changed, 102 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index a358cb15bb0d..3ce429954306 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -326,6 +326,9 @@ static inline void kvm_arch_vcpu_load_fp(struct kvm_vcpu 
> *vcpu) {}
>  static inline void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu) {}
>  
> +static inline void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu) {}
> +static inline void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu) {}
> +
>  static inline void kvm_arm_vhe_guest_enter(void) {}
>  static inline void kvm_arm_vhe_guest_exit(void) {}
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 7ca4e094626d..d631528898b5 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -487,7 +487,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
>  
>  static inline bool kvm_pmu_counter_defered(struct perf_event_attr *attr)
>  {
> - return attr->exclude_host;
> + return (!has_vhe() && attr->exclude_host);
>  }
>  
>  #ifdef CONFIG_KVM /* Avoid conflicts with core headers if CONFIG_KVM=n */
> @@ -501,6 +501,9 @@ void kvm_clr_pmu_events(u32 clr);
>  
>  void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt);
>  bool __pmu_switch_to_guest(struct kvm_cpu_context *host_ctxt);
> +
> +void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu);
>  #else
>  static inline void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr) 
> {}
>  static inline void kvm_clr_pmu_events(u32 clr) {}
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 64f02a9fd7cd..a121a82fc54c 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -847,8 +847,12 @@ static int armv8pmu_set_event_filter(struct 
> hw_perf_event *event,
>* with other architectures (x86 and Power).
>*/
>   if (is_kernel_in_hyp_mode()) {
> - if (!attr->exclude_kernel)
> + if (!attr->exclude_kernel && !attr->exclude_host)
>   config_base |= ARMV8_PMU_INCLUDE_EL2;
> + if (attr->exclude_guest)
> + config_base |= ARMV8_PMU_EXCLUDE_EL1;
> + if (attr->exclude_host)
> + config_base |= ARMV8_PMU_EXCLUDE_EL0;
>   } else {
>   if (!attr->exclude_hv && !attr->exclude_host)
>   config_base |= ARMV8_PMU_INCLUDE_EL2;
> diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
> index a1cee7919953..a0830c70ece5 100644
> --- a/arch/arm64/kvm/pmu.c
> +++ b/arch/arm64/kvm/pmu.c
> @@ -12,11 +12,19 @@
>  DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
>  
>  /*
> - * Given the exclude_{host,guest} attributes, determine if we are going
> - * to need to switch counters at guest entry/exit.
> + * Given the perf event attributes and system type, determine
> + * if we are going to need to switch counters at guest entry/exit.
>   */
>  static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
>  {
> + /**
> +  * With VHE the guest kernel runs at EL1 and the host at EL2,
> +  * where user (EL0) is excluded then we have no reason to switch
> +  * counters.
> +  */
> + if (has_vhe() && attr->exclude_user)
> + return false;
> +
>   /* Only switch if attributes are different */
>   return (attr->exclude_host ^ attr->exclude_guest);
>  }
> @@ -87,3 +95,78 @@ void __hyp_text __pmu_switch_to_host(struct 
> kvm_cpu_context *host_ctxt)
>   write_sysreg(pmu->events_host, pmcntenset_el0);
>  }
>  
> +/*
> + * Modify ARMv8 PMU events to include EL0 counting
> + */
> +static void kvm_vcpu_pmu_enable_el0(unsigned long events)
> +{
> + u64 typer;
> + u32 counter;
> +
> +   

Re: Question: KVM: Failed to bind vfio with PCI-e / SMMU on Juno-r2

2019-03-11 Thread Auger Eric
Hi Leo,

On 3/11/19 7:42 AM, Leo Yan wrote:
> Hi all,
> 
> I am trying to enable PCI-e device pass-through mode with KVM, since
> Juno-r2 board has PCI-e bus so I firstly try to use vfio to
> passthrough the network card on PCI-e bus.
> 
> According to Juno-r2 board TRM [1], there has a CoreLink MMU-401 (SMMU)
> between PCI-e devices and CCI bus; IIUC, PCI-e device and the SMMU can
> be used for vfio for address isolation and from hardware pespective it
> is sufficient for support pass-through mode.
> 
> I followed Eric's blog [2] for 'VFIO-PCI driver binding', so I
> executed blow commands on Juno-r2 board:
> 
>   echo vfio-pci > /sys/bus/pci/devices/\:08\:00.0/driver_override
>   echo :08:00.0 > /sys/bus/pci/drivers/sky2/unbind
>   echo :08:00.0 > /sys/bus/pci/drivers_probe
> 
> But at the last command for vifo probing, it reports failure as below:
> 
> [   21.553889] sky2 :08:00.0 enp8s0: disabling interface
> [   21.616720] vfio-pci: probe of :08:00.0 failed with error -22
> 
> I looked into for the code, though 'dev->bus->iommu_ops' points to the
> data structure 'arm_smmu_ops', but 'dev->iommu_group' is NULL thus the
> probe function returns failure with below flow:
> 
>   vfio_pci_probe()
> `-> vfio_iommu_group_get()
>   `-> iommu_group_get()
> `-> return NULL;
> 
> Alternatively, if enable the kconfig CONFIG_VFIO_NOIOMMU & set global
> variable 'noiommu' = true, the probe function still returns error; since
> the function iommu_present(dev->bus) return back 'arm_smmu_ops' so you
> could see the code will run into below logic:
> 
> vfio_iommu_group_get()
> {
>   group = iommu_group_get(dev);
> 
> #ifdef CONFIG_VFIO_NOIOMMU
> 
>   /*
>* With noiommu enabled, an IOMMU group will be created for a device
>* that doesn't already have one and doesn't have an iommu_ops on their
>* bus.  We set iommudata simply to be able to identify these groups
>* as special use and for reclamation later.
>*/
>   if (group || !noiommu || iommu_present(dev->bus))
>   return group;==> return 'group' and 'group' is NULL
> 
>   [...]
> }
> 
> So either using SMMU or with kernel config CONFIG_VFIO_NOIOMMU, both cannot
> bind vifo driver for network card device on Juno-r2 board.
> 
> P.s. I also checked the sysfs node and found it doesn't contain node
> 'iommu_group':
> 
> # ls /sys/bus/pci/devices/\:08\:00.0/iommu_group
> ls: cannot access '/sys/bus/pci/devices/:08:00.0/iommu_group': No
> such file or directory

please can you give the output of the following command:
find /sys/kernel/iommu_groups/

when booting your host without noiommu=true

At first sight I would say you have trouble with your iommu groups.

Thanks

Eric
> 
> Could you give some suggestions for this so that I can proceed?  Very
> appreciate for any comment.
> 
> Thanks,
> Leo Yan
> 
> [1] 
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515f/DDI0515F_juno_arm_development_platform_soc_trm.pdf
> [2] https://www.linaro.org/blog/kvm-pciemsi-passthrough-armarm64/
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm