Re: [PATCH 0/6] Introduce ID_PFR2 and other CPU feature changes

2020-04-06 Thread Will Deacon
On Tue, Jan 28, 2020 at 06:09:03PM +0530, Anshuman Khandual wrote:
> This series is primarily motivated from an adhoc list from Mark Rutland
> during our ID_ISAR6 discussion [1]. Besides, it also includes a patch
> which does macro replacement for various open bits shift encodings in
> various CPU ID registers. This series is based on linux-next 20200124.
> 
> [1] https://patchwork.kernel.org/patch/11287805/
> 
> Is there anything else apart from these changes which can be accommodated
> in this series, please do let me know. Thank you.

The latest Arm ARM also talks about DFR1 and MMFR5. Please can you include
those too? Might also be worth checking to see if anything is missing on
the 64-bit side as well (I didn't look).

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: I{S,C}ACTIVER implemention question

2020-04-06 Thread Marc Zyngier

Hi Julien,

Thanks for the heads up.

On 2020-04-06 14:16, Julien Grall wrote:

Hi,

Xen community is currently reviewing a new implementation for reading
I{S,C}ACTIVER registers (see [1]).

The implementation is based on vgic_mmio_read_active() in KVM, i.e the
active state of the interrupts is based on the vGIC state stored in
memory.

While reviewing the patch on xen-devel, I noticed a potential deadlock
at least with Xen implementation. I know that Xen vGIC and KVM vGIC
are quite different, so I looked at the implementation to see how this
is dealt.

With my limited knowledge of KVM, I wasn't able to rule it out. I am
curious to know if I missed anything.

vCPU A may read the active state of an interrupt routed to vCPU B.
When vCPU A is reading the state, it will read the state stored in
memory.

The only way the memory state can get synced with the HW state is when
vCPU B exit guest context.

AFAICT, vCPU B will not exit when deactivating HW mapped interrupts
and virtual edge interrupts. So vCPU B may run for an abritrary long
time before been exiting and syncing the memory state with the HW
state.


So while I agree that this is definitely not ideal, I don't think we 
end-up
with a deadlock (or rather a livelock) either. That's because we are 
guaranteed
to exit eventually if only because the kernel's own timer interrupt (or 
any
other host interrupt routed to the same physical CPU) will fire and get 
us
out of there. On its own, this is enough to allow the polling vcpu to 
make

forward progress.

Now, it is obvious that we should improve on the current situation. I 
just

hacked together a patch that provides the same guarantees as the one we
already have on the write side (kick all vcpus out of the guest, 
snapshot
the state, kick everyone back in). I boot-tested it, so it is obviously 
perfect

and won't eat your data at all! ;-)

Thanks,

M.

+
+/*
+ * If we are fiddling with an IRQ's active state, we have to make sure 
the IRQ
+ * is not queued on some running VCPU's LRs, because then the change to 
the
+ * active state can be overwritten when the VCPU's state is synced 
coming back

+ * from the guest.
+ *
+ * For shared interrupts as well as GICv3 private interrupts, we have 
to
+ * stop all the VCPUs because interrupts can be migrated while we don't 
hold

+ * the IRQ locks and we don't want to be chasing moving targets.
+ *
+ * For GICv2 private interrupts we don't have to do anything because
+ * userspace accesses to the VGIC state already require all VCPUs to be
+ * stopped, and only the VCPU itself can modify its private interrupts
+ * active state, which guarantees that the VCPU is not running.
+ */
+static void vgic_access_active_prepare(struct kvm_vcpu *vcpu, u32 
intid)

+{
+   if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
+   intid > VGIC_NR_PRIVATE_IRQS)
+   kvm_arm_halt_guest(vcpu->kvm);
+}
+
+/* See vgic_access_active_prepare */
+static void vgic_access_active_finish(struct kvm_vcpu *vcpu, u32 intid)
+{
+   if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
+   intid > VGIC_NR_PRIVATE_IRQS)
+   kvm_arm_resume_guest(vcpu->kvm);
+}
+
+static unsigned long __vgic_mmio_read_active(struct kvm_vcpu *vcpu,
+gpa_t addr, unsigned int len)
 {
u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
u32 value = 0;
@@ -359,6 +390,10 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu 
*vcpu,

for (i = 0; i < len * 8; i++) {
struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);

+   /*
+* Even for HW interrupts, don't evaluate the HW state as
+* all the guest is interested in is the virtual state.
+*/
if (irq->active)
value |= (1U << i);

@@ -368,6 +403,29 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu 
*vcpu,

return value;
 }

+unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
+   gpa_t addr, unsigned int len)
+{
+   u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+   u32 val;
+
+   mutex_lock(&vcpu->kvm->lock);
+   vgic_access_active_prepare(vcpu, intid);
+
+   val = __vgic_mmio_read_active(vcpu, addr, len);
+
+   vgic_access_active_finish(vcpu, intid);
+   mutex_unlock(&vcpu->kvm->lock);
+
+   return val;
+}
+
+unsigned long vgic_uaccess_read_active(struct kvm_vcpu *vcpu,
+   gpa_t addr, unsigned int len)
+{
+   return __vgic_mmio_read_active(vcpu, addr, len);
+}
+
 /* Must be called with irq->irq_lock held */
 static void vgic_hw_irq_change_active(struct kvm_vcpu *vcpu, struct 
vgic_irq *irq,

  bool active, bool is_uaccess)
@@ -426,36 +484,6 @@ static void vgic_mmio_change_active(struct kvm_vcpu 
*vcpu, struct vgic_irq *irq,

raw_spin_unlock_irqresto

[PATCH] KVM: arm64: arch_timer shouldn't assume the vcpu is loaded

2020-04-06 Thread James Morse
kvm_arch_timer_get_input_level() needs to get the arch_timer_context for
a particular vcpu, and uses kvm_get_running_vcpu() to find it.

kvm_arch_timer_get_input_level() may be called to handle a user-space
write to the redistributor, where the vcpu is not loaded. This causes
kvm_get_running_vcpu() to return NULL:
| Unable to handle kernel paging request at virtual address 1ec0
| Mem abort info:
|   ESR = 0x9604
|   EC = 0x25: DABT (current EL), IL = 32 bits
|   SET = 0, FnV = 0
|   EA = 0, S1PTW = 0
| Data abort info:
|   ISV = 0, ISS = 0x0004
|   CM = 0, WnR = 0
| user pgtable: 4k pages, 48-bit VAs, pgdp=3cbf9000
| [1ec0] pgd=
| Internal error: Oops: 9604 [#1] PREEMPT SMP
| Modules linked in: r8169 realtek efivarfs ip_tables x_tables
| CPU: 1 PID: 2615 Comm: qemu-system-aar Not tainted 5.6.0-rc7 #30
| Hardware name: Marvell mvebu_armada-37xx/mvebu_armada-37xx, BIOS 
2018.03-devel-18.12.3-gc9aa92c-armbian 02/20/2019
| pstate: 0085 (nzcv daIf -PAN -UAO)
| pc : kvm_arch_timer_get_input_level+0x1c/0x68
| lr : kvm_arch_timer_get_input_level+0x1c/0x68

| Call trace:
|  kvm_arch_timer_get_input_level+0x1c/0x68
|  vgic_get_phys_line_level+0x3c/0x90
|  vgic_mmio_write_senable+0xe4/0x130
|  vgic_uaccess+0xe0/0x100
|  vgic_v3_redist_uaccess+0x5c/0x80
|  vgic_v3_attr_regs_access+0xf0/0x200
|  nvgic_v3_set_attr+0x234/0x250
|  kvm_device_ioctl_attr+0xa4/0xf8
|  kvm_device_ioctl+0x7c/0xc0
|  ksys_ioctl+0x1fc/0xc18
|  __arm64_sys_ioctl+0x24/0x30
|  do_el0_svc+0x7c/0x148
|  el0_sync_handler+0x138/0x258
|  el0_sync+0x140/0x180
| Code: 910003fd f9000bf3 2a0003f3 97ff650c (b95ec001)
| ---[ end trace 81287612d93f1e70 ]---
| note: qemu-system-aar[2615] exited with preempt_count 1

Loading the vcpu doesn't make a lot of sense for handling a device ioctl(),
so instead pass the vcpu through to kvm_arch_timer_get_input_level(). Its
not clear that an intid makes much sense without the paired vcpu.

Suggested-by: Andre Przywara 
Signed-off-by: James Morse 
---
 include/kvm/arm_arch_timer.h  | 2 +-
 include/kvm/arm_vgic.h| 8 +++-
 virt/kvm/arm/arch_timer.c | 3 +--
 virt/kvm/arm/vgic/vgic-mmio.c | 2 +-
 virt/kvm/arm/vgic/vgic-v2.c   | 2 +-
 virt/kvm/arm/vgic/vgic-v3.c   | 2 +-
 virt/kvm/arm/vgic/vgic.c  | 8 
 virt/kvm/arm/vgic/vgic.h  | 2 +-
 8 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/include/kvm/arm_arch_timer.h b/include/kvm/arm_arch_timer.h
index d120e6c..42a016a 100644
--- a/include/kvm/arm_arch_timer.h
+++ b/include/kvm/arm_arch_timer.h
@@ -92,7 +92,7 @@ void kvm_timer_vcpu_put(struct kvm_vcpu *vcpu);
 
 void kvm_timer_init_vhe(void);
 
-bool kvm_arch_timer_get_input_level(int vintid);
+bool kvm_arch_timer_get_input_level(int vintid, struct kvm_vcpu *vcpu);
 
 #define vcpu_timer(v)  (&(v)->arch.timer_cpu)
 #define vcpu_get_timer(v,t)(&vcpu_timer(v)->timers[(t)])
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 9d53f54..41e91b3 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -130,11 +130,9 @@ struct vgic_irq {
 * state of the input level of mapped level-triggered IRQ faster than
 * peaking into the physical GIC.
 *
-* Always called in non-preemptible section and the functions can use
-* kvm_arm_get_running_vcpu() to get the vcpu pointer for private
-* IRQs.
+* Always called in non-preemptible section.
 */
-   bool (*get_input_level)(int vintid);
+   bool (*get_input_level)(int vintid, struct kvm_vcpu *vcpu);
 
void *owner;/* Opaque pointer to reserve an 
interrupt
   for in-kernel devices. */
@@ -344,7 +342,7 @@ void kvm_vgic_init_cpu_hardware(void);
 int kvm_vgic_inject_irq(struct kvm *kvm, int cpuid, unsigned int intid,
bool level, void *owner);
 int kvm_vgic_map_phys_irq(struct kvm_vcpu *vcpu, unsigned int host_irq,
- u32 vintid, bool (*get_input_level)(int vindid));
+ u32 vintid, bool (*get_input_level)(int vindid, 
struct kvm_vcpu *vcpu));
 int kvm_vgic_unmap_phys_irq(struct kvm_vcpu *vcpu, unsigned int vintid);
 bool kvm_vgic_map_is_active(struct kvm_vcpu *vcpu, unsigned int vintid);
 
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c
index 0d9438e..ca0e87b 100644
--- a/virt/kvm/arm/arch_timer.c
+++ b/virt/kvm/arm/arch_timer.c
@@ -1021,9 +1021,8 @@ static bool timer_irqs_are_valid(struct kvm_vcpu *vcpu)
return true;
 }
 
-bool kvm_arch_timer_get_input_level(int vintid)
+bool kvm_arch_timer_get_input_level(int vintid, struct kvm_vcpu *vcpu)
 {
-   struct kvm_vcpu *vcpu = kvm_get_running_vcpu();
struct arch_timer_context *timer;
 
if (vintid == vcpu_vtimer(vcpu)->irq.irq)
diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index 97fb2a4..37ee2f8 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/

I{S,C}ACTIVER implemention question

2020-04-06 Thread Julien Grall

Hi,

Xen community is currently reviewing a new implementation for reading 
I{S,C}ACTIVER registers (see [1]).


The implementation is based on vgic_mmio_read_active() in KVM, i.e the 
active state of the interrupts is based on the vGIC state stored in memory.


While reviewing the patch on xen-devel, I noticed a potential deadlock 
at least with Xen implementation. I know that Xen vGIC and KVM vGIC are 
quite different, so I looked at the implementation to see how this is dealt.


With my limited knowledge of KVM, I wasn't able to rule it out. I am 
curious to know if I missed anything.


vCPU A may read the active state of an interrupt routed to vCPU B. When 
vCPU A is reading the state, it will read the state stored in memory.


The only way the memory state can get synced with the HW state is when 
vCPU B exit guest context.


AFAICT, vCPU B will not exit when deactivating HW mapped interrupts and 
virtual edge interrupts. So vCPU B may run for an abritrary long time 
before been exiting and syncing the memory state with the HW state.


Looking at Linux (5.4 and onwards) use of the active state, vCPU A would 
loop until the interrupt is not active anymore. So wouldn't the task on 
vCPU A be blocked for an arbitrary long time?


Cheers,

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2020-03/msg01844.html


--
Julien Grall
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm