RE: [PATCH] arm64/kvm: correct the error report in kvm_handle_guest_abort

2021-01-27 Thread Jianyong Wu
Hi Marc,

> 
>  From 8f2a919d6f13d36445974794c76821fbb6b40f88 Mon Sep 17 00:00:00
> 2001
>  From: Marc Zyngier 
> Date: Sat, 16 Jan 2021 10:53:21 +
> Subject: [PATCH] CMO on RO memslot
> 
> Signed-off-by: Marc Zyngier 
> ---
>   arch/arm64/kvm/mmu.c | 51 +
> ---
>   1 file changed, 39 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index
> 7d2257cc5438..3c176b5b0a28 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -760,7 +760,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
> phys_addr_t fault_ipa,
>   struct kvm_pgtable *pgt;
> 
>   fault_granule = 1UL <<
> ARM64_HW_PGTABLE_LEVEL_SHIFT(fault_level);
> - write_fault = kvm_is_write_fault(vcpu);
> + /*
> +  * Treat translation faults on CMOs as read faults. Should
> +  * this further generate a permission fault, it will be caught
> +  * in kvm_handle_guest_abort(), with prejudice...
> +  */
> + if (fault_status == FSC_FAULT && kvm_vcpu_dabt_is_cm(vcpu))
> + write_fault = false;
> + else
> + write_fault = kvm_is_write_fault(vcpu);
>   exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu);
>   VM_BUG_ON(write_fault && exec_fault);
> 
> @@ -1013,19 +1021,37 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> *vcpu)
>   }
> 
>   /*
> -  * Check for a cache maintenance operation. Since we
> -  * ended-up here, we know it is outside of any memory
> -  * slot. But we can't find out if that is for a device,
> -  * or if the guest is just being stupid. The only thing
> -  * we know for sure is that this range cannot be cached.
> +  * Check for a cache maintenance operation. Two cases:
> +  *
> +  * - It is outside of any memory slot. But we can't find out
> +  *   if that is for a device, or if the guest is just being
> +  *   stupid. The only thing we know for sure is that this
> +  *   range cannot be cached.  So let's assume that the guest
> +  *   is just being cautious, and skip the instruction.
> +  *
> +  * - Otherwise, check whether this is a permission fault.
> +  *   If so, that's a DC IVAC on a R/O memslot, which is a
> +  *   pretty bad idea, and we tell the guest so.
>*
> -  * So let's assume that the guest is just being
> -  * cautious, and skip the instruction.
> +  * - If this wasn't a permission fault, pass it along for
> + *   further handling (including faulting the page in
> if it
> + *   was a translation fault).
>*/
> - if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu))
> {
> - kvm_incr_pc(vcpu);
> - ret = 1;
> - goto out_unlock;
> + if (kvm_vcpu_dabt_is_cm(vcpu)) {
> + if (kvm_is_error_hva(hva)) {
> + kvm_incr_pc(vcpu);
> + ret = 1;
> + goto out_unlock;
> + }
> +
> + if (fault_status == FSC_PERM) {
> + /* DC IVAC on a R/O memslot */
> + kvm_inject_dabt(vcpu,
> kvm_vcpu_get_hfar(vcpu));

One question:
In general, the "DC" ops show up very early in guest. So what if the guest do 
this before interrupt initialization? If so, the guest may stuck here.

Thanks
Jianyong

> + ret = 1;
> + goto out_unlock;
> + }
> +
> + goto handle_access;
>   }
> 
>   /*
> @@ -1039,6 +1065,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu
> *vcpu)
>   goto out_unlock;
>   }
> 
> +handle_access:
>   /* Userspace should not be able to register out-of-bounds IPAs */
>   VM_BUG_ON(fault_ipa >= kvm_phys_size(vcpu->kvm));
> 
> --
> 2.29.2
> 
> 
> --
> Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 2/2] vfio/iommu_type1: Fix some sanity checks in detach group

2021-01-27 Thread Alex Williamson
On Fri, 22 Jan 2021 17:26:35 +0800
Keqian Zhu  wrote:

> vfio_sanity_check_pfn_list() is used to check whether pfn_list and
> notifier are empty when remove the external domain, so it makes a
> wrong assumption that only external domain will use the pinning
> interface.
> 
> Now we apply the pfn_list check when a vfio_dma is removed and apply
> the notifier check when all domains are removed.
> 
> Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices")
> Signed-off-by: Keqian Zhu 
> ---
>  drivers/vfio/vfio_iommu_type1.c | 33 ++---
>  1 file changed, 10 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 161725395f2f..d8c10f508321 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -957,6 +957,7 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, 
> struct vfio_dma *dma,
>  
>  static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>  {
> + WARN_ON(!RB_EMPTY_ROOT(>pfn_list));
>   vfio_unmap_unpin(iommu, dma, true);
>   vfio_unlink_dma(iommu, dma);
>   put_task_struct(dma->task);
> @@ -2250,23 +2251,6 @@ static void vfio_iommu_unmap_unpin_reaccount(struct 
> vfio_iommu *iommu)
>   }
>  }
>  
> -static void vfio_sanity_check_pfn_list(struct vfio_iommu *iommu)
> -{
> - struct rb_node *n;
> -
> - n = rb_first(>dma_list);
> - for (; n; n = rb_next(n)) {
> - struct vfio_dma *dma;
> -
> - dma = rb_entry(n, struct vfio_dma, node);
> -
> - if (WARN_ON(!RB_EMPTY_ROOT(>pfn_list)))
> - break;
> - }
> - /* mdev vendor driver must unregister notifier */
> - WARN_ON(iommu->notifier.head);
> -}
> -
>  /*
>   * Called when a domain is removed in detach. It is possible that
>   * the removed domain decided the iova aperture window. Modify the
> @@ -2366,10 +2350,10 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>   kfree(group);
>  
>   if (list_empty(>external_domain->group_list)) {
> - vfio_sanity_check_pfn_list(iommu);
> -
> - if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu))
> + if (!IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu)) {
> + WARN_ON(iommu->notifier.head);
>   vfio_iommu_unmap_unpin_all(iommu);
> + }
>  
>   kfree(iommu->external_domain);
>   iommu->external_domain = NULL;
> @@ -2403,10 +2387,12 @@ static void vfio_iommu_type1_detach_group(void 
> *iommu_data,
>*/
>   if (list_empty(>group_list)) {
>   if (list_is_singular(>domain_list)) {
> - if (!iommu->external_domain)
> + if (!iommu->external_domain) {
> + WARN_ON(iommu->notifier.head);
>   vfio_iommu_unmap_unpin_all(iommu);
> - else
> + } else {
>   vfio_iommu_unmap_unpin_reaccount(iommu);
> + }
>   }
>   iommu_domain_free(domain->domain);
>   list_del(>next);
> @@ -2488,9 +2474,10 @@ static void vfio_iommu_type1_release(void *iommu_data)
>   struct vfio_iommu *iommu = iommu_data;
>   struct vfio_domain *domain, *domain_tmp;
>  
> + WARN_ON(iommu->notifier.head);

I don't see that this does any harm, but isn't it actually redundant?
It seems vfio-core only calls the iommu backend release function after
removing all the groups, so the tests in _detach_group should catch all
cases.  We're expecting the vfio bus/mdev driver to remove the notifier
when a device is closed, which necessarily occurs before detaching the
group.  Thanks,

Alex

> +
>   if (iommu->external_domain) {
>   vfio_release_domain(iommu->external_domain, true);
> - vfio_sanity_check_pfn_list(iommu);
>   kfree(iommu->external_domain);
>   }
>  

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2] KVM: x86/mmu: consider the hva in mmu_notifier retry

2021-01-27 Thread Sean Christopherson
On Wed, Jan 27, 2021, David Stevens wrote:
> From: David Stevens 
> 
> Track the range being invalidated by mmu_notifier and skip page fault
> retries if the fault address is not affected by the in-progress
> invalidation. Handle concurrent invalidations by finding the minimal
> range which includes all ranges being invalidated. Although the combined
> range may include unrelated addresses and cannot be shrunk as individual
> invalidation operations complete, it is unlikely the marginal gains of
> proper range tracking are worth the additional complexity.
> 
> The primary benefit of this change is the reduction in the likelihood of
> extreme latency when handing a page fault due to another thread having
> been preempted while modifying host virtual addresses.
> 
> Signed-off-by: David Stevens 
> ---
> v1 -> v2:
>  - improve handling of concurrent invalidation requests by unioning
>ranges, instead of just giving up and using [0, ULONG_MAX).

Ooh, even better.

>  - add lockdep check
>  - code comments and formatting
> 
>  arch/powerpc/kvm/book3s_64_mmu_hv.c|  2 +-
>  arch/powerpc/kvm/book3s_64_mmu_radix.c |  2 +-
>  arch/x86/kvm/mmu/mmu.c | 16 --
>  arch/x86/kvm/mmu/paging_tmpl.h |  7 ---
>  include/linux/kvm_host.h   | 27 +++-
>  virt/kvm/kvm_main.c| 29 ++
>  6 files changed, 67 insertions(+), 16 deletions(-)
> 

...

> @@ -3717,7 +3720,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, 
> gpa_t gpa, u32 error_code,
>   mmu_seq = vcpu->kvm->mmu_notifier_seq;
>   smp_rmb();
>  
> - if (try_async_pf(vcpu, prefault, gfn, gpa, , write, _writable))
> + if (try_async_pf(vcpu, prefault, gfn, gpa, , ,
> +  write, _writable))
>   return RET_PF_RETRY;
>  
>   if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, pfn, ACC_ALL, ))
> @@ -3725,7 +3729,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, 
> gpa_t gpa, u32 error_code,
>  
>   r = RET_PF_RETRY;
>   spin_lock(>kvm->mmu_lock);
> - if (mmu_notifier_retry(vcpu->kvm, mmu_seq))
> + if (mmu_notifier_retry_hva(vcpu->kvm, mmu_seq, hva))

'hva' will be uninitialized at this point if the gfn did not resolve to a
memslot, i.e. when handling an MMIO page fault.  On the plus side, that's an
opportunity for another optimization as there is no need to retry MMIO page
faults on mmu_notifier invalidations.  Including the attached patch as a preqreq
to this will avoid consuming an uninitialized 'hva'.


>   goto out_unlock;
>   r = make_mmu_pages_available(vcpu);
>   if (r)

...

>  void kvm_release_pfn_clean(kvm_pfn_t pfn);
>  void kvm_release_pfn_dirty(kvm_pfn_t pfn);
> @@ -1203,6 +1206,28 @@ static inline int mmu_notifier_retry(struct kvm *kvm, 
> unsigned long mmu_seq)
>   return 1;
>   return 0;
>  }
> +
> +static inline int mmu_notifier_retry_hva(struct kvm *kvm,
> +  unsigned long mmu_seq,
> +  unsigned long hva)
> +{
> +#ifdef CONFIG_LOCKDEP
> + lockdep_is_held(>mmu_lock);

No need to manually do the #ifdef, just use lockdep_assert_held instead of
lockdep_is_held.

> +#endif
> + /*
> +  * If mmu_notifier_count is non-zero, then the range maintained by
> +  * kvm_mmu_notifier_invalidate_range_start contains all addresses that
> +  * might be being invalidated. Note that it may include some false
> +  * positives, due to shortcuts when handing concurrent invalidations.
> +  */
> + if (unlikely(kvm->mmu_notifier_count) &&
> + kvm->mmu_notifier_range_start <= hva &&
> + hva < kvm->mmu_notifier_range_end)

Uber nit: I find this easier to read if 'hva' is on the left-hand side for both
checks, i.e.

if (unlikely(kvm->mmu_notifier_count) &&
hva >= kvm->mmu_notifier_range_start &&
hva < kvm->mmu_notifier_range_end)

> + return 1;
> + if (kvm->mmu_notifier_seq != mmu_seq)
> + return 1;
> + return 0;
> +}
>  #endif
>  
>  #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
>From a1bfdc6fe16582440815cfecc656313dff993003 Mon Sep 17 00:00:00 2001
From: Sean Christopherson 
Date: Wed, 27 Jan 2021 10:04:45 -0800
Subject: [PATCH] KVM: x86/mmu: Skip mmu_notifier check when handling MMIO page
 fault

Don't retry a page fault due to an mmu_notifier invalidation when
handling a page fault for a GPA that did not resolve to a memslot, i.e.
an MMIO page fault.  Invalidations from the mmu_notifier signal a change
in a host virtual address (HVA) mapping; without a memslot, there is no
HVA and thus no possibility that the invalidation is relevant to the
page fault being handled.

Note, the MMIO vs. memslot generation checks handle the case where a
pending memslot will create a memslot overlapping the faulting GPA.  The
mmu_notifier checks are orthogonal to memslot updates.

Signed-off-by: Sean 

Re: [PATCH v2 7/7] KVM: arm64: Use symbolic names for the PMU versions

2021-01-27 Thread Auger Eric
Hi Marc,

On 1/25/21 1:26 PM, Marc Zyngier wrote:
> Instead of using a bunch of magic numbers, use the existing definitions
> that have been added since 8673e02e58410 ("arm64: perf: Add support
> for ARMv8.5-PMU 64-bit counters")
> 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/pmu-emul.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 72cd704a8368..cb16ca2eee92 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -23,11 +23,11 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, 
> struct kvm_pmc *pmc);
>  static u32 kvm_pmu_event_mask(struct kvm *kvm)
>  {
>   switch (kvm->arch.pmuver) {
> - case 1: /* ARMv8.0 */
> + case ID_AA64DFR0_PMUVER_8_0:
>   return GENMASK(9, 0);
> - case 4: /* ARMv8.1 */
> - case 5: /* ARMv8.4 */
> - case 6: /* ARMv8.5 */
> + case ID_AA64DFR0_PMUVER_8_1:
> + case ID_AA64DFR0_PMUVER_8_4:
> + case ID_AA64DFR0_PMUVER_8_5:
>   return GENMASK(15, 0);
>   default:/* Shouldn't be here, just for sanity */
>   WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
> 
Reviewed-by: Eric Auger 

Thanks

Eric

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 6/7] KVM: arm64: Upgrade PMU support to ARMv8.4

2021-01-27 Thread Auger Eric
Hi Marc,

On 1/25/21 1:26 PM, Marc Zyngier wrote:
> Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
> pretty easy. All that is required is support for PMMIR_EL1, which
> is read-only, and for which returning 0 is a valid option as long
> as we don't advertise STALL_SLOT as an implemented event.
> 
> Let's just do that and adjust what we return to the guest.
> 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/include/asm/sysreg.h |  3 +++
>  arch/arm64/kvm/pmu-emul.c   |  6 ++
>  arch/arm64/kvm/sys_regs.c   | 11 +++
>  3 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 8b5e7e5c3cc8..2fb3f386588c 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -846,7 +846,10 @@
>  
>  #define ID_DFR0_PERFMON_SHIFT24
>  
> +#define ID_DFR0_PERFMON_8_0  0x3
>  #define ID_DFR0_PERFMON_8_1  0x4
> +#define ID_DFR0_PERFMON_8_4  0x5
> +#define ID_DFR0_PERFMON_8_5  0x6
>  
>  #define ID_ISAR4_SWP_FRAC_SHIFT  28
>  #define ID_ISAR4_PSR_M_SHIFT 24
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 398f6df1bbe4..72cd704a8368 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -795,6 +795,12 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool 
> pmceid1)
>   base = 0;
>   } else {
>   val = read_sysreg(pmceid1_el0);
> + /*
> +  * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
> +  * as RAZ
> +  */
> + if (vcpu->kvm->arch.pmuver >= ID_AA64DFR0_PMUVER_8_4)
> + val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
what about the STALL_SLOT_BACKEND and FRONTEND events then?
>   base = 32;
>   }
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 8f79ec1fffa7..5da536ab738d 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1051,16 +1051,16 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>   /* Limit debug to ARMv8.0 */
>   val &= ~FEATURE(ID_AA64DFR0_DEBUGVER);
>   val |= FIELD_PREP(FEATURE(ID_AA64DFR0_DEBUGVER), 6);
> - /* Limit guests to PMUv3 for ARMv8.1 */
> + /* Limit guests to PMUv3 for ARMv8.4 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_AA64DFR0_PMUVER_SHIFT,
> -   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_1 : 0);
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_4 : 0);
>   break;
>   case SYS_ID_DFR0_EL1:
> - /* Limit guests to PMUv3 for ARMv8.1 */
> + /* Limit guests to PMUv3 for ARMv8.4 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_DFR0_PERFMON_SHIFT,
> -   kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_1 : 0);
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_4 : 0);
>   break;
>   }
>  
> @@ -1496,6 +1496,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  
>   { SYS_DESC(SYS_PMINTENSET_EL1), access_pminten, reset_unknown, 
> PMINTENSET_EL1 },
>   { SYS_DESC(SYS_PMINTENCLR_EL1), access_pminten, reset_unknown, 
> PMINTENSET_EL1 },
"KVM: arm64: Hide PMU registers from userspace when not available"
changed the above, doesn't it?
> + { SYS_DESC(SYS_PMMIR_EL1), trap_raz_wi },
>  
>   { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
>   { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
> @@ -1918,6 +1919,8 @@ static const struct sys_reg_desc cp15_regs[] = {
>   { Op1( 0), CRn( 9), CRm(14), Op2( 3), access_pmovs },
>   { AA32(HI), Op1( 0), CRn( 9), CRm(14), Op2( 4), access_pmceid },
>   { AA32(HI), Op1( 0), CRn( 9), CRm(14), Op2( 5), access_pmceid },
> + /* PMMIR */
> + { Op1( 0), CRn( 9), CRm(14), Op2( 6), trap_raz_wi },
>  
>   /* PRRR/MAIR0 */
>   { AA32(LO), Op1( 0), CRn(10), CRm( 2), Op2( 0), access_vm_reg, NULL, 
> MAIR_EL1 },
> 
Thanks

Eric

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 6/7] KVM: arm64: Upgrade PMU support to ARMv8.4

2021-01-27 Thread Alexandru Elisei
Hi Marc,

Had another look at the patch, comments below.

On 1/25/21 12:26 PM, Marc Zyngier wrote:
> Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
> pretty easy. All that is required is support for PMMIR_EL1, which
> is read-only, and for which returning 0 is a valid option as long
> as we don't advertise STALL_SLOT as an implemented event.
>
> Let's just do that and adjust what we return to the guest.
>
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/include/asm/sysreg.h |  3 +++
>  arch/arm64/kvm/pmu-emul.c   |  6 ++
>  arch/arm64/kvm/sys_regs.c   | 11 +++
>  3 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 8b5e7e5c3cc8..2fb3f386588c 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -846,7 +846,10 @@
>  
>  #define ID_DFR0_PERFMON_SHIFT24
>  
> +#define ID_DFR0_PERFMON_8_0  0x3
>  #define ID_DFR0_PERFMON_8_1  0x4
> +#define ID_DFR0_PERFMON_8_4  0x5
> +#define ID_DFR0_PERFMON_8_5  0x6
>  
>  #define ID_ISAR4_SWP_FRAC_SHIFT  28
>  #define ID_ISAR4_PSR_M_SHIFT 24
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 398f6df1bbe4..72cd704a8368 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -795,6 +795,12 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool 
> pmceid1)
>   base = 0;
>   } else {
>   val = read_sysreg(pmceid1_el0);
> + /*
> +  * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
> +  * as RAZ
> +  */
> + if (vcpu->kvm->arch.pmuver >= ID_AA64DFR0_PMUVER_8_4)
> + val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);

This is confusing the me. We have kvm->arch.pmuver set to the hardware PMU 
version
(as set by __armv8pmu_probe_pmu()), but we ignore it when reporting the PMU
version to the guest. Why do we do that? We limit the event number in
kvm_pmu_event_mask() based on the hardware PMU version, so even if we advertise
Armv8.4 PMU, support for all those extra events added by Arm8.1 PMU is missing 
(I
hope I understood the code correctly).

I looked at commit c854188ea010 ("KVM: arm64: limit PMU version to PMUv3 for
ARMv8.1") which changed read_id_reg() to report PMUv3 for Armv8.1 
unconditionally,
and there's no explanation why PMUv3 for Armv8.1 was chosen instead of plain 
PMUv3
(PMUVer = 0b0100).

Thanks,

Alex

>   base = 32;
>   }
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 8f79ec1fffa7..5da536ab738d 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1051,16 +1051,16 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>   /* Limit debug to ARMv8.0 */
>   val &= ~FEATURE(ID_AA64DFR0_DEBUGVER);
>   val |= FIELD_PREP(FEATURE(ID_AA64DFR0_DEBUGVER), 6);
> - /* Limit guests to PMUv3 for ARMv8.1 */
> + /* Limit guests to PMUv3 for ARMv8.4 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_AA64DFR0_PMUVER_SHIFT,
> -   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_1 : 0);
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_4 : 0);
>   break;
>   case SYS_ID_DFR0_EL1:
> - /* Limit guests to PMUv3 for ARMv8.1 */
> + /* Limit guests to PMUv3 for ARMv8.4 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_DFR0_PERFMON_SHIFT,
> -   kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_1 : 0);
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_4 : 0);
>   break;
>   }
>  
> @@ -1496,6 +1496,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  
>   { SYS_DESC(SYS_PMINTENSET_EL1), access_pminten, reset_unknown, 
> PMINTENSET_EL1 },
>   { SYS_DESC(SYS_PMINTENCLR_EL1), access_pminten, reset_unknown, 
> PMINTENSET_EL1 },
> + { SYS_DESC(SYS_PMMIR_EL1), trap_raz_wi },
>  
>   { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
>   { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
> @@ -1918,6 +1919,8 @@ static const struct sys_reg_desc cp15_regs[] = {
>   { Op1( 0), CRn( 9), CRm(14), Op2( 3), access_pmovs },
>   { AA32(HI), Op1( 0), CRn( 9), CRm(14), Op2( 4), access_pmceid },
>   { AA32(HI), Op1( 0), CRn( 9), CRm(14), Op2( 5), access_pmceid },
> + /* PMMIR */
> + { Op1( 0), CRn( 9), CRm(14), Op2( 6), trap_raz_wi },
>  
>   /* PRRR/MAIR0 */
>   { AA32(LO), Op1( 0), CRn(10), CRm( 2), Op2( 0), 

Re: [PATCH v2 6/7] KVM: arm64: Upgrade PMU support to ARMv8.4

2021-01-27 Thread Marc Zyngier

On 2021-01-27 17:00, Alexandru Elisei wrote:

Hi Marc,

On 1/27/21 2:35 PM, Marc Zyngier wrote:

Hi Alex,

On 2021-01-27 14:09, Alexandru Elisei wrote:

Hi Marc,

On 1/25/21 12:26 PM, Marc Zyngier wrote:

Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
pretty easy. All that is required is support for PMMIR_EL1, which
is read-only, and for which returning 0 is a valid option as long
as we don't advertise STALL_SLOT as an implemented event.


According to ARM DDI 0487F.b, page D7-2743:

"If ARMv8.4-PMU is implemented:
- If STALL_SLOT is not implemented, it is IMPLEMENTATION DEFINED
whether the PMMIR
System registers are implemented.
- If STALL_SLOT is implemented, then the PMMIR System registers are
implemented."

I tried to come up with a reason why PMMIR is emulated instead of 
being left
undefined, but I couldn't figure it out. Would you mind adding a 
comment or

changing the commit message to explain that?


The main reason is that PMMIR gets new fields down the line,
and doing the bare minimum in term of implementation allows
us to gently ease into it.
I think I understand what you are saying - add a bare minimum emulation 
of the
PMMIR register now so it's less work when we do decide to support the 
STALL_SLOT

event for a guest.


We could also go for the full PMMIR reporting on homogeneous
systems too, as a further improvement.

What do you think?


I don't have an opinion either way. But if you do decide to add full
emulation for
STALL_SLOT, I would like to help with reviewing the patches (I'm 
curious to see

how KVM will detect that it's running on a homogeneous system).


I'm not sure we can *detect* it. We'd need some more information
from the core arch code and firmware. To be honest, PMU emulation
is a joke on BL, so maybe we shouldn't really care and expose
what we have.

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 6/7] KVM: arm64: Upgrade PMU support to ARMv8.4

2021-01-27 Thread Alexandru Elisei
Hi Marc,

On 1/27/21 2:35 PM, Marc Zyngier wrote:
> Hi Alex,
>
> On 2021-01-27 14:09, Alexandru Elisei wrote:
>> Hi Marc,
>>
>> On 1/25/21 12:26 PM, Marc Zyngier wrote:
>>> Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
>>> pretty easy. All that is required is support for PMMIR_EL1, which
>>> is read-only, and for which returning 0 is a valid option as long
>>> as we don't advertise STALL_SLOT as an implemented event.
>>
>> According to ARM DDI 0487F.b, page D7-2743:
>>
>> "If ARMv8.4-PMU is implemented:
>> - If STALL_SLOT is not implemented, it is IMPLEMENTATION DEFINED
>> whether the PMMIR
>> System registers are implemented.
>> - If STALL_SLOT is implemented, then the PMMIR System registers are
>> implemented."
>>
>> I tried to come up with a reason why PMMIR is emulated instead of being left
>> undefined, but I couldn't figure it out. Would you mind adding a comment or
>> changing the commit message to explain that?
>
> The main reason is that PMMIR gets new fields down the line,
> and doing the bare minimum in term of implementation allows
> us to gently ease into it.
I think I understand what you are saying - add a bare minimum emulation of the
PMMIR register now so it's less work when we do decide to support the STALL_SLOT
event for a guest.
>
> We could also go for the full PMMIR reporting on homogeneous
> systems too, as a further improvement.
>
> What do you think?

I don't have an opinion either way. But if you do decide to add full emulation 
for
STALL_SLOT, I would like to help with reviewing the patches (I'm curious to see
how KVM will detect that it's running on a homogeneous system).

Thanks,
Alex
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [kvm-unit-tests PATCH v2 08/12] arm/arm64: gic: Split check_acked() into two functions

2021-01-27 Thread Alexandru Elisei
Hi Andre,

On 1/27/21 3:10 PM, Andre Przywara wrote:
> On Mon, 25 Jan 2021 17:27:35 +
> Alexandru Elisei  wrote:
>
> Hi Alex,
>
>> On 12/18/20 3:52 PM, André Przywara wrote:
>>> On 17/12/2020 14:13, Alexandru Elisei wrote:  
 check_acked() has several peculiarities: is the only function among the
 check_* functions which calls report() directly, it does two things
 (waits for interrupts and checks for misfired interrupts) and it also
 mixes printf, report_info and report calls.

 check_acked() also reports a pass and returns as soon all the target CPUs
 have received interrupts, However, a CPU not having received an interrupt
 *now* does not guarantee not receiving an erroneous interrupt if we wait
 long enough.

 Rework the function by splitting it into two separate functions, each with
 a single responsibility: wait_for_interrupts(), which waits for the
 expected interrupts to fire, and check_acked() which checks that interrupts
 have been received as expected.

 wait_for_interrupts() also waits an extra 100 milliseconds after the
 expected interrupts have been received in an effort to make sure we don't
 miss misfiring interrupts.

 Splitting check_acked() into two functions will also allow us to
 customize the behavior of each function in the future more easily
 without using an unnecessarily long list of arguments for check_acked().  
>>> Yes, splitting this up looks much better, in general this is a nice
>>> cleanup, thank you!
>>>
>>> Some comments below:
>>>  
 CC: Andre Przywara 
 Signed-off-by: Alexandru Elisei 
 ---
  arm/gic.c | 73 +++
  1 file changed, 47 insertions(+), 26 deletions(-)

 diff --git a/arm/gic.c b/arm/gic.c
 index ec733719c776..a9ef1a5def56 100644
 --- a/arm/gic.c
 +++ b/arm/gic.c
 @@ -62,41 +62,42 @@ static void stats_reset(void)
}
  }
  
 -static void check_acked(const char *testname, cpumask_t *mask)
 +static void wait_for_interrupts(cpumask_t *mask)
  {
 -  int missing = 0, extra = 0, unexpected = 0;
int nr_pass, cpu, i;
 -  bool bad = false;
  
/* Wait up to 5s for all interrupts to be delivered */
 -  for (i = 0; i < 50; ++i) {
 +  for (i = 0; i < 50; i++) {
mdelay(100);
nr_pass = 0;
for_each_present_cpu(cpu) {
 +  /*
 +   * A CPU having received more than one interrupts will
 +   * show up in check_acked(), and no matter how long we
 +   * wait it cannot un-receive it. Consider at least one
 +   * interrupt as a pass.
 +   */
nr_pass += cpumask_test_cpu(cpu, mask) ?
 -  acked[cpu] == 1 : acked[cpu] == 0;
 -  smp_rmb(); /* pairs with smp_wmb in ipi_handler */
 -
 -  if (bad_sender[cpu] != -1) {
 -  printf("cpu%d received IPI from wrong sender 
 %d\n",
 -  cpu, bad_sender[cpu]);
 -  bad = true;
 -  }
 -
 -  if (bad_irq[cpu] != -1) {
 -  printf("cpu%d received wrong irq %d\n",
 -  cpu, bad_irq[cpu]);
 -  bad = true;
 -  }
 +  acked[cpu] >= 1 : acked[cpu] == 0;  
>>> I wonder if this logic was already flawed to begin with: For interrupts
>>> we expect to fire, we wait for up to 5 seconds (really that long?), but
>>> for interrupts we expect *not* to fire we are OK if they don't show up
>>> in the first 100 ms. That does not sound consistent.  
>> There are two ways that I see to fix this:
>>
>> - Have the caller wait for however long it sees fit, and *after* that waiting
>> period call wait_for_interrupts().
>>
>> - Pass a flag to wait_for_interrupts() to specify that the behaviour should 
>> be to
>> wait for the entire duration instead of until the expected interrupts have 
>> been
>> received.
>>
>> Neither sounds appealing to me for inclusion in this patch set, since I want 
>> to
>> concentrate on reworking check_acked() while keeping much of the current 
>> behaviour
>> intact.
>>
>>> I am wondering if we should *not* have the initial 100ms wait at all,
>>> since most interrupts will fire immediately (especially in KVM). And
>>> then have *one* extra wait for, say 100ms, to cover latecomers and
>>> spurious interrupts.  
>> I don't think it really matters where the 100 millisecond delay is in the 
>> loop.
> I think it does. I ran tests with 256 vCPUs, I think we support even
> more, and running k-u-t on those setups is one of the cases where it
> really matters and we can find real bugs.

Re: [kvm-unit-tests PATCH v2 08/12] arm/arm64: gic: Split check_acked() into two functions

2021-01-27 Thread Andre Przywara
On Mon, 25 Jan 2021 17:27:35 +
Alexandru Elisei  wrote:

Hi Alex,

> On 12/18/20 3:52 PM, André Przywara wrote:
> > On 17/12/2020 14:13, Alexandru Elisei wrote:  
> >> check_acked() has several peculiarities: is the only function among the
> >> check_* functions which calls report() directly, it does two things
> >> (waits for interrupts and checks for misfired interrupts) and it also
> >> mixes printf, report_info and report calls.
> >>
> >> check_acked() also reports a pass and returns as soon all the target CPUs
> >> have received interrupts, However, a CPU not having received an interrupt
> >> *now* does not guarantee not receiving an erroneous interrupt if we wait
> >> long enough.
> >>
> >> Rework the function by splitting it into two separate functions, each with
> >> a single responsibility: wait_for_interrupts(), which waits for the
> >> expected interrupts to fire, and check_acked() which checks that interrupts
> >> have been received as expected.
> >>
> >> wait_for_interrupts() also waits an extra 100 milliseconds after the
> >> expected interrupts have been received in an effort to make sure we don't
> >> miss misfiring interrupts.
> >>
> >> Splitting check_acked() into two functions will also allow us to
> >> customize the behavior of each function in the future more easily
> >> without using an unnecessarily long list of arguments for check_acked().  
> > Yes, splitting this up looks much better, in general this is a nice
> > cleanup, thank you!
> >
> > Some comments below:
> >  
> >> CC: Andre Przywara 
> >> Signed-off-by: Alexandru Elisei 
> >> ---
> >>  arm/gic.c | 73 +++
> >>  1 file changed, 47 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/arm/gic.c b/arm/gic.c
> >> index ec733719c776..a9ef1a5def56 100644
> >> --- a/arm/gic.c
> >> +++ b/arm/gic.c
> >> @@ -62,41 +62,42 @@ static void stats_reset(void)
> >>}
> >>  }
> >>  
> >> -static void check_acked(const char *testname, cpumask_t *mask)
> >> +static void wait_for_interrupts(cpumask_t *mask)
> >>  {
> >> -  int missing = 0, extra = 0, unexpected = 0;
> >>int nr_pass, cpu, i;
> >> -  bool bad = false;
> >>  
> >>/* Wait up to 5s for all interrupts to be delivered */
> >> -  for (i = 0; i < 50; ++i) {
> >> +  for (i = 0; i < 50; i++) {
> >>mdelay(100);
> >>nr_pass = 0;
> >>for_each_present_cpu(cpu) {
> >> +  /*
> >> +   * A CPU having received more than one interrupts will
> >> +   * show up in check_acked(), and no matter how long we
> >> +   * wait it cannot un-receive it. Consider at least one
> >> +   * interrupt as a pass.
> >> +   */
> >>nr_pass += cpumask_test_cpu(cpu, mask) ?
> >> -  acked[cpu] == 1 : acked[cpu] == 0;
> >> -  smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> >> -
> >> -  if (bad_sender[cpu] != -1) {
> >> -  printf("cpu%d received IPI from wrong sender 
> >> %d\n",
> >> -  cpu, bad_sender[cpu]);
> >> -  bad = true;
> >> -  }
> >> -
> >> -  if (bad_irq[cpu] != -1) {
> >> -  printf("cpu%d received wrong irq %d\n",
> >> -  cpu, bad_irq[cpu]);
> >> -  bad = true;
> >> -  }
> >> +  acked[cpu] >= 1 : acked[cpu] == 0;  
> >
> > I wonder if this logic was already flawed to begin with: For interrupts
> > we expect to fire, we wait for up to 5 seconds (really that long?), but
> > for interrupts we expect *not* to fire we are OK if they don't show up
> > in the first 100 ms. That does not sound consistent.  
> 
> There are two ways that I see to fix this:
> 
> - Have the caller wait for however long it sees fit, and *after* that waiting
> period call wait_for_interrupts().
> 
> - Pass a flag to wait_for_interrupts() to specify that the behaviour should 
> be to
> wait for the entire duration instead of until the expected interrupts have 
> been
> received.
> 
> Neither sounds appealing to me for inclusion in this patch set, since I want 
> to
> concentrate on reworking check_acked() while keeping much of the current 
> behaviour
> intact.
> 
> >
> > I am wondering if we should *not* have the initial 100ms wait at all,
> > since most interrupts will fire immediately (especially in KVM). And
> > then have *one* extra wait for, say 100ms, to cover latecomers and
> > spurious interrupts.  
> 
> I don't think it really matters where the 100 millisecond delay is in the 
> loop.

I think it does. I ran tests with 256 vCPUs, I think we support even
more, and running k-u-t on those setups is one of the cases where it
really matters and we can find real bugs.
So 100ms on their own does not sound much, but it means we wait at 

Re: [PATCH v2 6/7] KVM: arm64: Upgrade PMU support to ARMv8.4

2021-01-27 Thread Marc Zyngier

Hi Alex,

On 2021-01-27 14:09, Alexandru Elisei wrote:

Hi Marc,

On 1/25/21 12:26 PM, Marc Zyngier wrote:

Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
pretty easy. All that is required is support for PMMIR_EL1, which
is read-only, and for which returning 0 is a valid option as long
as we don't advertise STALL_SLOT as an implemented event.


According to ARM DDI 0487F.b, page D7-2743:

"If ARMv8.4-PMU is implemented:
- If STALL_SLOT is not implemented, it is IMPLEMENTATION DEFINED
whether the PMMIR
System registers are implemented.
- If STALL_SLOT is implemented, then the PMMIR System registers are
implemented."

I tried to come up with a reason why PMMIR is emulated instead of being 
left
undefined, but I couldn't figure it out. Would you mind adding a 
comment or

changing the commit message to explain that?


The main reason is that PMMIR gets new fields down the line,
and doing the bare minimum in term of implementation allows
us to gently ease into it.

We could also go for the full PMMIR reporting on homogeneous
systems too, as a further improvement.

What do you think?

M.
--
Jazz is not dead. It just smells funny...
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 7/7] KVM: arm64: Use symbolic names for the PMU versions

2021-01-27 Thread Alexandru Elisei
Hi Marc,

This is a nice cleanup. Checked that the defines have the same value as the
constants they are replacing:

Reviewed-by: Alexandru Elisei 

On 1/25/21 12:26 PM, Marc Zyngier wrote:
> Instead of using a bunch of magic numbers, use the existing definitions
> that have been added since 8673e02e58410 ("arm64: perf: Add support
> for ARMv8.5-PMU 64-bit counters")
>
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/pmu-emul.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 72cd704a8368..cb16ca2eee92 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -23,11 +23,11 @@ static void kvm_pmu_stop_counter(struct kvm_vcpu *vcpu, 
> struct kvm_pmc *pmc);
>  static u32 kvm_pmu_event_mask(struct kvm *kvm)
>  {
>   switch (kvm->arch.pmuver) {
> - case 1: /* ARMv8.0 */
> + case ID_AA64DFR0_PMUVER_8_0:
>   return GENMASK(9, 0);
> - case 4: /* ARMv8.1 */
> - case 5: /* ARMv8.4 */
> - case 6: /* ARMv8.5 */
> + case ID_AA64DFR0_PMUVER_8_1:
> + case ID_AA64DFR0_PMUVER_8_4:
> + case ID_AA64DFR0_PMUVER_8_5:
>   return GENMASK(15, 0);
>   default:/* Shouldn't be here, just for sanity */
>   WARN_ONCE(1, "Unknown PMU version %d\n", kvm->arch.pmuver);
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 6/7] KVM: arm64: Upgrade PMU support to ARMv8.4

2021-01-27 Thread Alexandru Elisei
Hi Marc,

On 1/25/21 12:26 PM, Marc Zyngier wrote:
> Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
> pretty easy. All that is required is support for PMMIR_EL1, which
> is read-only, and for which returning 0 is a valid option as long
> as we don't advertise STALL_SLOT as an implemented event.

According to ARM DDI 0487F.b, page D7-2743:

"If ARMv8.4-PMU is implemented:
- If STALL_SLOT is not implemented, it is IMPLEMENTATION DEFINED whether the 
PMMIR
System registers are implemented.
- If STALL_SLOT is implemented, then the PMMIR System registers are 
implemented."

I tried to come up with a reason why PMMIR is emulated instead of being left
undefined, but I couldn't figure it out. Would you mind adding a comment or
changing the commit message to explain that?

Thanks,
Alex
> Let's just do that and adjust what we return to the guest.
>
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/include/asm/sysreg.h |  3 +++
>  arch/arm64/kvm/pmu-emul.c   |  6 ++
>  arch/arm64/kvm/sys_regs.c   | 11 +++
>  3 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 8b5e7e5c3cc8..2fb3f386588c 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -846,7 +846,10 @@
>  
>  #define ID_DFR0_PERFMON_SHIFT24
>  
> +#define ID_DFR0_PERFMON_8_0  0x3
>  #define ID_DFR0_PERFMON_8_1  0x4
> +#define ID_DFR0_PERFMON_8_4  0x5
> +#define ID_DFR0_PERFMON_8_5  0x6
>  
>  #define ID_ISAR4_SWP_FRAC_SHIFT  28
>  #define ID_ISAR4_PSR_M_SHIFT 24
> diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c
> index 398f6df1bbe4..72cd704a8368 100644
> --- a/arch/arm64/kvm/pmu-emul.c
> +++ b/arch/arm64/kvm/pmu-emul.c
> @@ -795,6 +795,12 @@ u64 kvm_pmu_get_pmceid(struct kvm_vcpu *vcpu, bool 
> pmceid1)
>   base = 0;
>   } else {
>   val = read_sysreg(pmceid1_el0);
> + /*
> +  * Don't advertise STALL_SLOT, as PMMIR_EL0 is handled
> +  * as RAZ
> +  */
> + if (vcpu->kvm->arch.pmuver >= ID_AA64DFR0_PMUVER_8_4)
> + val &= ~BIT_ULL(ARMV8_PMUV3_PERFCTR_STALL_SLOT - 32);
>   base = 32;
>   }
>  
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 8f79ec1fffa7..5da536ab738d 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1051,16 +1051,16 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>   /* Limit debug to ARMv8.0 */
>   val &= ~FEATURE(ID_AA64DFR0_DEBUGVER);
>   val |= FIELD_PREP(FEATURE(ID_AA64DFR0_DEBUGVER), 6);
> - /* Limit guests to PMUv3 for ARMv8.1 */
> + /* Limit guests to PMUv3 for ARMv8.4 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_AA64DFR0_PMUVER_SHIFT,
> -   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_1 : 0);
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_4 : 0);
>   break;
>   case SYS_ID_DFR0_EL1:
> - /* Limit guests to PMUv3 for ARMv8.1 */
> + /* Limit guests to PMUv3 for ARMv8.4 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_DFR0_PERFMON_SHIFT,
> -   kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_1 : 0);
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_4 : 0);
>   break;
>   }
>  
> @@ -1496,6 +1496,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
>  
>   { SYS_DESC(SYS_PMINTENSET_EL1), access_pminten, reset_unknown, 
> PMINTENSET_EL1 },
>   { SYS_DESC(SYS_PMINTENCLR_EL1), access_pminten, reset_unknown, 
> PMINTENSET_EL1 },
> + { SYS_DESC(SYS_PMMIR_EL1), trap_raz_wi },
>  
>   { SYS_DESC(SYS_MAIR_EL1), access_vm_reg, reset_unknown, MAIR_EL1 },
>   { SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
> @@ -1918,6 +1919,8 @@ static const struct sys_reg_desc cp15_regs[] = {
>   { Op1( 0), CRn( 9), CRm(14), Op2( 3), access_pmovs },
>   { AA32(HI), Op1( 0), CRn( 9), CRm(14), Op2( 4), access_pmceid },
>   { AA32(HI), Op1( 0), CRn( 9), CRm(14), Op2( 5), access_pmceid },
> + /* PMMIR */
> + { Op1( 0), CRn( 9), CRm(14), Op2( 6), trap_raz_wi },
>  
>   /* PRRR/MAIR0 */
>   { AA32(LO), Op1( 0), CRn(10), CRm( 2), Op2( 0), access_vm_reg, NULL, 
> MAIR_EL1 },
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: x86/mmu: consider the hva in mmu_notifer retry

2021-01-27 Thread Sean Christopherson
On Tue, Jan 26, 2021, David Stevens wrote:
> > This needs a comment to explicitly state that 'count > 1' cannot be done at
> > this time.  My initial thought is that it would be more intuitive to check 
> > for
> > 'count > 1' here, but that would potentially check the wrong wrange when 
> > count
> > goes from 2->1.  The comment about persistence in invalidate_range_start() 
> > is a
> > good hint, but I think it's worth being explicit to avoid bad "cleanup" in 
> > the
> > future.
> >
> > > + if (unlikely(kvm->mmu_notifier_count)) {
> > > + if (kvm->mmu_notifier_range_start <= hva &&
> > > + hva < kvm->mmu_notifier_range_end)
> 
> I'm not sure I understand what you're suggesting here. How exactly
> would 'count > 1' be used incorrectly here? I'm fine with adding a
> comment, but I'm not sure what the comment needs to clarify.

There's no guarantee that the remaining in-progress invalidation when the count
goes from 2->1 is the same invalidation call that set range_start/range_end.

E.g. given two invalidations, A and B, the order of calls could be:

  kvm_mmu_notifier_invalidate_range_start(A)
  kvm_mmu_notifier_invalidate_range_start(B)
  kvm_mmu_notifier_invalidate_range_end(A)
  kvm_mmu_notifier_invalidate_range_end(B) <-- ???

or

  kvm_mmu_notifier_invalidate_range_start(A)
  kvm_mmu_notifier_invalidate_range_start(B)
  kvm_mmu_notifier_invalidate_range_end(B)
  kvm_mmu_notifier_invalidate_range_end(A) <-- ???

In the first case, "A" is in-progress when the count goes 2->1, in the second
case "B" is still in-progress.  Checking for "count > 1" in the consumer instead
of handling it in the producer (as you did) would lead to the consumer checking
against the wrong range.  I don't see a way to solve that without adding some
amount of history, which I agree is unnecessary.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 5/7] KVM: arm64: Limit the debug architecture to ARMv8.0

2021-01-27 Thread Alexandru Elisei
Hi Marc,

This looks correct, and it also matches what we report for aarch32 in 
trap_dbgidr():

Reviewed-by: Alexandru Elisei 

On 1/25/21 12:26 PM, Marc Zyngier wrote:
> Let's not pretend we support anything but ARMv8.0 as far as the
> debug architecture is concerned.
>
> Reviewed-by: Eric Auger 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/sys_regs.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index dda16d60197b..8f79ec1fffa7 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -1048,6 +1048,9 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>FEATURE(ID_AA64ISAR1_GPI));
>   break;
>   case SYS_ID_AA64DFR0_EL1:
> + /* Limit debug to ARMv8.0 */
> + val &= ~FEATURE(ID_AA64DFR0_DEBUGVER);
> + val |= FIELD_PREP(FEATURE(ID_AA64DFR0_DEBUGVER), 6);
>   /* Limit guests to PMUv3 for ARMv8.1 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_AA64DFR0_PMUVER_SHIFT,
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 4/4] KVM: arm64: GICv4.1: Give a chance to save VLPI's pending state

2021-01-27 Thread Shenming Lu
Before GICv4.1, we don't have direct access to the VLPI's pending
state. So we simply let it fail early when encountering any VLPI.

But now we don't have to return -EACCES directly if on GICv4.1. So
let’s change the hard code and give a chance to save the VLPI's pending
state (and preserve the UAPI).

Signed-off-by: Shenming Lu 
---
 Documentation/virt/kvm/devices/arm-vgic-its.rst | 2 +-
 arch/arm64/kvm/vgic/vgic-its.c  | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/devices/arm-vgic-its.rst 
b/Documentation/virt/kvm/devices/arm-vgic-its.rst
index 6c304fd2b1b4..d257eddbae29 100644
--- a/Documentation/virt/kvm/devices/arm-vgic-its.rst
+++ b/Documentation/virt/kvm/devices/arm-vgic-its.rst
@@ -80,7 +80,7 @@ KVM_DEV_ARM_VGIC_GRP_CTRL
 -EFAULT  Invalid guest ram access
 -EBUSY   One or more VCPUS are running
 -EACCES  The virtual ITS is backed by a physical GICv4 ITS, and the
-state is not available
+state is not available without GICv4.1
 ===  ==
 
 KVM_DEV_ARM_VGIC_GRP_ITS_REGS
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 40cbaca81333..ec7543a9617c 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -2218,10 +2218,10 @@ static int vgic_its_save_itt(struct vgic_its *its, 
struct its_device *device)
/*
 * If an LPI carries the HW bit, this means that this
 * interrupt is controlled by GICv4, and we do not
-* have direct access to that state. Let's simply fail
-* the save operation...
+* have direct access to that state without GICv4.1.
+* Let's simply fail the save operation...
 */
-   if (ite->irq->hw)
+   if (ite->irq->hw && !kvm_vgic_global_state.has_gicv4_1)
return -EACCES;
 
ret = vgic_its_save_ite(its, device, ite, gpa, ite_esz);
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 3/4] KVM: arm64: GICv4.1: Restore VLPI's pending state to physical side

2021-01-27 Thread Shenming Lu
From: Zenghui Yu 

When setting the forwarding path of a VLPI (switch to the HW mode),
we could also transfer the pending state from irq->pending_latch to
VPT (especially in migration, the pending states of VLPIs are restored
into kvm’s vgic first). And we currently send "INT+VSYNC" to trigger
a VLPI to pending.

Signed-off-by: Zenghui Yu 
Signed-off-by: Shenming Lu 
---
 arch/arm64/kvm/vgic/vgic-v4.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
index ac029ba3d337..a3542af6f04a 100644
--- a/arch/arm64/kvm/vgic/vgic-v4.c
+++ b/arch/arm64/kvm/vgic/vgic-v4.c
@@ -449,6 +449,20 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq,
irq->host_irq   = virq;
atomic_inc(>vlpi_count);
 
+   /* Transfer pending state */
+   if (irq->pending_latch) {
+   ret = irq_set_irqchip_state(irq->host_irq,
+   IRQCHIP_STATE_PENDING,
+   irq->pending_latch);
+   WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq);
+
+   /*
+* Let it be pruned from ap_list later and don't bother
+* the List Register.
+*/
+   irq->pending_latch = false;
+   }
+
 out:
mutex_unlock(>its_lock);
return ret;
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 2/4] KVM: arm64: GICv4.1: Try to save hw pending state in save_pending_tables

2021-01-27 Thread Shenming Lu
After pausing all vCPUs and devices capable of interrupting, in order
to save the information of all interrupts, besides flushing the pending
states in kvm’s vgic, we also try to flush the states of VLPIs in the
virtual pending tables into guest RAM, but we need to have GICv4.1 and
safely unmap the vPEs first.

As for the saving of VSGIs, which needs the vPEs to be mapped and might
conflict with the saving of VLPIs, but since we will map the vPEs back
at the end of save_pending_tables and both savings require the kvm->lock
to be held (only happen serially), it will work fine.

Signed-off-by: Shenming Lu 
---
 arch/arm64/kvm/vgic/vgic-v3.c | 61 +++
 1 file changed, 55 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 52915b342351..06b1162b7a0a 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -356,6 +358,32 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, 
struct vgic_irq *irq)
return 0;
 }
 
+/*
+ * The deactivation of the doorbell interrupt will trigger the
+ * unmapping of the associated vPE.
+ */
+static void unmap_all_vpes(struct vgic_dist *dist)
+{
+   struct irq_desc *desc;
+   int i;
+
+   for (i = 0; i < dist->its_vm.nr_vpes; i++) {
+   desc = irq_to_desc(dist->its_vm.vpes[i]->irq);
+   irq_domain_deactivate_irq(irq_desc_get_irq_data(desc));
+   }
+}
+
+static void map_all_vpes(struct vgic_dist *dist)
+{
+   struct irq_desc *desc;
+   int i;
+
+   for (i = 0; i < dist->its_vm.nr_vpes; i++) {
+   desc = irq_to_desc(dist->its_vm.vpes[i]->irq);
+   irq_domain_activate_irq(irq_desc_get_irq_data(desc), false);
+   }
+}
+
 /**
  * vgic_v3_save_pending_tables - Save the pending tables into guest RAM
  * kvm lock and all vcpu lock must be held
@@ -365,14 +393,26 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
struct vgic_dist *dist = >arch.vgic;
struct vgic_irq *irq;
gpa_t last_ptr = ~(gpa_t)0;
-   int ret;
+   bool vlpi_avail = false;
+   int ret = 0;
u8 val;
 
+   /*
+* As a preparation for getting any VLPI states.
+* The vgic initialized check ensures that the allocation and
+* enabling of the doorbells have already been done.
+*/
+   if (kvm_vgic_global_state.has_gicv4_1 && 
!WARN_ON(!vgic_initialized(kvm))) {
+   unmap_all_vpes(dist);
+   vlpi_avail = true;
+   }
+
list_for_each_entry(irq, >lpi_list_head, lpi_list) {
int byte_offset, bit_nr;
struct kvm_vcpu *vcpu;
gpa_t pendbase, ptr;
bool stored;
+   bool is_pending = irq->pending_latch;
 
vcpu = irq->target_vcpu;
if (!vcpu)
@@ -387,24 +427,33 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
if (ptr != last_ptr) {
ret = kvm_read_guest_lock(kvm, ptr, , 1);
if (ret)
-   return ret;
+   goto out;
last_ptr = ptr;
}
 
stored = val & (1U << bit_nr);
-   if (stored == irq->pending_latch)
+
+   if (irq->hw && vlpi_avail)
+   vgic_v4_get_vlpi_state(irq, _pending);
+
+   if (stored == is_pending)
continue;
 
-   if (irq->pending_latch)
+   if (is_pending)
val |= 1 << bit_nr;
else
val &= ~(1 << bit_nr);
 
ret = kvm_write_guest_lock(kvm, ptr, , 1);
if (ret)
-   return ret;
+   goto out;
}
-   return 0;
+
+out:
+   if (vlpi_avail)
+   map_all_vpes(dist);
+
+   return ret;
 }
 
 /**
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 0/4] KVM: arm64: Add VLPI migration support on GICv4.1

2021-01-27 Thread Shenming Lu
Hi Marc, sorry for the late commit.

In GICv4.1, migration has been supported except for (directly-injected)
VLPI. And GICv4.1 Spec explicitly gives a way to get the VLPI's pending
state (which was crucially missing in GICv4.0). So we make VLPI migration
capable on GICv4.1 in this patch set.

In order to support VLPI migration, we need to save and restore all
required configuration information and pending states of VLPIs. But
in fact, the configuration information of VLPIs has already been saved
(or will be reallocated on the dst host...) in vgic(kvm) migration.
So we only have to migrate the pending states of VLPIs specially.

Below is the related workflow in migration.

On the save path:
In migration completion:
pause all vCPUs
|
call each VM state change handler:
pause other devices (just keep from sending interrupts, 
and
such as VFIO migration protocol has already realized it 
[1])
|
flush ITS tables into guest RAM
|
flush RDIST pending tables (also flush VLPI state here)
|
...
On the resume path:
load each device's state:
restore ITS tables (include pending tables) from guest RAM
|
for other (PCI) devices (paused), if configured to have VLPIs,
establish the forwarding paths of their VLPIs (and transfer
the pending states from kvm's vgic to VPT here)

We have tested this series in VFIO migration, and found some related
issues in QEMU [2].

Links:
[1] vfio: UAPI for migration interface for device state:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a8a24f3f6e38103b77cf399c38eb54e1219d00d6
[2] vfio: Some fixes and optimizations for VFIO migration:
https://patchwork.ozlabs.org/cover/1413263/

History:

v2 -> v3
 - Add the vgic initialized check to ensure that the allocation and enabling
   of the doorbells have already been done before unmapping the vPEs.
 - Check all get_vlpi_state related conditions in save_pending_tables in one 
place.
 - Nit fixes.

v1 -> v2:
 - Get the VLPI state from the KVM side.
 - Nit fixes.

Thanks,
Shenming


Shenming Lu (3):
  KVM: arm64: GICv4.1: Add function to get VLPI state
  KVM: arm64: GICv4.1: Try to save hw pending state in
save_pending_tables
  KVM: arm64: GICv4.1: Give a chance to save VLPI's pending state

Zenghui Yu (1):
  KVM: arm64: GICv4.1: Restore VLPI's pending state to physical side

 .../virt/kvm/devices/arm-vgic-its.rst |  2 +-
 arch/arm64/kvm/vgic/vgic-its.c|  6 +-
 arch/arm64/kvm/vgic/vgic-v3.c | 61 +--
 arch/arm64/kvm/vgic/vgic-v4.c | 33 ++
 arch/arm64/kvm/vgic/vgic.h|  1 +
 5 files changed, 93 insertions(+), 10 deletions(-)

-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 1/4] KVM: arm64: GICv4.1: Add function to get VLPI state

2021-01-27 Thread Shenming Lu
With GICv4.1 and the vPE unmapped, which indicates the invalidation
of any VPT caches associated with the vPE, we can get the VLPI state
by peeking at the VPT. So we add a function for this.

Signed-off-by: Shenming Lu 
---
 arch/arm64/kvm/vgic/vgic-v4.c | 19 +++
 arch/arm64/kvm/vgic/vgic.h|  1 +
 2 files changed, 20 insertions(+)

diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
index 66508b03094f..ac029ba3d337 100644
--- a/arch/arm64/kvm/vgic/vgic-v4.c
+++ b/arch/arm64/kvm/vgic/vgic-v4.c
@@ -203,6 +203,25 @@ void vgic_v4_configure_vsgis(struct kvm *kvm)
kvm_arm_resume_guest(kvm);
 }
 
+/*
+ * Must be called with GICv4.1 and the vPE unmapped, which
+ * indicates the invalidation of any VPT caches associated
+ * with the vPE, thus we can get the VLPI state by peeking
+ * at the VPT.
+ */
+void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val)
+{
+   struct its_vpe *vpe = >target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
+   int mask = BIT(irq->intid % BITS_PER_BYTE);
+   void *va;
+   u8 *ptr;
+
+   va = page_address(vpe->vpt_page);
+   ptr = va + irq->intid / BITS_PER_BYTE;
+
+   *val = !!(*ptr & mask);
+}
+
 /**
  * vgic_v4_init - Initialize the GICv4 data structures
  * @kvm:   Pointer to the VM being initialized
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 64fcd750..d8cfd360838c 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -317,5 +317,6 @@ bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
 void vgic_v4_teardown(struct kvm *kvm);
 void vgic_v4_configure_vsgis(struct kvm *kvm);
+void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val);
 
 #endif
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 4/7] KVM: arm64: Refactor filtering of ID registers

2021-01-27 Thread Alexandru Elisei
Hi Marc,

On 1/25/21 12:26 PM, Marc Zyngier wrote:
> Our current ID register filtering is starting to be a mess of if()
> statements, and isn't going to get any saner.
>
> Let's turn it into a switch(), which has a chance of being more
> readable, and introduce a FEATURE() macro that allows easy generation
> of feature masks.
>
> No functionnal change intended.
>
> Reviewed-by: Eric Auger 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/sys_regs.c | 51 +--
>  1 file changed, 28 insertions(+), 23 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 2bea0494b81d..dda16d60197b 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -9,6 +9,7 @@
>   *  Christoffer Dall 
>   */
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1016,6 +1017,8 @@ static bool access_arch_timer(struct kvm_vcpu *vcpu,
>   return true;
>  }
>  
> +#define FEATURE(x)   (GENMASK_ULL(x##_SHIFT + 3, x##_SHIFT))
> +
>  /* Read a sanitised cpufeature ID register by sys_reg_desc */
>  static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>   struct sys_reg_desc const *r, bool raz)
> @@ -1024,36 +1027,38 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
>(u32)r->CRn, (u32)r->CRm, (u32)r->Op2);
>   u64 val = raz ? 0 : read_sanitised_ftr_reg(id);
>  
> - if (id == SYS_ID_AA64PFR0_EL1) {
> + switch (id) {
> + case SYS_ID_AA64PFR0_EL1:
>   if (!vcpu_has_sve(vcpu))
> - val &= ~(0xfUL << ID_AA64PFR0_SVE_SHIFT);
> - val &= ~(0xfUL << ID_AA64PFR0_AMU_SHIFT);
> - val &= ~(0xfUL << ID_AA64PFR0_CSV2_SHIFT);
> - val |= ((u64)vcpu->kvm->arch.pfr0_csv2 << 
> ID_AA64PFR0_CSV2_SHIFT);
> - val &= ~(0xfUL << ID_AA64PFR0_CSV3_SHIFT);
> - val |= ((u64)vcpu->kvm->arch.pfr0_csv3 << 
> ID_AA64PFR0_CSV3_SHIFT);
> - } else if (id == SYS_ID_AA64PFR1_EL1) {
> - val &= ~(0xfUL << ID_AA64PFR1_MTE_SHIFT);
> - } else if (id == SYS_ID_AA64ISAR1_EL1 && !vcpu_has_ptrauth(vcpu)) {
> - val &= ~((0xfUL << ID_AA64ISAR1_APA_SHIFT) |
> -  (0xfUL << ID_AA64ISAR1_API_SHIFT) |
> -  (0xfUL << ID_AA64ISAR1_GPA_SHIFT) |
> -  (0xfUL << ID_AA64ISAR1_GPI_SHIFT));
> - } else if (id == SYS_ID_AA64DFR0_EL1) {
> - u64 cap = 0;
> -
> + val &= ~FEATURE(ID_AA64PFR0_SVE);
> + val &= ~FEATURE(ID_AA64PFR0_AMU);
> + val &= ~FEATURE(ID_AA64PFR0_CSV2);
> + val |= FIELD_PREP(FEATURE(ID_AA64PFR0_CSV2), 
> (u64)vcpu->kvm->arch.pfr0_csv2);
> + val &= ~FEATURE(ID_AA64PFR0_CSV3);
> + val |= FIELD_PREP(FEATURE(ID_AA64PFR0_CSV3), 
> (u64)vcpu->kvm->arch.pfr0_csv3);

The patch ends up looking really nice (checked that the previous behaviour is
preserved). It also ends up making the code more robust because we make sure 
that
we only do bitwise or on the first 4 bits of pfr0_csv2 and pfr0_csv3:

Reviewed-by: Alexandru Elisei 

> + break;
> + case SYS_ID_AA64PFR1_EL1:
> + val &= ~FEATURE(ID_AA64PFR1_MTE);
> + break;
> + case SYS_ID_AA64ISAR1_EL1:
> + if (!vcpu_has_ptrauth(vcpu))
> + val &= ~(FEATURE(ID_AA64ISAR1_APA) |
> +  FEATURE(ID_AA64ISAR1_API) |
> +  FEATURE(ID_AA64ISAR1_GPA) |
> +  FEATURE(ID_AA64ISAR1_GPI));
> + break;
> + case SYS_ID_AA64DFR0_EL1:
>   /* Limit guests to PMUv3 for ARMv8.1 */
> - if (kvm_vcpu_has_pmu(vcpu))
> - cap = ID_AA64DFR0_PMUVER_8_1;
> -
>   val = cpuid_feature_cap_perfmon_field(val,
> - ID_AA64DFR0_PMUVER_SHIFT,
> - cap);
> - } else if (id == SYS_ID_DFR0_EL1) {
> +   ID_AA64DFR0_PMUVER_SHIFT,
> +   kvm_vcpu_has_pmu(vcpu) ? 
> ID_AA64DFR0_PMUVER_8_1 : 0);
> + break;
> + case SYS_ID_DFR0_EL1:
>   /* Limit guests to PMUv3 for ARMv8.1 */
>   val = cpuid_feature_cap_perfmon_field(val,
> ID_DFR0_PERFMON_SHIFT,
> kvm_vcpu_has_pmu(vcpu) ? 
> ID_DFR0_PERFMON_8_1 : 0);
> + break;
>   }
>  
>   return val;
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2] KVM: x86/mmu: consider the hva in mmu_notifier retry

2021-01-27 Thread David Stevens
From: David Stevens 

Track the range being invalidated by mmu_notifier and skip page fault
retries if the fault address is not affected by the in-progress
invalidation. Handle concurrent invalidations by finding the minimal
range which includes all ranges being invalidated. Although the combined
range may include unrelated addresses and cannot be shrunk as individual
invalidation operations complete, it is unlikely the marginal gains of
proper range tracking are worth the additional complexity.

The primary benefit of this change is the reduction in the likelihood of
extreme latency when handing a page fault due to another thread having
been preempted while modifying host virtual addresses.

Signed-off-by: David Stevens 
---
v1 -> v2:
 - improve handling of concurrent invalidation requests by unioning
   ranges, instead of just giving up and using [0, ULONG_MAX).
 - add lockdep check
 - code comments and formatting

 arch/powerpc/kvm/book3s_64_mmu_hv.c|  2 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  2 +-
 arch/x86/kvm/mmu/mmu.c | 16 --
 arch/x86/kvm/mmu/paging_tmpl.h |  7 ---
 include/linux/kvm_host.h   | 27 +++-
 virt/kvm/kvm_main.c| 29 ++
 6 files changed, 67 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 38ea396a23d6..8e06cd3f759c 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -590,7 +590,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_vcpu *vcpu,
} else {
/* Call KVM generic code to do the slow-path check */
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
-  writing, _ok);
+  writing, _ok, NULL);
if (is_error_noslot_pfn(pfn))
return -EFAULT;
page = NULL;
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index bb35490400e9..e603de7ade52 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -822,7 +822,7 @@ int kvmppc_book3s_instantiate_page(struct kvm_vcpu *vcpu,
 
/* Call KVM generic code to do the slow-path check */
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, NULL,
-  writing, upgrade_p);
+  writing, upgrade_p, NULL);
if (is_error_noslot_pfn(pfn))
return -EFAULT;
page = NULL;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6d16481aa29d..79166288ed03 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3658,8 +3658,8 @@ static bool kvm_arch_setup_async_pf(struct kvm_vcpu 
*vcpu, gpa_t cr2_or_gpa,
 }
 
 static bool try_async_pf(struct kvm_vcpu *vcpu, bool prefault, gfn_t gfn,
-gpa_t cr2_or_gpa, kvm_pfn_t *pfn, bool write,
-bool *writable)
+gpa_t cr2_or_gpa, kvm_pfn_t *pfn, hva_t *hva,
+bool write, bool *writable)
 {
struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
bool async;
@@ -3672,7 +3672,8 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
}
 
async = false;
-   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, , write, writable);
+   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, ,
+   write, writable, hva);
if (!async)
return false; /* *pfn has correct page already */
 
@@ -3686,7 +3687,8 @@ static bool try_async_pf(struct kvm_vcpu *vcpu, bool 
prefault, gfn_t gfn,
return true;
}
 
-   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, NULL, write, writable);
+   *pfn = __gfn_to_pfn_memslot(slot, gfn, false, NULL,
+   write, writable, hva);
return false;
 }
 
@@ -3699,6 +3701,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t 
gpa, u32 error_code,
gfn_t gfn = gpa >> PAGE_SHIFT;
unsigned long mmu_seq;
kvm_pfn_t pfn;
+   hva_t hva;
int r;
 
if (page_fault_handle_page_track(vcpu, error_code, gfn))
@@ -3717,7 +3720,8 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t 
gpa, u32 error_code,
mmu_seq = vcpu->kvm->mmu_notifier_seq;
smp_rmb();
 
-   if (try_async_pf(vcpu, prefault, gfn, gpa, , write, _writable))
+   if (try_async_pf(vcpu, prefault, gfn, gpa, , ,
+write, _writable))
return RET_PF_RETRY;
 
if (handle_abnormal_pfn(vcpu, is_tdp ? 0 : gpa, gfn, pfn, ACC_ALL, ))
@@ -3725,7 +3729,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, gpa_t 

Re: [PATCH v4 21/21] arm64: cpufeatures: Allow disabling of Pointer Auth from the command-line

2021-01-27 Thread Srinivas Ramana

Hi Marc,

On 1/23/2021 6:28 AM, Catalin Marinas wrote:

On Mon, Jan 18, 2021 at 09:45:33AM +, Marc Zyngier wrote:

In order to be able to disable Pointer Authentication  at runtime,
whether it is for testing purposes, or to work around HW issues,
let's add support for overriding the ID_AA64ISAR1_EL1.{GPI,GPA,API,APA}
fields.

This is further mapped on the arm64.nopauth command-line alias.

Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 


Verified this Series for PAC control feature from command line. with 
arm64.nopauth, we could see PAUTH is disabled on both primary and 
secondary cores as expected.


HWCAPs show no PAC support and kernel instructions are being treated as 
NOPs.


Tested-by: Srinivas Ramana 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 19/21] arm64: cpufeatures: Allow disabling of BTI from the command-line

2021-01-27 Thread Srinivas Ramana

Hi Marc,

On 1/23/2021 6:24 AM, Catalin Marinas wrote:

On Mon, Jan 18, 2021 at 09:45:31AM +, Marc Zyngier wrote:

In order to be able to disable BTI at runtime, whether it is
for testing purposes, or to work around HW issues, let's add
support for overriding the ID_AA64PFR1_EL1.BTI field.

This is further mapped on the arm64.nobti command-line alias.

Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 

Verified this Series for BTI as well for command line control.
With arm64.nobti:
BTI is disabled on both primary and secondary cores as expected(verified 
page table entries).
HWCAPs show no BTI support and kernel instructions are being treated as 
NOPs.
We don't have plan to repeat the test on v5 as there are not major 
changes here.


Tested-by: Srinivas Ramana 

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm