arm64: Properly handle faulting of device mappings

Alexandru Elisei Wed, 11 Dec 2019 09:54:14 -0800

Hi,

On 12/11/19 4:56 PM, Marc Zyngier wrote:
> A device mapping is normally always mapped at Stage-2, since there
> is very little gain in having it faulted in.
>
> Nonetheless, it is possible to end-up in a situation where the device
> mapping has been removed from Stage-2 (userspace munmaped the VFIO
> region, and the MMU notifier did its job), but present in a userspace
> mapping (userpace has mapped it back at the same address). In such
> a situation, the device mapping will be demand-paged as the guest
> performs memory accesses.
>
> This requires to be careful when dealing with mapping size, cache
> management, and to handle potential execution of a device mapping.
>
> Cc: sta...@vger.kernel.org
> Reported-by: Alexandru Elisei <alexandru.eli...@arm.com>
> Signed-off-by: Marc Zyngier <m...@kernel.org>
> ---
>  virt/kvm/arm/mmu.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index a48994af70b8..0b32a904a1bb 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -38,6 +38,11 @@ static unsigned long io_map_base;
>  #define KVM_S2PTE_FLAG_IS_IOMAP              (1UL << 0)
>  #define KVM_S2_FLAG_LOGGING_ACTIVE   (1UL << 1)
>  
> +static bool is_iomap(unsigned long flags)
> +{
> +     return flags & KVM_S2PTE_FLAG_IS_IOMAP;
> +}
> +
>  static bool memslot_is_logging(struct kvm_memory_slot *memslot)
>  {
>       return memslot->dirty_bitmap && !(memslot->flags & KVM_MEM_READONLY);
> @@ -1698,6 +1703,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>  
>       vma_pagesize = vma_kernel_pagesize(vma);
>       if (logging_active ||
> +         (vma->vm_flags & VM_PFNMAP) ||
>           !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
>               force_pte = true;
>               vma_pagesize = PAGE_SIZE;
> @@ -1760,6 +1766,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>                       writable = false;
>       }
>  
> +     if (exec_fault && is_iomap(flags))
> +             return -ENOEXEC;
> +
>       spin_lock(&kvm->mmu_lock);
>       if (mmu_notifier_retry(kvm, mmu_seq))
>               goto out_unlock;
> @@ -1781,7 +1790,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>       if (writable)
>               kvm_set_pfn_dirty(pfn);
>  
> -     if (fault_status != FSC_PERM)
> +     if (fault_status != FSC_PERM && !is_iomap(flags))
>               clean_dcache_guest_page(pfn, vma_pagesize);
>  
>       if (exec_fault)
> @@ -1948,9 +1957,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, 
> struct kvm_run *run)
>       if (kvm_is_error_hva(hva) || (write_fault && !writable)) {
>               if (is_iabt) {
>                       /* Prefetch Abort on I/O address */
> -                     kvm_inject_pabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> -                     ret = 1;
> -                     goto out_unlock;
> +                     ret = -ENOEXEC;
> +                     goto out;
>               }
>  
>               /*
> @@ -1992,6 +2000,11 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, 
> struct kvm_run *run)
>       ret = user_mem_abort(vcpu, fault_ipa, memslot, hva, fault_status);
>       if (ret == 0)
>               ret = 1;
> +out:
> +     if (ret == -ENOEXEC) {
> +             kvm_inject_pabt(vcpu, kvm_vcpu_get_hfar(vcpu));
> +             ret = 1;
> +     }
>  out_unlock:
>       srcu_read_unlock(&vcpu->kvm->srcu, idx);
>       return ret;


I've tested this patch using the scenario that you described in the commit
message. I've also tested it with the following scenarios:

- The region is mmap'ed as backed by a VFIO device fd and the memslot is 
created,
it is munmap'ed, then mmap'ed as anonymous.
- The region is mmap'ed as anonymous and the memslot is created, it is 
munmap'ed,
then mmap'ed as backed by a VFIO device fd.

Everything worked as expected, so:

Tested-by: Alexandru Elisei <alexandru.eli...@arm.com>

Just a nitpick, but stage2_set_pte has a local variable iomap which can be
replaced with a call to is_iomap.

Thanks,
Alex
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 1/3] KVM: arm/arm64: Properly handle faulting of device mappings

Reply via email to