Re: [v2 0/6] KVM: arm64: implement vcpu_is_preempted check

2022-11-10 Thread Punit Agrawal
Hi Usama,

Usama Arif  writes:

> This patchset adds support for vcpu_is_preempted in arm64, which allows the 
> guest
> to check if a vcpu was scheduled out, which is useful to know incase it was
> holding a lock. vcpu_is_preempted can be used to improve
> performance in locking (see owner_on_cpu usage in mutex_spin_on_owner,
> mutex_can_spin_on_owner, rtmutex_spin_on_owner and osq_lock) and scheduling
> (see available_idle_cpu which is used in several places in kernel/sched/fair.c
> for e.g. in wake_affine to determine which CPU can run soonest):
>
> This patchset shows improvement on overcommitted hosts (vCPUs > pCPUS), as 
> waiting
> for preempted vCPUs reduces performance.
>
> This patchset is inspired from the para_steal_clock implementation and from 
> the
> work originally done by Zengruan Ye:
> https://lore.kernel.org/linux-arm-kernel/20191226135833.1052-1-yezengr...@huawei.com/.
>
> All the results in the below experiments are done on an aws r6g.metal instance
> which has 64 pCPUs.
>
> The following table shows the index results of UnixBench running on a 128 
> vCPU VM
> with (6.0.0+vcpu_is_preempted) and without (6.0.0 base) the patchset.
> TestName6.0.0 base  6.0.0+vcpu_is_preempted   
>  % improvement for vcpu_is_preempted
> Dhrystone 2 using register variables187761  191274.7  
>  1.871368389
> Double-Precision Whetstone  96743.6 98414.4   
>  1.727039308
> Execl Throughput689.3   10426 
>  1412.548963
> File Copy 1024 bufsize 2000 maxblocks   549.5   3165  
>  475.978162
> File Copy 256 bufsize 500 maxblocks 400.7   2084.7
>  420.2645371
> File Copy 4096 bufsize 8000 maxblocks   894.3   5003.2
>  459.4543218
> Pipe Throughput 76819.5 78601.5   
>  2.319723508
> Pipe-based Context Switching3444.8  13414.5   
>  289.4130283
> Process Creation301.1   293.4 
>  -2.557289937
> Shell Scripts (1 concurrent)1248.1  28300.6   
>  2167.494592
> Shell Scripts (8 concurrent)781.2   26222.3   
>  3256.669227
> System Call Overhead34263729.4
>  8.855808523
>
> System Benchmarks Index Score   305311534 
>  277.7923354
>
> This shows a 277% overall improvement using these patches.
>
> The biggest improvement is in the shell scripts benchmark, which forks a lot 
> of processes.
> This acquires rwsem lock where a large chunk of time is spent in base 6.0.0 
> kernel.
> This can be seen from one of the callstack of the perf output of the shell
> scripts benchmark on 6.0.0 base (pseudo NMI enabled for perf numbers below):
> - 33.79% el0_svc
>- 33.43% do_el0_svc
>   - 33.43% el0_svc_common.constprop.3
>  - 33.30% invoke_syscall
> - 17.27% __arm64_sys_clone
>- 17.27% __do_sys_clone
>   - 17.26% kernel_clone
>  - 16.73% copy_process
> - 11.91% dup_mm
>- 11.82% dup_mmap
>   - 9.15% down_write
>  - 8.87% rwsem_down_write_slowpath
> - 8.48% osq_lock
>
> Just under 50% of the total time in the shell script benchmarks ends up being
> spent in osq_lock in the base 6.0.0 kernel:
>   Children  Self  Command   Shared ObjectSymbol
>17.19%10.71%  sh  [kernel.kallsyms]  [k] osq_lock
> 6.17% 4.04%  sort[kernel.kallsyms]  [k] osq_lock
> 4.20% 2.60%  multi.  [kernel.kallsyms]  [k] osq_lock
> 3.77% 2.47%  grep[kernel.kallsyms]  [k] osq_lock
> 3.50% 2.24%  expr[kernel.kallsyms]  [k] osq_lock
> 3.41% 2.23%  od  [kernel.kallsyms]  [k] osq_lock
> 3.36% 2.15%  rm  [kernel.kallsyms]  [k] osq_lock
> 3.28% 2.12%  tee [kernel.kallsyms]  [k] osq_lock
> 3.16% 2.02%  wc  [kernel.kallsyms]  [k] osq_lock
> 0.21% 0.13%  looper  [kernel.kallsyms]  [k] osq_lock
> 0.01% 0.00%  Run [kernel.kallsyms]  [k] osq_lock
>
> and this comes down to less than 1% total with 6.0.0+vcpu_is_preempted kernel:
>   Children  Self  Command   Shared ObjectSymbol
>  0.26% 0.21%  sh  [kernel.kallsyms]  [k] osq_lock
>  0.10% 0.08%  multi.  [kernel.kallsyms]  [k] osq_lock
>  0.04% 0.04%  sort[kernel.kallsyms]  [k] osq_lock
>  0.02% 0.01%  grep[kernel.kallsyms]  [k] osq_lock
>  0.02% 0.02%  od  [kernel.kallsyms]  [k] osq_lock
>  0.01% 0.01%  tee [kernel.kallsyms]  [k] osq_lock
>  0.01% 0.00%  expr[kernel.kallsyms]  [k] osq_lock
>  0.01% 

Re: [v2 3/6] KVM: arm64: Support pvlock preempted via shared structure

2022-11-10 Thread Punit Agrawal
Usama Arif  writes:

> Implement the service call for configuring a shared structure between a
> VCPU and the hypervisor in which the hypervisor can tell whether the
> VCPU is running or not.
>
> The preempted field is zero if the VCPU is not preempted.
> Any other value means the VCPU has been preempted.
>
> Signed-off-by: Zengruan Ye 
> Signed-off-by: Usama Arif 
> ---
>  Documentation/virt/kvm/arm/hypercalls.rst |  3 ++
>  arch/arm64/include/asm/kvm_host.h | 18 ++
>  arch/arm64/include/uapi/asm/kvm.h |  1 +
>  arch/arm64/kvm/Makefile   |  2 +-
>  arch/arm64/kvm/arm.c  |  8 +
>  arch/arm64/kvm/hypercalls.c   |  8 +
>  arch/arm64/kvm/pvlock.c   | 43 +++
>  tools/arch/arm64/include/uapi/asm/kvm.h   |  1 +
>  8 files changed, 83 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/pvlock.c
>
> diff --git a/Documentation/virt/kvm/arm/hypercalls.rst 
> b/Documentation/virt/kvm/arm/hypercalls.rst
> index 3e23084644ba..872a16226ace 100644
> --- a/Documentation/virt/kvm/arm/hypercalls.rst
> +++ b/Documentation/virt/kvm/arm/hypercalls.rst
> @@ -127,6 +127,9 @@ The pseudo-firmware bitmap register are as follows:
>  Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_PTP:
>The bit represents the Precision Time Protocol KVM service.
>  
> +Bit-2: KVM_REG_ARM_VENDOR_HYP_BIT_PV_LOCK:
> +  The bit represents the Paravirtualized lock service.
> +
>  Errors:
>  
>  ===  =
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index 45e2136322ba..18303b30b7e9 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -417,6 +417,11 @@ struct kvm_vcpu_arch {
>   u64 last_steal;
>   gpa_t base;
>   } steal;
> +
> + /* Guest PV lock state */
> + struct {
> + gpa_t base;
> + } pv;

Using "pv" for the structure isn't quite describing the usage well. It'd
be better to call it "pv_lock" or "pvlock" at the least.

[...]

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [v2 2/6] KVM: arm64: Add SMCCC paravirtualised lock calls

2022-11-10 Thread Punit Agrawal
Usama Arif  writes:

> Add a new SMCCC compatible hypercalls for PV lock features:
>   ARM_SMCCC_KVM_FUNC_PV_LOCK:   0xC602
>
> Also add the header file which defines the ABI for the paravirtualized
> lock features we're about to add.
>
> Signed-off-by: Zengruan Ye 
> Signed-off-by: Usama Arif 
> ---
>  arch/arm64/include/asm/pvlock-abi.h | 17 +
>  include/linux/arm-smccc.h   |  8 
>  tools/include/linux/arm-smccc.h |  8 
>  3 files changed, 33 insertions(+)
>  create mode 100644 arch/arm64/include/asm/pvlock-abi.h
>
> diff --git a/arch/arm64/include/asm/pvlock-abi.h 
> b/arch/arm64/include/asm/pvlock-abi.h
> new file mode 100644
> index ..3f4574071679
> --- /dev/null
> +++ b/arch/arm64/include/asm/pvlock-abi.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright(c) 2019 Huawei Technologies Co., Ltd
> + * Author: Zengruan Ye 
> + * Usama Arif 
> + */
> +
> +#ifndef __ASM_PVLOCK_ABI_H
> +#define __ASM_PVLOCK_ABI_H
> +
> +struct pvlock_vcpu_state {
> + __le64 preempted;
> + /* Structure must be 64 byte aligned, pad to that size */
> + u8 padding[56];
> +} __packed;

For structure alignment, I'd have expected to see the use of "aligned"
attribute. Is there any benefit in using padding to achieve alignment?

[...]
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [v2 1/6] KVM: arm64: Document PV-lock interface

2022-11-10 Thread Punit Agrawal
Hi Usama,

Usama Arif  writes:

> Introduce a paravirtualization interface for KVM/arm64 to obtain whether
> the VCPU is currently running or not.
>
> The PV lock structure of the guest is allocated by user space.
>
> A hypercall interface is provided for the guest to interrogate the
> location of the shared memory structures.
>
> Signed-off-by: Zengruan Ye 
> Signed-off-by: Usama Arif 
> ---
>  Documentation/virt/kvm/arm/index.rst|  1 +
>  Documentation/virt/kvm/arm/pvlock.rst   | 52 +
>  Documentation/virt/kvm/devices/vcpu.rst | 25 
>  3 files changed, 78 insertions(+)
>  create mode 100644 Documentation/virt/kvm/arm/pvlock.rst
>
> diff --git a/Documentation/virt/kvm/arm/index.rst 
> b/Documentation/virt/kvm/arm/index.rst
> index e84848432158..b8499dc00a6a 100644
> --- a/Documentation/virt/kvm/arm/index.rst
> +++ b/Documentation/virt/kvm/arm/index.rst
> @@ -10,4 +10,5 @@ ARM
> hyp-abi
> hypercalls
> pvtime
> +   pvlock
> ptp_kvm
> diff --git a/Documentation/virt/kvm/arm/pvlock.rst 
> b/Documentation/virt/kvm/arm/pvlock.rst
> new file mode 100644
> index ..d3c391b16d36
> --- /dev/null
> +++ b/Documentation/virt/kvm/arm/pvlock.rst
> @@ -0,0 +1,52 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +Paravirtualized lock support for arm64
> +==
> +
> +KVM/arm64 provides a hypervisor service call for paravirtualized guests to
> +determine whether a VCPU is currently running or not.
> +
> +A new SMCCC compatible hypercall is defined:
> +
> +* ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID:   0xC602
> +
> +ARM_SMCCC_VENDOR_HYP_KVM_PV_LOCK_FUNC_ID
> +
> += ==
> +Function ID:  (uint32)0xC602
> +Return value: (int64) IPA of the pv lock data structure for this
> +  VCPU. On failure:
> +  NOT_SUPPORTED (-1)
> += ==
> +
> +The IPA returned by PV_LOCK_PREEMPTED should be mapped by the guest as normal
> +memory with inner and outer write back caching attributes, in the inner
> +shareable domain.
> +
> +PV_LOCK_PREEMPTED returns the structure for the calling VCPU.
> +
> +PV lock state
> +-
> +
> +The structure pointed to by the PV_LOCK_PREEMPTED hypercall is as follows:
> +
> ++---+-+-+-+
> +| Field | Byte Length | Byte Offset | Description |
> ++===+=+=+=+
> +| preempted |  8  |  0  | Indicate if the VCPU that owns  |
> +|   | | | this struct is running or not.  |
> +|   | | | Non-zero values mean the VCPU   |
> +|   | | | has been preempted. Zero means  |
> +|   | | | the VCPU is not preempted.  |
> ++---+-+-+-+
> +
> +The preempted field will be updated to 1 by the hypervisor prior to 
> scheduling
> +a VCPU. When the VCPU is scheduled out, the preempted field will be updated
> +to 0 by the hypervisor.

The text above doesn't match the description in the table. Please update
the texts to align them with the code.

[...]

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2] KVM: arm64: Try PMD block mappings if PUD mappings are not supported

2020-09-15 Thread Punit Agrawal
Hi Alex,

Alexandru Elisei  writes:

> Hi Punit,
>
> Thank you for having a look!
>
> On 9/11/20 9:34 AM, Punit Agrawal wrote:
>> Hi Alexandru,
>>
>> Alexandru Elisei  writes:
>>
>>> When userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to
>>> use the same block size to map the faulting IPA in stage 2. If stage 2
>>> cannot the same block mapping because the block size doesn't fit in the
>>> memslot or the memslot is not properly aligned, user_mem_abort() will fall
>>> back to a page mapping, regardless of the block size. We can do better for
>>> PUD backed hugetlbfs by checking if a PMD block mapping is supported before
>>> deciding to use a page.
>> I think this was discussed in the past.
>>
>> I have a vague recollection of there being a problem if the user and
>> stage 2 mappings go out of sync - can't recall the exact details.
>
> I'm not sure what you mean by the two tables going out of sync. I'm looking at
> Documentation/vm/unevictable-lru.rst and this is what it says regarding 
> hugetlbfs:
>
> "VMAs mapping hugetlbfs page are already effectively pinned into memory.  We
> neither need nor want to mlock() these pages.  However, to preserve the prior
> behavior of mlock() - before the unevictable/mlock changes - mlock_fixup() 
> will
> call make_pages_present() in the hugetlbfs VMA range to allocate the huge 
> pages
> and populate the ptes."
>
> Please correct me if I'm wrong, but my interpretation is that once a hugetlbfs
> page has been mapped in a process' address space, the only way to unmap it is 
> via
> munmap. If that's the case, the KVM mmu notifier should take care of unmapping
> from stage 2 the entire memory range addressed by the hugetlbfs pages,
> right?

You're right - I managed to confuse myself. Thinking about it with a bit
more context, I don't see a problem with what the patch is doing.

Apologies for the noise.

>>
>> Putting it out there in case anybody else on the thread can recall the
>> details of the previous discussion (offlist).
>>
>> Though things may have changed and if it passes testing - then maybe I
>> am mis-remembering. I'll take a closer look at the patch and shout out
>> if I notice anything.
>
> The test I ran was to boot a VM and run ltp (with printk's sprinkled in the 
> host
> kernel to see what page size and where it gets mapped/unmapped at stage 2). 
> Do you
> mind recommending other tests that I might run?

You may want to put the changes through VM save / restore and / or live
migration. It should help catch any issues with transitioning from
hugepages to regular pages.

Hope that helps.

Thanks,
Punit

[...]

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2] KVM: arm64: Try PMD block mappings if PUD mappings are not supported

2020-09-11 Thread Punit Agrawal
Hi Alexandru,

Alexandru Elisei  writes:

> When userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to
> use the same block size to map the faulting IPA in stage 2. If stage 2
> cannot the same block mapping because the block size doesn't fit in the
> memslot or the memslot is not properly aligned, user_mem_abort() will fall
> back to a page mapping, regardless of the block size. We can do better for
> PUD backed hugetlbfs by checking if a PMD block mapping is supported before
> deciding to use a page.

I think this was discussed in the past.

I have a vague recollection of there being a problem if the user and
stage 2 mappings go out of sync - can't recall the exact details.

Putting it out there in case anybody else on the thread can recall the
details of the previous discussion (offlist).

Though things may have changed and if it passes testing - then maybe I
am mis-remembering. I'll take a closer look at the patch and shout out
if I notice anything.

Thanks,
Punit

>
> vma_pagesize is an unsigned long, use 1UL instead of 1ULL when assigning
> its value.
>
> Signed-off-by: Alexandru Elisei 
> ---
> Tested on a rockpro64 with 4K pages and hugetlbfs hugepagesz=1G (PUD sized
> block mappings).  First test, guest RAM starts at 0x8100 
> (memslot->base_gfn not aligned to 1GB); second test, guest RAM starts at
> 0x8000 , but is only 512 MB.  In both cases using PUD mappings is not
> possible because either the memslot base address is not aligned, or the
> mapping would extend beyond the memslot.
>
> Without the changes, user_mem_abort() uses 4K pages to map the guest IPA.
> With the patches, user_mem_abort() uses PMD block mappings (2MB) to map the
> guest RAM, which means less TLB pressure and fewer stage 2 aborts.
>
> Changes since v1 [1]:
> - Rebased on top of Will's stage 2 page table handling rewrite, version 4
>   of the series [2]. His series is missing the patch "KVM: arm64: Update
>   page shift if stage 2 block mapping not supported" and there might be a
>   conflict (it's straightforward to fix).
>
> [1] https://www.spinics.net/lists/arm-kernel/msg834015.html
> [2] https://www.spinics.net/lists/arm-kernel/msg835806.html
>
>  arch/arm64/kvm/mmu.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 1041be1fafe4..39c539d4d4cb 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -776,16 +776,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   else
>   vma_shift = PAGE_SHIFT;
>  
> - vma_pagesize = 1ULL << vma_shift;
>   if (logging_active ||
> - (vma->vm_flags & VM_PFNMAP) ||
> - !fault_supports_stage2_huge_mapping(memslot, hva, vma_pagesize)) {
> + (vma->vm_flags & VM_PFNMAP)) {
>   force_pte = true;
> - vma_pagesize = PAGE_SIZE;
> + vma_shift = PAGE_SHIFT;
> + }
> +
> + if (vma_shift == PUD_SHIFT &&
> + !fault_supports_stage2_huge_mapping(memslot, hva, PUD_SIZE))
> +vma_shift = PMD_SHIFT;
> +
> + if (vma_shift == PMD_SHIFT &&
> + !fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
> + force_pte = true;
> + vma_shift = PAGE_SHIFT;
>   }
>  
> + vma_pagesize = 1UL << vma_shift;
>   if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
> - fault_ipa &= huge_page_mask(hstate_vma(vma));
> + fault_ipa &= ~(vma_pagesize - 1);
>  
>   gfn = fault_ipa >> PAGE_SHIFT;
>   mmap_read_unlock(current->mm);
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 7/9] KVM: arm64: Do not try to map PUDs when they are folded into PMD

2020-09-09 Thread Punit Agrawal
Hi Marc,

Noticed this patch while catching up with the lists.

Marc Zyngier  writes:

> For the obscure cases where PMD and PUD are the same size
> (64kB pages with 42bit VA, for example, which results in only
> two levels of page tables), we can't map anything as a PUD,
> because there is... erm... no PUD to speak of. Everything is
> either a PMD or a PTE.
>
> So let's only try and map a PUD when its size is different from
> that of a PMD.
>
> Cc: sta...@vger.kernel.org
> Fixes: b8e0ba7c8bea ("KVM: arm64: Add support for creating PUD hugepages at 
> stage 2")
> Reported-by: Gavin Shan 
> Reported-by: Eric Auger 
> Reviewed-by: Alexandru Elisei 
> Reviewed-by: Gavin Shan 
> Tested-by: Gavin Shan 
> Tested-by: Eric Auger 
> Tested-by: Alexandru Elisei 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/kvm/mmu.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 0121ef2c7c8d..16b8660ddbcc 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1964,7 +1964,12 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   (fault_status == FSC_PERM &&
>stage2_is_exec(mmu, fault_ipa, vma_pagesize));
>  
> - if (vma_pagesize == PUD_SIZE) {
> + /*
> +  * If PUD_SIZE == PMD_SIZE, there is no real PUD level, and
> +  * all we have is a 2-level page table. Trying to map a PUD in
> +  * this case would be fatally wrong.
> +  */
> + if (PUD_SIZE != PMD_SIZE && vma_pagesize == PUD_SIZE) {
>   pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
>  
>   new_pud = kvm_pud_mkhuge(new_pud);

Good catch!
Missed the 64kb / 42b VA case while adding the initial support.

Thanks for fixing it.

Punit
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v10 0/8] kvm: arm64: Support PUD hugepage at stage 2

2019-01-03 Thread Punit Agrawal
Christoffer Dall  writes:

> On Tue, Dec 11, 2018 at 05:10:33PM +, Suzuki K Poulose wrote:
>> This series is an update to the PUD hugepage support previously posted
>> at [0]. This patchset adds support for PUD hugepages at stage 2 a
>> feature that is useful on cores that have support for large sized TLB
>> mappings (e.g., 1GB for 4K granule).
>> 
>> The patches are based on v4.20-rc4
>> 
>> The patches have been tested on AMD Seattle system with the following
>> hugepage sizes - 2M and 1G.
>> 
>> Right now the PUD hugepage for stage2 is only supported if the stage2
>> has 4 levels. i.e, with an IPA size of minimum 44bits with 4K pages.
>> This could be relaxed to stage2 with 3 levels, with the stage1 PUD huge
>> page mapped in the entry level of the stage2 (i.e, pgd). I have not
>> added the change here to keep this version stable w.r.t the previous
>> version. I could post a patch later after further discussions in the
>> list.
>> 
>
> For the series:
>
> Reviewed-by: Christoffer Dall 

Thanks a lot for reviewing the patches and the tag. And to Suzuki for
picking up the patchset.

(I was happy to see this while catching up with the lists after an
extended break!)
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 8/8] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-10-31 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  20 +
 arch/arm/include/asm/stage2_pgtable.h  |   5 ++
 arch/arm64/include/asm/kvm_mmu.h   |  16 
 arch/arm64/include/asm/pgtable-hwdef.h |   2 +
 arch/arm64/include/asm/pgtable.h   |   2 +
 virt/kvm/arm/mmu.c | 104 +++--
 6 files changed, 143 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index e62f0913ce7d..6336319a0d5b 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -84,11 +84,14 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   ({ BUG(); 0; })
 
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)( {BUG(); pud; })
 
 /*
  * The following kvm_*pud*() functions are provided strictly to allow
@@ -105,6 +108,23 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
BUG();
diff --git a/arch/arm/include/asm/stage2_pgtable.h 
b/arch/arm/include/asm/stage2_pgtable.h
index f6a7ea805232..f9017167a8d1 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -68,4 +68,9 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, 
phys_addr_t end)
 #define stage2_pmd_table_empty(kvm, pmdp)  kvm_page_empty(pmdp)
 #define stage2_pud_table_empty(kvm, pudp)  false
 
+static inline bool kvm_stage2_has_pud(struct kvm *kvm)
+{
+   return false;
+}
+
 #endif /* __ARM_S2_PGTABLE_H_ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 9f941f70775c..8af4b1befa42 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,12 +184,16 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
+
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -203,6 +207,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -215,6 +225,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 336e24cddc87..6f1c187f1c86 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index bb0f3f17a7a9..576128635f3c 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -390,6 +390,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(p

[PATCH v9 5/8] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-10-31 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  6 +++
 arch/arm64/include/asm/kvm_mmu.h   |  5 +++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 virt/kvm/arm/mmu.c | 53 +++---
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 37bf85d39607..839a619873d3 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -102,6 +102,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 8da6d1b2a196..c755b37b3f92 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+   return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 1d7d8da2ef9b..336e24cddc87 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1c669c3c1208..8e44dccd1b47 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1083,23 +1083,66 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+ pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
 {
+   pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
 
-   pmdp = stage2_get_pmd(kvm, NULL, addr);
+   *pudpp = NULL;
+   *pmdpp = NULL;
+   *ptepp = NULL;
+
+   pudp = stage2_get_pud(kvm, NULL, addr);
+   if (!pudp || stage2_pud_none(kvm, *pudp) || !stage2_pud_present(kvm, 
*pudp))
+   return false;
+
+   if (stage2_pud_huge(kvm, *pudp)) {
+   *pudpp = pudp;
+   return true;
+   }
+
+   pmdp = stage2_pmd_offset(kvm, pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
 
-   if (pmd_thp_or_huge(*pmdp))
-   return kvm_s2pmd_exec(pmdp);
+   if (pmd_thp_or_huge(*pmdp)) {
+   *pmdpp = pmdp;
+   return true;
+   }
 
ptep = pte_offset_kernel(pmdp, addr);
if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
return false;
 
-   return kvm_s2pte_exec(ptep);
+   *ptepp = ptep;
+   return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+   pud_t *pudp;
+   pmd_t *pmdp;
+   pte_t *ptep;
+   bool found;
+
+   found = stage2_get_leaf_entry(kvm, addr, , , );
+   if (!found)
+   return false;
+
+   if (pudp)
+   return kvm_s2pud_exec(pudp);
+   else if (pmdp)
+   return kvm_s2pmd_exec(pmdp);
+   else
+   return kvm_s2pte_exec(ptep);
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 4/8] KVM: arm64: Support dirty page tracking for PUD hugepages

2018-10-31 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal 
Reviewed-by: Christoffer Dall 
Reviewed-by: Suzuki K Poulose 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 15 +++
 arch/arm64/include/asm/kvm_mmu.h | 10 ++
 virt/kvm/arm/mmu.c   | 11 +++
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index e6eff8bf5d7f..37bf85d39607 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -87,6 +87,21 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functions are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 13d482710292..8da6d1b2a196 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -251,6 +251,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+   kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+   return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index fb5325f7a1ac..1c669c3c1208 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1347,9 +1347,12 @@ static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
do {
next = stage2_pud_addr_end(kvm, addr, end);
if (!stage2_pud_none(kvm, *pud)) {
-   /* TODO:PUD not supported, revisit later if supported */
-   BUG_ON(stage2_pud_huge(kvm, *pud));
-   stage2_wp_pmds(kvm, pud, addr, next);
+   if (stage2_pud_huge(kvm, *pud)) {
+   if (!kvm_s2pud_readonly(pud))
+   kvm_set_s2pud_readonly(pud);
+   } else {
+   stage2_wp_pmds(kvm, pud, addr, next);
+   }
}
} while (pud++, addr = next, addr != end);
 }
@@ -1392,7 +1395,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t 
addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 7/8] KVM: arm64: Update age handlers to support PUD hugepages

2018-10-31 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  6 +
 arch/arm64/include/asm/kvm_mmu.h |  5 
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c   | 39 
 4 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index fea5e723e3ac..e62f0913ce7d 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -117,6 +117,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 612032bbb428..9f941f70775c 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -273,6 +273,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   return pud_young(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f51e2271e6a3..bb0f3f17a7a9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -386,6 +386,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud) pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index bd749601195f..3893ea6a50bf 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1225,6 +1225,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+   return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1932,42 +1937,38 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long 
hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return stage2_pudp_test_and_clear_young(pud);
+   else if (pmd)
return stage2_pmdp_test_and_clear_young(pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (pte_none(*pte))
-   return 0;
-
-   return stage2_ptep_test_and_clear_young(pte);
+   else
+   return stage2_ptep_test_and_clear_young(pte);
 }
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return kvm_s2pud_young(*pud);
+   else if (pmd)
return pmd_young(*pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (!pte_none(*pte))/* Just a page... */
+   else
return pte_young(*pte);
-
-   return 0;
 }
 
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 6/8] KVM: arm64: Support handling access faults for PUD hugepages

2018-10-31 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  9 +
 arch/arm64/include/asm/kvm_mmu.h |  7 +++
 arch/arm64/include/asm/pgtable.h |  6 ++
 virt/kvm/arm/mmu.c   | 22 +++---
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 839a619873d3..fea5e723e3ac 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -85,6 +85,9 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   ({ BUG(); 0; })
+
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 /*
@@ -108,6 +111,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c755b37b3f92..612032bbb428 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -187,6 +187,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -266,6 +268,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   return pud_mkyoung(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 50b1ef8584c0..f51e2271e6a3 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+   return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
return __pmd(pud_val(pud));
@@ -381,6 +386,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 8e44dccd1b47..bd749601195f 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1698,6 +1698,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
kvm_pfn_t pfn;
@@ -1707,24 +1708,23 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa)
 
spin_lock(>kvm->mmu_lock);
 
-   pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   if (!stage2_get_leaf_entry(vcpu->kvm, fault_ipa, , , ))
goto out;
 
-   if (pmd_thp_or_huge(*pmd)) {/* THP, HugeTLB */
+   if (pud) {  /* HugeTLB */
+   *pud = kvm_s2pud_mkyoung(*pud);
+   pfn = kvm_pud_pfn(*pud);
+   pfn_valid = true;
+   } else  if (pmd) {  /* THP, HugeTLB */
*pmd = pmd_mkyoung(*pmd);
pfn = pmd_pfn(*pmd);
pfn_valid = true;
-   goto out;
+   } else {
+   *pte = pte_mkyoung(*pte);   /* Just a page... */
+   pfn = pte_pfn(*pte);
+   pfn_valid = true;
}
 
-   pte = pte_offset_kernel(pmd, fault_ipa);
-   if (pte_none(*pte)) /* Nothing there either */
-   goto out;
-
-   *pte = pte_mkyoung(*pte);   /* Just a page... */
-   pfn = pte_pfn(*pte);
-   pfn_valid = true;
 out:
spin_unlock(>kvm->mmu_lock);
if (pfn_valid)
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 2/8] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault

2018-10-31 Thread Punit Agrawal
Stage 2 fault handler marks a page as executable if it is handling an
execution fault or if it was a permission fault in which case the
executable bit needs to be preserved.

The logic to decide if the page should be marked executable is
duplicated for PMD and PTE entries. To avoid creating another copy
when support for PUD hugepages is introduced refactor the code to
share the checks needed to mark a page table entry as executable.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 59595207c5e1..6912529946fb 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1475,7 +1475,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  unsigned long fault_status)
 {
int ret;
-   bool write_fault, exec_fault, writable, force_pte = false;
+   bool write_fault, writable, force_pte = false;
+   bool exec_fault, needs_exec;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1598,19 +1599,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (exec_fault)
invalidate_icache_guest_page(pfn, vma_pagesize);
 
+   /*
+* If we took an execution fault we have made the
+* icache/dcache coherent above and should now let the s2
+* mapping be executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   needs_exec = exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
if (vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
-   if (exec_fault) {
+   if (needs_exec)
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1621,13 +1628,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
mark_page_dirty(kvm, gfn);
}
 
-   if (exec_fault) {
+   if (needs_exec)
new_pte = kvm_s2pte_mkexec(new_pte);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pte = kvm_s2pte_mkexec(new_pte);
-   }
 
ret = stage2_set_pte(kvm, memcache, fault_ipa, _pte, flags);
}
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 0/8] KVM: Support PUD hugepage at stage 2

2018-10-31 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage 2 a
feature that is useful on cores that have support for large sized TLB
mappings (e.g., 1GB for 4K granule).

The patches are based on the latest upstream kernel.

The patches have been tested on AMD Seattle system with the following
hugepage sizes - 2M and 1G.

Thanks,
Punit

[0] https://patchwork.kernel.org/cover/10622379/

v8 -> v9

* Dropped bugfix patch 1 which has been merged

v7 -> v8

* Add kvm_stage2_has_pud() helper on arm32
* Rebased to v6 of 52bit dynamic IPA support

v6 -> v7

* Restrict thp check to exclude hugetlbfs pages - Patch 1
* Don't update PUD entry if there's no change - Patch 9
* Add check for PUD level in stage 2 - Patch 9

v5 -> v6

* Split Patch 1 to move out the refactoring of exec permissions on
  page table entries.
* Patch 4 - Initialise p*dpp pointers in stage2_get_leaf_entry()
* Patch 5 - Trigger a BUG() in kvm_pud_pfn() on arm

v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

Punit Agrawal (8):
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault
  KVM: arm/arm64: Introduce helpers to manipulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  61 +
 arch/arm/include/asm/stage2_pgtable.h  |   5 +
 arch/arm64/include/asm/kvm_mmu.h   |  48 
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 +
 virt/kvm/arm/mmu.c | 312 ++---
 6 files changed, 360 insertions(+), 79 deletions(-)

-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 1/8] KVM: arm/arm64: Share common code in user_mem_abort()

2018-10-31 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 49 --
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5eca48bdb1a6..59595207c5e1 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1475,7 +1475,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  unsigned long fault_status)
 {
int ret;
-   bool write_fault, exec_fault, writable, hugetlb = false, force_pte = 
false;
+   bool write_fault, exec_fault, writable, force_pte = false;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1484,7 +1484,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1504,10 +1504,16 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
-   hugetlb = true;
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
+   /*
+* Fallback to PTE if it's not one of the Stage 2
+* supported hugepage sizes
+*/
+   vma_pagesize = PAGE_SIZE;
+
/*
 * Pages belonging to memslots that don't have the same
 * alignment for userspace and IPA cannot be mapped using
@@ -1573,23 +1579,33 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
-   hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (vma_pagesize == PAGE_SIZE && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
+   if (transparent_hugepage_adjust(, _ipa))
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
 
-   if (hugetlb) {
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
+
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
+
+   if (vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
-
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
 
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
@@ -1602,16 +1618,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
-   invalidate_icache_guest_page(pfn, PAGE_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v9 3/8] KVM: arm/arm64: Introduce helpers to manipulate page table entries

2018-10-31 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Acked-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  5 +
 arch/arm64/include/asm/kvm_mmu.h |  5 +
 virt/kvm/arm/mmu.c   | 14 --
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 1098ffc3d54b..e6eff8bf5d7f 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -82,6 +82,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pud(pmdp)   __pud(__pa(pmdp) | PMD_TYPE_TABLE)
 #define kvm_mk_pgd(pudp)   ({ BUILD_BUG(); 0; })
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 658657367f2f..13d482710292 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,6 +184,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 6912529946fb..fb5325f7a1ac 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -607,7 +607,7 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned 
long start,
addr = start;
do {
pte = pte_offset_kernel(pmd, addr);
-   kvm_set_pte(pte, pfn_pte(pfn, prot));
+   kvm_set_pte(pte, kvm_pfn_pte(pfn, prot));
get_page(virt_to_page(pte));
pfn++;
} while (addr += PAGE_SIZE, addr != end);
@@ -1202,7 +1202,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
pfn = __phys_to_pfn(pa);
 
for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
-   pte_t pte = pfn_pte(pfn, PAGE_S2_DEVICE);
+   pte_t pte = kvm_pfn_pte(pfn, PAGE_S2_DEVICE);
 
if (writable)
pte = kvm_s2pte_mkwrite(pte);
@@ -1611,8 +1611,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
if (vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1621,7 +1623,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
@@ -1878,7 +1880,7 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte)
 * just like a translation fault and clean the cache to the PoC.
 */
clean_dcache_guest_page(pfn, PAGE_SIZE);
-   stage2_pte = pfn_pte(pfn, PAGE_S2);
+   stage2_pte = kvm_pfn_pte(pfn, PAGE_S2);
handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, _pte);
 }
 
-- 
2.19.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment

2018-10-31 Thread Punit Agrawal
Punit Agrawal  writes:

> Christoffer Dall  writes:
>
>> On Mon, Oct 01, 2018 at 04:54:35PM +0100, Punit Agrawal wrote:
>>> PageTransCompoundMap() returns true for hugetlbfs and THP
>>> hugepages. This behaviour incorrectly leads to stage 2 faults for
>>> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
>>> treated as THP faults.
>>> 
>>> Tighten the check to filter out hugetlbfs pages. This also leads to
>>> consistently mapping all unsupported hugepage sizes as PTE level
>>> entries at stage 2.
>>> 
>>> Signed-off-by: Punit Agrawal 
>>> Reviewed-by: Suzuki Poulose 
>>> Cc: Christoffer Dall 
>>> Cc: Marc Zyngier 
>>> Cc: sta...@vger.kernel.org # v4.13+
>>
>>
>> Hmm, this function is only actually called from user_mem_abort() if we
>> have (!hugetlb), so I'm not sure the cc stable here was actually
>> warranted, nor that this patch is strictly necessary.
>>
>> It doesn't hurt, and makes the code potentially more robust for the
>> future though.
>>
>> Am I missing something?
>
> !hugetlb is only true for hugepage sizes supported at stage 2. 

Of course I meant "hugetlb" above (Note the lack of "!").

> The function also got called for unsupported hugepage size at stage 2,
> e.g., 64k hugepage with 4k page size, which then ended up doing the
> wrong thing.
>
> Hope that adds some context. I should've added this to the commit log.
>
>>
>> Thanks,
>>
>> Christoffer
>>
>>> ---
>>>  virt/kvm/arm/mmu.c | 8 +++-
>>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 7e477b3cae5b..c23a1b323aad 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t 
>>> *pfnp, phys_addr_t *ipap)
>>>  {
>>> kvm_pfn_t pfn = *pfnp;
>>> gfn_t gfn = *ipap >> PAGE_SHIFT;
>>> +   struct page *page = pfn_to_page(pfn);
>>>  
>>> -   if (PageTransCompoundMap(pfn_to_page(pfn))) {
>>> +   /*
>>> +* PageTransCompoungMap() returns true for THP and
>>> +* hugetlbfs. Make sure the adjustment is done only for THP
>>> +* pages.
>>> +*/
>>> +   if (!PageHuge(page) && PageTransCompoundMap(page)) {
>>> unsigned long mask;
>>> /*
>>>  * The address we faulted on is backed by a transparent huge
>>> -- 
>>> 2.18.0
>>> 
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment

2018-10-31 Thread Punit Agrawal
Christoffer Dall  writes:

> On Mon, Oct 01, 2018 at 04:54:35PM +0100, Punit Agrawal wrote:
>> PageTransCompoundMap() returns true for hugetlbfs and THP
>> hugepages. This behaviour incorrectly leads to stage 2 faults for
>> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
>> treated as THP faults.
>> 
>> Tighten the check to filter out hugetlbfs pages. This also leads to
>> consistently mapping all unsupported hugepage sizes as PTE level
>> entries at stage 2.
>> 
>> Signed-off-by: Punit Agrawal 
>> Reviewed-by: Suzuki Poulose 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: sta...@vger.kernel.org # v4.13+
>
>
> Hmm, this function is only actually called from user_mem_abort() if we
> have (!hugetlb), so I'm not sure the cc stable here was actually
> warranted, nor that this patch is strictly necessary.
>
> It doesn't hurt, and makes the code potentially more robust for the
> future though.
>
> Am I missing something?

!hugetlb is only true for hugepage sizes supported at stage 2. The
function also got called for unsupported hugepage size at stage 2, e.g.,
64k hugepage with 4k page size, which then ended up doing the wrong
thing.

Hope that adds some context. I should've added this to the commit log.

>
> Thanks,
>
> Christoffer
>
>> ---
>>  virt/kvm/arm/mmu.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 7e477b3cae5b..c23a1b323aad 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t 
>> *pfnp, phys_addr_t *ipap)
>>  {
>>  kvm_pfn_t pfn = *pfnp;
>>  gfn_t gfn = *ipap >> PAGE_SHIFT;
>> +struct page *page = pfn_to_page(pfn);
>>  
>> -if (PageTransCompoundMap(pfn_to_page(pfn))) {
>> +/*
>> + * PageTransCompoungMap() returns true for THP and
>> + * hugetlbfs. Make sure the adjustment is done only for THP
>> + * pages.
>> + */
>> +if (!PageHuge(page) && PageTransCompoundMap(page)) {
>>  unsigned long mask;
>>  /*
>>   * The address we faulted on is backed by a transparent huge
>> -- 
>> 2.18.0
>> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2] KVM: arm/arm64: Check memslot bounds before mapping hugepages

2018-10-04 Thread Punit Agrawal
Hi Lukas,

Lukas Braun  writes:

> Userspace can create a memslot with memory backed by (transparent)
> hugepages, but with bounds that do not align with hugepages.
> In that case, we cannot map the entire region in the guest as hugepages
> without exposing additional host memory to the guest and potentially
> interfering with other memslots.
> Consequently, this patch adds a bounds check when populating guest page
> tables and forces the creation of regular PTEs if mapping an entire
> hugepage would violate the memslots bounds.
>
> Signed-off-by: Lukas Braun 
> ---
>
> Hi everyone,
>
> for v2, in addition to writing the condition the way Marc suggested, I
> moved the whole check so it also catches the problem when the hugepage
> was allocated explicitly, not only for THPs.

Ok, that makes sense. Memslot bounds could exceed for hugetlbfs pages as
well.

> The second line is quite long, but splitting it up would make things
> rather ugly IMO, so I left it as it is.

Let's try to do better - user_mem_abort() is quite hard to follow as it
is.

>
>
> Regards,
> Lukas
>
>
>  virt/kvm/arm/mmu.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index ed162a6c57c5..ba77339e23ec 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -1500,7 +1500,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
> phys_addr_t fault_ipa,
>   return -EFAULT;
>   }
>  
> - if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
> + if ((fault_ipa & S2_PMD_MASK) < (memslot->base_gfn << PAGE_SHIFT) ||
> + ALIGN(fault_ipa, S2_PMD_SIZE) >= ((memslot->base_gfn + 
> memslot->npages) << PAGE_SHIFT)) {
> + /* PMD entry would map something outside of the memslot */
> + force_pte = true;
> + } else if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
>   hugetlb = true;
>   gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>   } else {

For the purpose of this fix, using a helper to check whether the mapping
fits in the memslot makes things clearer (imo) (untested patch below) -

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed162a6c57c5..8bca141eb45e 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1466,6 +1466,18 @@ static void kvm_send_hwpoison_signal(unsigned long 
address,
send_sig_info(SIGBUS, , current);
 }
 
+static bool mapping_in_memslot(struct kvm_memory_slot *memslot,
+phys_addr_t fault_ipa, unsigned long mapping_size)
+{
+ gfn_t start_gfn = (fault_ipa & ~(mapping_size - 1)) >> PAGE_SHIFT;
+ gfn_t end_gfn = ALIGN(fault_ipa, mapping_size) >> PAGE_SHIFT;
+
+ WARN_ON(!is_power_of_2(mapping_size));
+
+ return memslot->base_gfn <= start_gfn &&
+ end_gfn < memslot->base_gfn + memslot->npages;
+}
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
  struct kvm_memory_slot *memslot, unsigned long hva,
  unsigned long fault_status)
@@ -1480,7 +1492,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+ unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1500,7 +1512,15 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+ vma_pagesize = vma_kernel_pagesize(vma);
+ /* Is the mapping contained in the memslot? */
+ if (!mapping_in_memslot(memslot, fault_ipa, vma_pagesize)) {
+ /* memslot should be aligned to page size */
+ vma_pagesize = PAGE_SIZE;
+ force_pte = true;
+ }
+
+ if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {

Thoughts?
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment

2018-10-03 Thread Punit Agrawal
Marc Zyngier  writes:

> On 01/10/18 16:54, Punit Agrawal wrote:
>> PageTransCompoundMap() returns true for hugetlbfs and THP
>> hugepages. This behaviour incorrectly leads to stage 2 faults for
>> unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
>> treated as THP faults.
>>
>> Tighten the check to filter out hugetlbfs pages. This also leads to
>> consistently mapping all unsupported hugepage sizes as PTE level
>> entries at stage 2.
>>
>> Signed-off-by: Punit Agrawal 
>> Reviewed-by: Suzuki Poulose 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: sta...@vger.kernel.org # v4.13+
>
> FWIW, I've cherry-picked that single patch from the series and queued
> it for 4.20.

Thanks for picking up the fix.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-10-02 Thread Punit Agrawal
Suzuki K Poulose  writes:

> On 10/01/2018 04:54 PM, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Now that the various page
>> handling routines are updated, extend the stage 2 fault handling to
>> map in PUD hugepages.
>>
>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>> 1G with 4K granule) which can be useful on cores that support mapping
>> larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> ---
>>   arch/arm/include/asm/kvm_mmu.h |  20 +
>>   arch/arm/include/asm/stage2_pgtable.h  |   9 +++
>>   arch/arm64/include/asm/kvm_mmu.h   |  16 
>>   arch/arm64/include/asm/pgtable-hwdef.h |   2 +
>>   arch/arm64/include/asm/pgtable.h   |   2 +
>>   virt/kvm/arm/mmu.c | 106 +++--
>>   6 files changed, 149 insertions(+), 6 deletions(-)
>>
>
> ...
>
>> diff --git a/arch/arm/include/asm/stage2_pgtable.h 
>> b/arch/arm/include/asm/stage2_pgtable.h
>> index f6a7ea805232..a4ec25360e50 100644
>> --- a/arch/arm/include/asm/stage2_pgtable.h
>> +++ b/arch/arm/include/asm/stage2_pgtable.h
>> @@ -68,4 +68,13 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, 
>> phys_addr_t end)
>>   #define stage2_pmd_table_empty(kvm, pmdp)  kvm_page_empty(pmdp)
>>   #define stage2_pud_table_empty(kvm, pudp)  false
>>   +static inline bool kvm_stage2_has_pud(struct kvm *kvm)
>> +{
>> +#if CONFIG_PGTABLE_LEVELS > 3
>> +return true;
>> +#else
>> +return false;
>> +#endif
>
> nit: We can only have PGTABLE_LEVELS=3 on ARM with LPAE.
> AFAIT, this can be set to false always for ARM.

I debated this and veered towards being generic but not committed either
ways.

I've updated this locally but will wait for further comments before
re-posting.

>
>> +}
>> +
>
> ...
>
>> @@ -1669,7 +1752,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  needs_exec = exec_fault ||
>>  (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
>>   -  if (hugetlb && vma_pagesize == PMD_SIZE) {
>> +if (hugetlb && vma_pagesize == PUD_SIZE) {
>> +pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
>> +
>> +new_pud = kvm_pud_mkhuge(new_pud);
>> +if (writable)
>> +new_pud = kvm_s2pud_mkwrite(new_pud);
>> +
>> +if (needs_exec)
>> +new_pud = kvm_s2pud_mkexec(new_pud);
>> +
>> +ret = stage2_set_pud_huge(kvm, memcache, fault_ipa, _pud);
>> +} else if (hugetlb && vma_pagesize == PMD_SIZE) {
>>  pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
>>  new_pmd = kvm_pmd_mkhuge(new_pmd);
>>
>
>
> Reviewed-by: Suzuki K Poulose 

Thanks a lot for going through the series.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 7/9] KVM: arm64: Support handling access faults for PUD hugepages

2018-10-01 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  9 +
 arch/arm64/include/asm/kvm_mmu.h |  7 +++
 arch/arm64/include/asm/pgtable.h |  6 ++
 virt/kvm/arm/mmu.c   | 22 +++---
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 26a2ab05b3f6..95b34aad0dc8 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -85,6 +85,9 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   ({ BUG(); 0; })
+
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 /*
@@ -108,6 +111,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c06ef3be8ca9..b93e5167728f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -187,6 +187,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -266,6 +268,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   return pud_mkyoung(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+   return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5fd1eae7d964..1401dc015a22 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1706,6 +1706,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
kvm_pfn_t pfn;
@@ -1715,24 +1716,23 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa)
 
spin_lock(>kvm->mmu_lock);
 
-   pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   if (!stage2_get_leaf_entry(vcpu->kvm, fault_ipa, , , ))
goto out;
 
-   if (pmd_thp_or_huge(*pmd)) {/* THP, HugeTLB */
+   if (pud) {  /* HugeTLB */
+   *pud = kvm_s2pud_mkyoung(*pud);
+   pfn = kvm_pud_pfn(*pud);
+   pfn_valid = true;
+   } else  if (pmd) {  /* THP, HugeTLB */
*pmd = pmd_mkyoung(*pmd);
pfn = pmd_pfn(*pmd);
pfn_valid = true;
-   goto out;
+   } else {
+   *pte = pte_mkyoung(*pte);   /* Just a page... */
+   pfn = pte_pfn(*pte);
+   pfn_valid = true;
}
 
-   pte = pte_offset_kernel(pmd, fault_ipa);
-   if (pte_none(*pte)) /* Nothing there either */
-   goto out;
-
-   *pte = pte_mkyoung(*pte);   /* Just a page... */
-   pfn = pte_pfn(*pte);
-   pfn_valid = true;
 out:
spin_unlock(>kvm->mmu_lock);
if (pfn_valid)
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 3/9] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault

2018-10-01 Thread Punit Agrawal
Stage 2 fault handler marks a page as executable if it is handling an
execution fault or if it was a permission fault in which case the
executable bit needs to be preserved.

The logic to decide if the page should be marked executable is
duplicated for PMD and PTE entries. To avoid creating another copy
when support for PUD hugepages is introduced refactor the code to
share the checks needed to mark a page table entry as executable.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5b76ee204000..ec64d21c6571 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1481,7 +1481,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  unsigned long fault_status)
 {
int ret;
-   bool write_fault, exec_fault, writable, hugetlb = false, force_pte = 
false;
+   bool write_fault, writable, hugetlb = false, force_pte = false;
+   bool exec_fault, needs_exec;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1606,19 +1607,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (exec_fault)
invalidate_icache_guest_page(pfn, vma_pagesize);
 
+   /*
+* If we took an execution fault we have made the
+* icache/dcache coherent above and should now let the s2
+* mapping be executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   needs_exec = exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
-   if (exec_fault) {
+   if (needs_exec)
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1629,13 +1636,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
mark_page_dirty(kvm, gfn);
}
 
-   if (exec_fault) {
+   if (needs_exec)
new_pte = kvm_s2pte_mkexec(new_pte);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pte = kvm_s2pte_mkexec(new_pte);
-   }
 
ret = stage2_set_pte(kvm, memcache, fault_ipa, _pte, flags);
}
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 5/9] KVM: arm64: Support dirty page tracking for PUD hugepages

2018-10-01 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal 
Reviewed-by: Christoffer Dall 
Reviewed-by: Suzuki K Poulose 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 15 +++
 arch/arm64/include/asm/kvm_mmu.h | 10 ++
 virt/kvm/arm/mmu.c   | 11 +++
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index e77212e53e77..9ec09f4cc284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -87,6 +87,21 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functions are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index baabea0cbb66..3cc342177474 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -251,6 +251,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+   kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+   return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 21079eb5bc15..9c48f2ca6583 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1347,9 +1347,12 @@ static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
do {
next = stage2_pud_addr_end(kvm, addr, end);
if (!stage2_pud_none(kvm, *pud)) {
-   /* TODO:PUD not supported, revisit later if supported */
-   BUG_ON(stage2_pud_huge(kvm, *pud));
-   stage2_wp_pmds(kvm, pud, addr, next);
+   if (stage2_pud_huge(kvm, *pud)) {
+   if (!kvm_s2pud_readonly(pud))
+   kvm_set_s2pud_readonly(pud);
+   } else {
+   stage2_wp_pmds(kvm, pud, addr, next);
+   }
}
} while (pud++, addr = next, addr != end);
 }
@@ -1392,7 +1395,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t 
addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 4/9] KVM: arm/arm64: Introduce helpers to manipulate page table entries

2018-10-01 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Acked-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  5 +
 arch/arm64/include/asm/kvm_mmu.h |  5 +
 virt/kvm/arm/mmu.c   | 14 --
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 5ad1a54f98dc..e77212e53e77 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -82,6 +82,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pud(pmdp)   __pud(__pa(pmdp) | PMD_TYPE_TABLE)
 #define kvm_mk_pgd(pudp)   ({ BUILD_BUG(); 0; })
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 77b1af9e64db..baabea0cbb66 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,6 +184,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ec64d21c6571..21079eb5bc15 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -607,7 +607,7 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned 
long start,
addr = start;
do {
pte = pte_offset_kernel(pmd, addr);
-   kvm_set_pte(pte, pfn_pte(pfn, prot));
+   kvm_set_pte(pte, kvm_pfn_pte(pfn, prot));
get_page(virt_to_page(pte));
pfn++;
} while (addr += PAGE_SIZE, addr != end);
@@ -1202,7 +1202,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
pfn = __phys_to_pfn(pa);
 
for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
-   pte_t pte = pfn_pte(pfn, PAGE_S2_DEVICE);
+   pte_t pte = kvm_pfn_pte(pfn, PAGE_S2_DEVICE);
 
if (writable)
pte = kvm_s2pte_mkwrite(pte);
@@ -1619,8 +1619,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
if (hugetlb && vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1629,7 +1631,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
@@ -1886,7 +1888,7 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte)
 * just like a translation fault and clean the cache to the PoC.
 */
clean_dcache_guest_page(pfn, PAGE_SIZE);
-   stage2_pte = pfn_pte(pfn, PAGE_S2);
+   stage2_pte = kvm_pfn_pte(pfn, PAGE_S2);
handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, _pte);
 }
 
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-10-01 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  20 +
 arch/arm/include/asm/stage2_pgtable.h  |   9 +++
 arch/arm64/include/asm/kvm_mmu.h   |  16 
 arch/arm64/include/asm/pgtable-hwdef.h |   2 +
 arch/arm64/include/asm/pgtable.h   |   2 +
 virt/kvm/arm/mmu.c | 106 +++--
 6 files changed, 149 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a42b9505c9a7..da5f078ae68c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -84,11 +84,14 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   ({ BUG(); 0; })
 
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)( {BUG(); pud; })
 
 /*
  * The following kvm_*pud*() functions are provided strictly to allow
@@ -105,6 +108,23 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
BUG();
diff --git a/arch/arm/include/asm/stage2_pgtable.h 
b/arch/arm/include/asm/stage2_pgtable.h
index f6a7ea805232..a4ec25360e50 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -68,4 +68,13 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, 
phys_addr_t end)
 #define stage2_pmd_table_empty(kvm, pmdp)  kvm_page_empty(pmdp)
 #define stage2_pud_table_empty(kvm, pudp)  false
 
+static inline bool kvm_stage2_has_pud(struct kvm *kvm)
+{
+#if CONFIG_PGTABLE_LEVELS > 3
+   return true;
+#else
+   return false;
+#endif
+}
+
 #endif /* __ARM_S2_PGTABLE_H_ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3baf72705dcc..b4e9c2cceecb 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,12 +184,16 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
+
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -203,6 +207,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -215,6 +225,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud

[PATCH v8 6/9] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-10-01 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  6 +++
 arch/arm64/include/asm/kvm_mmu.h   |  5 +++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 virt/kvm/arm/mmu.c | 53 +++---
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 9ec09f4cc284..26a2ab05b3f6 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -102,6 +102,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3cc342177474..c06ef3be8ca9 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+   return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 9c48f2ca6583..5fd1eae7d964 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1083,23 +1083,66 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+ pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
 {
+   pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
 
-   pmdp = stage2_get_pmd(kvm, NULL, addr);
+   *pudpp = NULL;
+   *pmdpp = NULL;
+   *ptepp = NULL;
+
+   pudp = stage2_get_pud(kvm, NULL, addr);
+   if (!pudp || stage2_pud_none(kvm, *pudp) || !stage2_pud_present(kvm, 
*pudp))
+   return false;
+
+   if (stage2_pud_huge(kvm, *pudp)) {
+   *pudpp = pudp;
+   return true;
+   }
+
+   pmdp = stage2_pmd_offset(kvm, pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
 
-   if (pmd_thp_or_huge(*pmdp))
-   return kvm_s2pmd_exec(pmdp);
+   if (pmd_thp_or_huge(*pmdp)) {
+   *pmdpp = pmdp;
+   return true;
+   }
 
ptep = pte_offset_kernel(pmdp, addr);
if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
return false;
 
-   return kvm_s2pte_exec(ptep);
+   *ptepp = ptep;
+   return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+   pud_t *pudp;
+   pmd_t *pmdp;
+   pte_t *ptep;
+   bool found;
+
+   found = stage2_get_leaf_entry(kvm, addr, , , );
+   if (!found)
+   return false;
+
+   if (pudp)
+   return kvm_s2pud_exec(pudp);
+   else if (pmdp)
+   return kvm_s2pmd_exec(pmdp);
+   else
+   return kvm_s2pte_exec(ptep);
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 8/9] KVM: arm64: Update age handlers to support PUD hugepages

2018-10-01 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  6 +
 arch/arm64/include/asm/kvm_mmu.h |  5 
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c   | 39 
 4 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 95b34aad0dc8..a42b9505c9a7 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -117,6 +117,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b93e5167728f..3baf72705dcc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -273,6 +273,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   return pud_young(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud) pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1401dc015a22..1cf84507bbd6 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1225,6 +1225,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+   return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1940,42 +1945,38 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long 
hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return stage2_pudp_test_and_clear_young(pud);
+   else if (pmd)
return stage2_pmdp_test_and_clear_young(pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (pte_none(*pte))
-   return 0;
-
-   return stage2_ptep_test_and_clear_young(pte);
+   else
+   return stage2_ptep_test_and_clear_young(pte);
 }
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return kvm_s2pud_young(*pud);
+   else if (pmd)
return pmd_young(*pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (!pte_none(*pte))/* Just a page... */
+   else
return pte_young(*pte);
-
-   return 0;
 }
 
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 2/9] KVM: arm/arm64: Share common code in user_mem_abort()

2018-10-01 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal 
Cc: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 45 +
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index c23a1b323aad..5b76ee204000 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1490,7 +1490,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1510,10 +1510,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
+   /*
+* Fallback to PTE if it's not one of the Stage 2
+* supported hugepage sizes
+*/
+   vma_pagesize = PAGE_SIZE;
+
/*
 * Pages belonging to memslots that don't have the same
 * alignment for userspace and IPA cannot be mapped using
@@ -1579,23 +1586,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
+   if (!hugetlb && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (hugetlb)
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
 
-   if (hugetlb) {
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
+
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
+
+   if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
-
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
 
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
@@ -1608,16 +1626,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
-   invalidate_icache_guest_page(pfn, PAGE_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment

2018-10-01 Thread Punit Agrawal
PageTransCompoundMap() returns true for hugetlbfs and THP
hugepages. This behaviour incorrectly leads to stage 2 faults for
unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
treated as THP faults.

Tighten the check to filter out hugetlbfs pages. This also leads to
consistently mapping all unsupported hugepage sizes as PTE level
entries at stage 2.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: sta...@vger.kernel.org # v4.13+
---
 virt/kvm/arm/mmu.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7e477b3cae5b..c23a1b323aad 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, 
phys_addr_t *ipap)
 {
kvm_pfn_t pfn = *pfnp;
gfn_t gfn = *ipap >> PAGE_SHIFT;
+   struct page *page = pfn_to_page(pfn);
 
-   if (PageTransCompoundMap(pfn_to_page(pfn))) {
+   /*
+* PageTransCompoungMap() returns true for THP and
+* hugetlbfs. Make sure the adjustment is done only for THP
+* pages.
+*/
+   if (!PageHuge(page) && PageTransCompoundMap(page)) {
unsigned long mask;
/*
 * The address we faulted on is backed by a transparent huge
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v8 0/9] KVM: Support PUD hugepage at stage 2

2018-10-01 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage 2 a
feature that is useful on cores that have support for large sized TLB
mappings (e.g., 1GB for 4K granule).

The only change in this version is to update the kvm_stage2_has_pud()
helper for arm to use CONFIG_PGTABLE_LEVELS.

The patches are based on v6 of the dynamic IPA support.

The patches have been tested on AMD Seattle system with the following
hugepage sizes - 64K, 32M, 1G.

Thanks,
Punit

v7 -> v8

* Add kvm_stage2_has_pud() helper on arm32
* Rebased to v6 of 52bit dynamic IPA support

v6 -> v7

* Restrict thp check to exclude hugetlbfs pages - Patch 1
* Don't update PUD entry if there's no change - Patch 9
* Add check for PUD level in stage 2 - Patch 9

v5 -> v6

* Split Patch 1 to move out the refactoring of exec permissions on
  page table entries.
* Patch 4 - Initialise p*dpp pointers in stage2_get_leaf_entry()
* Patch 5 - Trigger a BUG() in kvm_pud_pfn() on arm

v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/kvm-arm/msg32753.html
[1] https://lkml.org/lkml/2018/9/26/936

Punit Agrawal (9):
  KVM: arm/arm64: Ensure only THP is candidate for adjustment
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault
  KVM: arm/arm64: Introduce helpers to manipulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  61 +
 arch/arm/include/asm/stage2_pgtable.h  |   9 +
 arch/arm64/include/asm/kvm_mmu.h   |  48 
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 +
 virt/kvm/arm/mmu.c | 320 +++--
 6 files changed, 373 insertions(+), 78 deletions(-)

-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v7.1 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-10-01 Thread Punit Agrawal
Punit Agrawal  writes:

> KVM only supports PMD hugepages at stage 2. Now that the various page
> handling routines are updated, extend the stage 2 fault handling to
> map in PUD hugepages.
>
> Addition of PUD hugepage support enables additional page sizes (e.g.,
> 1G with 4K granule) which can be useful on cores that support mapping
> larger block sizes in the TLB entries.
>
> Signed-off-by: Punit Agrawal 
> Cc: Christoffer Dall 
> Cc: Marc Zyngier 
> Cc: Russell King 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> ---
>
> v7 -> v7.1
>
> * Added arm helper kvm_stage2_has_pud()
> * Added check for PUD level present at stage 2
> * Dropped redundant comment
> * Fixed up kvm_pud_mkhuge() to complain on arm
>
>  arch/arm/include/asm/kvm_mmu.h |  20 +
>  arch/arm/include/asm/stage2_pgtable.h  |   5 ++
>  arch/arm64/include/asm/kvm_mmu.h   |  16 
>  arch/arm64/include/asm/pgtable-hwdef.h |   2 +
>  arch/arm64/include/asm/pgtable.h   |   2 +
>  virt/kvm/arm/mmu.c | 106 +++--
>  6 files changed, 145 insertions(+), 6 deletions(-)
>

[...]

> diff --git a/arch/arm/include/asm/stage2_pgtable.h 
> b/arch/arm/include/asm/stage2_pgtable.h
> index f6a7ea805232..ec1567d9eb4b 100644
> --- a/arch/arm/include/asm/stage2_pgtable.h
> +++ b/arch/arm/include/asm/stage2_pgtable.h
> @@ -68,4 +68,9 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, 
> phys_addr_t end)
>  #define stage2_pmd_table_empty(kvm, pmdp)kvm_page_empty(pmdp)
>  #define stage2_pud_table_empty(kvm, pudp)false
>  
> +static inline bool kvm_stage2_has_pud(struct kvm *kvm)
> +{
> + return KVM_VTCR_SL0 == VTCR_SL_L1;
> +}
> +

Turns out this isn't quite the right check. On arm32, the maximum number
of supported levels is 3 with LPAE - effectively the helper should
always return false.

I've updated the check locally to key off of CONFIG_PGTABLE_LEVELS. I'll
post these patches later today.

Thanks,
Punit
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7.1 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-09-25 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---

v7 -> v7.1

* Added arm helper kvm_stage2_has_pud()
* Added check for PUD level present at stage 2
* Dropped redundant comment
* Fixed up kvm_pud_mkhuge() to complain on arm

 arch/arm/include/asm/kvm_mmu.h |  20 +
 arch/arm/include/asm/stage2_pgtable.h  |   5 ++
 arch/arm64/include/asm/kvm_mmu.h   |  16 
 arch/arm64/include/asm/pgtable-hwdef.h |   2 +
 arch/arm64/include/asm/pgtable.h   |   2 +
 virt/kvm/arm/mmu.c | 106 +++--
 6 files changed, 145 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a42b9505c9a7..da5f078ae68c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -84,11 +84,14 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   ({ BUG(); 0; })
 
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)( {BUG(); pud; })
 
 /*
  * The following kvm_*pud*() functions are provided strictly to allow
@@ -105,6 +108,23 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
BUG();
diff --git a/arch/arm/include/asm/stage2_pgtable.h 
b/arch/arm/include/asm/stage2_pgtable.h
index f6a7ea805232..ec1567d9eb4b 100644
--- a/arch/arm/include/asm/stage2_pgtable.h
+++ b/arch/arm/include/asm/stage2_pgtable.h
@@ -68,4 +68,9 @@ stage2_pmd_addr_end(struct kvm *kvm, phys_addr_t addr, 
phys_addr_t end)
 #define stage2_pmd_table_empty(kvm, pmdp)  kvm_page_empty(pmdp)
 #define stage2_pud_table_empty(kvm, pudp)  false
 
+static inline bool kvm_stage2_has_pud(struct kvm *kvm)
+{
+   return KVM_VTCR_SL0 == VTCR_SL_L1;
+}
+
 #endif /* __ARM_S2_PGTABLE_H_ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3baf72705dcc..b4e9c2cceecb 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,12 +184,16 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
+
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -203,6 +207,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -215,6 +225,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,

Re: [PATCH v7 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-09-25 Thread Punit Agrawal
Suzuki K Poulose  writes:

> Hi Punit,
>
>
> On 09/24/2018 06:45 PM, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Now that the various page
>> handling routines are updated, extend the stage 2 fault handling to
>> map in PUD hugepages.
>>
>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>> 1G with 4K granule) which can be useful on cores that support mapping
>> larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>
>
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index a42b9505c9a7..a8e86b926ee0 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -84,11 +84,14 @@ void kvm_clear_hyp_idmap(void);
>> #define kvm_pfn_pte(pfn, prot)   pfn_pte(pfn, prot)
>>   #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
>> +#define kvm_pfn_pud(pfn, prot)  (__pud(0))
>> #define kvm_pud_pfn(pud) ({ BUG(); 0; })
>>   #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
>> +/* No support for pud hugepages */
>> +#define kvm_pud_mkhuge(pud) (pud)
>>   
>
> shouldn't this be BUG() like other PUD huge helpers for arm32 ?
>
>>   /*
>>* The following kvm_*pud*() functions are provided strictly to allow
>> @@ -105,6 +108,23 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
>>  return false;
>>   }
>>   +static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
>> +{
>> +BUG();
>> +}
>> +
>> +static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
>> +{
>> +BUG();
>> +return pud;
>> +}
>> +
>> +static inline pud_t kvm_s2pud_mkexec(pud_t pud)
>> +{
>> +BUG();
>> +return pud;
>> +}
>> +
>
>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 3ff7ebb262d2..5b8163537bc2 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>
> ...
>
>
>> @@ -1669,7 +1746,28 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  needs_exec = exec_fault ||
>>  (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
>>   -  if (hugetlb && vma_pagesize == PMD_SIZE) {
>> +if (hugetlb && vma_pagesize == PUD_SIZE) {
>> +/*
>> + * Assuming that PUD level always exists at Stage 2 -
>> + * this is true for 4k pages with 40 bits IPA
>> + * currently supported.
>> + *
>> + * When using 64k pages, 40bits of IPA results in
>> + * using only 2-levels at Stage 2. Overlooking this
>> + * problem for now as a PUD hugepage with 64k pages is
>> + * too big (4TB) to be practical.
>> + */
>> +pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
>
> Is this based on the Dynamic IPA series ? The cover letter seems
> to suggest that it is. But I don't see the check to make sure we have
> stage2 PUD level here before we go ahead and try PUD huge page at
> stage2. Also the comment above seems outdated in that case.

It is indeed based on the Dynamic IPA series but I seem to have lost the
actual changes introducing the checks for PUD level. Let me fix that up
and post an update.

Sorry for the noise.

Punit

>
>> +
>> +new_pud = kvm_pud_mkhuge(new_pud);
>> +if (writable)
>> +new_pud = kvm_s2pud_mkwrite(new_pud);
>> +
>> +if (needs_exec)
>> +new_pud = kvm_s2pud_mkexec(new_pud);
>> +
>> +ret = stage2_set_pud_huge(kvm, memcache, fault_ipa, _pud);
>> +} else if (hugetlb && vma_pagesize == PMD_SIZE) {
>>  pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
>>  new_pmd = kvm_pmd_mkhuge(new_pmd);
>>
>
>
> Suzuki
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 9/9] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-09-24 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  20 +
 arch/arm64/include/asm/kvm_mmu.h   |  16 
 arch/arm64/include/asm/pgtable-hwdef.h |   2 +
 arch/arm64/include/asm/pgtable.h   |   2 +
 virt/kvm/arm/mmu.c | 108 +++--
 5 files changed, 143 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a42b9505c9a7..a8e86b926ee0 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -84,11 +84,14 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   ({ BUG(); 0; })
 
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)(pud)
 
 /*
  * The following kvm_*pud*() functions are provided strictly to allow
@@ -105,6 +108,23 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
BUG();
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3baf72705dcc..b4e9c2cceecb 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,12 +184,16 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
+
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -203,6 +207,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -215,6 +225,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
+#define pud_mkhuge(pud)(__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
 #define __phys_to_pud_val(phys)__phys_to_pte_val(phys)
 #define pud_pfn(pud)   ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 3ff7ebb262d2..5b8163537bc2 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -115,6 +115,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, 
phys_addr_t addr, pmd_t *pmd)
put_page(virt_to_page(pmd));
 }
 
+/**
+ * stage2_dissolve_pud() - clear and flush hug

[PATCH v7 8/9] KVM: arm64: Update age handlers to support PUD hugepages

2018-09-24 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  6 +
 arch/arm64/include/asm/kvm_mmu.h |  5 
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c   | 39 
 4 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 95b34aad0dc8..a42b9505c9a7 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -117,6 +117,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index b93e5167728f..3baf72705dcc 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -273,6 +273,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   return pud_young(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud) pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e2487e5fff37..3ff7ebb262d2 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1225,6 +1225,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+   return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1940,42 +1945,38 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long 
hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return stage2_pudp_test_and_clear_young(pud);
+   else if (pmd)
return stage2_pmdp_test_and_clear_young(pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (pte_none(*pte))
-   return 0;
-
-   return stage2_ptep_test_and_clear_young(pte);
+   else
+   return stage2_ptep_test_and_clear_young(pte);
 }
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return kvm_s2pud_young(*pud);
+   else if (pmd)
return pmd_young(*pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (!pte_none(*pte))/* Just a page... */
+   else
return pte_young(*pte);
-
-   return 0;
 }
 
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 0/9] KVM: Support PUD hugepages at stage 2

2018-09-24 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage 2 a
feature that is useful on cores that have support for large sized TLB
mappings (e.g., 1GB for 4K granule).

This version fixes two bugs -

* Corrects stage 2 fault handling for unsupported hugepage sizes (new
  patch 1/9). This is a long standing bug and needs backporting to
  earlier kernels

* Ensures that multiple vcpus faulting on the same hugepage doesn't
  hamper forward progress (patch 9)

The patches are based on dynamic IPA support which could lead to a
situation where the guest doesn't have PUD level. In this case, the
patches have been updated to fallback to the PTE level mappings at
stage 2.

The patches have been tested on AMD Seattle system with the following
hugepage sizes - 64K, 32M, 1G.

Thanks,
Punit

v6 -> v7

* Restrict thp check to exclude hugetlbfs pages - Patch 1
* Don't update PUD entry if there's no change - Patch 9
* Add check for PUD level in stage 2 - Patch 9

v5 -> v6

* Split Patch 1 to move out the refactoring of exec permissions on
  page table entries.
* Patch 4 - Initialise p*dpp pointers in stage2_get_leaf_entry()
* Patch 5 - Trigger a BUG() in kvm_pud_pfn() on arm

v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/kvm-arm/msg32241.html
[1] https://www.spinics.net/lists/kvm-arm/msg32641.html


Punit Agrawal (9):
  KVM: arm/arm64: Ensure only THP is candidate for adjustment
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault
  KVM: arm/arm64: Introduce helpers to manipulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  61 +
 arch/arm64/include/asm/kvm_mmu.h   |  48 
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 +
 virt/kvm/arm/mmu.c | 324 +++--
 5 files changed, 368 insertions(+), 78 deletions(-)

-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 1/9] KVM: arm/arm64: Ensure only THP is candidate for adjustment

2018-09-24 Thread Punit Agrawal
PageTransCompoundMap() returns true for hugetlbfs and THP
hugepages. This behaviour incorrectly leads to stage 2 faults for
unsupported hugepage sizes (e.g., 64K hugepage with 4K pages) to be
treated as THP faults.

Tighten the check to filter out hugetlbfs pages. This also leads to
consistently mapping all unsupported hugepage sizes as PTE level
entries at stage 2.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Suzuki Poulose 
Cc: sta...@vger.kernel.org # v4.13+
---
 virt/kvm/arm/mmu.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7e477b3cae5b..c23a1b323aad 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1231,8 +1231,14 @@ static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, 
phys_addr_t *ipap)
 {
kvm_pfn_t pfn = *pfnp;
gfn_t gfn = *ipap >> PAGE_SHIFT;
+   struct page *page = pfn_to_page(pfn);
 
-   if (PageTransCompoundMap(pfn_to_page(pfn))) {
+   /*
+* PageTransCompoungMap() returns true for THP and
+* hugetlbfs. Make sure the adjustment is done only for THP
+* pages.
+*/
+   if (!PageHuge(page) && PageTransCompoundMap(page)) {
unsigned long mask;
/*
 * The address we faulted on is backed by a transparent huge
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 4/9] KVM: arm/arm64: Introduce helpers to manipulate page table entries

2018-09-24 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Acked-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  5 +
 arch/arm64/include/asm/kvm_mmu.h |  5 +
 virt/kvm/arm/mmu.c   | 14 --
 3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 5ad1a54f98dc..e77212e53e77 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -82,6 +82,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pud(pmdp)   __pud(__pa(pmdp) | PMD_TYPE_TABLE)
 #define kvm_mk_pgd(pudp)   ({ BUILD_BUG(); 0; })
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 77b1af9e64db..baabea0cbb66 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -184,6 +184,11 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_mk_pgd(pudp)   \
__pgd(__phys_to_pgd_val(__pa(pudp)) | PUD_TYPE_TABLE)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ec64d21c6571..21079eb5bc15 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -607,7 +607,7 @@ static void create_hyp_pte_mappings(pmd_t *pmd, unsigned 
long start,
addr = start;
do {
pte = pte_offset_kernel(pmd, addr);
-   kvm_set_pte(pte, pfn_pte(pfn, prot));
+   kvm_set_pte(pte, kvm_pfn_pte(pfn, prot));
get_page(virt_to_page(pte));
pfn++;
} while (addr += PAGE_SIZE, addr != end);
@@ -1202,7 +1202,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
guest_ipa,
pfn = __phys_to_pfn(pa);
 
for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
-   pte_t pte = pfn_pte(pfn, PAGE_S2_DEVICE);
+   pte_t pte = kvm_pfn_pte(pfn, PAGE_S2_DEVICE);
 
if (writable)
pte = kvm_s2pte_mkwrite(pte);
@@ -1619,8 +1619,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
if (hugetlb && vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1629,7 +1631,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
@@ -1886,7 +1888,7 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, 
pte_t pte)
 * just like a translation fault and clean the cache to the PoC.
 */
clean_dcache_guest_page(pfn, PAGE_SIZE);
-   stage2_pte = pfn_pte(pfn, PAGE_S2);
+   stage2_pte = kvm_pfn_pte(pfn, PAGE_S2);
handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, _pte);
 }
 
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 3/9] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault

2018-09-24 Thread Punit Agrawal
Stage 2 fault handler marks a page as executable if it is handling an
execution fault or if it was a permission fault in which case the
executable bit needs to be preserved.

The logic to decide if the page should be marked executable is
duplicated for PMD and PTE entries. To avoid creating another copy
when support for PUD hugepages is introduced refactor the code to
share the checks needed to mark a page table entry as executable.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5b76ee204000..ec64d21c6571 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1481,7 +1481,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  unsigned long fault_status)
 {
int ret;
-   bool write_fault, exec_fault, writable, hugetlb = false, force_pte = 
false;
+   bool write_fault, writable, hugetlb = false, force_pte = false;
+   bool exec_fault, needs_exec;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1606,19 +1607,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (exec_fault)
invalidate_icache_guest_page(pfn, vma_pagesize);
 
+   /*
+* If we took an execution fault we have made the
+* icache/dcache coherent above and should now let the s2
+* mapping be executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   needs_exec = exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
-   if (exec_fault) {
+   if (needs_exec)
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1629,13 +1636,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
mark_page_dirty(kvm, gfn);
}
 
-   if (exec_fault) {
+   if (needs_exec)
new_pte = kvm_s2pte_mkexec(new_pte);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pte = kvm_s2pte_mkexec(new_pte);
-   }
 
ret = stage2_set_pte(kvm, memcache, fault_ipa, _pte, flags);
}
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 6/9] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-09-24 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  6 +++
 arch/arm64/include/asm/kvm_mmu.h   |  5 +++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 virt/kvm/arm/mmu.c | 53 +++---
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 9ec09f4cc284..26a2ab05b3f6 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -102,6 +102,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 3cc342177474..c06ef3be8ca9 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+   return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 9c48f2ca6583..5fd1eae7d964 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1083,23 +1083,66 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+ pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
 {
+   pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
 
-   pmdp = stage2_get_pmd(kvm, NULL, addr);
+   *pudpp = NULL;
+   *pmdpp = NULL;
+   *ptepp = NULL;
+
+   pudp = stage2_get_pud(kvm, NULL, addr);
+   if (!pudp || stage2_pud_none(kvm, *pudp) || !stage2_pud_present(kvm, 
*pudp))
+   return false;
+
+   if (stage2_pud_huge(kvm, *pudp)) {
+   *pudpp = pudp;
+   return true;
+   }
+
+   pmdp = stage2_pmd_offset(kvm, pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
 
-   if (pmd_thp_or_huge(*pmdp))
-   return kvm_s2pmd_exec(pmdp);
+   if (pmd_thp_or_huge(*pmdp)) {
+   *pmdpp = pmdp;
+   return true;
+   }
 
ptep = pte_offset_kernel(pmdp, addr);
if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
return false;
 
-   return kvm_s2pte_exec(ptep);
+   *ptepp = ptep;
+   return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+   pud_t *pudp;
+   pmd_t *pmdp;
+   pte_t *ptep;
+   bool found;
+
+   found = stage2_get_leaf_entry(kvm, addr, , , );
+   if (!found)
+   return false;
+
+   if (pudp)
+   return kvm_s2pud_exec(pudp);
+   else if (pmdp)
+   return kvm_s2pmd_exec(pmdp);
+   else
+   return kvm_s2pte_exec(ptep);
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 5/9] KVM: arm64: Support dirty page tracking for PUD hugepages

2018-09-24 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal 
Reviewed-by: Christoffer Dall 
Reviewed-by: Suzuki K Poulose 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 15 +++
 arch/arm64/include/asm/kvm_mmu.h | 10 ++
 virt/kvm/arm/mmu.c   | 11 +++
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index e77212e53e77..9ec09f4cc284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -87,6 +87,21 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functions are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index baabea0cbb66..3cc342177474 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -251,6 +251,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+   kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+   return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 21079eb5bc15..9c48f2ca6583 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1347,9 +1347,12 @@ static void  stage2_wp_puds(struct kvm *kvm, pgd_t *pgd,
do {
next = stage2_pud_addr_end(kvm, addr, end);
if (!stage2_pud_none(kvm, *pud)) {
-   /* TODO:PUD not supported, revisit later if supported */
-   BUG_ON(stage2_pud_huge(kvm, *pud));
-   stage2_wp_pmds(kvm, pud, addr, next);
+   if (stage2_pud_huge(kvm, *pud)) {
+   if (!kvm_s2pud_readonly(pud))
+   kvm_set_s2pud_readonly(pud);
+   } else {
+   stage2_wp_pmds(kvm, pud, addr, next);
+   }
}
} while (pud++, addr = next, addr != end);
 }
@@ -1392,7 +1395,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t 
addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 7/9] KVM: arm64: Support handling access faults for PUD hugepages

2018-09-24 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  9 +
 arch/arm64/include/asm/kvm_mmu.h |  7 +++
 arch/arm64/include/asm/pgtable.h |  6 ++
 virt/kvm/arm/mmu.c   | 22 +++---
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 26a2ab05b3f6..95b34aad0dc8 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -85,6 +85,9 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   ({ BUG(); 0; })
+
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 /*
@@ -108,6 +111,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= L_PTE_S2_RDWR;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c06ef3be8ca9..b93e5167728f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -187,6 +187,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -266,6 +268,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   return pud_mkyoung(pud);
+}
+
 #define hyp_pte_table_empty(ptep) kvm_page_empty(ptep)
 
 #ifdef __PAGETABLE_PMD_FOLDED
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+   return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 5fd1eae7d964..e2487e5fff37 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1706,6 +1706,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
kvm_pfn_t pfn;
@@ -1715,24 +1716,23 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa)
 
spin_lock(>kvm->mmu_lock);
 
-   pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   if (!stage2_get_leaf_entry(vcpu->kvm, fault_ipa, , , ))
goto out;
 
-   if (pmd_thp_or_huge(*pmd)) {/* THP, HugeTLB */
+   if (pud) {  /* HugeTLB */
+   *pud = kvm_s2pud_mkyoung(*pud);
+   pfn = kvm_pud_pfn(*pud);
+   pfn_valid = true;
+   } else  if (pmd) {  /* THP, HugeTLB */
*pmd = pmd_mkyoung(*pmd);
pfn = pmd_pfn(*pmd);
pfn_valid = true;
-   goto out;
+   } else {
+   *pte = pte_mkyoung(*pte);   /* Just a page... */
+   pfn = pte_pfn(*pte);
+   pfn_valid = true;
}
 
-   pte = pte_offset_kernel(pmd, fault_ipa);
-   if (pte_none(*pte)) /* Nothing there either */
-   goto out;
-
-   *pte = pte_mkyoung(*pte);   /* Just a page... */
-   pfn = pte_pfn(*pte);
-   pfn_valid = true;
 out:
spin_unlock(>kvm->mmu_lock);
if (pfn_valid)
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v7 2/9] KVM: arm/arm64: Share common code in user_mem_abort()

2018-09-24 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal 
Cc: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 45 +
 1 file changed, 29 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index c23a1b323aad..5b76ee204000 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1490,7 +1490,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1510,10 +1510,17 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
+   /*
+* Fallback to PTE if it's not one of the Stage 2
+* supported hugepage sizes
+*/
+   vma_pagesize = PAGE_SIZE;
+
/*
 * Pages belonging to memslots that don't have the same
 * alignment for userspace and IPA cannot be mapped using
@@ -1579,23 +1586,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
+   if (!hugetlb && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (hugetlb)
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
 
-   if (hugetlb) {
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
+
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
+
+   if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
-
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
 
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
@@ -1608,16 +1626,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
-   invalidate_icache_guest_page(pfn, PAGE_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Clean dcache to PoC when changing PTE due to CoW

2018-09-04 Thread Punit Agrawal
Christoffer Dall  writes:

> On Mon, Sep 03, 2018 at 06:29:30PM +0100, Punit Agrawal wrote:
>> Christoffer Dall  writes:
>> 
>> > [Adding Andrea and Steve in CC]
>> >
>> > On Thu, Aug 23, 2018 at 04:33:42PM +0100, Marc Zyngier wrote:
>> >> When triggering a CoW, we unmap the RO page via an MMU notifier
>> >> (invalidate_range_start), and then populate the new PTE using another
>> >> one (change_pte). In the meantime, we'll have copied the old page
>> >> into the new one.
>> >> 
>> >> The problem is that the data for the new page is sitting in the
>> >> cache, and should the guest have an uncached mapping to that page
>> >> (or its MMU off), following accesses will bypass the cache.
>> >> 
>> >> In a way, this is similar to what happens on a translation fault:
>> >> We need to clean the page to the PoC before mapping it. So let's just
>> >> do that.
>> >> 
>> >> This fixes a KVM unit test regression observed on a HiSilicon platform,
>> >> and subsequently reproduced on Seattle.
>> >> 
>> >> Fixes: a9c0e12ebee5 ("KVM: arm/arm64: Only clean the dcache on 
>> >> translation fault")
>> >> Reported-by: Mike Galbraith 
>> >> Signed-off-by: Marc Zyngier 
>> >> ---
>> >>  virt/kvm/arm/mmu.c | 9 -
>> >>  1 file changed, 8 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> >> index 1d90d79706bd..287c8e274655 100644
>> >> --- a/virt/kvm/arm/mmu.c
>> >> +++ b/virt/kvm/arm/mmu.c
>> >> @@ -1811,13 +1811,20 @@ static int kvm_set_spte_handler(struct kvm *kvm, 
>> >> gpa_t gpa, u64 size, void *data
>> >>  void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
>> >>  {
>> >>   unsigned long end = hva + PAGE_SIZE;
>> >> + kvm_pfn_t pfn = pte_pfn(pte);
>> >>   pte_t stage2_pte;
>> >>  
>> >>   if (!kvm->arch.pgd)
>> >>   return;
>> >>  
>> >>   trace_kvm_set_spte_hva(hva);
>> >> - stage2_pte = pfn_pte(pte_pfn(pte), PAGE_S2);
>> >> +
>> >> + /*
>> >> +  * We've moved a page around, probably through CoW, so let's treat
>> >> +  * just like a translation fault and clean the cache to the PoC.
>> >> +  */
>> >> + clean_dcache_guest_page(pfn, PAGE_SIZE);
>> >> + stage2_pte = pfn_pte(pfn, PAGE_S2);
>> >>   handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, _pte);
>> >>  }
>> >
>> > How does this work for pmd mappings?
>> 
>> kvm_set_spte_hva() isn't called for PMD mappings. But...
>> 
>> >
>> > Are we guaranteed that a pmd mapping (hugetlbfs or THP) is split before
>> > a CoW happens?
>> >
>> > Steve tells me that we share THP mappings on fork and that we back THPs
>> > by a zero page, so CoW with THP should be possible.
>> >
>> 
>> ...the problem seems to affect handling write permission faults for CoW
>> or zero pages.
>> 
>> The memory gets unmapped at stage 2 due to the invalidate notifier (in
>> hugetlb_cow() for hugetlbfs and do_huge_pmd_wp_page() for THP) 
>
> So just to make sure I get this right.  For a pte CoW, Linux calls the
> set_spte function to simply change the pte mapping, without doing any
> unmapping at stage 2, 

No.

I hadn't checked into the PTE CoW for zero pages when replying but was
relying on Marc's commit log. I've outlined the flow below.

> but with pmd, we unmap and wait to take another fault as an
> alternative method?

Having looked at handling of CoW handling for the different page sizes,
here's my understanding of the steps for CoW faults - note the slight
variance when dealing with PTE entries.

* Guest takes a stage 2 permission fault (user_mem_abort())

* The host mapping is updated to point to another page (either zeroed or
  contents copied). This happens via the get_user_pages_unlocked()
  invoked via gfn_to_pfn_prot().

* For all page sizes, mmu_invalidate_range_start() notifiers are called
  which will unmap the memory at stage 2.

* For PTE (wp_page_copy), set_pte_at_notify() is called which eventually
  calls kvm_set_pte_hva() modified in $SUBJECT.

  For hugepages (hugetlb_cow) and annonymous THP (do_huge_pmd_wp_page)
  there are no notifying versions of page table entry updaters so stage
  2 entries remain unmapped.

* mmu_notifier_invalidate_range_end() is called. This updates
  mmu_notifier_seq which will abort 

Re: [PATCH] KVM: arm/arm64: Clean dcache to PoC when changing PTE due to CoW

2018-09-03 Thread Punit Agrawal
Christoffer Dall  writes:

> [Adding Andrea and Steve in CC]
>
> On Thu, Aug 23, 2018 at 04:33:42PM +0100, Marc Zyngier wrote:
>> When triggering a CoW, we unmap the RO page via an MMU notifier
>> (invalidate_range_start), and then populate the new PTE using another
>> one (change_pte). In the meantime, we'll have copied the old page
>> into the new one.
>> 
>> The problem is that the data for the new page is sitting in the
>> cache, and should the guest have an uncached mapping to that page
>> (or its MMU off), following accesses will bypass the cache.
>> 
>> In a way, this is similar to what happens on a translation fault:
>> We need to clean the page to the PoC before mapping it. So let's just
>> do that.
>> 
>> This fixes a KVM unit test regression observed on a HiSilicon platform,
>> and subsequently reproduced on Seattle.
>> 
>> Fixes: a9c0e12ebee5 ("KVM: arm/arm64: Only clean the dcache on translation 
>> fault")
>> Reported-by: Mike Galbraith 
>> Signed-off-by: Marc Zyngier 
>> ---
>>  virt/kvm/arm/mmu.c | 9 -
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>> 
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 1d90d79706bd..287c8e274655 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1811,13 +1811,20 @@ static int kvm_set_spte_handler(struct kvm *kvm, 
>> gpa_t gpa, u64 size, void *data
>>  void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte)
>>  {
>>  unsigned long end = hva + PAGE_SIZE;
>> +kvm_pfn_t pfn = pte_pfn(pte);
>>  pte_t stage2_pte;
>>  
>>  if (!kvm->arch.pgd)
>>  return;
>>  
>>  trace_kvm_set_spte_hva(hva);
>> -stage2_pte = pfn_pte(pte_pfn(pte), PAGE_S2);
>> +
>> +/*
>> + * We've moved a page around, probably through CoW, so let's treat
>> + * just like a translation fault and clean the cache to the PoC.
>> + */
>> +clean_dcache_guest_page(pfn, PAGE_SIZE);
>> +stage2_pte = pfn_pte(pfn, PAGE_S2);
>>  handle_hva_to_gpa(kvm, hva, end, _set_spte_handler, _pte);
>>  }
>
> How does this work for pmd mappings?

kvm_set_spte_hva() isn't called for PMD mappings. But...

>
> Are we guaranteed that a pmd mapping (hugetlbfs or THP) is split before
> a CoW happens?
>
> Steve tells me that we share THP mappings on fork and that we back THPs
> by a zero page, so CoW with THP should be possible.
>

...the problem seems to affect handling write permission faults for CoW
or zero pages.

The memory gets unmapped at stage 2 due to the invalidate notifier (in
hugetlb_cow() for hugetlbfs and do_huge_pmd_wp_page() for THP) while the
cache maintenance for the newly allocated page will be skipped due to
the !FSC_PERM. 

Hmm... smells like there might be a problem here. I'll try and put
together a fix.

> Thanks,
> Christoffer
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 2/2] KVM: arm/arm64: Skip updating PTE entry if no change

2018-08-13 Thread Punit Agrawal
When there is contention on faulting in a particular page table entry
at stage 2, the break-before-make requirement of the architecture can
lead to additional refaulting due to TLB invalidation.

Avoid this by skipping a page table update if the new value of the PTE
matches the previous value.

Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki Poulose 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: sta...@vger.kernel.org
---
 virt/kvm/arm/mmu.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 2bb0b5dba412..c2b95a22959b 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1118,6 +1118,10 @@ static int stage2_set_pte(struct kvm *kvm, struct 
kvm_mmu_memory_cache *cache,
/* Create 2nd stage page table mapping - Level 3 */
old_pte = *pte;
if (pte_present(old_pte)) {
+   /* Skip page table update if there is no change */
+   if (pte_val(old_pte) == pte_val(*new_pte))
+   return 0;
+
kvm_set_pte(pte, __pte(0));
kvm_tlb_flush_vmid_ipa(kvm, addr);
} else {
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 1/2] KVM: arm/arm64: Skip updating PMD entry if no change

2018-08-13 Thread Punit Agrawal
Contention on updating a PMD entry by a large number of vcpus can lead
to duplicate work when handling stage 2 page faults. As the page table
update follows the break-before-make requirement of the architecture,
it can lead to repeated refaults due to clearing the entry and
flushing the tlbs.

This problem is more likely when -

* there are large number of vcpus
* the mapping is large block mapping

such as when using PMD hugepages (512MB) with 64k pages.

Fix this by skipping the page table update if there is no change in
the entry being updated.

Fixes: ad361f093c1e ("KVM: ARM: Support hugetlbfs backed huge pages")
Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki Poulose 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: sta...@vger.kernel.org
---
 virt/kvm/arm/mmu.c | 38 +++---
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..2bb0b5dba412 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1015,19 +1015,35 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
pmd = stage2_get_pmd(kvm, cache, addr);
VM_BUG_ON(!pmd);
 
-   /*
-* Mapping in huge pages should only happen through a fault.  If a
-* page is merged into a transparent huge page, the individual
-* subpages of that huge page should be unmapped through MMU
-* notifiers before we get here.
-*
-* Merging of CompoundPages is not supported; they should become
-* splitting first, unmapped, merged, and mapped back in on-demand.
-*/
-   VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd));
-
old_pmd = *pmd;
if (pmd_present(old_pmd)) {
+   /*
+* Multiple vcpus faulting on the same PMD entry, can
+* lead to them sequentially updating the PMD with the
+* same value. Following the break-before-make
+* (pmd_clear() followed by tlb_flush()) process can
+* hinder forward progress due to refaults generated
+* on missing translations.
+*
+* Skip updating the page table if the entry is
+* unchanged.
+*/
+   if (pmd_val(old_pmd) == pmd_val(*new_pmd))
+   return 0;
+
+   /*
+* Mapping in huge pages should only happen through a
+* fault.  If a page is merged into a transparent huge
+* page, the individual subpages of that huge page
+* should be unmapped through MMU notifiers before we
+* get here.
+*
+* Merging of CompoundPages is not supported; they
+* should become splitting first, unmapped, merged,
+* and mapped back in on-demand.
+*/
+   VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
+
pmd_clear(pmd);
kvm_tlb_flush_vmid_ipa(kvm, addr);
} else {
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 0/2] KVM: Fix refaulting due to page table update

2018-08-13 Thread Punit Agrawal
Contention when updating a page table entry can lead to unnecessary
refaults. The issue was reported by a user when testing PUD hugepage
support[1] but also exists for PMD and PTE updates though with a lower
probability.

This version -

* fixes a nit reported by Suzuki
* Re-orders the checks when setting PMD hugepage
* drops mistakenly introduced Change-id in the commit message

Thanks,
Punit

[1] https://lkml.org/lkml/2018/7/16/482

Punit Agrawal (2):
  KVM: arm/arm64: Skip updating PMD entry if no change
  KVM: arm/arm64: Skip updating PTE entry if no change

 virt/kvm/arm/mmu.c | 43 ---
 1 file changed, 32 insertions(+), 11 deletions(-)

-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 1/2] KVM: arm/arm64: Skip updating PMD entry if no change

2018-08-13 Thread Punit Agrawal
Marc Zyngier  writes:

> Hi Punit,
>
> On 13/08/18 10:40, Punit Agrawal wrote:

[...]

>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 1d90d79706bd..2ab977edc63c 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1015,19 +1015,36 @@ static int stage2_set_pmd_huge(struct kvm *kvm, 
>> struct kvm_mmu_memory_cache
>>  pmd = stage2_get_pmd(kvm, cache, addr);
>>  VM_BUG_ON(!pmd);
>>  
>> -/*
>> - * Mapping in huge pages should only happen through a fault.  If a
>> - * page is merged into a transparent huge page, the individual
>> - * subpages of that huge page should be unmapped through MMU
>> - * notifiers before we get here.
>> - *
>> - * Merging of CompoundPages is not supported; they should become
>> - * splitting first, unmapped, merged, and mapped back in on-demand.
>> - */
>> -VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd));
>> -
>>  old_pmd = *pmd;
>> +
>>  if (pmd_present(old_pmd)) {
>> +/*
>> + * Mapping in huge pages should only happen through a
>> + * fault.  If a page is merged into a transparent huge
>> + * page, the individual subpages of that huge page
>> + * should be unmapped through MMU notifiers before we
>> + * get here.
>> + *
>> + * Merging of CompoundPages is not supported; they
>> + * should become splitting first, unmapped, merged,
>> + * and mapped back in on-demand.
>> + */
>> +VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
>> +
>> +/*
>> + * Multiple vcpus faulting on the same PMD entry, can
>> + * lead to them sequentially updating the PMD with the
>> + * same value. Following the break-before-make
>> + * (pmd_clear() followed by tlb_flush()) process can
>> + * hinder forward progress due to refaults generated
>> + * on missing translations.
>> + *
>> + * Skip updating the page table if the entry is
>> + * unchanged.
>> + */
>> +if (pmd_val(old_pmd) == pmd_val(*new_pmd))
>> +goto out;
>
> I think the order of these two checks should be reversed: the first one
> is clearly a subset of the second one, so it'd make sense to have the
> global comparison before having the more specific one. Not that it
> matter much in practice, but I just find it easier to reason about.

Makes sense. I've reordered the checks for the next version.

Thanks,
Punit

>
>> +
>>  pmd_clear(pmd);
>>  kvm_tlb_flush_vmid_ipa(kvm, addr);
>>  } else {
>> @@ -1035,6 +1052,7 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
>> kvm_mmu_memory_cache
>>  }
>>  
>>  kvm_set_pmd(pmd, *new_pmd);
>> +out:
>>  return 0;
>>  }
>>  
>> 
>
> Thanks,
>
>   M.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 1/2] KVM: arm/arm64: Skip updating PMD entry if no change

2018-08-13 Thread Punit Agrawal
Suzuki K Poulose  writes:

> On 08/13/2018 10:40 AM, Punit Agrawal wrote:
>> Contention on updating a PMD entry by a large number of vcpus can lead
>> to duplicate work when handling stage 2 page faults. As the page table
>> update follows the break-before-make requirement of the architecture,
>> it can lead to repeated refaults due to clearing the entry and
>> flushing the tlbs.
>>
>> This problem is more likely when -
>>
>> * there are large number of vcpus
>> * the mapping is large block mapping
>>
>> such as when using PMD hugepages (512MB) with 64k pages.
>>
>> Fix this by skipping the page table update if there is no change in
>> the entry being updated.
>>
>> Fixes: ad361f093c1e ("KVM: ARM: Support hugetlbfs backed huge pages")
>> Change-Id: Ib417957c842ef67a6f4b786f68df62048d202c24
>> Signed-off-by: Punit Agrawal 
>> Cc: Marc Zyngier 
>> Cc: Christoffer Dall 
>> Cc: Suzuki Poulose 
>> Cc: sta...@vger.kernel.org
>> ---
>>   virt/kvm/arm/mmu.c | 40 +---
>>   1 file changed, 29 insertions(+), 11 deletions(-)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 1d90d79706bd..2ab977edc63c 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1015,19 +1015,36 @@ static int stage2_set_pmd_huge(struct kvm *kvm, 
>> struct kvm_mmu_memory_cache
>>  pmd = stage2_get_pmd(kvm, cache, addr);
>>  VM_BUG_ON(!pmd);
>>   -  /*
>> - * Mapping in huge pages should only happen through a fault.  If a
>> - * page is merged into a transparent huge page, the individual
>> - * subpages of that huge page should be unmapped through MMU
>> - * notifiers before we get here.
>> - *
>> - * Merging of CompoundPages is not supported; they should become
>> - * splitting first, unmapped, merged, and mapped back in on-demand.
>> - */
>> -VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd));
>> -
>>  old_pmd = *pmd;
>> +
>>  if (pmd_present(old_pmd)) {
>> +/*
>> + * Mapping in huge pages should only happen through a
>> + * fault.  If a page is merged into a transparent huge
>> + * page, the individual subpages of that huge page
>> + * should be unmapped through MMU notifiers before we
>> + * get here.
>> + *
>> + * Merging of CompoundPages is not supported; they
>> + * should become splitting first, unmapped, merged,
>> + * and mapped back in on-demand.
>> + */
>> +VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
>> +
>> +/*
>> + * Multiple vcpus faulting on the same PMD entry, can
>> + * lead to them sequentially updating the PMD with the
>> + * same value. Following the break-before-make
>> + * (pmd_clear() followed by tlb_flush()) process can
>> + * hinder forward progress due to refaults generated
>> + * on missing translations.
>> + *
>> + * Skip updating the page table if the entry is
>> + * unchanged.
>> + */
>> +if (pmd_val(old_pmd) == pmd_val(*new_pmd))
>> +goto out;
>
> minor nit: You could as well return here, as there are no other users
> for the label and there are no clean up actions.

Ok - I'll do a quick respin for the maintainers to pick up if they are
happy with the other aspects of the patch.

>
> Either way,
>
> Reviewed-by: Suzuki K Poulose 

Thanks Suzuki.

>
>
>> +
>>  pmd_clear(pmd);
>>  kvm_tlb_flush_vmid_ipa(kvm, addr);
>>  } else {
>> @@ -1035,6 +1052,7 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
>> kvm_mmu_memory_cache
>>  }
>>  kvm_set_pmd(pmd, *new_pmd);
>> +out:
>>  return 0;
>>   }
>>   
>>
>
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 1/2] KVM: arm/arm64: Skip updating PMD entry if no change

2018-08-13 Thread Punit Agrawal
Contention on updating a PMD entry by a large number of vcpus can lead
to duplicate work when handling stage 2 page faults. As the page table
update follows the break-before-make requirement of the architecture,
it can lead to repeated refaults due to clearing the entry and
flushing the tlbs.

This problem is more likely when -

* there are large number of vcpus
* the mapping is large block mapping

such as when using PMD hugepages (512MB) with 64k pages.

Fix this by skipping the page table update if there is no change in
the entry being updated.

Fixes: ad361f093c1e ("KVM: ARM: Support hugetlbfs backed huge pages")
Change-Id: Ib417957c842ef67a6f4b786f68df62048d202c24
Signed-off-by: Punit Agrawal 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: Suzuki Poulose 
Cc: sta...@vger.kernel.org
---
 virt/kvm/arm/mmu.c | 40 +---
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..2ab977edc63c 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1015,19 +1015,36 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
pmd = stage2_get_pmd(kvm, cache, addr);
VM_BUG_ON(!pmd);
 
-   /*
-* Mapping in huge pages should only happen through a fault.  If a
-* page is merged into a transparent huge page, the individual
-* subpages of that huge page should be unmapped through MMU
-* notifiers before we get here.
-*
-* Merging of CompoundPages is not supported; they should become
-* splitting first, unmapped, merged, and mapped back in on-demand.
-*/
-   VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd));
-
old_pmd = *pmd;
+
if (pmd_present(old_pmd)) {
+   /*
+* Mapping in huge pages should only happen through a
+* fault.  If a page is merged into a transparent huge
+* page, the individual subpages of that huge page
+* should be unmapped through MMU notifiers before we
+* get here.
+*
+* Merging of CompoundPages is not supported; they
+* should become splitting first, unmapped, merged,
+* and mapped back in on-demand.
+*/
+   VM_BUG_ON(pmd_pfn(old_pmd) != pmd_pfn(*new_pmd));
+
+   /*
+* Multiple vcpus faulting on the same PMD entry, can
+* lead to them sequentially updating the PMD with the
+* same value. Following the break-before-make
+* (pmd_clear() followed by tlb_flush()) process can
+* hinder forward progress due to refaults generated
+* on missing translations.
+*
+* Skip updating the page table if the entry is
+* unchanged.
+*/
+   if (pmd_val(old_pmd) == pmd_val(*new_pmd))
+   goto out;
+
pmd_clear(pmd);
kvm_tlb_flush_vmid_ipa(kvm, addr);
} else {
@@ -1035,6 +1052,7 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
}
 
kvm_set_pmd(pmd, *new_pmd);
+out:
return 0;
 }
 
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 2/2] KVM: arm/arm64: Skip updating PTE entry if no change

2018-08-13 Thread Punit Agrawal
When there is contention on faulting in a particular page table entry
at stage 2, the break-before-make requirement of the architecture can
lead to additional refaulting due to TLB invalidation.

Avoid this by skipping a page table update if the new value of the PTE
matches the previous value.

Fixes: d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Change-Id: I28e17daf394a4821b13c2cf8726bf72bf30434f9
Signed-off-by: Punit Agrawal 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: Suzuki Poulose 
Cc: sta...@vger.kernel.org
---
 virt/kvm/arm/mmu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 2ab977edc63c..d0a9dccc3793 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1120,6 +1120,10 @@ static int stage2_set_pte(struct kvm *kvm, struct 
kvm_mmu_memory_cache *cache,
/* Create 2nd stage page table mapping - Level 3 */
old_pte = *pte;
if (pte_present(old_pte)) {
+   /* Skip page table update if there is no change */
+   if (pte_val(old_pte) == pte_val(*new_pte))
+   goto out;
+
kvm_set_pte(pte, __pte(0));
kvm_tlb_flush_vmid_ipa(kvm, addr);
} else {
@@ -1127,6 +1131,7 @@ static int stage2_set_pte(struct kvm *kvm, struct 
kvm_mmu_memory_cache *cache,
}
 
kvm_set_pte(pte, *new_pte);
+out:
return 0;
 }
 
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v2 0/2] KVM: Fix refaulting due to page table update

2018-08-13 Thread Punit Agrawal
Hi,

Here's a couple of patches to fix an issue when multiple vcpus fault
on a page table entry[0]. The issue was reported by a user when testing
PUD hugepage support[1] but also exists for PMD and PTE updates though
with a lower probability.

In this version -

* the fix has been split for PMD hugepage and PTE update
* refactored the PMD fix
* applied fixes tag and cc'ing to stable

Thanks,
Punit

[0] https://lkml.org/lkml/2018/8/10/256
[1] https://lkml.org/lkml/2018/7/16/482

Punit Agrawal (2):
  KVM: arm/arm64: Skip updating PMD entry if no change
  KVM: arm/arm64: Skip updating PTE entry if no change

 virt/kvm/arm/mmu.c | 45 ++---
 1 file changed, 34 insertions(+), 11 deletions(-)

-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH] KVM: arm/arm64: Skip updating page table entry if no change

2018-08-10 Thread Punit Agrawal
Marc Zyngier  writes:

> Hi Punit,
>
> On Fri, 10 Aug 2018 12:13:00 +0100,
> Punit Agrawal  wrote:
>> 
>> Contention on updating a page table entry by a large number of vcpus
>> can lead to duplicate work when handling stage 2 page faults. As the
>> page table update follows the break-before-make requirement of the
>> architecture, it can lead to repeated refaults due to clearing the
>> entry and flushing the tlbs.
>> 
>> This problem is more likely when -
>> 
>> * there are large number of vcpus
>> * the mapping is large block mapping
>>  
>> such as when using PMD hugepages (512MB) with 64k pages.
>> 
>> Fix this by skipping the page table update if there is no change in
>> the entry being updated.
>> 
>> Signed-off-by: Punit Agrawal 
>> Cc: Marc Zyngier 
>> Cc: Christoffer Dall 
>> Cc: Suzuki Poulose 
>
> This definitely deserves a Cc to stable, and a Fixes: tag.

Agreed.

For PMD the issue exists since commit ad361f093c1e ("KVM: ARM: Support
hugetlbfs backed huge pages") when the file lived in arch/arm/kvm. (v3.12+)

For PTE the issue exists since commit d5d8184d35c9 ("KVM: ARM: Memory
virtualization setup"). (v3.8+)

I'll split the fix into two patches and add a cc to stable.

>> --
>> Hi,
>> 
>> This problem was reported by a user when testing PUD hugepages. During
>> VM restore when all threads are running cpu intensive workload, the
>> refauting was causing the VM to not make any forward progress.
>> 
>> This patch fixes the problem for PMD and PTE page fault handling.
>> 
>> Thanks,
>> Punit
>> 
>> Change-Id: I04c9aa8b9fbada47deb1a171c9959f400a0d2a21

Just noticed this. Looks like the commit hook has escaped it's worktree.

>> ---
>>  virt/kvm/arm/mmu.c | 16 
>>  1 file changed, 16 insertions(+)
>> 
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 1d90d79706bd..a66a5441ca2f 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1027,6 +1027,18 @@ static int stage2_set_pmd_huge(struct kvm *kvm, 
>> struct kvm_mmu_memory_cache
>>  VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd));
>>  
>>  old_pmd = *pmd;
>> +/*
>> + * Multiple vcpus faulting on the same PMD entry, can lead to
>> + * them sequentially updating the PMD with the same
>> + * value. Following the break-before-make (pmd_clear()
>> + * followed by tlb_flush()) process can hinder forward
>> + * progress due to refaults generated on missing translations.
>> + *
>> + * Skip updating the page table if the entry is unchanged.
>> + */
>> +if (pmd_val(old_pmd) == pmd_val(*new_pmd))
>> +return 0;
>
> Shouldn't you take this opportunity to also refactor it with the above
> VM_BUG_ON and the below pmd_present? At the moment, we end-up testing
> pmd_present twice, and your patch is awfully similar to the VM_BUG_ON
> one.

I went for the minimal change keeping a backport in mind.

The VM_BUG_ON() is enabled only when CONFIG_DEBUG_VM is selected and
checks for pfn ignoring the attributes, while this fix checks for the
entire entry. Maybe I'm missing something, but I can't see an obvious
way to combine the checks.

I've eliminated the duplicate pmd_present() check and will post the
updated patches.

Thanks,
Punit

>> +
>>  if (pmd_present(old_pmd)) {
>>  pmd_clear(pmd);
>>  kvm_tlb_flush_vmid_ipa(kvm, addr);
>> @@ -1101,6 +1113,10 @@ static int stage2_set_pte(struct kvm *kvm, struct 
>> kvm_mmu_memory_cache *cache,
>>  
>>  /* Create 2nd stage page table mapping - Level 3 */
>>  old_pte = *pte;
>> +/* Skip page table update if there is no change */
>> +if (pte_val(old_pte) == pte_val(*new_pte))
>> +return 0;
>> +
>>  if (pte_present(old_pte)) {
>>  kvm_set_pte(pte, __pte(0));
>>  kvm_tlb_flush_vmid_ipa(kvm, addr);
>
> Thanks,
>
>   M.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH] KVM: arm/arm64: Skip updating page table entry if no change

2018-08-10 Thread Punit Agrawal
Contention on updating a page table entry by a large number of vcpus
can lead to duplicate work when handling stage 2 page faults. As the
page table update follows the break-before-make requirement of the
architecture, it can lead to repeated refaults due to clearing the
entry and flushing the tlbs.

This problem is more likely when -

* there are large number of vcpus
* the mapping is large block mapping

such as when using PMD hugepages (512MB) with 64k pages.

Fix this by skipping the page table update if there is no change in
the entry being updated.

Signed-off-by: Punit Agrawal 
Cc: Marc Zyngier 
Cc: Christoffer Dall 
Cc: Suzuki Poulose 
--
Hi,

This problem was reported by a user when testing PUD hugepages. During
VM restore when all threads are running cpu intensive workload, the
refauting was causing the VM to not make any forward progress.

This patch fixes the problem for PMD and PTE page fault handling.

Thanks,
Punit

Change-Id: I04c9aa8b9fbada47deb1a171c9959f400a0d2a21
---
 virt/kvm/arm/mmu.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..a66a5441ca2f 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1027,6 +1027,18 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
VM_BUG_ON(pmd_present(*pmd) && pmd_pfn(*pmd) != pmd_pfn(*new_pmd));
 
old_pmd = *pmd;
+   /*
+* Multiple vcpus faulting on the same PMD entry, can lead to
+* them sequentially updating the PMD with the same
+* value. Following the break-before-make (pmd_clear()
+* followed by tlb_flush()) process can hinder forward
+* progress due to refaults generated on missing translations.
+*
+* Skip updating the page table if the entry is unchanged.
+*/
+   if (pmd_val(old_pmd) == pmd_val(*new_pmd))
+   return 0;
+
if (pmd_present(old_pmd)) {
pmd_clear(pmd);
kvm_tlb_flush_vmid_ipa(kvm, addr);
@@ -1101,6 +1113,10 @@ static int stage2_set_pte(struct kvm *kvm, struct 
kvm_mmu_memory_cache *cache,
 
/* Create 2nd stage page table mapping - Level 3 */
old_pte = *pte;
+   /* Skip page table update if there is no change */
+   if (pte_val(old_pte) == pte_val(*new_pte))
+   return 0;
+
if (pte_present(old_pte)) {
kvm_set_pte(pte, __pte(0));
kvm_tlb_flush_vmid_ipa(kvm, addr);
-- 
2.18.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v6 1/8] KVM: arm/arm64: Share common code in user_mem_abort()

2018-07-16 Thread Punit Agrawal
Hi Suzuki,

Suzuki K Poulose  writes:

> On 16/07/18 12:08, Punit Agrawal wrote:
>> The code for operations such as marking the pfn as dirty, and
>> dcache/icache maintenance during stage 2 fault handling is duplicated
>> between normal pages and PMD hugepages.
>>
>> Instead of creating another copy of the operations when we introduce
>> PUD hugepages, let's share them across the different pagesizes.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>
> Thanks for splitting the patch. It looks much simpler this way.
>
> Reviewed-by: Suzuki K Poulose 

Thanks for reviewing the patches.

Punit

> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 7/8] KVM: arm64: Update age handlers to support PUD hugepages

2018-07-16 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  6 +
 arch/arm64/include/asm/kvm_mmu.h |  5 
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c   | 39 
 4 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 13c0ee73756e..8225ec15cae7 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -111,6 +111,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   BUG();
+   return false;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 4d2780c588b0..c542052fb199 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   return pud_young(pud);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud) pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 6dc40b710d0d..c00155fe05c3 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1176,6 +1176,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+   return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1880,42 +1885,38 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long 
hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return stage2_pudp_test_and_clear_young(pud);
+   else if (pmd)
return stage2_pmdp_test_and_clear_young(pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (pte_none(*pte))
-   return 0;
-
-   return stage2_ptep_test_and_clear_young(pte);
+   else
+   return stage2_ptep_test_and_clear_young(pte);
 }
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   if (!stage2_get_leaf_entry(kvm, gpa, , , ))
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return kvm_s2pud_young(*pud);
+   else if (pmd)
return pmd_young(*pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (!pte_none(*pte))/* Just a page... */
+   else
return pte_young(*pte);
-
-   return 0;
 }
 
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 5/8] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-07-16 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  6 +++
 arch/arm64/include/asm/kvm_mmu.h   |  5 +++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 virt/kvm/arm/mmu.c | 53 +++---
 4 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index c3ac7a76fb69..ec0c58e139da 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
 }
 
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 84051930ddfe..15bc1be8f82f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+   return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed8f8271c389..3839d0e3766d 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1038,23 +1038,66 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+ pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
 {
+   pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
 
-   pmdp = stage2_get_pmd(kvm, NULL, addr);
+   *pudpp = NULL;
+   *pmdpp = NULL;
+   *ptepp = NULL;
+
+   pudp = stage2_get_pud(kvm, NULL, addr);
+   if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
+   return false;
+
+   if (pud_huge(*pudp)) {
+   *pudpp = pudp;
+   return true;
+   }
+
+   pmdp = stage2_pmd_offset(pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
 
-   if (pmd_thp_or_huge(*pmdp))
-   return kvm_s2pmd_exec(pmdp);
+   if (pmd_thp_or_huge(*pmdp)) {
+   *pmdpp = pmdp;
+   return true;
+   }
 
ptep = pte_offset_kernel(pmdp, addr);
if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
return false;
 
-   return kvm_s2pte_exec(ptep);
+   *ptepp = ptep;
+   return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+   pud_t *pudp;
+   pmd_t *pmdp;
+   pte_t *ptep;
+   bool found;
+
+   found = stage2_get_leaf_entry(kvm, addr, , , );
+   if (!found)
+   return false;
+
+   if (pudp)
+   return kvm_s2pud_exec(pudp);
+   else if (pmdp)
+   return kvm_s2pmd_exec(pmdp);
+   else
+   return kvm_s2pte_exec(ptep);
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 8/8] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-07-16 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h | 19 +
 arch/arm64/include/asm/kvm_mmu.h   | 15 
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 arch/arm64/include/asm/pgtable.h   |  2 +
 virt/kvm/arm/mmu.c | 98 --
 5 files changed, 131 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8225ec15cae7..665c746c46ce 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -77,11 +77,14 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   ({ BUG(); 0; })
 
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)(pud)
 
 /*
  * The following kvm_*pud*() functions are provided strictly to allow
@@ -98,6 +101,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
 
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c542052fb199..dd8a23159463 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -171,13 +171,16 @@ void kvm_clear_hyp_idmap(void);
 
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -191,6 +194,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -203,6 +212,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
+#define pud_mkhuge(pud)(__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
 #define __phys_to_pud_val(phys)__phys_to_pte_val(phys)
 #define pud_pfn(pud)   ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index c00155fe05c3..552fceb0521b 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -116,6 +116,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, 
phys_addr_t addr, pmd_t *pmd)
put_page(virt_to_page(pmd));
 }
 
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm:

[PATCH v6 6/8] KVM: arm64: Support handling access faults for PUD hugepages

2018-07-16 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  9 +
 arch/arm64/include/asm/kvm_mmu.h |  7 +++
 arch/arm64/include/asm/pgtable.h |  6 ++
 virt/kvm/arm/mmu.c   | 22 +++---
 4 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index ec0c58e139da..13c0ee73756e 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -78,6 +78,9 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   ({ BUG(); 0; })
+
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 /*
@@ -102,6 +105,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 15bc1be8f82f..4d2780c588b0 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -175,6 +175,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -254,6 +256,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   return pud_mkyoung(pud);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+   return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 3839d0e3766d..6dc40b710d0d 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1641,6 +1641,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
kvm_pfn_t pfn;
@@ -1650,24 +1651,23 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa)
 
spin_lock(>kvm->mmu_lock);
 
-   pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   if (!stage2_get_leaf_entry(vcpu->kvm, fault_ipa, , , ))
goto out;
 
-   if (pmd_thp_or_huge(*pmd)) {/* THP, HugeTLB */
+   if (pud) {  /* HugeTLB */
+   *pud = kvm_s2pud_mkyoung(*pud);
+   pfn = kvm_pud_pfn(*pud);
+   pfn_valid = true;
+   } else  if (pmd) {  /* THP, HugeTLB */
*pmd = pmd_mkyoung(*pmd);
pfn = pmd_pfn(*pmd);
pfn_valid = true;
-   goto out;
+   } else {
+   *pte = pte_mkyoung(*pte);   /* Just a page... */
+   pfn = pte_pfn(*pte);
+   pfn_valid = true;
}
 
-   pte = pte_offset_kernel(pmd, fault_ipa);
-   if (pte_none(*pte)) /* Nothing there either */
-   goto out;
-
-   *pte = pte_mkyoung(*pte);   /* Just a page... */
-   pfn = pte_pfn(*pte);
-   pfn_valid = true;
 out:
spin_unlock(>kvm->mmu_lock);
if (pfn_valid)
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 4/8] KVM: arm64: Support dirty page tracking for PUD hugepages

2018-07-16 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal 
Reviewed-by: Christoffer Dall 
Reviewed-by: Suzuki K Poulose 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 16 
 arch/arm64/include/asm/kvm_mmu.h | 10 ++
 virt/kvm/arm/mmu.c   | 11 +++
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d095c2d0b284..c3ac7a76fb69 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -80,6 +80,22 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functions are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 689def9bb9d5..84051930ddfe 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -239,6 +239,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+   kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+   return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e131b7f9b7d7..ed8f8271c389 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1288,9 +1288,12 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t 
addr, phys_addr_t end)
do {
next = stage2_pud_addr_end(addr, end);
if (!stage2_pud_none(*pud)) {
-   /* TODO:PUD not supported, revisit later if supported */
-   BUG_ON(stage2_pud_huge(*pud));
-   stage2_wp_pmds(pud, addr, next);
+   if (stage2_pud_huge(*pud)) {
+   if (!kvm_s2pud_readonly(pud))
+   kvm_set_s2pud_readonly(pud);
+   } else {
+   stage2_wp_pmds(pud, addr, next);
+   }
}
} while (pud++, addr = next, addr != end);
 }
@@ -1333,7 +1336,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t 
addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 0/8] KVM: Support PUD hugepages at stage 2

2018-07-16 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage 2 a
feature that is useful on cores that have support for large sized TLB
mappings (e.g., 1GB for 4K granule).

This version adds tags and addresses feedback received on
v5. Additionally, Patch 1 (1 & 2 in this version) has been split to
help make it easy to review.

Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.

The patches have been tested on an A57 based system. The patchset is
based on v4.18-rc5. The are a few conflicts with the support for 52
bit IPA[1] - I'll work with Suzuki to resolve this once the code is in
the right shape.

Thanks,
Punit

v5 -> v6

* Split Patch 1 to move out the refactoring of exec permissions on
  page table entries.
* Patch 4 - Initialise p*dpp pointers in stage2_get_leaf_entry()
* Patch 5 - Trigger a BUG() in kvm_pud_pfn() on arm

v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/arm-kernel/msg664276.html
[1] https://www.spinics.net/lists/kvm/msg171065.html

Punit Agrawal (8):
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault
  KVM: arm/arm64: Introduce helpers to manipulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  61 +
 arch/arm64/include/asm/kvm_mmu.h   |  47 
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 +
 virt/kvm/arm/mmu.c | 294 ++---
 5 files changed, 341 insertions(+), 74 deletions(-)

-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 3/8] KVM: arm/arm64: Introduce helpers to manipulate page table entries

2018-07-16 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal 
Reviewed-by: Suzuki K Poulose 
Acked-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 5 +
 arch/arm64/include/asm/kvm_mmu.h | 5 +
 virt/kvm/arm/mmu.c   | 8 +---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8553d68b7c8a..d095c2d0b284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,11 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index fb9a7127bb75..689def9bb9d5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,6 +172,11 @@ void kvm_clear_hyp_idmap(void);
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ea3d992e4fb7..e131b7f9b7d7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1554,8 +1554,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
if (hugetlb && vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1564,7 +1566,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 2/8] KVM: arm/arm64: Re-factor setting the Stage 2 entry to exec on fault

2018-07-16 Thread Punit Agrawal
Stage 2 fault handler marks a page as executable if it is handling an
execution fault or if it was a permission fault in which case the
executable bit needs to be preserved.

The logic to decide if the page should be marked executable is
duplicated for PMD and PTE entries. To avoid creating another copy
when support for PUD hugepages is introduced refactor the code to
share the checks needed to mark a page table entry as executable.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1c8d407a92ce..ea3d992e4fb7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1422,7 +1422,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  unsigned long fault_status)
 {
int ret;
-   bool write_fault, exec_fault, writable, hugetlb = false, force_pte = 
false;
+   bool write_fault, writable, hugetlb = false, force_pte = false;
+   bool exec_fault, needs_exec;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1541,19 +1542,25 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (exec_fault)
invalidate_icache_guest_page(pfn, vma_pagesize);
 
+   /*
+* If we took an execution fault we have made the
+* icache/dcache coherent above and should now let the s2
+* mapping be executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   needs_exec = exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
-   if (exec_fault) {
+   if (needs_exec)
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1564,13 +1571,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
mark_page_dirty(kvm, gfn);
}
 
-   if (exec_fault) {
+   if (needs_exec)
new_pte = kvm_s2pte_mkexec(new_pte);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pte = kvm_s2pte_mkexec(new_pte);
-   }
 
ret = stage2_set_pte(kvm, memcache, fault_ipa, _pte, flags);
}
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v6 1/8] KVM: arm/arm64: Share common code in user_mem_abort()

2018-07-16 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 39 +++
 1 file changed, 23 insertions(+), 16 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..1c8d407a92ce 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1431,7 +1431,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1451,7 +1451,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
@@ -1520,23 +1521,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
+   if (!hugetlb && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (hugetlb)
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
+
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
+
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
 
-   if (hugetlb) {
+   if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
-
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
 
if (exec_fault) {
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
@@ -1549,16 +1561,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
if (exec_fault) {
new_pte = kvm_s2pte_mkexec(new_pte);
-   invalidate_icache_guest_page(pfn, PAGE_SIZE);
} else if (fault_status == FSC_PERM) {
/* Preserve execute if XN was already cleared */
if (stage2_is_exec(kvm, fault_ipa))
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 7/7] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-07-11 Thread Punit Agrawal
Suzuki K Poulose  writes:

> On 11/07/18 17:05, Punit Agrawal wrote:
>> Suzuki K Poulose  writes:
>>
>>> On 09/07/18 15:41, Punit Agrawal wrote:
>>>> KVM only supports PMD hugepages at stage 2. Now that the various page
>>>> handling routines are updated, extend the stage 2 fault handling to
>>>> map in PUD hugepages.
>>>>
>>>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>>>> 1G with 4K granule) which can be useful on cores that support mapping
>>>> larger block sizes in the TLB entries.
>>>>
>>>> Signed-off-by: Punit Agrawal 
>>>> Cc: Christoffer Dall 
>>>> Cc: Marc Zyngier 
>>>> Cc: Russell King 
>>>> Cc: Catalin Marinas 
>>>> Cc: Will Deacon 
>>>> ---
>>>>arch/arm/include/asm/kvm_mmu.h | 19 +++
>>>>arch/arm64/include/asm/kvm_mmu.h   | 15 +
>>>>arch/arm64/include/asm/pgtable-hwdef.h |  2 +
>>>>arch/arm64/include/asm/pgtable.h   |  2 +
>>>>virt/kvm/arm/mmu.c | 78 --
>>>>5 files changed, 112 insertions(+), 4 deletions(-)
>>>>
>>
>> [...]
>>
>>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>>> index a6d3ac9d7c7a..d8e2497e5353 100644
>>>> --- a/virt/kvm/arm/mmu.c
>>>> +++ b/virt/kvm/arm/mmu.c
>>
>> [...]
>>
>>>> @@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct 
>>>> kvm_mmu_memory_cache *cache,
>>>>  phys_addr_t addr, const pte_t *new_pte,
>>>>  unsigned long flags)
>>>>{
>>>> +  pud_t *pud;
>>>>pmd_t *pmd;
>>>>pte_t *pte, old_pte;
>>>>bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>>> @@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct 
>>>> kvm_mmu_memory_cache *cache,
>>>>VM_BUG_ON(logging_active && !cache);
>>>>/* Create stage-2 page table mapping - Levels 0 and 1 */
>>>> +  pud = stage2_get_pud(kvm, cache, addr);
>>>> +  if (!pud) {
>>>> +  /*
>>>> +   * Ignore calls from kvm_set_spte_hva for unallocated
>>>> +   * address ranges.
>>>> +   */
>>>> +  return 0;
>>>> +  }
>>>> +
>>>> +  /*
>>>> +   * While dirty page logging - dissolve huge PUD, then continue
>>>> +   * on to allocate page.
>>>
>>> Punit,
>>>
>>> We don't seem to allocate a page here for the PUD entry, in case if it is 
>>> dissolved
>>> or empty (i.e, stage2_pud_none(*pud) is true.).
>>
>> I was trying to avoid duplicating the PUD allocation by reusing the
>> functionality in stage2_get_pmd().
>>
>> Does the below updated comment help?
>>
>>  /*
>>   * While dirty page logging - dissolve huge PUD, it'll be
>>   * allocated in stage2_get_pmd().
>>   */
>>
>> The other option is to duplicate the stage2_pud_none() case from
>> stage2_get_pmd() here.
>
> I think the explicit check for stage2_pud_none() suits better here.
> That would make it explicit that we are tearing down the entries
> from top to bottom. Also, we may be able to short cut for case
> where we know we just allocated a PUD page and hence we need another
> PMD level page.

Ok, I'll add the PUD allocation code here.

>
> Also, you are missing the comment about the assumption that stage2 PUD
> level always exist with 4k fixed IPA.

Hmm... I'm quite sure I wrote a comment to that effect but can't find it
now. I'll include it in the next version.

Thanks,
Punit

>
> Cheers
> Suzuki
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 2/7] KVM: arm/arm64: Introduce helpers to manupulate page table entries

2018-07-11 Thread Punit Agrawal
Suzuki K Poulose  writes:

> On 09/07/18 15:41, Punit Agrawal wrote:
>> Introduce helpers to abstract architectural handling of the conversion
>> of pfn to page table entries and marking a PMD page table entry as a
>> block entry.
>>
>> The helpers are introduced in preparation for supporting PUD hugepages
>> at stage 2 - which are supported on arm64 but do not exist on arm.
>>
>> Signed-off-by: Punit Agrawal 
>> Acked-by: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> ---
>
> Reviewed-by: Suzuki K Poulose 

Other than the query on Patch 7 I've incorporated all your suggestions
locally.

Thanks a lot for reviewing the patches.

Punit

ps: Just noticed the typo (manupulate) in the subject. I've fixed it up
locally.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 7/7] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-07-11 Thread Punit Agrawal
Suzuki K Poulose  writes:

> On 09/07/18 15:41, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Now that the various page
>> handling routines are updated, extend the stage 2 fault handling to
>> map in PUD hugepages.
>>
>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>> 1G with 4K granule) which can be useful on cores that support mapping
>> larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> ---
>>   arch/arm/include/asm/kvm_mmu.h | 19 +++
>>   arch/arm64/include/asm/kvm_mmu.h   | 15 +
>>   arch/arm64/include/asm/pgtable-hwdef.h |  2 +
>>   arch/arm64/include/asm/pgtable.h   |  2 +
>>   virt/kvm/arm/mmu.c | 78 --
>>   5 files changed, 112 insertions(+), 4 deletions(-)
>>

[...]

>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index a6d3ac9d7c7a..d8e2497e5353 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c

[...]

>> @@ -1100,6 +1139,7 @@ static int stage2_set_pte(struct kvm *kvm, struct 
>> kvm_mmu_memory_cache *cache,
>>phys_addr_t addr, const pte_t *new_pte,
>>unsigned long flags)
>>   {
>> +pud_t *pud;
>>  pmd_t *pmd;
>>  pte_t *pte, old_pte;
>>  bool iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>> @@ -1108,6 +1148,22 @@ static int stage2_set_pte(struct kvm *kvm, struct 
>> kvm_mmu_memory_cache *cache,
>>  VM_BUG_ON(logging_active && !cache);
>>  /* Create stage-2 page table mapping - Levels 0 and 1 */
>> +pud = stage2_get_pud(kvm, cache, addr);
>> +if (!pud) {
>> +/*
>> + * Ignore calls from kvm_set_spte_hva for unallocated
>> + * address ranges.
>> + */
>> +return 0;
>> +}
>> +
>> +/*
>> + * While dirty page logging - dissolve huge PUD, then continue
>> + * on to allocate page.
>
> Punit,
>
> We don't seem to allocate a page here for the PUD entry, in case if it is 
> dissolved
> or empty (i.e, stage2_pud_none(*pud) is true.).

I was trying to avoid duplicating the PUD allocation by reusing the
functionality in stage2_get_pmd().

Does the below updated comment help?

/*
 * While dirty page logging - dissolve huge PUD, it'll be
 * allocated in stage2_get_pmd().
 */

The other option is to duplicate the stage2_pud_none() case from
stage2_get_pmd() here.

What do you think?

Thanks,
Punit

>> + */
>> +if (logging_active)
>> +stage2_dissolve_pud(kvm, addr, pud);
>> +
>>  pmd = stage2_get_pmd(kvm, cache, addr);
>>  if (!pmd) {
>
> And once you add an entry, pmd is just the matter of getting 
> stage2_pmd_offset() from your pud.
> No need to start again from the top-level with stage2_get_pmd().
>
> Cheers
> Suzuki
>
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v5 4/7] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-07-11 Thread Punit Agrawal
Suzuki K Poulose  writes:

> On 09/07/18 15:41, Punit Agrawal wrote:
>> In preparation for creating PUD hugepages at stage 2, add support for
>> detecting execute permissions on PUD page table entries. Faults due to
>> lack of execute permissions on page table entries is used to perform
>> i-cache invalidation on first execute.
>>
>> Provide trivial implementations of arm32 helpers to allow sharing of
>> code.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> ---
>>   arch/arm/include/asm/kvm_mmu.h |  6 
>>   arch/arm64/include/asm/kvm_mmu.h   |  5 +++
>>   arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
>>   virt/kvm/arm/mmu.c | 49 +++---
>>   4 files changed, 57 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index c23722f75d5c..d05c8986e495 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
>>   }
>> +static inline bool kvm_s2pud_exec(pud_t *pud)
>> +{
>> +BUG();
>> +return false;
>> +}
>> +
>>   static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
>>   {
>>  *pmd = new_pmd;
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
>> b/arch/arm64/include/asm/kvm_mmu.h
>> index 84051930ddfe..15bc1be8f82f 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
>>  return kvm_s2pte_readonly((pte_t *)pudp);
>>   }
>>   +static inline bool kvm_s2pud_exec(pud_t *pudp)
>> +{
>> +return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
>> +}
>> +
>>   static inline bool kvm_page_empty(void *ptr)
>>   {
>>  struct page *ptr_page = virt_to_page(ptr);
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
>> b/arch/arm64/include/asm/pgtable-hwdef.h
>> index fd208eac9f2a..10ae592b78b8 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -193,6 +193,8 @@
>>   #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>   #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
>>   +#define PUD_S2_XN (_AT(pudval_t, 2) << 53)  /* XN[1:0]
>> */
>> +
>>   /*
>>* Memory Attribute override for Stage-2 (MemAttr[3:0])
>>*/
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index ed8f8271c389..e73909a31e02 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1038,23 +1038,62 @@ static int stage2_set_pmd_huge(struct kvm *kvm, 
>> struct kvm_mmu_memory_cache
>>  return 0;
>>   }
>>   -static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
>> +/*
>> + * stage2_get_leaf_entry - walk the stage2 VM page tables and return
>> + * true if a valid and present leaf-entry is found. A pointer to the
>> + * leaf-entry is returned in the appropriate level variable - pudpp,
>> + * pmdpp, ptepp.
>> + */
>> +static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
>> +  pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
>>   {
>> +pud_t *pudp;
>>  pmd_t *pmdp;
>>  pte_t *ptep;
>
> nit: As mentioned in the other thread, you may initialize the reference
> pointers to NULL to make sure we start clean and avoid the initialization
> everywhere this is called.

I took the approach to not touch the pointers unless they are being
assigned a valid pointer. I'll initialise the incoming pointers (p*dpp)
before proceeding with the table walk.

Thanks,
Punit

>
>>   -  pmdp = stage2_get_pmd(kvm, NULL, addr);
>> +pudp = stage2_get_pud(kvm, NULL, addr);
>> +if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
>> +return false;
>> +
>> +if (pud_huge(*pudp)) {
>> +*pudpp = pudp;
>> +return true;
>> +}
>> +
>> +pmdp = stage2_pmd_offset(pudp, addr);
>>  if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
>>  return false;
>>   -  if (pmd_thp_or_huge(*pmdp))
>> -return kvm_s2pmd_exec(pmdp);
>> +if (pmd_thp_or_huge(*pmdp)) {
>> +*pmdpp = pmdp;
>> +re

Re: [PATCH v5 0/7] KVM: Support PUD hugepages at stage 2

2018-07-09 Thread Punit Agrawal
Please ignore this cover letter.

Apologies for the duplicate cover-letter and a somewhat funky threading
(I blame emacs unsaved buffer).

The patches appear to be intact so don't let the threading get in the
way of review.

Punit Agrawal  writes:

> This series is an update to the PUD hugepage support previously posted
> at [0]. This patchset adds support for PUD hugepages at stage
> 2. This feature is useful on cores that have support for large sized
> TLB mappings (e.g., 1GB for 4K granule).
>
> The biggest change in this version is to replace repeated instances of
> walking the page tables to get to a leaf-entry with a function to
> return the appropriate entry. This was suggested by Suzuki and should
> help reduce the amount of churn resulting from future changes. It also
> address other feedback on the previous version.
>
> Support is added to code that is shared between arm and arm64. Dummy
> helpers for arm are provided as the port does not support PUD hugepage
> sizes.
>
> The patches have been tested on an A57 based system. The patchset is
> based on v4.18-rc4. The are a few conflicts with the support for 52
> bit IPA[1] due to change in number of parameters for
> stage2_pmd_offset().
>
> Thanks,
> Punit
>
> v4 -> v5:
>
> * Patch 1 - Drop helper stage2_should_exec() and refactor the
>   condition to decide if a page table entry should be marked
>   executable
> * Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
>   latter patches
> * Patch 7 - Use stage 2 accessors instead of using the page table
>   helpers directly
> * Patch 7 - Add a note to update the PUD hugepage support when number
>   of levels of stage 2 tables differs from stage 1
>
>
> v3 -> v4:
> * Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
> * Patch 4-5 - Add PUD hugepage support for exec and access faults
> * Patch 6 - PUD hugepage support for aging page table entries
>
> v2 -> v3:
> * Update vma_pagesize directly if THP [1/4]. Previsouly this was done
>   indirectly via hugetlb
> * Added review tag [4/4]
>
> v1 -> v2:
> * Create helper to check if the page should have exec permission [1/4]
> * Fix broken condition to detect THP hugepage [1/4]
> * Fix in-correct hunk resulting from a rebase [4/4]
>
> [0] https://www.spinics.net/lists/arm-kernel/msg663562.html
> [1] https://www.spinics.net/lists/kvm/msg171065.html
>
>
> Punit Agrawal (7):
>   KVM: arm/arm64: Share common code in user_mem_abort()
>   KVM: arm/arm64: Introduce helpers to manupulate page table entries
>   KVM: arm64: Support dirty page tracking for PUD hugepages
>   KVM: arm64: Support PUD hugepage in stage2_is_exec()
>   KVM: arm64: Support handling access faults for PUD hugepages
>   KVM: arm64: Update age handlers to support PUD hugepages
>   KVM: arm64: Add support for creating PUD hugepages at stage 2
>
>  arch/arm/include/asm/kvm_mmu.h |  60 +
>  arch/arm64/include/asm/kvm_mmu.h   |  47 
>  arch/arm64/include/asm/pgtable-hwdef.h |   4 +
>  arch/arm64/include/asm/pgtable.h   |   9 +
>  virt/kvm/arm/mmu.c | 289 ++---
>  5 files changed, 330 insertions(+), 79 deletions(-)
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 5/7] KVM: arm64: Support handling access faults for PUD hugepages

2018-07-09 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  8 
 arch/arm64/include/asm/kvm_mmu.h |  7 +++
 arch/arm64/include/asm/pgtable.h |  6 ++
 virt/kvm/arm/mmu.c   | 29 -
 4 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d05c8986e495..a4298d429efc 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -78,6 +78,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> 
PAGE_SHIFT)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 /*
@@ -102,6 +104,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 15bc1be8f82f..4d2780c588b0 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -175,6 +175,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -254,6 +256,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   return pud_mkyoung(pud);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+   return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e73909a31e02..d2c705e31584 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1637,33 +1637,36 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
-   pmd_t *pmd;
-   pte_t *pte;
+   pud_t *pud = NULL;
+   pmd_t *pmd = NULL;
+   pte_t *pte = NULL;
kvm_pfn_t pfn;
-   bool pfn_valid = false;
+   bool found, pfn_valid = false;
 
trace_kvm_access_fault(fault_ipa);
 
spin_lock(>kvm->mmu_lock);
 
-   pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   found = stage2_get_leaf_entry(vcpu->kvm, fault_ipa, , , );
+   if (!found)
goto out;
 
-   if (pmd_thp_or_huge(*pmd)) {/* THP, HugeTLB */
+   if (pud) {  /* HugeTLB */
+   *pud = kvm_s2pud_mkyoung(*pud);
+   pfn = kvm_pud_pfn(*pud);
+   pfn_valid = true;
+   goto out;
+   } else  if (pmd) {  /* THP, HugeTLB */
*pmd = pmd_mkyoung(*pmd);
pfn = pmd_pfn(*pmd);
pfn_valid = true;
goto out;
+   } else {
+   *pte = pte_mkyoung(*pte);   /* Just a page... */
+   pfn = pte_pfn(*pte);
+   pfn_valid = true;
}
 
-   pte = pte_offset_kernel(pmd, fault_ipa);
-   if (pte_none(*pte)) /* Nothing there either */
-   goto out;
-
-   *pte = pte_mkyoung(*pte);   /* Just a page... */
-   pfn = pte_pfn(*pte);
-   pfn_valid = true;
 out:
spin_unlock(>kvm->mmu_lock);
if (pfn_valid)
-- 
2.17.1


[PATCH v5 6/7] KVM: arm64: Update age handlers to support PUD hugepages

2018-07-09 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  6 
 arch/arm64/include/asm/kvm_mmu.h |  5 
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c   | 51 ++--
 4 files changed, 40 insertions(+), 23 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a4298d429efc..8e1e8aee229e 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -110,6 +110,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   BUG();
+   return false;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 4d2780c588b0..c542052fb199 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   return pud_young(pud);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud) pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index d2c705e31584..a6d3ac9d7c7a 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1172,6 +1172,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+   return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1879,42 +1884,42 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long 
hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
-   pmd_t *pmd;
-   pte_t *pte;
+   pud_t *pud = NULL;
+   pmd_t *pmd = NULL;
+   pte_t *pte = NULL;
+   bool found;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   found = stage2_get_leaf_entry(kvm, gpa, , , );
+   if (!found)
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return stage2_pudp_test_and_clear_young(pud);
+   else if (pmd)
return stage2_pmdp_test_and_clear_young(pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (pte_none(*pte))
-   return 0;
-
-   return stage2_ptep_test_and_clear_young(pte);
+   else
+   return stage2_ptep_test_and_clear_young(pte);
 }
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
-   pmd_t *pmd;
-   pte_t *pte;
+   pud_t *pud = NULL;
+   pmd_t *pmd = NULL;
+   pte_t *pte = NULL;
+   bool found;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
-   if (!pmd || pmd_none(*pmd)) /* Nothing there */
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   found = stage2_get_leaf_entry(kvm, gpa, , , );
+   if (!found)
return 0;
 
-   if (pmd_thp_or_huge(*pmd))  /* THP, HugeTLB */
+   if (pud)
+   return kvm_s2pud_young(*pud);
+   else if (pmd)
return pmd_young(*pmd);
-
-   pte = pte_offset_kernel(pmd, gpa);
-   if (!pte_none(*pte))/* Just a page... */
+   else
return pte_young(*pte);
-
-   return 0;
 }
 
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 7/7] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-07-09 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h | 19 +++
 arch/arm64/include/asm/kvm_mmu.h   | 15 +
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 arch/arm64/include/asm/pgtable.h   |  2 +
 virt/kvm/arm/mmu.c | 78 --
 5 files changed, 112 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8e1e8aee229e..787baf9ec994 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -77,10 +77,13 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> 
PAGE_SHIFT)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)(pud)
 
 /*
  * The following kvm_*pud*() functionas are provided strictly to allow
@@ -97,6 +100,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
 
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c542052fb199..dd8a23159463 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -171,13 +171,16 @@ void kvm_clear_hyp_idmap(void);
 
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -191,6 +194,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -203,6 +212,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
+#define pud_mkhuge(pud)(__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
 #define __phys_to_pud_val(phys)__phys_to_pte_val(phys)
 #define pud_pfn(pud)   ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index a6d3ac9d7c7a..d8e2497e5353 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -116,6 +116,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, 
phys_addr_t addr, pmd_t *pmd)
put_page(virt_to_page(pmd));
 }
 
+/**
+

[PATCH v5 4/7] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-07-09 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  6 
 arch/arm64/include/asm/kvm_mmu.h   |  5 +++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
 virt/kvm/arm/mmu.c | 49 +++---
 4 files changed, 57 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index c23722f75d5c..d05c8986e495 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
 }
 
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 84051930ddfe..15bc1be8f82f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+   return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ed8f8271c389..e73909a31e02 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1038,23 +1038,62 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+/*
+ * stage2_get_leaf_entry - walk the stage2 VM page tables and return
+ * true if a valid and present leaf-entry is found. A pointer to the
+ * leaf-entry is returned in the appropriate level variable - pudpp,
+ * pmdpp, ptepp.
+ */
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
+ pud_t **pudpp, pmd_t **pmdpp, pte_t **ptepp)
 {
+   pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
 
-   pmdp = stage2_get_pmd(kvm, NULL, addr);
+   pudp = stage2_get_pud(kvm, NULL, addr);
+   if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
+   return false;
+
+   if (pud_huge(*pudp)) {
+   *pudpp = pudp;
+   return true;
+   }
+
+   pmdp = stage2_pmd_offset(pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
 
-   if (pmd_thp_or_huge(*pmdp))
-   return kvm_s2pmd_exec(pmdp);
+   if (pmd_thp_or_huge(*pmdp)) {
+   *pmdpp = pmdp;
+   return true;
+   }
 
ptep = pte_offset_kernel(pmdp, addr);
if (!ptep || pte_none(*ptep) || !pte_present(*ptep))
return false;
 
-   return kvm_s2pte_exec(ptep);
+   *ptepp = ptep;
+   return true;
+}
+
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+{
+   pud_t *pudp = NULL;
+   pmd_t *pmdp = NULL;
+   pte_t *ptep = NULL;
+   bool found;
+
+   found = stage2_get_leaf_entry(kvm, addr, , , );
+   if (!found)
+   return false;
+
+   if (pudp)
+   return kvm_s2pud_exec(pudp);
+   else if (pmdp)
+   return kvm_s2pmd_exec(pmdp);
+   else
+   return kvm_s2pte_exec(ptep);
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 2/7] KVM: arm/arm64: Introduce helpers to manupulate page table entries

2018-07-09 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal 
Acked-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 5 +
 arch/arm64/include/asm/kvm_mmu.h | 5 +
 virt/kvm/arm/mmu.c   | 8 +---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8553d68b7c8a..d095c2d0b284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,11 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index fb9a7127bb75..689def9bb9d5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,6 +172,11 @@ void kvm_clear_hyp_idmap(void);
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ea3d992e4fb7..e131b7f9b7d7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1554,8 +1554,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
(fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
 
if (hugetlb && vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1564,7 +1566,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 3/7] KVM: arm64: Support dirty page tracking for PUD hugepages

2018-07-09 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal 
Reviewed-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 16 
 arch/arm64/include/asm/kvm_mmu.h | 10 ++
 virt/kvm/arm/mmu.c   | 11 +++
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d095c2d0b284..c23722f75d5c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -80,6 +80,22 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functionas are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 689def9bb9d5..84051930ddfe 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -239,6 +239,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+   kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+   return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index e131b7f9b7d7..ed8f8271c389 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1288,9 +1288,12 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t 
addr, phys_addr_t end)
do {
next = stage2_pud_addr_end(addr, end);
if (!stage2_pud_none(*pud)) {
-   /* TODO:PUD not supported, revisit later if supported */
-   BUG_ON(stage2_pud_huge(*pud));
-   stage2_wp_pmds(pud, addr, next);
+   if (stage2_pud_huge(*pud)) {
+   if (!kvm_s2pud_readonly(pud))
+   kvm_set_s2pud_readonly(pud);
+   } else {
+   stage2_wp_pmds(pud, addr, next);
+   }
}
} while (pud++, addr = next, addr != end);
 }
@@ -1333,7 +1336,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t 
addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 1/7] KVM: arm/arm64: Share common code in user_mem_abort()

2018-07-09 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 67 ++
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..ea3d992e4fb7 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1422,7 +1422,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  unsigned long fault_status)
 {
int ret;
-   bool write_fault, exec_fault, writable, hugetlb = false, force_pte = 
false;
+   bool write_fault, writable, hugetlb = false, force_pte = false;
+   bool exec_fault, needs_exec;
unsigned long mmu_seq;
gfn_t gfn = fault_ipa >> PAGE_SHIFT;
struct kvm *kvm = vcpu->kvm;
@@ -1431,7 +1432,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1451,7 +1452,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
@@ -1520,28 +1522,45 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
+   if (!hugetlb && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (hugetlb)
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
+
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
 
-   if (hugetlb) {
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
+
+   /*
+* If we took an execution fault we have made the
+* icache/dcache coherent above and should now let the s2
+* mapping be executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   needs_exec = exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+
+   if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
-
-   if (exec_fault) {
+   if (needs_exec)
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1549,21 +1568,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
-   if (exec_fault) {
+   if (needs_exec)
new_pte = kvm_s2pte_mkexec(new_pte);
-   invalidate_icache_gues

[PATCH v5 0/7] KVM: Support PUD hugepages at stage 2

2018-07-09 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage
2. This feature is useful on cores that have support for large sized
TLB mappings (e.g., 1GB for 4K granule).

The biggest change in this version is to replace repeated instances of
walking the page tables to get to a leaf-entry with a function to
return the appropriate entry. This was suggested by Suzuki and should
help reduce the amount of churn resulting from future changes. It also
address other feedback on the previous version.

Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.

The patches have been tested on an A57 based system. The patchset is
based on v4.18-rc4. The are a few conflicts with the support for 52
bit IPA[1] due to change in number of parameters for
stage2_pmd_offset().

Thanks,
Punit

v4 -> v5:

* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1


v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/arm-kernel/msg663562.html
[1] https://www.spinics.net/lists/kvm/msg171065.html


Punit Agrawal (7):
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Introduce helpers to manupulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  60 +
 arch/arm64/include/asm/kvm_mmu.h   |  47 
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 +
 virt/kvm/arm/mmu.c | 289 ++---
 5 files changed, 330 insertions(+), 79 deletions(-)

-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v5 0/7] KVM: Support PUD hugepages at stage 2

2018-07-09 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0]. This patchset adds support for PUD hugepages at stage
2. This feature is useful on cores that have support for large sized
TLB mappings (e.g., 1GB for 4K granule).

The biggest change in this version is to replace repeated instances of
walking the page tables to get to a leaf-entry with a function to
return the appropriate entry. This was suggested by Suzuki and should
help reduce the amount of churn resulting from future changes. It also
address other feedback on the previous version.

Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.

The patches have been tested on an A57 based system. The patchset is
based on v4.18-rc4. The are a few conflicts with the support for 52
bit IPA[1] due to change in number of parameters for
stage2_pmd_offset().

Thanks,
Punit

v4 -> v5:
* Patch 1 - Drop helper stage2_should_exec() and refactor the
  condition to decide if a page table entry should be marked
  executable
* Patch 4-6 - Introduce stage2_get_leaf_entry() and use it in this and
  latter patches
* Patch 7 - Use stage 2 accessors instead of using the page table
  helpers directly
* Patch 7 - Add a note to update the PUD hugepage support when number
  of levels of stage 2 tables differs from stage 1

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/arm-kernel/msg663562.html
[1] https://www.spinics.net/lists/kvm/msg171065.html

Punit Agrawal (7):
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Introduce helpers to manupulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  60 +
 arch/arm64/include/asm/kvm_mmu.h   |  47 
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 +
 virt/kvm/arm/mmu.c | 289 ++---
 5 files changed, 330 insertions(+), 79 deletions(-)

-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v4 4/7] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-07-06 Thread Punit Agrawal
Suzuki K Poulose  writes:

> Hi Punit,
>
> On 05/07/18 15:08, Punit Agrawal wrote:
>> In preparation for creating PUD hugepages at stage 2, add support for
>> detecting execute permissions on PUD page table entries. Faults due to
>> lack of execute permissions on page table entries is used to perform
>> i-cache invalidation on first execute.
>>
>> Provide trivial implementations of arm32 helpers to allow sharing of
>> code.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> ---
>>   arch/arm/include/asm/kvm_mmu.h |  6 ++
>>   arch/arm64/include/asm/kvm_mmu.h   |  5 +
>>   arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
>>   virt/kvm/arm/mmu.c | 10 +-
>>   4 files changed, 22 insertions(+), 1 deletion(-)
>>

[...]

>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index db04b18218c1..ccdea0edabb3 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1040,10 +1040,18 @@ static int stage2_set_pmd_huge(struct kvm *kvm, 
>> struct kvm_mmu_memory_cache
>> static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
>>   {
>> +pud_t *pudp;
>>  pmd_t *pmdp;
>>  pte_t *ptep;
>>   -  pmdp = stage2_get_pmd(kvm, NULL, addr);
>> +pudp = stage2_get_pud(kvm, NULL, addr);
>> +if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
>> +return false;
>> +
>> +if (pud_huge(*pudp))
>> +return kvm_s2pud_exec(pudp);
>> +
>> +pmdp = stage2_pmd_offset(pudp, addr);
>>  if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
>>  return false;
>
> I am wondering if we need a slightly better way to deal with this
> kind of operation. We seem to duplicate the above operation (here and
> in the following patches), i.e, finding the "leaf entry" for a given
> address and follow the checks one level up at a time.

We definitely need a better way to walk the page tables - for stage 2
but also stage 1 and hugetlbfs. As things stands, there is a lot of
repetitive pattern with small differences at some levels (hugepage
and/or THP, p*d_none(), p*d_present(), ...)

> So instead of doing, stage2_get_pud() and walking down everywhere this
> is needed, how about adding :
>
> /* Returns true if the leaf entry is found and updates the relevant pointer */
> found = stage2_get_leaf_entry(kvm, NULL, addr, , , )
>
> which could set the appropriate entry and we could check the result
> here.

I prototyped with the above approach but found that it could not be used
in all places due to the specific semantics of the walk. Also, then we
end up with the following pattern.

if (pudp) {
...
} else if (pmdp) {
...
} else {
...
}

At the end of the conversion, the resulting code is the same size as
well (see diff below for changes).

Another idea might be to build a page table walker passing in callbacks
- but this makes more sense if we have unified modifiers for the
levels. I think this is something we should explore but would like to do
outside the context of this series.

Hope thats ok.

Thanks for having a look,
Punit

-- >8 --
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index eddb74a7fac3..ea5c99f6dfab 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1077,31 +1077,56 @@ static int stage2_set_pud_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache *cac
return 0;
 }
 
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr, pud_t 
**pudp,
+ pmd_t **pmdp, pte_t **ptep)
 {
-   pud_t *pudp;
-   pmd_t *pmdp;
-   pte_t *ptep;
+   pud_t *lpudp;
+   pmd_t *lpmdp;
+   pte_t *lptep;
 
-   pudp = stage2_get_pud(kvm, NULL, addr);
-   if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
+   lpudp = stage2_get_pud(kvm, NULL, addr);
+   if (!lpudp || pud_none(*lpudp) || !pud_present(*lpudp))
return false;
 
-   if (pud_huge(*pudp))
-   return kvm_s2pud_exec(pudp);
+   if (pud_huge(*lpudp)) {
+   *pudp = lpudp;
+   return true;
+   }
 
-   pmdp = stage2_pmd_offset(pudp, addr);
-   if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
+   lpmdp = stage2_pmd_offset(lpudp, addr);
+   if (!lpmdp || pmd_none(*lpmdp) || !pmd_present(*lpmdp))
return false;
 
-   if (pmd_thp_or_huge(*pmdp))
-   return kvm_s2pmd_exec(pmdp);
+   if (pmd_thp_or_huge(*lpmdp)) {
+   *pmdp = lpmdp;
+  

Re: [PATCH v4 7/7] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-07-06 Thread Punit Agrawal
Suzuki K Poulose  writes:

> Hi Punit,
>
> On 05/07/18 15:08, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Now that the various page
>> handling routines are updated, extend the stage 2 fault handling to
>> map in PUD hugepages.
>>
>> Addition of PUD hugepage support enables additional page sizes (e.g.,
>> 1G with 4K granule) which can be useful on cores that support mapping
>> larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> Cc: Russell King 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 0c04c64e858c..5912210e94d9 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -116,6 +116,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, 
>> phys_addr_t addr, pmd_t *pmd)
>>  put_page(virt_to_page(pmd));
>>   }
>>   +/**
>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>> + * @kvm:pointer to kvm structure.
>> + * @addr:   IPA
>> + * @pud:pud pointer for IPA
>> + *
>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks 
>> all
>> + * pages in the range dirty.
>> + */
>> +static void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr, pud_t 
>> *pud)
>> +{
>> +if (!pud_huge(*pud))
>> +return;
>> +
>> +pud_clear(pud);
>
> You need to use the stage2_ accessors here. The stage2_dissolve_pmd() uses
> "pmd_" helpers as the PTE entries (level 3) are always guaranteed to exist.

I've fixed this and other uses of the PUD helpers to go via the stage2_
accessors.

I've still not quite come to terms with the lack of certain levels at
stage 2 vis-a-vis stage 1. I'll be more careful about this going
forward.

>
>> +kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +put_page(virt_to_page(pud));
>> +}
>> +
>>   static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>int min, int max)
>>   {
>> @@ -993,7 +1012,7 @@ static pmd_t *stage2_get_pmd(struct kvm *kvm, struct 
>> kvm_mmu_memory_cache *cache
>>  pmd_t *pmd;
>>  pud = stage2_get_pud(kvm, cache, addr);
>> -if (!pud)
>> +if (!pud || pud_huge(*pud))
>>  return NULL;
>
> Same here.
>
>>  if (stage2_pud_none(*pud)) {
>
> Like this ^
>
>> @@ -1038,6 +1057,26 @@ static int stage2_set_pmd_huge(struct kvm *kvm, 
>> struct kvm_mmu_memory_cache
>>  return 0;
>>   }
>>   +static int stage2_set_pud_huge(struct kvm *kvm, struct
>> kvm_mmu_memory_cache *cache,
>> +   phys_addr_t addr, const pud_t *new_pud)
>> +{
>> +pud_t *pud, old_pud;
>> +
>> +pud = stage2_get_pud(kvm, cache, addr);
>> +VM_BUG_ON(!pud);
>> +
>> +old_pud = *pud;
>> +if (pud_present(old_pud)) {
>> +pud_clear(pud);
>> +kvm_tlb_flush_vmid_ipa(kvm, addr);
>
> Same here.
>
>> +} else {
>> +get_page(virt_to_page(pud));
>> +}
>> +
>> +kvm_set_pud(pud, *new_pud);
>> +return 0;
>> +}
>> +
>>   static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
>>   {
>>  pud_t *pudp;

[...]

>> @@ -1572,7 +1631,18 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  if (exec_fault)
>>  invalidate_icache_guest_page(pfn, vma_pagesize);
>>   -  if (hugetlb && vma_pagesize == PMD_SIZE) {
>> +if (hugetlb && vma_pagesize == PUD_SIZE) {
>
> I think we may need to check if the stage2 indeed has 3 levels of
> tables to use stage2 PUD. Otherwise, fall back to PTE level mapping
> or even PMD huge pages. Also, this cannot be triggered right now,
> as we only get PUD hugepages with 4K and we are guaranteed to have
> at least 3 levels with 40bit IPA. May be I can take care of it in
> the Dynamic IPA series, when we run a guest with say 32bit IPA.
> So for now, it is worth adding a comment here.

Good point. I've added the following comment.

/*
 * PUD level may not exist if the guest boots with two
 * levels at Stage 2. This configuration is currently
 * not supported due to IPA size supported by KVM.
 *
 * Revisit the assumptions about PUD levels when
 * additional IPA sizes are supported by KVM.
 */

Let me know if looks OK to you.

Thanks a lot for reviewing the patches.

Punit

>
>> +pud_t new_pud

Re: [PATCH v4 1/7] KVM: arm/arm64: Share common code in user_mem_abort()

2018-07-05 Thread Punit Agrawal
Marc Zyngier  writes:

> Hi Punit,
>
> On 05/07/18 15:08, Punit Agrawal wrote:
>> The code for operations such as marking the pfn as dirty, and
>> dcache/icache maintenance during stage 2 fault handling is duplicated
>> between normal pages and PMD hugepages.
>> 
>> Instead of creating another copy of the operations when we introduce
>> PUD hugepages, let's share them across the different pagesizes.
>> 
>> Signed-off-by: Punit Agrawal 
>> Cc: Christoffer Dall 
>> Cc: Marc Zyngier 
>> ---
>>  virt/kvm/arm/mmu.c | 68 +++---
>>  1 file changed, 40 insertions(+), 28 deletions(-)
>> 
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 1d90d79706bd..dd14cc36c51c 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1398,6 +1398,21 @@ static void invalidate_icache_guest_page(kvm_pfn_t 
>> pfn, unsigned long size)
>>  __invalidate_icache_guest_page(pfn, size);
>>  }
>>  
>> +static bool stage2_should_exec(struct kvm *kvm, phys_addr_t addr,
>> +   bool exec_fault, unsigned long fault_status)
>
> I find this "should exec" very confusing.
>
>> +{
>> +/*
>> + * If we took an execution fault we will have made the
>> + * icache/dcache coherent and should now let the s2 mapping be
>> + * executable.
>> + *
>> + * Write faults (!exec_fault && FSC_PERM) are orthogonal to
>> + * execute permissions, and we preserve whatever we have.
>> + */
>> +return exec_fault ||
>> +(fault_status == FSC_PERM && stage2_is_exec(kvm, addr));
>> +}
>> +
>>  static void kvm_send_hwpoison_signal(unsigned long address,
>>   struct vm_area_struct *vma)
>>  {
>> @@ -1431,7 +1446,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  kvm_pfn_t pfn;
>>  pgprot_t mem_type = PAGE_S2;
>>  bool logging_active = memslot_is_logging(memslot);
>> -unsigned long flags = 0;
>> +unsigned long vma_pagesize, flags = 0;
>>  
>>  write_fault = kvm_is_write_fault(vcpu);
>>  exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
>> @@ -1451,7 +1466,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  return -EFAULT;
>>  }
>>  
>> -if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
>> +vma_pagesize = vma_kernel_pagesize(vma);
>> +if (vma_pagesize == PMD_SIZE && !logging_active) {
>>  hugetlb = true;
>>  gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>>  } else {
>> @@ -1520,28 +1536,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  if (mmu_notifier_retry(kvm, mmu_seq))
>>  goto out_unlock;
>>  
>> -if (!hugetlb && !force_pte)
>> +if (!hugetlb && !force_pte) {
>> +/*
>> + * Only PMD_SIZE transparent hugepages(THP) are
>> + * currently supported. This code will need to be
>> + * updated to support other THP sizes.
>> + */
>>  hugetlb = transparent_hugepage_adjust(, _ipa);
>> +if (hugetlb)
>> +vma_pagesize = PMD_SIZE;
>> +}
>> +
>> +if (writable)
>> +kvm_set_pfn_dirty(pfn);
>>  
>> -if (hugetlb) {
>> +if (fault_status != FSC_PERM)
>> +clean_dcache_guest_page(pfn, vma_pagesize);
>> +
>> +if (exec_fault)
>> +invalidate_icache_guest_page(pfn, vma_pagesize);
>> +
>> +if (hugetlb && vma_pagesize == PMD_SIZE) {
>>  pmd_t new_pmd = pfn_pmd(pfn, mem_type);
>>  new_pmd = pmd_mkhuge(new_pmd);
>> -if (writable) {
>> +if (writable)
>>  new_pmd = kvm_s2pmd_mkwrite(new_pmd);
>> -kvm_set_pfn_dirty(pfn);
>> -}
>>  
>> -if (fault_status != FSC_PERM)
>> -clean_dcache_guest_page(pfn, PMD_SIZE);
>> -
>> -if (exec_fault) {
>> +if (stage2_should_exec(kvm, fault_ipa, exec_fault, 
>> fault_status))
>>  new_pmd = kvm_s2pmd_mkexec(new_pmd);
>
> OK, I find this absolutely horrid... ;-)
>
> The rest of the function deals with discrete flags, and all of a sudden
> we have a function call with a bunch of seemingly unrelated parameters.
> And you are repeating it for each vma_pagesize...
>
> How about something like:
>
>   bool needs_exec;
>
>   [...]
>
>   needs_exec = exec_fault || (fault_status == FSC_PERM &&
>   stage2_is_exec(kvm, fault_ipa);
>
> And then you just check needs_exec to update the pte/pmd. And you drop
> this helper.

That does look a lot better. I'll roll the change into the next version.

Thanks,
Punit

[...]

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 6/7] KVM: arm64: Update age handlers to support PUD hugepages

2018-07-05 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, add support
to the age handling notifiers for PUD hugepages when encountered.

Provide trivial helpers for arm32 to allow sharing code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  6 ++
 arch/arm64/include/asm/kvm_mmu.h |  5 +
 arch/arm64/include/asm/pgtable.h |  1 +
 virt/kvm/arm/mmu.c   | 29 +
 4 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a4298d429efc..8e1e8aee229e 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -110,6 +110,12 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud;
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   BUG();
+   return false;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 4d2780c588b0..c542052fb199 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -261,6 +261,11 @@ static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
return pud_mkyoung(pud);
 }
 
+static inline bool kvm_s2pud_young(pud_t pud)
+{
+   return pud_young(pud);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index a64a5c35beb1..4d9476e420d9 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -385,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_young(pud) pte_young(pud_pte(pud))
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 94a91bcdd152..0c04c64e858c 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1141,6 +1141,11 @@ static int stage2_pmdp_test_and_clear_young(pmd_t *pmd)
return stage2_ptep_test_and_clear_young((pte_t *)pmd);
 }
 
+static int stage2_pudp_test_and_clear_young(pud_t *pud)
+{
+   return stage2_ptep_test_and_clear_young((pte_t *)pud);
+}
+
 /**
  * kvm_phys_addr_ioremap - map a device range to guest IPA
  *
@@ -1860,11 +1865,19 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long 
hva, pte_t pte)
 
 static int kvm_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   pud = stage2_get_pud(kvm, NULL, gpa);
+   if (!pud || pud_none(*pud)) /* Nothing there */
+   return 0;
+
+   if (pud_huge(*pud)) /* HugeTLB */
+   return stage2_pudp_test_and_clear_young(pud);
+
+   pmd = stage2_pmd_offset(pud, gpa);
if (!pmd || pmd_none(*pmd)) /* Nothing there */
return 0;
 
@@ -1880,11 +1893,19 @@ static int kvm_age_hva_handler(struct kvm *kvm, gpa_t 
gpa, u64 size, void *data)
 
 static int kvm_test_age_hva_handler(struct kvm *kvm, gpa_t gpa, u64 size, void 
*data)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
 
-   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE);
-   pmd = stage2_get_pmd(kvm, NULL, gpa);
+   WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
+   pud = stage2_get_pud(kvm, NULL, gpa);
+   if (!pud || pud_none(*pud)) /* Nothing there */
+   return 0;
+
+   if (pud_huge(*pud)) /* HugeTLB */
+   return kvm_s2pud_young(*pud);
+
+   pmd = stage2_pmd_offset(pud, gpa);
if (!pmd || pmd_none(*pmd)) /* Nothing there */
return 0;
 
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 5/7] KVM: arm64: Support handling access faults for PUD hugepages

2018-07-05 Thread Punit Agrawal
In preparation for creating larger hugepages at Stage 2, extend the
access fault handling at Stage 2 to support PUD hugepages when
encountered.

Provide trivial helpers for arm32 to allow sharing of code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   |  8 
 arch/arm64/include/asm/kvm_mmu.h |  7 +++
 arch/arm64/include/asm/pgtable.h |  6 ++
 virt/kvm/arm/mmu.c   | 14 +-
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d05c8986e495..a4298d429efc 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -78,6 +78,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> 
PAGE_SHIFT)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 /*
@@ -102,6 +104,12 @@ static inline bool kvm_s2pud_exec(pud_t *pud)
return false;
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 15bc1be8f82f..4d2780c588b0 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -175,6 +175,8 @@ void kvm_clear_hyp_idmap(void);
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
 
+#define kvm_pud_pfn(pud)   pud_pfn(pud)
+
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
@@ -254,6 +256,11 @@ static inline bool kvm_s2pud_exec(pud_t *pudp)
return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
 }
 
+static inline pud_t kvm_s2pud_mkyoung(pud_t pud)
+{
+   return pud_mkyoung(pud);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 1bdeca8918a6..a64a5c35beb1 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -314,6 +314,11 @@ static inline pte_t pud_pte(pud_t pud)
return __pte(pud_val(pud));
 }
 
+static inline pud_t pte_pud(pte_t pte)
+{
+   return __pud(pte_val(pte));
+}
+
 static inline pmd_t pud_pmd(pud_t pud)
 {
return __pmd(pud_val(pud));
@@ -380,6 +385,7 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pfn_pmd(pfn,prot)  __pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << 
PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)  pfn_pmd(page_to_pfn(page),prot)
 
+#define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index ccdea0edabb3..94a91bcdd152 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1609,6 +1609,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
  */
 static void handle_access_fault(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 {
+   pud_t *pud;
pmd_t *pmd;
pte_t *pte;
kvm_pfn_t pfn;
@@ -1618,7 +1619,18 @@ static void handle_access_fault(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa)
 
spin_lock(>kvm->mmu_lock);
 
-   pmd = stage2_get_pmd(vcpu->kvm, NULL, fault_ipa);
+   pud = stage2_get_pud(vcpu->kvm, NULL, fault_ipa);
+   if (!pud || pud_none(*pud))
+   goto out;   /* Nothing there */
+
+   if (pud_huge(*pud)) {   /* HugeTLB */
+   *pud = kvm_s2pud_mkyoung(*pud);
+   pfn = kvm_pud_pfn(*pud);
+   pfn_valid = true;
+   goto out;
+   }
+
+   pmd = stage2_pmd_offset(pud, fault_ipa);
if (!pmd || pmd_none(*pmd)) /* Nothing there */
goto out;
 
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 7/7] KVM: arm64: Add support for creating PUD hugepages at stage 2

2018-07-05 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Now that the various page
handling routines are updated, extend the stage 2 fault handling to
map in PUD hugepages.

Addition of PUD hugepage support enables additional page sizes (e.g.,
1G with 4K granule) which can be useful on cores that support mapping
larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h | 19 +++
 arch/arm64/include/asm/kvm_mmu.h   | 15 +
 arch/arm64/include/asm/pgtable-hwdef.h |  2 +
 arch/arm64/include/asm/pgtable.h   |  2 +
 virt/kvm/arm/mmu.c | 78 --
 5 files changed, 112 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8e1e8aee229e..787baf9ec994 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -77,10 +77,13 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pud_pfn(pud)   (((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> 
PAGE_SHIFT)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)(pud)
 
 /*
  * The following kvm_*pud*() functionas are provided strictly to allow
@@ -97,6 +100,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
 
 static inline bool kvm_s2pud_exec(pud_t *pud)
 {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index c542052fb199..dd8a23159463 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -171,13 +171,16 @@ void kvm_clear_hyp_idmap(void);
 
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pud_pfn(pud)   pud_pfn(pud)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -191,6 +194,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -203,6 +212,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index 10ae592b78b8..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
 #define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
 
 /*
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4d9476e420d9..0afc34f94ff5 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -389,6 +389,8 @@ static inline int pmd_protnone(pmd_t pmd)
 #define pud_mkyoung(pud)   pte_pud(pte_mkyoung(pud_pte(pud)))
 #define pud_write(pud) pte_write(pud_pte(pud))
 
+#define pud_mkhuge(pud)(__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
 #define __phys_to_pud_val(phys)__phys_to_pte_val(phys)
 #define pud_pfn(pud)   ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 0c04c64e858c..5912210e94d9 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -116,6 +116,25 @@ static void stage2_dissolve_pmd(struct kvm *kvm, 
phys_addr_t addr, pmd_t *pmd)
put_page(virt_to_page(pmd));
 }
 
+/**
+

[PATCH v4 4/7] KVM: arm64: Support PUD hugepage in stage2_is_exec()

2018-07-05 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
detecting execute permissions on PUD page table entries. Faults due to
lack of execute permissions on page table entries is used to perform
i-cache invalidation on first execute.

Provide trivial implementations of arm32 helpers to allow sharing of
code.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h |  6 ++
 arch/arm64/include/asm/kvm_mmu.h   |  5 +
 arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
 virt/kvm/arm/mmu.c | 10 +-
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index c23722f75d5c..d05c8986e495 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -96,6 +96,12 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
 }
 
 
+static inline bool kvm_s2pud_exec(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 84051930ddfe..15bc1be8f82f 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -249,6 +249,11 @@ static inline bool kvm_s2pud_readonly(pud_t *pudp)
return kvm_s2pte_readonly((pte_t *)pudp);
 }
 
+static inline bool kvm_s2pud_exec(pud_t *pudp)
+{
+   return !(READ_ONCE(pud_val(*pudp)) & PUD_S2_XN);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..10ae592b78b8 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,8 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index db04b18218c1..ccdea0edabb3 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1040,10 +1040,18 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
 
 static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
 {
+   pud_t *pudp;
pmd_t *pmdp;
pte_t *ptep;
 
-   pmdp = stage2_get_pmd(kvm, NULL, addr);
+   pudp = stage2_get_pud(kvm, NULL, addr);
+   if (!pudp || pud_none(*pudp) || !pud_present(*pudp))
+   return false;
+
+   if (pud_huge(*pudp))
+   return kvm_s2pud_exec(pudp);
+
+   pmdp = stage2_pmd_offset(pudp, addr);
if (!pmdp || pmd_none(*pmdp) || !pmd_present(*pmdp))
return false;
 
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 3/7] KVM: arm64: Support dirty page tracking for PUD hugepages

2018-07-05 Thread Punit Agrawal
In preparation for creating PUD hugepages at stage 2, add support for
write protecting PUD hugepages when they are encountered. Write
protecting guest tables is used to track dirty pages when migrating
VMs.

Also, provide trivial implementations of required kvm_s2pud_* helpers
to allow sharing of code with arm32.

Signed-off-by: Punit Agrawal 
Reviewed-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 16 
 arch/arm64/include/asm/kvm_mmu.h | 10 ++
 virt/kvm/arm/mmu.c   | 11 +++
 3 files changed, 33 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index d095c2d0b284..c23722f75d5c 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -80,6 +80,22 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
 
+/*
+ * The following kvm_*pud*() functionas are provided strictly to allow
+ * sharing code with arm64. They should never be called in practice.
+ */
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+   BUG();
+   return false;
+}
+
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 689def9bb9d5..84051930ddfe 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -239,6 +239,16 @@ static inline bool kvm_s2pmd_exec(pmd_t *pmdp)
return !(READ_ONCE(pmd_val(*pmdp)) & PMD_S2_XN);
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pudp)
+{
+   kvm_set_s2pte_readonly((pte_t *)pudp);
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pudp)
+{
+   return kvm_s2pte_readonly((pte_t *)pudp);
+}
+
 static inline bool kvm_page_empty(void *ptr)
 {
struct page *ptr_page = virt_to_page(ptr);
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 040cd0bce5e1..db04b18218c1 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1288,9 +1288,12 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t 
addr, phys_addr_t end)
do {
next = stage2_pud_addr_end(addr, end);
if (!stage2_pud_none(*pud)) {
-   /* TODO:PUD not supported, revisit later if supported */
-   BUG_ON(stage2_pud_huge(*pud));
-   stage2_wp_pmds(pud, addr, next);
+   if (stage2_pud_huge(*pud)) {
+   if (!kvm_s2pud_readonly(pud))
+   kvm_set_s2pud_readonly(pud);
+   } else {
+   stage2_wp_pmds(pud, addr, next);
+   }
}
} while (pud++, addr = next, addr != end);
 }
@@ -1333,7 +1336,7 @@ static void stage2_wp_range(struct kvm *kvm, phys_addr_t 
addr, phys_addr_t end)
  *
  * Called to start logging dirty pages after memory region
  * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
- * all present PMD and PTEs are write protected in the memory region.
+ * all present PUD, PMD and PTEs are write protected in the memory region.
  * Afterwards read of dirty page log can be called.
  *
  * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 2/7] KVM: arm/arm64: Introduce helpers to manupulate page table entries

2018-07-05 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal 
Acked-by: Christoffer Dall 
Cc: Marc Zyngier 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
---
 arch/arm/include/asm/kvm_mmu.h   | 5 +
 arch/arm64/include/asm/kvm_mmu.h | 5 +
 virt/kvm/arm/mmu.c   | 8 +---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 8553d68b7c8a..d095c2d0b284 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,11 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index fb9a7127bb75..689def9bb9d5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,6 +172,11 @@ void kvm_clear_hyp_idmap(void);
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index dd14cc36c51c..040cd0bce5e1 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1557,8 +1557,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
invalidate_icache_guest_page(pfn, vma_pagesize);
 
if (hugetlb && vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
+
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1567,7 +1569,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v4 1/7] KVM: arm/arm64: Share common code in user_mem_abort()

2018-07-05 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal 
Cc: Christoffer Dall 
Cc: Marc Zyngier 
---
 virt/kvm/arm/mmu.c | 68 +++---
 1 file changed, 40 insertions(+), 28 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 1d90d79706bd..dd14cc36c51c 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1398,6 +1398,21 @@ static void invalidate_icache_guest_page(kvm_pfn_t pfn, 
unsigned long size)
__invalidate_icache_guest_page(pfn, size);
 }
 
+static bool stage2_should_exec(struct kvm *kvm, phys_addr_t addr,
+  bool exec_fault, unsigned long fault_status)
+{
+   /*
+* If we took an execution fault we will have made the
+* icache/dcache coherent and should now let the s2 mapping be
+* executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   return exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, addr));
+}
+
 static void kvm_send_hwpoison_signal(unsigned long address,
 struct vm_area_struct *vma)
 {
@@ -1431,7 +1446,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1451,7 +1466,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
@@ -1520,28 +1536,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
+   if (!hugetlb && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (hugetlb)
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
 
-   if (hugetlb) {
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
+
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
+
+   if (hugetlb && vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
-
-   if (exec_fault) {
+   if (stage2_should_exec(kvm, fault_ipa, exec_fault, 
fault_status))
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1549,21 +1571,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
-   if (exec_fault) {
+   if (stage2_should_exec(kvm, fault_ipa, exec_fault, 
fault_status))
new_pte = kvm_s2pte_mkexec(new_pte);
-

[PATCH v4 0/7] KVM: Support PUD hugepages at stage 2

2018-07-05 Thread Punit Agrawal
This series is an update to the PUD hugepage support previously posted
at [0][1][2][3]. This patchset adds support for PUD hugepages at stage
2. This feature is useful on cores that have support for large sized
TLB mappings (e.g., 1GB for 4K granule).

There are a three new patches to support PUD hugepages for execute
permission faults, access faults and handling aging of PUD page table
entries (patches 4-6). This addresses Suzuki's feedback on the
previous version.

Also, live migration didn't work with earlier versions. This has now
been addressed by updating patch 1 & 7 to ensure that hugepages are
dissolved correctly when dirty logging is enabled.

Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.

The patches have been tested on an A57 based system. The patchset is
based on v4.18-rc3. The are a few conflicts with the support for 52
bit IPA[4] due to change in number of parameters for
stage2_pmd_offset().

Thanks,
Punit

v3 -> v4:
* Patch 1 and 7 - Don't put down hugepages pte if logging is enabled
* Patch 4-5 - Add PUD hugepage support for exec and access faults
* Patch 6 - PUD hugepage support for aging page table entries

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://lkml.org/lkml/2018/5/14/907
[1] https://www.spinics.net/lists/arm-kernel/msg628053.html
[2] https://lkml.org/lkml/2018/4/20/566
[3] https://lkml.org/lkml/2018/5/1/133
[4] https://www.spinics.net/lists/kvm/msg171065.html

Punit Agrawal (7):
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Introduce helpers to manupulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Support PUD hugepage in stage2_is_exec()
  KVM: arm64: Support handling access faults for PUD hugepages
  KVM: arm64: Update age handlers to support PUD hugepages
  KVM: arm64: Add support for creating PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  60 +++
 arch/arm64/include/asm/kvm_mmu.h   |  47 ++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   9 ++
 virt/kvm/arm/mmu.c | 214 -
 5 files changed, 291 insertions(+), 43 deletions(-)

-- 
2.17.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v3 4/4] KVM: arm64: Add support for PUD hugepages at stage 2

2018-05-16 Thread Punit Agrawal
Suzuki K Poulose <suzuki.poul...@arm.com> writes:

> On 05/14/2018 03:43 PM, Punit Agrawal wrote:
>> KVM only supports PMD hugepages at stage 2. Extend the stage 2 fault
>> handling to add support for PUD hugepages.
>>
>> Addition of pud hugepage support enables additional hugepage
>> sizes (e.g., 1G with 4K granule) which can be useful on cores that
>> support mapping larger block sizes in the TLB entries.
>>
>> Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
>> Reviewed-by: Christoffer Dall <christoffer.d...@arm.com>
>> Cc: Marc Zyngier <marc.zyng...@arm.com>
>> Cc: Russell King <li...@armlinux.org.uk>
>> Cc: Catalin Marinas <catalin.mari...@arm.com>
>> Cc: Will Deacon <will.dea...@arm.com>
>> ---
>>   arch/arm/include/asm/kvm_mmu.h | 19 
>>   arch/arm64/include/asm/kvm_mmu.h   | 15 ++
>>   arch/arm64/include/asm/pgtable-hwdef.h |  4 +++
>>   arch/arm64/include/asm/pgtable.h   |  2 ++
>>   virt/kvm/arm/mmu.c | 40 --
>>   5 files changed, 77 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 224c22c0a69c..155916dbdd7e 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -77,8 +77,11 @@ void kvm_clear_hyp_idmap(void);
>> #define kvm_pfn_pte(pfn, prot)   pfn_pte(pfn, prot)
>>   #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
>> +#define kvm_pfn_pud(pfn, prot)  (__pud(0))
>> #define kvm_pmd_mkhuge(pmd)  pmd_mkhuge(pmd)
>> +/* No support for pud hugepages */
>> +#define kvm_pud_mkhuge(pud) (pud)
>> /*
>>* The following kvm_*pud*() functionas are provided strictly to allow
>> @@ -95,6 +98,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
>>  return false;
>>   }
>>   +static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
>> +{
>> +BUG();
>> +}
>> +
>> +static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
>> +{
>> +BUG();
>> +return pud;
>> +}
>> +
>> +static inline pud_t kvm_s2pud_mkexec(pud_t pud)
>> +{
>> +BUG();
>> +return pud;
>> +}
>> static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
>>   {
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
>> b/arch/arm64/include/asm/kvm_mmu.h
>> index f440cf216a23..f49a68fcbf26 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -172,11 +172,14 @@ void kvm_clear_hyp_idmap(void);
>> #define  kvm_set_pte(ptep, pte)  set_pte(ptep, pte)
>>   #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
>> +#define kvm_set_pud(pudp, pud)  set_pud(pudp, pud)
>> #define kvm_pfn_pte(pfn, prot)   pfn_pte(pfn, prot)
>>   #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
>> +#define kvm_pfn_pud(pfn, prot)  pfn_pud(pfn, prot)
>> #define kvm_pmd_mkhuge(pmd)  pmd_mkhuge(pmd)
>> +#define kvm_pud_mkhuge(pud) pud_mkhuge(pud)
>> static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>>   {
>> @@ -190,6 +193,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
>>  return pmd;
>>   }
>>   +static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
>> +{
>> +pud_val(pud) |= PUD_S2_RDWR;
>> +return pud;
>> +}
>> +
>>   static inline pte_t kvm_s2pte_mkexec(pte_t pte)
>>   {
>>  pte_val(pte) &= ~PTE_S2_XN;
>> @@ -202,6 +211,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
>>  return pmd;
>>   }
>>   +static inline pud_t kvm_s2pud_mkexec(pud_t pud)
>> +{
>> +pud_val(pud) &= ~PUD_S2_XN;
>> +return pud;
>> +}
>> +
>>   static inline void kvm_set_s2pte_readonly(pte_t *ptep)
>>   {
>>  pteval_t old_pteval, pteval;
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
>> b/arch/arm64/include/asm/pgtable-hwdef.h
>> index fd208eac9f2a..e327665e94d1 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -193,6 +193,10 @@
>>   #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>   #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
>>   +#define PUD_S2_RDONLY (_AT(pudval_t, 1) << 6)   /*
>> HAP[2:1] */
>> +#define PUD_S2_RDWR (_AT(pudval_t, 3) << 6)   /* HAP[2:1

Re: [PATCH v2 4/4] KVM: arm64: Add support for PUD hugepages at stage 2

2018-05-15 Thread Punit Agrawal
Catalin Marinas <catalin.mari...@arm.com> writes:

> On Tue, May 01, 2018 at 11:26:59AM +0100, Punit Agrawal wrote:
>> KVM currently supports PMD hugepages at stage 2. Extend the stage 2
>> fault handling to add support for PUD hugepages.
>> 
>> Addition of pud hugepage support enables additional hugepage
>> sizes (e.g., 1G with 4K granule) which can be useful on cores that
>> support mapping larger block sizes in the TLB entries.
>> 
>> Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
>> Cc: Christoffer Dall <christoffer.d...@arm.com>
>> Cc: Marc Zyngier <marc.zyng...@arm.com>
>> Cc: Russell King <li...@armlinux.org.uk>
>> Cc: Catalin Marinas <catalin.mari...@arm.com>
>> Cc: Will Deacon <will.dea...@arm.com>
>> ---
>>  arch/arm/include/asm/kvm_mmu.h | 19 
>>  arch/arm64/include/asm/kvm_mmu.h   | 15 ++
>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 +++
>>  arch/arm64/include/asm/pgtable.h   |  2 ++
>>  virt/kvm/arm/mmu.c | 40 --
>>  5 files changed, 77 insertions(+), 3 deletions(-)
>
> Since this patch touches a couple of core arm64 files:
>
> Acked-by: Catalin Marinas <catalin.mari...@arm.com>

Thanks Catalin.

I've posted a v3 with minor changes yesterday[0]. Can you comment there?
Or maybe Marc can apply the tag while merging the patches.

[0] https://lkml.org/lkml/2018/5/14/912
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 4/4] KVM: arm64: Add support for PUD hugepages at stage 2

2018-05-14 Thread Punit Agrawal
KVM only supports PMD hugepages at stage 2. Extend the stage 2 fault
handling to add support for PUD hugepages.

Addition of pud hugepage support enables additional hugepage
sizes (e.g., 1G with 4K granule) which can be useful on cores that
support mapping larger block sizes in the TLB entries.

Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
Reviewed-by: Christoffer Dall <christoffer.d...@arm.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h | 19 
 arch/arm64/include/asm/kvm_mmu.h   | 15 ++
 arch/arm64/include/asm/pgtable-hwdef.h |  4 +++
 arch/arm64/include/asm/pgtable.h   |  2 ++
 virt/kvm/arm/mmu.c | 40 --
 5 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 224c22c0a69c..155916dbdd7e 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -77,8 +77,11 @@ void kvm_clear_hyp_idmap(void);
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) (__pud(0))
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+/* No support for pud hugepages */
+#define kvm_pud_mkhuge(pud)(pud)
 
 /*
  * The following kvm_*pud*() functionas are provided strictly to allow
@@ -95,6 +98,22 @@ static inline bool kvm_s2pud_readonly(pud_t *pud)
return false;
 }
 
+static inline void kvm_set_pud(pud_t *pud, pud_t new_pud)
+{
+   BUG();
+}
+
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   BUG();
+   return pud;
+}
+
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   BUG();
+   return pud;
+}
 
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index f440cf216a23..f49a68fcbf26 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -172,11 +172,14 @@ void kvm_clear_hyp_idmap(void);
 
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
+#define kvm_set_pud(pudp, pud) set_pud(pudp, pud)
 
 #define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
 #define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+#define kvm_pfn_pud(pfn, prot) pfn_pud(pfn, prot)
 
 #define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+#define kvm_pud_mkhuge(pud)pud_mkhuge(pud)
 
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
@@ -190,6 +193,12 @@ static inline pmd_t kvm_s2pmd_mkwrite(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkwrite(pud_t pud)
+{
+   pud_val(pud) |= PUD_S2_RDWR;
+   return pud;
+}
+
 static inline pte_t kvm_s2pte_mkexec(pte_t pte)
 {
pte_val(pte) &= ~PTE_S2_XN;
@@ -202,6 +211,12 @@ static inline pmd_t kvm_s2pmd_mkexec(pmd_t pmd)
return pmd;
 }
 
+static inline pud_t kvm_s2pud_mkexec(pud_t pud)
+{
+   pud_val(pud) &= ~PUD_S2_XN;
+   return pud;
+}
+
 static inline void kvm_set_s2pte_readonly(pte_t *ptep)
 {
pteval_t old_pteval, pteval;
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index fd208eac9f2a..e327665e94d1 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -193,6 +193,10 @@
 #define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 #define PMD_S2_XN  (_AT(pmdval_t, 2) << 53)  /* XN[1:0] */
 
+#define PUD_S2_RDONLY  (_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
+#define PUD_S2_XN  (_AT(pudval_t, 2) << 53)  /* XN[1:0] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 7c4c8f318ba9..31ea9fda07e3 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -386,6 +386,8 @@ static inline int pmd_protnone(pmd_t pmd)
 
 #define pud_write(pud) pte_write(pud_pte(pud))
 
+#define pud_mkhuge(pud)(__pud(pud_val(pud) & ~PUD_TABLE_BIT))
+
 #define __pud_to_phys(pud) __pte_to_phys(pud_pte(pud))
 #define __phys_to_pud_val(phys)__phys_to_pte_val(phys)
 #define pud_pfn(pud)   ((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 671d3c0825f2..b0931fa2d64e 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1036,6 +1036,26 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct 
kvm_mmu_memory_cache
return 0;
 

[PATCH v3 1/4] KVM: arm/arm64: Share common code in user_mem_abort()

2018-05-14 Thread Punit Agrawal
The code for operations such as marking the pfn as dirty, and
dcache/icache maintenance during stage 2 fault handling is duplicated
between normal pages and PMD hugepages.

Instead of creating another copy of the operations when we introduce
PUD hugepages, let's share them across the different pagesizes.

Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
Reviewed-by: Christoffer Dall <christoffer.d...@arm.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
---
 virt/kvm/arm/mmu.c | 69 +++---
 1 file changed, 40 insertions(+), 29 deletions(-)

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 7f6a944db23d..07ae1e003762 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1396,6 +1396,21 @@ static void invalidate_icache_guest_page(kvm_pfn_t pfn, 
unsigned long size)
__invalidate_icache_guest_page(pfn, size);
 }
 
+static bool stage2_should_exec(struct kvm *kvm, phys_addr_t addr,
+  bool exec_fault, unsigned long fault_status)
+{
+   /*
+* If we took an execution fault we will have made the
+* icache/dcache coherent and should now let the s2 mapping be
+* executable.
+*
+* Write faults (!exec_fault && FSC_PERM) are orthogonal to
+* execute permissions, and we preserve whatever we have.
+*/
+   return exec_fault ||
+   (fault_status == FSC_PERM && stage2_is_exec(kvm, addr));
+}
+
 static void kvm_send_hwpoison_signal(unsigned long address,
 struct vm_area_struct *vma)
 {
@@ -1428,7 +1443,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
kvm_pfn_t pfn;
pgprot_t mem_type = PAGE_S2;
bool logging_active = memslot_is_logging(memslot);
-   unsigned long flags = 0;
+   unsigned long vma_pagesize, flags = 0;
 
write_fault = kvm_is_write_fault(vcpu);
exec_fault = kvm_vcpu_trap_is_iabt(vcpu);
@@ -1448,7 +1463,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
return -EFAULT;
}
 
-   if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
+   vma_pagesize = vma_kernel_pagesize(vma);
+   if (vma_pagesize == PMD_SIZE && !logging_active) {
hugetlb = true;
gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
} else {
@@ -1517,28 +1533,33 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
if (mmu_notifier_retry(kvm, mmu_seq))
goto out_unlock;
 
-   if (!hugetlb && !force_pte)
-   hugetlb = transparent_hugepage_adjust(, _ipa);
+   if (!hugetlb && !force_pte) {
+   /*
+* Only PMD_SIZE transparent hugepages(THP) are
+* currently supported. This code will need to be
+* updated to support other THP sizes.
+*/
+   if (transparent_hugepage_adjust(, _ipa))
+   vma_pagesize = PMD_SIZE;
+   }
+
+   if (writable)
+   kvm_set_pfn_dirty(pfn);
 
-   if (hugetlb) {
+   if (fault_status != FSC_PERM)
+   clean_dcache_guest_page(pfn, vma_pagesize);
+
+   if (exec_fault)
+   invalidate_icache_guest_page(pfn, vma_pagesize);
+
+   if (vma_pagesize == PMD_SIZE) {
pmd_t new_pmd = pfn_pmd(pfn, mem_type);
new_pmd = pmd_mkhuge(new_pmd);
-   if (writable) {
+   if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
-   kvm_set_pfn_dirty(pfn);
-   }
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PMD_SIZE);
-
-   if (exec_fault) {
+   if (stage2_should_exec(kvm, fault_ipa, exec_fault, 
fault_status))
new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   invalidate_icache_guest_page(pfn, PMD_SIZE);
-   } else if (fault_status == FSC_PERM) {
-   /* Preserve execute if XN was already cleared */
-   if (stage2_is_exec(kvm, fault_ipa))
-   new_pmd = kvm_s2pmd_mkexec(new_pmd);
-   }
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
@@ -1546,21 +1567,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-   kvm_set_pfn_dirty(pfn);
mark_page_dirty(kvm, gfn);
}
 
-   if (fault_status != FSC_PERM)
-   clean_dcache_guest_page(pfn, PAGE_SIZE);
-
-   if (exec_fault) {
+   if (stage2_should_exec

[PATCH v3 2/4] KVM: arm/arm64: Introduce helpers to manupulate page table entries

2018-05-14 Thread Punit Agrawal
Introduce helpers to abstract architectural handling of the conversion
of pfn to page table entries and marking a PMD page table entry as a
block entry.

The helpers are introduced in preparation for supporting PUD hugepages
at stage 2 - which are supported on arm64 but do not exist on arm.

Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
Acked-by: Christoffer Dall <christoffer.d...@arm.com>
Cc: Marc Zyngier <marc.zyng...@arm.com>
Cc: Russell King <li...@armlinux.org.uk>
Cc: Catalin Marinas <catalin.mari...@arm.com>
Cc: Will Deacon <will.dea...@arm.com>
---
 arch/arm/include/asm/kvm_mmu.h   | 5 +
 arch/arm64/include/asm/kvm_mmu.h | 5 +
 virt/kvm/arm/mmu.c   | 7 ---
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 707a1f06dc5d..5907a81ad5c1 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -75,6 +75,11 @@ phys_addr_t kvm_get_idmap_vector(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
 {
*pmd = new_pmd;
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 082110993647..d962508ce4b3 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -173,6 +173,11 @@ void kvm_clear_hyp_idmap(void);
 #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
 #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
 
+#define kvm_pfn_pte(pfn, prot) pfn_pte(pfn, prot)
+#define kvm_pfn_pmd(pfn, prot) pfn_pmd(pfn, prot)
+
+#define kvm_pmd_mkhuge(pmd)pmd_mkhuge(pmd)
+
 static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
 {
pte_val(pte) |= PTE_S2_RDWR;
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 07ae1e003762..0beefcc5e090 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -1553,8 +1553,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
invalidate_icache_guest_page(pfn, vma_pagesize);
 
if (vma_pagesize == PMD_SIZE) {
-   pmd_t new_pmd = pfn_pmd(pfn, mem_type);
-   new_pmd = pmd_mkhuge(new_pmd);
+   pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
+
+   new_pmd = kvm_pmd_mkhuge(new_pmd);
if (writable)
new_pmd = kvm_s2pmd_mkwrite(new_pmd);
 
@@ -1563,7 +1564,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
phys_addr_t fault_ipa,
 
ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
} else {
-   pte_t new_pte = pfn_pte(pfn, mem_type);
+   pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
 
if (writable) {
new_pte = kvm_s2pte_mkwrite(new_pte);
-- 
2.17.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH v3 0/4] KVM: Support PUD hugepages at stage 2

2018-05-14 Thread Punit Agrawal
Hi,

This patchset adds support for PUD hugepages at stage 2. This feature
is useful on cores that have support for large sized TLB mappings
(e.g., 1GB for 4K granule). Previous postings can be found at
[0][1][2].

Support is added to code that is shared between arm and arm64. Dummy
helpers for arm are provided as the port does not support PUD hugepage
sizes.

There is a small conflict with the series to add support for 52 bit
IPA[3]. The patches have been functionally tested on an A57 based
system. The patchset is based on v4.17-rc5 and incorporates feedback
received on the previous version.

Thanks,
Punit

v2 -> v3:
* Update vma_pagesize directly if THP [1/4]. Previsouly this was done
  indirectly via hugetlb
* Added review tag [4/4]

v1 -> v2:
* Create helper to check if the page should have exec permission [1/4]
* Fix broken condition to detect THP hugepage [1/4]
* Fix in-correct hunk resulting from a rebase [4/4]

[0] https://www.spinics.net/lists/arm-kernel/msg628053.html
[1] https://lkml.org/lkml/2018/4/20/566
[2] https://lkml.org/lkml/2018/5/1/133
[3] https://lwn.net/Articles/750176/


Punit Agrawal (4):
  KVM: arm/arm64: Share common code in user_mem_abort()
  KVM: arm/arm64: Introduce helpers to manupulate page table entries
  KVM: arm64: Support dirty page tracking for PUD hugepages
  KVM: arm64: Add support for PUD hugepages at stage 2

 arch/arm/include/asm/kvm_mmu.h |  40 
 arch/arm64/include/asm/kvm_mmu.h   |  30 ++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/include/asm/pgtable.h   |   2 +
 virt/kvm/arm/mmu.c | 121 +
 5 files changed, 161 insertions(+), 36 deletions(-)

-- 
2.17.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 1/4] KVM: arm/arm64: Share common code in user_mem_abort()

2018-05-04 Thread Punit Agrawal
Christoffer Dall <christoffer.d...@arm.com> writes:

> On Tue, May 01, 2018 at 11:26:56AM +0100, Punit Agrawal wrote:
>> The code for operations such as marking the pfn as dirty, and
>> dcache/icache maintenance during stage 2 fault handling is duplicated
>> between normal pages and PMD hugepages.
>> 
>> Instead of creating another copy of the operations when we introduce
>> PUD hugepages, let's share them across the different pagesizes.
>> 
>> Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
>> Reviewed-by: Christoffer Dall <christoffer.d...@arm.com>
>> Cc: Marc Zyngier <marc.zyng...@arm.com>
>> ---
>>  virt/kvm/arm/mmu.c | 66 +++---
>>  1 file changed, 39 insertions(+), 27 deletions(-)
>> 
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 7f6a944db23d..686fc6a4b866 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c

[...]

>> @@ -1517,28 +1533,34 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  if (mmu_notifier_retry(kvm, mmu_seq))
>>  goto out_unlock;
>>  
>> -if (!hugetlb && !force_pte)
>> +if (!hugetlb && !force_pte) {
>>  hugetlb = transparent_hugepage_adjust(, _ipa);
>> +/*
>> + * Only PMD_SIZE transparent hugepages(THP) are
>> + * currently supported. This code will need to be
>> + * updated to support other THP sizes.
>> + */
>> +if (hugetlb)
>> +vma_pagesize = PMD_SIZE;
>
> nit: this is a bit of a trap waiting to happen, as the suggested
> semantics of hugetlb is now hugetlbfs and not THP.
>
> It may be slightly nicer to do do:
>
>   if (transparent_hugepage_adjust(, _ipa))
>   vma_pagesize = PMD_SIZE;

I should've noticed this.

I'll incorporate your suggestion and update the condition below using
hugetlb to rely on vma_pagesize instead.

Thanks,
Punit

>
>> +}
>> +
>> +if (writable)
>> +kvm_set_pfn_dirty(pfn);
>> +
>> +if (fault_status != FSC_PERM)
>> +clean_dcache_guest_page(pfn, vma_pagesize);
>> +
>> +if (exec_fault)
>> +invalidate_icache_guest_page(pfn, vma_pagesize);
>>  
>>  if (hugetlb) {
>>  pmd_t new_pmd = pfn_pmd(pfn, mem_type);
>>  new_pmd = pmd_mkhuge(new_pmd);
>> -if (writable) {
>> +if (writable)
>>  new_pmd = kvm_s2pmd_mkwrite(new_pmd);
>> -kvm_set_pfn_dirty(pfn);
>> -}
>>  
>> -if (fault_status != FSC_PERM)
>> -clean_dcache_guest_page(pfn, PMD_SIZE);
>> -
>> -if (exec_fault) {
>> +if (stage2_should_exec(kvm, fault_ipa, exec_fault, 
>> fault_status))
>>  new_pmd = kvm_s2pmd_mkexec(new_pmd);
>> -invalidate_icache_guest_page(pfn, PMD_SIZE);
>> -} else if (fault_status == FSC_PERM) {
>> -/* Preserve execute if XN was already cleared */
>> -if (stage2_is_exec(kvm, fault_ipa))
>> -new_pmd = kvm_s2pmd_mkexec(new_pmd);
>> -}
>>  
>>  ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, _pmd);
>>  } else {
>> @@ -1546,21 +1568,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  
>>  if (writable) {
>>  new_pte = kvm_s2pte_mkwrite(new_pte);
>> -kvm_set_pfn_dirty(pfn);
>>  mark_page_dirty(kvm, gfn);
>>  }
>>  
>> -if (fault_status != FSC_PERM)
>> -clean_dcache_guest_page(pfn, PAGE_SIZE);
>> -
>> -if (exec_fault) {
>> +if (stage2_should_exec(kvm, fault_ipa, exec_fault, 
>> fault_status))
>>  new_pte = kvm_s2pte_mkexec(new_pte);
>> -invalidate_icache_guest_page(pfn, PAGE_SIZE);
>> -} else if (fault_status == FSC_PERM) {
>> -/* Preserve execute if XN was already cleared */
>> -if (stage2_is_exec(kvm, fault_ipa))
>> -new_pte = kvm_s2pte_mkexec(new_pte);
>> -}
>>  
>>  ret = stage2_set_pte(kvm, memcache, fault_ipa, _pte, flags);
>>  }
>> -- 
>> 2.17.0
>> 
>
> Otherwise looks good.
>
> Thanks,
> -Christoffer
> ___
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH v2 2/4] KVM: arm/arm64: Introduce helpers to manupulate page table entries

2018-05-01 Thread Punit Agrawal
Hi Suzuki,

Thanks for having a look.

Suzuki K Poulose <suzuki.poul...@arm.com> writes:

> On 01/05/18 11:26, Punit Agrawal wrote:
>> Introduce helpers to abstract architectural handling of the conversion
>> of pfn to page table entries and marking a PMD page table entry as a
>> block entry.
>>
>> The helpers are introduced in preparation for supporting PUD hugepages
>> at stage 2 - which are supported on arm64 but do not exist on arm.
>
> Punit,
>
> The change are fine by me. However, we usually do not define kvm_*
> accessors for something which we know matches with the host variant.
> i.e, PMD and PTE helpers, which are always present and we make use
> of them directly. (see unmap_stage2_pmds for e.g)

In general, I agree - it makes sense to avoid duplication.

Having said that, the helpers here allow following a common pattern for
handling the various page sizes - pte, pmd and pud - during stage 2
fault handling (see patch 4).

As you've said you're OK with this change, I'd prefer to keep this patch
but will drop it if any others reviewers are concerned about the
duplication as well.

Thanks,
Punit

>
> Cheers
> Suzuki
>
>>
>> Signed-off-by: Punit Agrawal <punit.agra...@arm.com>
>> Acked-by: Christoffer Dall <christoffer.d...@arm.com>
>> Cc: Marc Zyngier <marc.zyng...@arm.com>
>> Cc: Russell King <li...@armlinux.org.uk>
>> Cc: Catalin Marinas <catalin.mari...@arm.com>
>> Cc: Will Deacon <will.dea...@arm.com>
>> ---
>>   arch/arm/include/asm/kvm_mmu.h   | 5 +
>>   arch/arm64/include/asm/kvm_mmu.h | 5 +
>>   virt/kvm/arm/mmu.c   | 7 ---
>>   3 files changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index 707a1f06dc5d..5907a81ad5c1 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -75,6 +75,11 @@ phys_addr_t kvm_get_idmap_vector(void);
>>   int kvm_mmu_init(void);
>>   void kvm_clear_hyp_idmap(void);
>>   +#define kvm_pfn_pte(pfn, prot)pfn_pte(pfn, prot)
>> +#define kvm_pfn_pmd(pfn, prot)  pfn_pmd(pfn, prot)
>> +
>> +#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
>> +
>>   static inline void kvm_set_pmd(pmd_t *pmd, pmd_t new_pmd)
>>   {
>>  *pmd = new_pmd;
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
>> b/arch/arm64/include/asm/kvm_mmu.h
>> index 082110993647..d962508ce4b3 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -173,6 +173,11 @@ void kvm_clear_hyp_idmap(void);
>>   #definekvm_set_pte(ptep, pte)  set_pte(ptep, pte)
>>   #definekvm_set_pmd(pmdp, pmd)  set_pmd(pmdp, pmd)
>>   +#define kvm_pfn_pte(pfn, prot)pfn_pte(pfn, prot)
>> +#define kvm_pfn_pmd(pfn, prot)  pfn_pmd(pfn, prot)
>> +
>> +#define kvm_pmd_mkhuge(pmd) pmd_mkhuge(pmd)
>> +
>>   static inline pte_t kvm_s2pte_mkwrite(pte_t pte)
>>   {
>>  pte_val(pte) |= PTE_S2_RDWR;
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 686fc6a4b866..74750236f445 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -1554,8 +1554,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, 
>> phys_addr_t fault_ipa,
>>  invalidate_icache_guest_page(pfn, vma_pagesize);
>>  if (hugetlb) {
>> -pmd_t new_pmd = pfn_pmd(pfn, mem_type);
>> -new_pmd = pmd_mkhuge(new_pmd);
>> +pmd_t new_pmd = kvm_pfn_pmd(pfn, mem_type);
>> +
>> +new_pmd = kvm_pmd_mkhuge(new_pmd);
>>  if (writable)
>>  new_pmd = kvm_s2pmd_mkwrite(new_pmd);
>>   @@ -1564,7 +1565,7 @@ static int user_mem_abort(struct kvm_vcpu
>> *vcpu, phys_addr_t fault_ipa,
>>  ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa,
>> _pmd);
>>  } else {
>> -pte_t new_pte = pfn_pte(pfn, mem_type);
>> +pte_t new_pte = kvm_pfn_pte(pfn, mem_type);
>>  if (writable) {
>>  new_pte = kvm_s2pte_mkwrite(new_pte);
>>
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


  1   2   >