date:20210811

[PATCH v4 2/2] memblock: make memblock_find_in_range method private

2021-08-11 Thread Mike Rapoport

From: Mike Rapoport 

There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.

memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.

Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.

This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().

Signed-off-by: Mike Rapoport 
Acked-by: Kirill A. Shutemov 
Acked-by: Rafael J. Wysocki 
Acked-by: Russell King (Oracle) 
Acked-by: Nick Kossifidis 
Reviewed-by: Catalin Marinas 
---
 arch/arm/kernel/setup.c   | 20 +-
 arch/arm64/kvm/hyp/reserved_mem.c |  9 +++
 arch/arm64/mm/init.c  | 36 -
 arch/mips/kernel/setup.c  | 14 +-
 arch/riscv/mm/init.c  | 44 ++-
 arch/s390/kernel/setup.c  | 10 ---
 arch/x86/kernel/aperture_64.c |  5 ++--
 arch/x86/mm/init.c| 11 
 arch/x86/mm/numa.c|  5 ++--
 arch/x86/mm/numa_emulation.c  |  5 ++--
 arch/x86/realmode/init.c  |  2 +-
 drivers/acpi/tables.c |  5 ++--
 drivers/base/arch_numa.c  |  5 +---
 drivers/of/of_reserved_mem.c  | 12 ++---
 include/linux/memblock.h  |  2 --
 mm/memblock.c |  2 +-
 16 files changed, 71 insertions(+), 116 deletions(-)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index f97eb2371672..284a80c0b6e1 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -1012,31 +1012,25 @@ static void __init reserve_crashkernel(void)
unsigned long long lowmem_max = __pa(high_memory - 1) + 1;
if (crash_max > lowmem_max)
crash_max = lowmem_max;
-   crash_base = memblock_find_in_range(CRASH_ALIGN, crash_max,
-   crash_size, CRASH_ALIGN);
+
+   crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
+  CRASH_ALIGN, crash_max);
if (!crash_base) {
pr_err("crashkernel reservation failed - No suitable 
area found.\n");
return;
}
} else {
+   unsigned long long crash_max = crash_base + crash_size;
unsigned long long start;
 
-   start = memblock_find_in_range(crash_base,
-  crash_base + crash_size,
-  crash_size, SECTION_SIZE);
-   if (start != crash_base) {
+   start = memblock_phys_alloc_range(crash_size, SECTION_SIZE,
+ crash_base, crash_max);
+   if (!start) {
pr_err("crashkernel reservation failed - memory is in 
use.\n");
return;
}
}
 
-   ret = memblock_reserve(crash_base, crash_size);
-   if (ret < 0) {
-   pr_warn("crashkernel reservation failed - memory is in use 
(0x%lx)\n",
-   (unsigned long)crash_base);
-   return;
-   }
-
pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System 
RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crash_base >> 20),
diff --git a/arch/arm64/kvm/hyp/reserved_mem.c 
b/arch/arm64/kvm/hyp/reserved_mem.c
index d654921dd09b..578670e3f608 100644
--- a/arch/arm64/kvm/hyp/reserved_mem.c
+++ b/arch/arm64/kvm/hyp/reserved_mem.c
@@ -92,12 +92,10 @@ void __init kvm_hyp_reserve(void)
 * this is unmapped from the host stage-2, and fallback to PAGE_SIZE.
 */
hyp_mem_size = hyp_mem_pages << PAGE_SHIFT;
-   hyp_mem_base = memblock_find_in_range(0, memblock_end_of_DRAM(),
- ALIGN(hyp_mem_size, PMD_SIZE),
- PMD_SIZE);
+   hyp_mem_base = memblock_phys_alloc(ALIGN(hyp_mem_size, PMD_SIZE),
+  PMD_SIZE);
if (!hyp_mem_base)
-   hyp_mem_base = memblock_find_in_range(0, memblock_end_of_DRAM(),
- hyp_mem_size, PAGE_SIZE);
+   hyp_mem_base = memblock_phys_alloc(hyp_mem_size, PAGE_SIZE);
else
hyp_mem_size = ALIGN(hyp_mem_size, PMD_SIZE);
 
@@ -105,7 +103,6 @@ void __init kvm_hyp_reserve(void)
kvm_err("Failed to reserve hyp memory\n");
return;
}
-   me

[PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M

2021-08-11 Thread Mike Rapoport

From: Mike Rapoport 

memory_map_top_down() function skips the upper 2M in the beginning and maps
them in the end because

"xen has big range in reserved near end of ram, skip it at first"

It appears, though, that the root cause was that there was not enough
memory in the range [min_pfn_mapped, max_pfn_mapped] to allocate page
tables from that range in alloc_low_pages() because min_pfn_mapped didn't
reflect that actual minimal pfn that was already mapped but remained close
to the end of the range being mapped by memory_map_top_down().

This happened because min_pfn_mapped is updated at every iteration of the
loop in memory_map_top_down(), but there is another loop in
init_range_memory_mapping() that maps several regions below the current
min_pfn_mapped without updating this variable.

Move the update of min_pfn_mapped to add_pfn_range_mapped() next to the
update of max_pfn_mapped so that every time a new range is mapped both
limits will be updated accordingly, and remove the spurious "reservation"
of upper 2M.

Signed-off-by: Mike Rapoport 
---
 arch/x86/mm/init.c | 16 +---
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 75ef19aa8903..87150961fdca 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -486,6 +486,7 @@ static void add_pfn_range_mapped(unsigned long start_pfn, 
unsigned long end_pfn)
nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_MAX_ENTRIES);
 
max_pfn_mapped = max(max_pfn_mapped, end_pfn);
+   min_pfn_mapped = min(min_pfn_mapped, start_pfn);
 
if (start_pfn < (1UL<<(32-PAGE_SHIFT)))
max_low_pfn_mapped = max(max_low_pfn_mapped,
@@ -605,20 +606,14 @@ static unsigned long __init get_new_step_size(unsigned 
long step_size)
 static void __init memory_map_top_down(unsigned long map_start,
   unsigned long map_end)
 {
-   unsigned long real_end, last_start;
-   unsigned long step_size;
-   unsigned long addr;
+   unsigned long real_end = ALIGN_DOWN(map_end, PMD_SIZE);
+   unsigned long last_start = real_end;
+   /* step_size need to be small so pgt_buf from BRK could cover it */
+   unsigned long step_size = PMD_SIZE;
unsigned long mapped_ram_size = 0;
 
-   /* xen has big range in reserved near end of ram, skip it at first.*/
-   addr = memblock_find_in_range(map_start, map_end, PMD_SIZE, PMD_SIZE);
-   real_end = addr + PMD_SIZE;
-
-   /* step_size need to be small so pgt_buf from BRK could cover it */
-   step_size = PMD_SIZE;
max_pfn_mapped = 0; /* will get exact value next */
min_pfn_mapped = real_end >> PAGE_SHIFT;
-   last_start = real_end;
 
/*
 * We start from the top (end of memory) and go to the bottom.
@@ -638,7 +633,6 @@ static void __init memory_map_top_down(unsigned long 
map_start,
mapped_ram_size += init_range_memory_mapping(start,
last_start);
last_start = start;
-   min_pfn_mapped = last_start >> PAGE_SHIFT;
if (mapped_ram_size >= step_size)
step_size = get_new_step_size(step_size);
}
-- 
2.28.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH v4 0/2] memblock: make memblock_find_in_range method private

2021-08-11 Thread Mike Rapoport

From: Mike Rapoport 

Hi,

This is v4 of "memblock: make memblock_find_in_range method private" patch
that essentially replaces memblock_find_in_range() + memblock_reserve()
calls with equivalent calls to memblock_phys_alloc() and prevents usage of
memblock_find_in_range() outside memblock itself.

The patch uncovered an issue with top down memory mapping on x86 and this
version has a preparation patch that addresses this issue.

Guenter, I didn't add your Tested-by because the patch that addresses the
crashes differs from the one you've tested.

v4: 
* Add patch that prevents the crashes reported by Guenter Roeck on x86/i386
  on QEMU with 256M or 512M of memory and EFI boot enabled.
* Add Acked-by and Reviewed-by, thanks everybidy!

v3: https://lore.kernel.org/lkml/20210803064218.6611-1-r...@kernel.org
* simplify check for exact crash kerenl allocation on arm, per Rob
* make crash_max unsigned long long on arm64, per Rob

v2: https://lore.kernel.org/lkml/20210802063737.22733-1-r...@kernel.org
* don't change error message in arm::reserve_crashkernel(), per Russell

v1: https://lore.kernel.org/lkml/20210730104039.7047-1-r...@kernel.org

Mike Rapoport (2):
  x86/mm: memory_map_top_down: remove spurious reservation of upper 2M
  memblock: make memblock_find_in_range method private

 arch/arm/kernel/setup.c   | 20 +-
 arch/arm64/kvm/hyp/reserved_mem.c |  9 +++
 arch/arm64/mm/init.c  | 36 -
 arch/mips/kernel/setup.c  | 14 +-
 arch/riscv/mm/init.c  | 44 ++-
 arch/s390/kernel/setup.c  | 10 ---
 arch/x86/kernel/aperture_64.c |  5 ++--
 arch/x86/mm/init.c| 27 +++
 arch/x86/mm/numa.c|  5 ++--
 arch/x86/mm/numa_emulation.c  |  5 ++--
 arch/x86/realmode/init.c  |  2 +-
 drivers/acpi/tables.c |  5 ++--
 drivers/base/arch_numa.c  |  5 +---
 drivers/of/of_reserved_mem.c  | 12 ++---
 include/linux/memblock.h  |  2 --
 mm/memblock.c |  2 +-
 16 files changed, 76 insertions(+), 127 deletions(-)


base-commit: ff1176468d368232b684f75e82563369208bc371
-- 
2.28.0

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2 5/5] KVM: arm64: Drop unused REQUIRES_VIRT

2021-08-11 Thread Anshuman Khandual

This seems like a residue from the past. REQUIRES_VIRT is no more available
. Hence it can just be dropped along with the related code section.

Cc: Marc Zyngier 
Cc: James Morse 
Cc: Alexandru Elisei 
Cc: Suzuki K Poulose 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/kvm/arm.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index ee9b1166f330..d779a29c5607 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -42,10 +42,6 @@
 #include 
 #include 
 
-#ifdef REQUIRES_VIRT
-__asm__(".arch_extension   virt");
-#endif
-
 static enum kvm_mode kvm_mode = KVM_MODE_DEFAULT;
 DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
 
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2 4/5] KVM: arm64: Drop check_kvm_target_cpu() based percpu probe

2021-08-11 Thread Anshuman Khandual

kvm_target_cpu() never returns a negative error code, so check_kvm_target()
would never have 'ret' filled with a negative error code. Hence the percpu
probe via check_kvm_target_cpu() does not make sense as its never going to
find an unsupported CPU, forcing kvm_arch_init() to exit early. Hence lets
just drop this percpu probe (and also check_kvm_target_cpu()) altogether.

While here, this also changes kvm_target_cpu() return type to a u32, making
it explicit that an error code will not be returned from this function.

Cc: Marc Zyngier 
Cc: James Morse 
Cc: Alexandru Elisei 
Cc: Suzuki K Poulose 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org
Acked-by: Will Deacon 
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/include/asm/kvm_host.h |  2 +-
 arch/arm64/kvm/arm.c  | 16 +---
 arch/arm64/kvm/guest.c|  4 ++--
 3 files changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index 41911585ae0c..26bbd84a02de 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -66,7 +66,7 @@ DECLARE_STATIC_KEY_FALSE(userspace_irqchip_in_use);
 extern unsigned int kvm_sve_max_vl;
 int kvm_arm_init_sve(void);
 
-int __attribute_const__ kvm_target_cpu(void);
+u32 __attribute_const__ kvm_target_cpu(void);
 int kvm_reset_vcpu(struct kvm_vcpu *vcpu);
 void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 19560e457c11..ee9b1166f330 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1035,7 +1035,7 @@ static int kvm_vcpu_set_target(struct kvm_vcpu *vcpu,
   const struct kvm_vcpu_init *init)
 {
unsigned int i, ret;
-   int phys_target = kvm_target_cpu();
+   u32 phys_target = kvm_target_cpu();
 
if (init->target != phys_target)
return -EINVAL;
@@ -2010,11 +2010,6 @@ static int finalize_hyp_mode(void)
return 0;
 }
 
-static void check_kvm_target_cpu(void *ret)
-{
-   *(int *)ret = kvm_target_cpu();
-}
-
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr)
 {
struct kvm_vcpu *vcpu;
@@ -2074,7 +2069,6 @@ void kvm_arch_irq_bypass_start(struct irq_bypass_consumer 
*cons)
 int kvm_arch_init(void *opaque)
 {
int err;
-   int ret, cpu;
bool in_hyp_mode;
 
if (!is_hyp_mode_available()) {
@@ -2089,14 +2083,6 @@ int kvm_arch_init(void *opaque)
kvm_info("Guests without required CPU erratum workarounds can 
deadlock system!\n" \
 "Only trusted guests should be used on this 
system.\n");
 
-   for_each_online_cpu(cpu) {
-   smp_call_function_single(cpu, check_kvm_target_cpu, &ret, 1);
-   if (ret < 0) {
-   kvm_err("Error, CPU %d not supported!\n", cpu);
-   return -ENODEV;
-   }
-   }
-
err = kvm_set_ipa_limit();
if (err)
return err;
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 1dfb83578277..8ce850fa7b49 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -842,7 +842,7 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu,
return 0;
 }
 
-int __attribute_const__ kvm_target_cpu(void)
+u32 __attribute_const__ kvm_target_cpu(void)
 {
unsigned long implementor = read_cpuid_implementor();
unsigned long part_number = read_cpuid_part_number();
@@ -874,7 +874,7 @@ int __attribute_const__ kvm_target_cpu(void)
 
 int kvm_vcpu_preferred_target(struct kvm_vcpu_init *init)
 {
-   int target = kvm_target_cpu();
+   u32 target = kvm_target_cpu();
 
if (target < 0)
return -ENODEV;
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2 3/5] KVM: arm64: Drop init_common_resources()

2021-08-11 Thread Anshuman Khandual

Could do without this additional indirection via init_common_resources() by
just calling kvm_set_ipa_limit() directly instead.

Cc: Marc Zyngier 
Cc: James Morse 
Cc: Alexandru Elisei 
Cc: Suzuki K Poulose 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/kvm/arm.c | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e9a2b8f27792..19560e457c11 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1696,11 +1696,6 @@ static bool init_psci_relay(void)
return true;
 }
 
-static int init_common_resources(void)
-{
-   return kvm_set_ipa_limit();
-}
-
 static int init_subsystems(void)
 {
int err = 0;
@@ -2102,7 +2097,7 @@ int kvm_arch_init(void *opaque)
}
}
 
-   err = init_common_resources();
+   err = kvm_set_ipa_limit();
if (err)
return err;
 
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2 2/5] KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPA

2021-08-11 Thread Anshuman Khandual

Drop hard coded value for the minimum supported IPA range bits (i.e 32).
Instead use ARM64_MIN_PARANGE_BITS which improves the code readability.

Cc: Marc Zyngier 
Cc: James Morse 
Cc: Alexandru Elisei 
Cc: Suzuki K Poulose 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/kvm/reset.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index cba7872d69a8..acf309698f33 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -369,7 +369,7 @@ int kvm_arm_setup_stage2(struct kvm *kvm, unsigned long 
type)
phys_shift = KVM_VM_TYPE_ARM_IPA_SIZE(type);
if (phys_shift) {
if (phys_shift > kvm_ipa_limit ||
-   phys_shift < 32)
+   phys_shift < ARM64_MIN_PARANGE_BITS)
return -EINVAL;
} else {
phys_shift = KVM_PHYS_SHIFT;
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2 1/5] arm64/mm: Add remaining ID_AA64MMFR0_PARANGE_ macros

2021-08-11 Thread Anshuman Khandual

Currently there are macros only for 48 and 52 bits parange value extracted
from the ID_AA64MMFR0.PARANGE field. This change completes the enumeration
and updates the helper id_aa64mmfr0_parange_to_phys_shift(). While here it
also defines ARM64_MIN_PARANGE_BITS as the absolute minimum shift value PA
range which could be supported on a given platform.

Cc: Marc Zyngier 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/include/asm/cpufeature.h | 14 +++---
 arch/arm64/include/asm/sysreg.h |  7 +++
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h 
b/arch/arm64/include/asm/cpufeature.h
index 9bb9d11750d7..8633bdb21f33 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -781,13 +781,13 @@ extern int do_emulate_mrs(struct pt_regs *regs, u32 
sys_reg, u32 rt);
 static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange)
 {
switch (parange) {
-   case 0: return 32;
-   case 1: return 36;
-   case 2: return 40;
-   case 3: return 42;
-   case 4: return 44;
-   case 5: return 48;
-   case 6: return 52;
+   case ID_AA64MMFR0_PARANGE_32: return 32;
+   case ID_AA64MMFR0_PARANGE_36: return 36;
+   case ID_AA64MMFR0_PARANGE_40: return 40;
+   case ID_AA64MMFR0_PARANGE_42: return 42;
+   case ID_AA64MMFR0_PARANGE_44: return 44;
+   case ID_AA64MMFR0_PARANGE_48: return 48;
+   case ID_AA64MMFR0_PARANGE_52: return 52;
/*
 * A future PE could use a value unknown to the kernel.
 * However, by the "D10.1.4 Principles of the ID scheme
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7b9c3acba684..504e909129ea 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -853,9 +853,16 @@
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED 0x0
 #define ID_AA64MMFR0_TGRAN16_NI0x0
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED 0x1
+#define ID_AA64MMFR0_PARANGE_320x0
+#define ID_AA64MMFR0_PARANGE_360x1
+#define ID_AA64MMFR0_PARANGE_400x2
+#define ID_AA64MMFR0_PARANGE_420x3
+#define ID_AA64MMFR0_PARANGE_440x4
 #define ID_AA64MMFR0_PARANGE_480x5
 #define ID_AA64MMFR0_PARANGE_520x6
 
+#define ARM64_MIN_PARANGE_BITS 32
+
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_DEFAULT 0x0
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE0x1
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MIN 0x2
-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2 0/5] KVM: arm64: General cleanups

2021-08-11 Thread Anshuman Khandual

This series contains mostly unrelated general cleanups. This series applies
on v5.14-rc5 and has been boot tested with different page sized guests.

Changes in V2:

- Dropped the first patch regarding PAGE_[SHIFT|SIZE] per Marc
- Added remaining ID_AA64MMFR0_PARANGE_ macros and ARM64_MIN_PARANGE_BITS per 
Marc
- Dropped memory and cycle reference from commit message in [PATCH 3/5]
- Changed return value as u32 in kvm_target_cpu() per Will

Changes in V1:

https://lore.kernel.org/lkml/1628578961-29097-1-git-send-email-anshuman.khand...@arm.com/

Cc: Marc Zyngier 
Cc: James Morse 
Cc: Alexandru Elisei 
Cc: Suzuki K Poulose 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org

Anshuman Khandual (5):
  arm64/mm: Add remaining ID_AA64MMFR0_PARANGE_ macros
  KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPA
  KVM: arm64: Drop init_common_resources()
  KVM: arm64: Drop check_kvm_target_cpu() based percpu probe
  KVM: arm64: Drop unused REQUIRES_VIRT

 arch/arm64/include/asm/cpufeature.h | 14 +++---
 arch/arm64/include/asm/kvm_host.h   |  2 +-
 arch/arm64/include/asm/sysreg.h |  7 +++
 arch/arm64/kvm/arm.c| 27 ++-
 arch/arm64/kvm/guest.c  |  4 ++--
 arch/arm64/kvm/reset.c  |  2 +-
 6 files changed, 20 insertions(+), 36 deletions(-)

-- 
2.20.1

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state

2021-08-11 Thread Oliver Upton

On Wed, Aug 11, 2021 at 6:05 AM Paolo Bonzini  wrote:
>
> On 04/08/21 10:57, Oliver Upton wrote:
> > KVM's current means of saving/restoring system counters is plagued with
> > temporal issues. At least on ARM64 and x86, we migrate the guest's
> > system counter by-value through the respective guest system register
> > values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
> > brittle as the state is not idempotent: the host system counter is still
> > oscillating between the attempted save and restore. Furthermore, VMMs
> > may wish to transparently live migrate guest VMs, meaning that they
> > include the elapsed time due to live migration blackout in the guest
> > system counter view. The VMM thread could be preempted for any number of
> > reasons (scheduler, L0 hypervisor under nested) between the time that
> > it calculates the desired guest counter value and when KVM actually sets
> > this counter state.
> >
> > Despite the value-based interface that we present to userspace, KVM
> > actually has idempotent guest controls by way of system counter offsets.
> > We can avoid all of the issues associated with a value-based interface
> > by abstracting these offset controls in new ioctls. This series
> > introduces new vCPU device attributes to provide userspace access to the
> > vCPU's system counter offset.
> >
> > Patch 1 addresses a possible race in KVM_GET_CLOCK where
> > use_master_clock is read outside of the pvclock_gtod_sync_lock.
> >
> > Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
> > ioctls to provide userspace with a (host_tsc, realtime) instant. This is
> > essential for a VMM to perform precise migration of the guest's system
> > counters.
> >
> > Patches 3-4 are some preparatory changes for exposing the TSC offset to
> > userspace. Patch 5 provides a vCPU attribute to provide userspace access
> > to the TSC offset.
> >
> > Patches 6-7 implement a test for the new additions to
> > KVM_{GET,SET}_CLOCK.
> >
> > Patch 8 fixes some assertions in the kvm device attribute helpers.
> >
> > Patches 9-10 implement at test for the tsc offset attribute introduced in
> > patch 5.
>
> The x86 parts look good, except that patch 3 is a bit redundant with my
> idea of altogether getting rid of the pvclock_gtod_sync_lock.  That said
> I agree that patches 1 and 2 (and extracting kvm_vm_ioctl_get_clock and
> kvm_vm_ioctl_set_clock) should be done before whatever locking changes
> have to be done.

Following up on patch 3.

> Time is ticking for 5.15 due to my vacation, I'll see if I have some
> time to look at it further next week.
>
> I agree that arm64 can be done separately from x86.

Marc, just a disclaimer:

I'm going to separate these two series, although there will still
exist dependencies in the selftests changes. Otherwise, kernel changes
are disjoint.

--
Thanks,
Oliver
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH] KVM: arm64: Return -EPERM from __pkvm_host_share_hyp()

2021-08-11 Thread Quentin Perret

Fix the error code returned by __pkvm_host_share_hyp() when the
host attempts to share with EL2 a page that has already been shared with
another entity.

Reported-by: Will Deacon 
Signed-off-by: Quentin Perret 
---
This patch fixes a bug introduced in the stage-2 ownership series which
is already queued in kvmarm/next:

https://lore.kernel.org/lkml/20210809152448.1810400-1-qper...@google.com/

I've starred at this code for hours, but that clearly was not enough. Oh
well ...
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c 
b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 8165390d3ec9..6ec695311498 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -404,7 +404,7 @@ int __pkvm_host_share_hyp(u64 pfn)
cur = kvm_pgtable_hyp_pte_prot(pte);
prot = pkvm_mkstate(PAGE_HYP, PKVM_PAGE_SHARED_BORROWED);
if (!check_prot(cur, prot, ~prot))
-   ret = EPERM;
+   ret = -EPERM;
goto unlock;
 
 map_shared:
-- 
2.32.0.605.g8dce9f2422-goog

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 19:56:22 +0100,
Oliver Upton  wrote:

[...]

> > Time is ticking for 5.15 due to my vacation, I'll see if I have some
> > time to look at it further next week.
> >
> > I agree that arm64 can be done separately from x86.
> 
> Marc, just a disclaimer:
> 
> I'm going to separate these two series, although there will still
> exist dependencies in the selftests changes. Otherwise, kernel changes
> are disjoint.

No problem. The selftests can even sit in a third series if that makes
it easier.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] KVM: arm64: Return -EPERM from __pkvm_host_share_hyp()

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 18:36:25 +0100, Quentin Perret wrote:
> Fix the error code returned by __pkvm_host_share_hyp() when the
> host attempts to share with EL2 a page that has already been shared with
> another entity.

Applied to next, thanks!

[1/1] KVM: arm64: Return -EPERM from __pkvm_host_share_hyp()
  commit: 12593568d7319c34c72038ea799ab1bd0f0eb01c

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset

2021-08-11 Thread Marc Zyngier

On Tue, 10 Aug 2021 10:44:01 +0100,
Oliver Upton  wrote:
> 
> On Tue, Aug 10, 2021 at 2:35 AM Marc Zyngier  wrote:
> >
> > On Wed, 04 Aug 2021 09:58:11 +0100,
> > Oliver Upton  wrote:
> > >
> > > Allow userspace to access the guest's virtual counter-timer offset
> > > through the ONE_REG interface. The value read or written is defined to
> > > be an offset from the guest's physical counter-timer. Add some
> > > documentation to clarify how a VMM should use this and the existing
> > > CNTVCT_EL0.
> > >
> > > Signed-off-by: Oliver Upton 
> > > ---
> > >  Documentation/virt/kvm/api.rst| 10 ++
> > >  arch/arm64/include/uapi/asm/kvm.h |  1 +
> > >  arch/arm64/kvm/arch_timer.c   | 11 +++
> > >  arch/arm64/kvm/guest.c|  6 +-
> > >  include/kvm/arm_arch_timer.h  |  1 +
> > >  5 files changed, 28 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/virt/kvm/api.rst 
> > > b/Documentation/virt/kvm/api.rst
> > > index 8d4a3471ad9e..28a65dc89985 100644
> > > --- a/Documentation/virt/kvm/api.rst
> > > +++ b/Documentation/virt/kvm/api.rst
> > > @@ -2487,6 +2487,16 @@ arm64 system registers have the following id bit 
> > > patterns::
> > >   derived from the register encoding for CNTV_CVAL_EL0.  As this is
> > >   API, it must remain this way.
> > >
> > > +.. warning::
> > > +
> > > + The value of KVM_REG_ARM_TIMER_OFFSET is defined as an offset from
> > > + the guest's view of the physical counter-timer.
> > > +
> > > + Userspace should use either KVM_REG_ARM_TIMER_OFFSET or
> > > + KVM_REG_ARM_TIMER_CVAL to pause and resume a guest's virtual
> >
> > You probably mean KVM_REG_ARM_TIMER_CNT here, despite the broken
> > encoding.
> 
> Indeed I do!
> 
> >
> > > + counter-timer. Mixed use of these registers could result in an
> > > + unpredictable guest counter value.
> > > +
> > >  arm64 firmware pseudo-registers have the following bit pattern::
> > >
> > >0x6030  0014 
> > > diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> > > b/arch/arm64/include/uapi/asm/kvm.h
> > > index b3edde68bc3e..949a31bc10f0 100644
> > > --- a/arch/arm64/include/uapi/asm/kvm.h
> > > +++ b/arch/arm64/include/uapi/asm/kvm.h
> > > @@ -255,6 +255,7 @@ struct kvm_arm_copy_mte_tags {
> > >  #define KVM_REG_ARM_TIMER_CTLARM64_SYS_REG(3, 3, 14, 3, 
> > > 1)
> > >  #define KVM_REG_ARM_TIMER_CVAL   ARM64_SYS_REG(3, 3, 14, 0, 
> > > 2)
> > >  #define KVM_REG_ARM_TIMER_CNTARM64_SYS_REG(3, 3, 14, 3, 
> > > 2)
> > > +#define KVM_REG_ARM_TIMER_OFFSET ARM64_SYS_REG(3, 4, 14, 0, 3)
> >
> > I don't think we can use the encoding for CNTPOFF_EL2 here, as it will
> > eventually clash with a NV guest using the same feature for its own
> > purpose. We don't want this offset to overlap with any of the existing
> > features.
> >
> > I actually liked your previous proposal of controlling the physical
> > offset via a device property, as it clearly indicated that you were
> > dealing with non-architectural state.
> 
> That's actually exactly what I did here :) That said, the macro name
> is horribly obfuscated from CNTVOFF_EL2. I did this for the sake of
> symmetry with other virtual counter-timer registers above, though this
> may warrant special casing given the fact that we have a similarly
> named device attribute to handle the physical offset.

Gah, you are of course right. Ignore my rambling. The name is fine (or
at least in keeping with existing quality level of the making).

For the physical offset, something along the lines of
KVM_ARM_VCPU_TIMER_PHYS_OFFSET is probably right (but feel free to be
creative, I'm terrible at this stuff [1]).

Thanks,

M.

[1] https://twitter.com/codinghorror/status/506010907021828096?lang=en

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH V2] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 16:41:15 +0530, Anshuman Khandual wrote:
> Even though ID_AA64MMFR0.PARANGE reports 52 bit PA size support, it cannot
> be enabled as guest IPA size on 4K or 16K page size configurations. Hence
> kvm_ipa_limit must be restricted to 48 bits. This change achieves required
> IPA capping.
> 
> Before the commit c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA
> size based on host configuration"), the problem here would have been just
> latent via PHYS_MASK_SHIFT (which earlier in turn capped kvm_ipa_limit),
> which remains capped at 48 bits on 4K and 16K configs.

Applied to next, thanks!

[1/1] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size
  commit: 5e5df9571c319fb107d7a523cc96fcc99961ee70

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state

2021-08-11 Thread Paolo Bonzini


On 04/08/21 10:57, Oliver Upton wrote:

KVM's current means of saving/restoring system counters is plagued with
temporal issues. At least on ARM64 and x86, we migrate the guest's
system counter by-value through the respective guest system register
values (cntvct_el0, ia32_tsc). Restoring system counters by-value is
brittle as the state is not idempotent: the host system counter is still
oscillating between the attempted save and restore. Furthermore, VMMs
may wish to transparently live migrate guest VMs, meaning that they
include the elapsed time due to live migration blackout in the guest
system counter view. The VMM thread could be preempted for any number of
reasons (scheduler, L0 hypervisor under nested) between the time that
it calculates the desired guest counter value and when KVM actually sets
this counter state.

Despite the value-based interface that we present to userspace, KVM
actually has idempotent guest controls by way of system counter offsets.
We can avoid all of the issues associated with a value-based interface
by abstracting these offset controls in new ioctls. This series
introduces new vCPU device attributes to provide userspace access to the
vCPU's system counter offset.

Patch 1 addresses a possible race in KVM_GET_CLOCK where
use_master_clock is read outside of the pvclock_gtod_sync_lock.

Patch 2 adopts Paolo's suggestion, augmenting the KVM_{GET,SET}_CLOCK
ioctls to provide userspace with a (host_tsc, realtime) instant. This is
essential for a VMM to perform precise migration of the guest's system
counters.

Patches 3-4 are some preparatory changes for exposing the TSC offset to
userspace. Patch 5 provides a vCPU attribute to provide userspace access
to the TSC offset.

Patches 6-7 implement a test for the new additions to
KVM_{GET,SET}_CLOCK.

Patch 8 fixes some assertions in the kvm device attribute helpers.

Patches 9-10 implement at test for the tsc offset attribute introduced in
patch 5.


The x86 parts look good, except that patch 3 is a bit redundant with my 
idea of altogether getting rid of the pvclock_gtod_sync_lock.  That said 
I agree that patches 1 and 2 (and extracting kvm_vm_ioctl_get_clock and 
kvm_vm_ioctl_set_clock) should be done before whatever locking changes 
have to be done.


Time is ticking for 5.15 due to my vacation, I'll see if I have some 
time to look at it further next week.


I agree that arm64 can be done separately from x86.

Paolo

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK

2021-08-11 Thread Paolo Bonzini


On 04/08/21 10:57, Oliver Upton wrote:

Sean noticed that KVM_GET_CLOCK was checking kvm_arch.use_master_clock
outside of the pvclock sync lock. This is problematic, as the clock
value written to the user may or may not actually correspond to a stable
TSC.

Fix the race by populating the entire kvm_clock_data structure behind
the pvclock_gtod_sync_lock.

Suggested-by: Sean Christopherson 
Signed-off-by: Oliver Upton 
---
  arch/x86/kvm/x86.c | 39 ---
  1 file changed, 28 insertions(+), 11 deletions(-)


I had a completely independent patch that fixed the same race.  It unifies
the read sides of tsc_write_lock and pvclock_gtod_sync_lock into a seqcount
(and replaces pvclock_gtod_sync_lock with kvm->lock on the write side).

I attach it now (based on 
https://lore.kernel.org/kvm/20210811102356.3406687-1-pbonz...@redhat.com/T/#t),
but the testing was extremely light so I'm not sure I will be able to include
it in 5.15.

Paolo

-- 8< -
From: Paolo Bonzini 
Date: Thu, 8 Apr 2021 05:03:44 -0400
Subject: [PATCH] kvm: x86: protect masterclock with a seqcount

Protect the reference point for kvmclock with a seqcount, so that
kvmclock updates for all vCPUs can proceed in parallel.  Xen runstate
updates will also run in parallel and not bounce the kvmclock cacheline.

This also makes it possible to use KVM_REQ_CLOCK_UPDATE (which will
block on the seqcount) to prevent entering in the guests until
pvclock_update_vm_gtod_copy is complete, and thus to get rid of
KVM_REQ_MCLOCK_INPROGRESS.

nr_vcpus_matched_tsc is updated outside pvclock_update_vm_gtod_copy
though, so a spinlock must be kept for that one.

Signed-off-by: Paolo Bonzini 

diff --git a/Documentation/virt/kvm/locking.rst 
b/Documentation/virt/kvm/locking.rst
index 8138201efb09..ed41da172e02 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -29,6 +29,8 @@ The acquisition orders for mutexes are as follows:
 
 On x86:
 
+- the seqcount kvm->arch.pvclock_sc is written under kvm->lock.

+
 - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
 
 - kvm->arch.mmu_lock is an rwlock.  kvm->arch.tdp_mmu_pages_lock is

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 20daaf67a5bf..6889aab92362 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -68,8 +68,7 @@
 #define KVM_REQ_PMIKVM_ARCH_REQ(11)
 #define KVM_REQ_SMIKVM_ARCH_REQ(12)
 #define KVM_REQ_MASTERCLOCK_UPDATE KVM_ARCH_REQ(13)
-#define KVM_REQ_MCLOCK_INPROGRESS \
-   KVM_ARCH_REQ_FLAGS(14, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
+/* 14 unused */
 #define KVM_REQ_SCAN_IOAPIC \
KVM_ARCH_REQ_FLAGS(15, KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
 #define KVM_REQ_GLOBAL_CLOCK_UPDATEKVM_ARCH_REQ(16)
@@ -1067,6 +1066,11 @@ struct kvm_arch {
 
 	unsigned long irq_sources_bitmap;

s64 kvmclock_offset;
+
+   /*
+* This also protects nr_vcpus_matched_tsc which is read from a
+* preemption-disabled region, so it must be a raw spinlock.
+*/
raw_spinlock_t tsc_write_lock;
u64 last_tsc_nsec;
u64 last_tsc_write;
@@ -1077,7 +1081,7 @@ struct kvm_arch {
u64 cur_tsc_generation;
int nr_vcpus_matched_tsc;
 
-	spinlock_t pvclock_gtod_sync_lock;

+   seqcount_mutex_t pvclock_sc;
bool use_master_clock;
u64 master_kernel_ns;
u64 master_cycle_now;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 74145a3fd4f2..7d73c5560260 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2533,9 +2533,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, 
u64 data)
vcpu->arch.this_tsc_write = kvm->arch.cur_tsc_write;
 
 	kvm_vcpu_write_tsc_offset(vcpu, offset);

-   raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 
-	spin_lock_irqsave(&kvm->arch.pvclock_gtod_sync_lock, flags);

if (!matched) {
kvm->arch.nr_vcpus_matched_tsc = 0;
} else if (!already_matched) {
@@ -2543,7 +2541,7 @@ static void kvm_synchronize_tsc(struct kvm_vcpu *vcpu, 
u64 data)
}
 
 	kvm_track_tsc_matching(vcpu);

-   spin_unlock_irqrestore(&kvm->arch.pvclock_gtod_sync_lock, flags);
+   raw_spin_unlock_irqrestore(&kvm->arch.tsc_write_lock, flags);
 }
 
 static inline void adjust_tsc_offset_guest(struct kvm_vcpu *vcpu,

@@ -2730,9 +2728,7 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
struct kvm_arch *ka = &kvm->arch;
int vclock_mode;
bool host_tsc_clocksource, vcpus_matched;
-
-   vcpus_matched = (ka->nr_vcpus_matched_tsc + 1 ==
-   atomic_read(&kvm->online_vcpus));
+   unsigned long flags;
 
 	/*

 * If the host uses TSC clock, then passthrough TSC as stable
@@ -2742,9 +2738,14 @@ static void pvclock_update_vm_gtod_copy(struct kvm *kvm)
&ka->ma

Re: [PATCH v4 00/21] Track shared pages at EL2 in protected mode

2021-08-11 Thread Marc Zyngier

On Mon, 9 Aug 2021 16:24:27 +0100, Quentin Perret wrote:
> This is v4 of the patch series previously posted here:
> 
> https://lore.kernel.org/lkml/20210729132818.4091769-1-qper...@google.com/
> 
> This series aims to improve how the nVHE hypervisor tracks ownership of
> memory pages when running in protected mode ("kvm-arm.mode=protected" on
> the kernel command line).
> 
> [...]

Applied to next, thanks!

[01/21] KVM: arm64: Add hyp_spin_is_locked() for basic locking assertions at EL2
commit: d21292f13f1f0721d60e8122e2db46bea8cf6950
[02/21] KVM: arm64: Introduce hyp_assert_lock_held()
commit: 8e049e0daf23aa380c264e5e15e4c64ea5497ed7
[03/21] KVM: arm64: Provide the host_stage2_try() helper macro
commit: 1bac49d490cbc813f407a5c9806e464bf4a300c9
[05/21] KVM: arm64: Expose page-table helpers
commit: 51add457733bbc4a442fc280d73d14bfe262e4a0
[06/21] KVM: arm64: Optimize host memory aborts
commit: c4f0935e4d957bfcea25ad76860445660a60f3fd
[07/21] KVM: arm64: Rename KVM_PTE_LEAF_ATTR_S2_IGNORED
commit: 178cac08d588e7406a09351a992f57892d8d9cc9
[08/21] KVM: arm64: Don't overwrite software bits with owner id
commit: 8a0282c68121e53ab17413283cfed408a47e1a2a
[09/21] KVM: arm64: Tolerate re-creating hyp mappings to set software bits
commit: b53846c5f279cb5329b82f19a7d313f02cb9d21c
[10/21] KVM: arm64: Enable forcing page-level stage-2 mappings
commit: 5651311941105ca077d3ab74dd4a92e646ecf7fb
[11/21] KVM: arm64: Allow populating software bits
commit: 4505e9b624cefafa4b75d8a28e72f32076c33375
[12/21] KVM: arm64: Add helpers to tag shared pages in SW bits
commit: ec250a67ea8db6209918a389554cf3aec0395b1f
[13/21] KVM: arm64: Expose host stage-2 manipulation helpers
commit: 39257da0e04e5cdb1e4a3ca715dc3d949fe8b059
[14/21] KVM: arm64: Expose pkvm_hyp_id
commit: 2d77e238badb022adb364332b7d6a1d627f77145
[15/21] KVM: arm64: Introduce addr_is_memory()
commit: e009dce1292c37cf8ee7c33e0887ad3c642f980f
[16/21] KVM: arm64: Enable retrieving protections attributes of PTEs
commit: 9024b3d0069ab4b8ef70cf55f0ee09e61f3a0747
[17/21] KVM: arm64: Mark host bss and rodata section as shared
commit: 2c50166c62ba7f3c23c1bbdbb9324db462ddc97b
[18/21] KVM: arm64: Remove __pkvm_mark_hyp
commit: ad0e0139a8e163245d8f44ab4f6ec3bc9b08034d
[19/21] KVM: arm64: Refactor protected nVHE stage-1 locking
commit: f9370010e92638f66473baf342e19de940403362
[20/21] KVM: arm64: Restrict EL2 stage-1 changes in protected mode
commit: 66c57edd3bc79e3527daaae8123f72ecd1e3fa25
[21/21] KVM: arm64: Make __pkvm_create_mappings static
commit: 64a80fb766f9a91e26930bfc56d8e7c12425df12

Note that patch #4 has been used as the base for this series,
and is already part of the mapping level rework.

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

[PATCH V2] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

2021-08-11 Thread Anshuman Khandual

Even though ID_AA64MMFR0.PARANGE reports 52 bit PA size support, it cannot
be enabled as guest IPA size on 4K or 16K page size configurations. Hence
kvm_ipa_limit must be restricted to 48 bits. This change achieves required
IPA capping.

Before the commit c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA
size based on host configuration"), the problem here would have been just
latent via PHYS_MASK_SHIFT (which earlier in turn capped kvm_ipa_limit),
which remains capped at 48 bits on 4K and 16K configs.

Cc: Marc Zyngier 
Cc: James Morse 
Cc: Alexandru Elisei 
Cc: Suzuki K Poulose 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: linux-arm-ker...@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-ker...@vger.kernel.org
Fixes: c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA size based on 
host configuration")
Signed-off-by: Anshuman Khandual 
---
This applies on v5.14-rc5

Changes in V2:

- Replaced IS_ENABLED() based check with PAGE_SIZE based one per Marc
- Moved the conditional code block near parange assignment per Marc

Changes in V1:

https://lore.kernel.org/lkml/1628657549-27584-1-git-send-email-anshuman.khand...@arm.com/

 arch/arm64/kvm/reset.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index cba7872..78d4bd8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -317,6 +317,14 @@ int kvm_set_ipa_limit(void)
mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
parange = cpuid_feature_extract_unsigned_field(mmfr0,
ID_AA64MMFR0_PARANGE_SHIFT);
+   /*
+* IPA size beyond 48 bits could not be supported
+* on either 4K or 16K page size. Hence let's cap
+* it to 48 bits, in case it's reported as larger
+* on the system.
+*/
+   if (PAGE_SIZE != SZ_64K)
+   parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
 
/*
 * Check with ARMv8.5-GTG that our PAGE_SIZE is supported at
-- 
2.7.4

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

2021-08-11 Thread Anshuman Khandual




On 8/11/21 3:06 PM, Marc Zyngier wrote:
> On Wed, 11 Aug 2021 05:52:29 +0100,
> Anshuman Khandual  wrote:
>>
>> Even though ID_AA64MMFR0.PARANGE reports 52 bit PA size support, it cannot
>> be enabled as guest IPA size on 4K or 16K page size configurations. Hence
>> kvm_ipa_limit must be restricted to 48 bits. This change achieves required
>> IPA capping.
>>
>> Before the commit c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA
>> size based on host configuration"), the problem here would have been just
>> latent via PHYS_MASK_SHIFT (which earlier in turn capped kvm_ipa_limit),
>> which remains capped at 48 bits on 4K and 16K configs.
>>
>> Cc: Marc Zyngier 
>> Cc: James Morse 
>> Cc: Alexandru Elisei 
>> Cc: Suzuki K Poulose 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: kvmarm@lists.cs.columbia.edu
>> Cc: linux-ker...@vger.kernel.org
>> Fixes: c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA size based on 
>> host configuration")
>> Signed-off-by: Anshuman Khandual 
>> ---
>> This applies on v5.14-rc5
>>
>>  arch/arm64/kvm/reset.c | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index 20588220fe66..e66974c4b9d3 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -337,6 +337,15 @@ int kvm_set_ipa_limit(void)
>>  return -EINVAL;
>>  }
>>  
>> +/*
>> + * IPA size beyond 48 bits could not be supported
>> + * on either 4K or 16K page size. Hence let's cap
>> + * it to 48 bits, in case it's reported as larger
>> + * on the system.
>> + */
>> +if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES))
> 
> As per our earlier discussion, please use (PAGE_SIZE != SZ_64K)
> instead. This is in keeping with the rest of the file.

Sure, will change.

> 
>> +parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
>> +
> 
> Also, please move it next to the point where we assign parange.

Sure, will move.


> 
>>  kvm_ipa_limit = id_aa64mmfr0_parange_to_phys_shift(parange);
>>  kvm_info("IPA Size Limit: %d bits%s\n", kvm_ipa_limit,
>>   ((kvm_ipa_limit < KVM_PHYS_SHIFT) ?
> 
> Thanks,
> 
>   M.
> 
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH] arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT

2021-08-11 Thread Marc Zyngier

On Tue, 10 Aug 2021 09:59:42 +0530, Anshuman Khandual wrote:
> Streamline the Stage-2 TGRAN value extraction from ID_AA64MMFR0 register by
> adding a page size agnostic ID_AA64MMFR0_TGRAN_2_SHIFT. This is similar to
> the existing Stage-1 TGRAN shift i.e ID_AA64MMFR0_TGRAN_SHIFT.

Applied to kvm-arm64/misc-5.15, thanks!

[1/1] arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT
  commit: 9efb41493ddfb19c7b3d0a21d68be6279520144f

Cheers,

M.
-- 
Without deviation from the norm, progress is not possible.


___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 1/5] KVM: arm64: Drop direct PAGE_[SHIFT|SIZE] usage as page size

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 10:37:36 +0100,
Anshuman Khandual  wrote:
> 
> 
> 
> On 8/11/21 1:41 PM, Marc Zyngier wrote:
> > On Wed, 11 Aug 2021 06:34:46 +0100,
> > Anshuman Khandual  wrote:
> >>
> >>
> >>
> >> On 8/10/21 7:03 PM, Marc Zyngier wrote:
> >>> On 2021-08-10 08:02, Anshuman Khandual wrote:
>  All instances here could just directly test against 
>  CONFIG_ARM64_XXK_PAGES
>  instead of evaluating via PAGE_SHIFT or PAGE_SIZE. With this change, 
>  there
>  will be no such usage left.
> 
>  Cc: Marc Zyngier 
>  Cc: James Morse 
>  Cc: Alexandru Elisei 
>  Cc: Suzuki K Poulose 
>  Cc: Catalin Marinas 
>  Cc: Will Deacon 
>  Cc: linux-arm-ker...@lists.infradead.org
>  Cc: kvmarm@lists.cs.columbia.edu
>  Cc: linux-ker...@vger.kernel.org
>  Signed-off-by: Anshuman Khandual 
>  ---
>   arch/arm64/kvm/hyp/pgtable.c | 6 +++---
>   arch/arm64/mm/mmu.c  | 2 +-
>   2 files changed, 4 insertions(+), 4 deletions(-)
> 
>  diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
>  index 05321f4165e3..a6112b6d6ef6 100644
>  --- a/arch/arm64/kvm/hyp/pgtable.c
>  +++ b/arch/arm64/kvm/hyp/pgtable.c
>  @@ -85,7 +85,7 @@ static bool kvm_level_supports_block_mapping(u32 level)
>    * Reject invalid block mappings and don't bother with 4TB mappings 
>  for
>    * 52-bit PAs.
>    */
>  -    return !(level == 0 || (PAGE_SIZE != SZ_4K && level == 1));
>  +    return !(level == 0 || (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) && level 
>  == 1));
>   }
> 
>   static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, 
>  u32 level)
>  @@ -155,7 +155,7 @@ static u64 kvm_pte_to_phys(kvm_pte_t pte)
>   {
>   u64 pa = pte & KVM_PTE_ADDR_MASK;
> 
>  -    if (PAGE_SHIFT == 16)
>  +    if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
>   pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
> 
>   return pa;
>  @@ -165,7 +165,7 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
>   {
>   kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
> 
>  -    if (PAGE_SHIFT == 16)
>  +    if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
>   pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
> 
>   return pte;
>  diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>  index 9ff0de1b2b93..8fdfca179815 100644
>  --- a/arch/arm64/mm/mmu.c
>  +++ b/arch/arm64/mm/mmu.c
>  @@ -296,7 +296,7 @@ static void alloc_init_cont_pmd(pud_t *pudp,
>  unsigned long addr,
>   static inline bool use_1G_block(unsigned long addr, unsigned long next,
>   unsigned long phys)
>   {
>  -    if (PAGE_SHIFT != 12)
>  +    if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES))
>   return false;
> 
>   if (((addr | next | phys) & ~PUD_MASK) != 0)
> >>>
> >>> I personally find it a lot less readable.
> >>>
> >>> Also, there is no evaluation whatsoever. All the code guarded
> >>> by a PAGE_SIZE/PAGE_SHIFT that doesn't match the configuration
> >>> is dropped at compile time.
> >>
> >> The primary idea here is to unify around IS_ENABLED(CONFIG_ARM64_XXK_PAGES)
> >> usage in arm64, rather than having multiple methods to test page size when
> >> ever required.
> > 
> > I'm sorry, but I find the idiom extremely painful to parse. If you are
> 
> Okay, it was not explained very well. My bad.
> 
> > annoyed with the 'PAGE_SHIFT == 12/14/16', consider replacing it with
> > 'PAGE_SIZE == SZ_4/16/64K' instead.
> 
> Sure, understood. But the problem here is not with PAGE_SHIFT/PAGE_SIZE
> based tests but rather having multiple ways of doing the same thing in
> arm64 tree. Please find further explanation below.
> 
> > 
> > IS_ENABLED(CONFIG_ARM64_XXK_PAGES) also gives the wrong impression
> > that *multiple* page sizes can be selected at any given time. That's
> > obviously not the case, which actually makes PAGE_SIZE a much better
> > choice.
> 
> PAGE_SHIFT and PAGE_SIZE are derived from CONFIG_ARM64_XXK_PAGES. Hence
> why not just directly use the original user selected config option that
> eventually decides PAGE_SHIFT and PAGE_SIZE.
> 
> config ARM64_PAGE_SHIFT
> int
> default 16 if ARM64_64K_PAGES
> default 14 if ARM64_16K_PAGES
> default 12
> 
> arch/arm64/include/asm/page-def.h:#define PAGE_SHIFT  CONFIG_ARM64_PAGE_SHIFT
> arch/arm64/include/asm/page-def.h:#define PAGE_SIZE   (_AC(1, UL) << 
> PAGE_SHIFT)

I'm sorry, but that's only a build system artefact. PAGE_SIZE/SHIFT is
what we use in the kernel at large, not IS_ENABLED(BLAH). It is short,
to the point, and it is guaranteed to be what it says on the tin.

If by some miracle you were going to enable multiple *simultaneous*
page sizes support in the arm64 kernel, I'd certainly look at things
differently. Thankfully, this isn't the case.

> Also there are already similar IS_ENABLED() instance

Re: [PATCH] arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT

2021-08-11 Thread Will Deacon

On Tue, Aug 10, 2021 at 09:59:42AM +0530, Anshuman Khandual wrote:
> Streamline the Stage-2 TGRAN value extraction from ID_AA64MMFR0 register by
> adding a page size agnostic ID_AA64MMFR0_TGRAN_2_SHIFT. This is similar to
> the existing Stage-1 TGRAN shift i.e ID_AA64MMFR0_TGRAN_SHIFT.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Marc Zyngier 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: kvmarm@lists.cs.columbia.edu
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
> This applies on v5.14-rc5.
> 
>  arch/arm64/include/asm/sysreg.h |  3 +++
>  arch/arm64/kvm/reset.c  | 17 ++---
>  2 files changed, 5 insertions(+), 15 deletions(-)

Acked-by: Will Deacon 

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH V2] KVM: arm64: perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEF

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 04:27:06 +0100,
Anshuman Khandual  wrote:
> 
> ID_AA64DFR0_PMUVER_IMP_DEF which indicate implementation defined PMU, never
> actually gets used although there are '0xf' instances scattered all around.
> Just do the macro replacement to improve readability.
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Mark Rutland 
> Cc: Peter Zijlstra 
> Cc: Marc Zyngier 
> Cc: linux-perf-us...@vger.kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: kvmarm@lists.cs.columbia.edu
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 

For some reason I don't understand, b4 doesn't want to cherry-pick
this patch from the list. I've applied it manually to
kvm-arm64/misc-5.15.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 1/5] KVM: arm64: Drop direct PAGE_[SHIFT|SIZE] usage as page size

2021-08-11 Thread Anshuman Khandual



On 8/11/21 1:41 PM, Marc Zyngier wrote:
> On Wed, 11 Aug 2021 06:34:46 +0100,
> Anshuman Khandual  wrote:
>>
>>
>>
>> On 8/10/21 7:03 PM, Marc Zyngier wrote:
>>> On 2021-08-10 08:02, Anshuman Khandual wrote:
 All instances here could just directly test against CONFIG_ARM64_XXK_PAGES
 instead of evaluating via PAGE_SHIFT or PAGE_SIZE. With this change, there
 will be no such usage left.

 Cc: Marc Zyngier 
 Cc: James Morse 
 Cc: Alexandru Elisei 
 Cc: Suzuki K Poulose 
 Cc: Catalin Marinas 
 Cc: Will Deacon 
 Cc: linux-arm-ker...@lists.infradead.org
 Cc: kvmarm@lists.cs.columbia.edu
 Cc: linux-ker...@vger.kernel.org
 Signed-off-by: Anshuman Khandual 
 ---
  arch/arm64/kvm/hyp/pgtable.c | 6 +++---
  arch/arm64/mm/mmu.c  | 2 +-
  2 files changed, 4 insertions(+), 4 deletions(-)

 diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
 index 05321f4165e3..a6112b6d6ef6 100644
 --- a/arch/arm64/kvm/hyp/pgtable.c
 +++ b/arch/arm64/kvm/hyp/pgtable.c
 @@ -85,7 +85,7 @@ static bool kvm_level_supports_block_mapping(u32 level)
   * Reject invalid block mappings and don't bother with 4TB mappings 
 for
   * 52-bit PAs.
   */
 -    return !(level == 0 || (PAGE_SIZE != SZ_4K && level == 1));
 +    return !(level == 0 || (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) && level 
 == 1));
  }

  static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 
 level)
 @@ -155,7 +155,7 @@ static u64 kvm_pte_to_phys(kvm_pte_t pte)
  {
  u64 pa = pte & KVM_PTE_ADDR_MASK;

 -    if (PAGE_SHIFT == 16)
 +    if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
  pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;

  return pa;
 @@ -165,7 +165,7 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
  {
  kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;

 -    if (PAGE_SHIFT == 16)
 +    if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
  pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);

  return pte;
 diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
 index 9ff0de1b2b93..8fdfca179815 100644
 --- a/arch/arm64/mm/mmu.c
 +++ b/arch/arm64/mm/mmu.c
 @@ -296,7 +296,7 @@ static void alloc_init_cont_pmd(pud_t *pudp,
 unsigned long addr,
  static inline bool use_1G_block(unsigned long addr, unsigned long next,
  unsigned long phys)
  {
 -    if (PAGE_SHIFT != 12)
 +    if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES))
  return false;

  if (((addr | next | phys) & ~PUD_MASK) != 0)
>>>
>>> I personally find it a lot less readable.
>>>
>>> Also, there is no evaluation whatsoever. All the code guarded
>>> by a PAGE_SIZE/PAGE_SHIFT that doesn't match the configuration
>>> is dropped at compile time.
>>
>> The primary idea here is to unify around IS_ENABLED(CONFIG_ARM64_XXK_PAGES)
>> usage in arm64, rather than having multiple methods to test page size when
>> ever required.
> 
> I'm sorry, but I find the idiom extremely painful to parse. If you are

Okay, it was not explained very well. My bad.

> annoyed with the 'PAGE_SHIFT == 12/14/16', consider replacing it with
> 'PAGE_SIZE == SZ_4/16/64K' instead.

Sure, understood. But the problem here is not with PAGE_SHIFT/PAGE_SIZE
based tests but rather having multiple ways of doing the same thing in
arm64 tree. Please find further explanation below.

> 
> IS_ENABLED(CONFIG_ARM64_XXK_PAGES) also gives the wrong impression
> that *multiple* page sizes can be selected at any given time. That's
> obviously not the case, which actually makes PAGE_SIZE a much better
> choice.

PAGE_SHIFT and PAGE_SIZE are derived from CONFIG_ARM64_XXK_PAGES. Hence
why not just directly use the original user selected config option that
eventually decides PAGE_SHIFT and PAGE_SIZE.

config ARM64_PAGE_SHIFT
int
default 16 if ARM64_64K_PAGES
default 14 if ARM64_16K_PAGES
default 12

arch/arm64/include/asm/page-def.h:#define PAGE_SHIFTCONFIG_ARM64_PAGE_SHIFT
arch/arm64/include/asm/page-def.h:#define PAGE_SIZE (_AC(1, UL) << 
PAGE_SHIFT)

Also there are already similar IS_ENABLED() instances which do not
create much confusion. The point here being, to have just a single
method that checks compiled page size support, instead of three
different ways of doing the same thing.

- IS_ENABLED(CONFIG_ARM64_XXK_PAGES)
- if (PAGE_SHIFT == XX)
- if (PAGE_SIZE == XX)

$git grep IS_ENABLED arch/arm64/ | grep PAGES

arch/arm64/include/asm/vmalloc.h:   return 
IS_ENABLED(CONFIG_ARM64_4K_PAGES) &&
arch/arm64/mm/mmu.c:BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
arch/arm64/mm/mmu.c:BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));

> 
> As things stand, I don't plan to take such a patch.

Sure, will drop it from the series if the above explanation and
the rationale for the patch

Re: [PATCH] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 05:52:29 +0100,
Anshuman Khandual  wrote:
> 
> Even though ID_AA64MMFR0.PARANGE reports 52 bit PA size support, it cannot
> be enabled as guest IPA size on 4K or 16K page size configurations. Hence
> kvm_ipa_limit must be restricted to 48 bits. This change achieves required
> IPA capping.
> 
> Before the commit c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA
> size based on host configuration"), the problem here would have been just
> latent via PHYS_MASK_SHIFT (which earlier in turn capped kvm_ipa_limit),
> which remains capped at 48 bits on 4K and 16K configs.
> 
> Cc: Marc Zyngier 
> Cc: James Morse 
> Cc: Alexandru Elisei 
> Cc: Suzuki K Poulose 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: kvmarm@lists.cs.columbia.edu
> Cc: linux-ker...@vger.kernel.org
> Fixes: c9b69a0cf0b4 ("KVM: arm64: Don't constrain maximum IPA size based on 
> host configuration")
> Signed-off-by: Anshuman Khandual 
> ---
> This applies on v5.14-rc5
> 
>  arch/arm64/kvm/reset.c | 9 +
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 20588220fe66..e66974c4b9d3 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -337,6 +337,15 @@ int kvm_set_ipa_limit(void)
>   return -EINVAL;
>   }
>  
> + /*
> +  * IPA size beyond 48 bits could not be supported
> +  * on either 4K or 16K page size. Hence let's cap
> +  * it to 48 bits, in case it's reported as larger
> +  * on the system.
> +  */
> + if (!IS_ENABLED(CONFIG_ARM64_64K_PAGES))

As per our earlier discussion, please use (PAGE_SIZE != SZ_64K)
instead. This is in keeping with the rest of the file.

> + parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
> +

Also, please move it next to the point where we assign parange.

>   kvm_ipa_limit = id_aa64mmfr0_parange_to_phys_shift(parange);
>   kvm_info("IPA Size Limit: %d bits%s\n", kvm_ipa_limit,
>((kvm_ipa_limit < KVM_PHYS_SHIFT) ?

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset

2021-08-11 Thread Marc Zyngier

On Tue, 10 Aug 2021 18:55:12 +0100,
Oliver Upton  wrote:
> 
> On Tue, Aug 10, 2021 at 3:56 AM Marc Zyngier  wrote:
> >
> > On Wed, 04 Aug 2021 09:58:15 +0100,
> > Oliver Upton  wrote:

[...]

> > > diff --git a/include/clocksource/arm_arch_timer.h 
> > > b/include/clocksource/arm_arch_timer.h
> > > index 73c7139c866f..7252ffa3d675 100644
> > > --- a/include/clocksource/arm_arch_timer.h
> > > +++ b/include/clocksource/arm_arch_timer.h
> > > @@ -21,6 +21,7 @@
> > >  #define CNTHCTL_EVNTEN   (1 << 2)
> > >  #define CNTHCTL_EVNTDIR  (1 << 3)
> > >  #define CNTHCTL_EVNTI(0xF << 4)
> > > +#define CNTHCTL_ECV  (1 << 12)
> > >
> > >  enum arch_timer_reg {
> > >   ARCH_TIMER_REG_CTRL,
> >
> > You also want to document that SCR_EL3.ECVEn has to be set to 1 for
> > this to work (see Documentation/arm64/booting.txt). And if it isn't,
> > the firmware better handle the CNTPOFF_EL2 traps correctly...
> 
> I'll grab the popcorn now ;-) Adding docs for this, good idea.
> 
> > What firmware did you use for this? I think we need to update the boot
> > wrapper, but that's something that can be done in parallel.
> 
> I had actually just done a direct boot from ARM-TF -> Linux, nothing
> else in between.

Ah, right. I tend to use the boot-wrapper[1] to build a single binary
that contains the 'boot loader', DT and kernel. Using ATF is probably
more representative of the final thing, but the boot-wrapper is dead
easy to hack on...

Thanks,

M.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/boot-wrapper-aarch64.git

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

2021-08-11 Thread Shameerali Kolothum Thodi

Hi Will,

> -Original Message-
> From: Will Deacon [mailto:w...@kernel.org]
> Sent: 03 August 2021 16:31
> To: Shameerali Kolothum Thodi 
> Cc: linux-arm-ker...@lists.infradead.org; kvmarm@lists.cs.columbia.edu;
> linux-ker...@vger.kernel.org; m...@kernel.org; catalin.mari...@arm.com;
> james.mo...@arm.com; julien.thierry.k...@gmail.com;
> suzuki.poul...@arm.com; jean-phili...@linaro.org;
> alexandru.eli...@arm.com; qper...@google.com; Linuxarm
> 
> Subject: Re: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU
> schedule out

[...]
 
> I think we have to be really careful not to run into the "suspended
> animation" problem described in ae120d9edfe9 ("ARM: 7767/1: let the ASID
> allocator handle suspended animation") if we go down this road.
> 
> Maybe something along the lines of:
> 
> ROLLOVER
> 
>   * Take lock
>   * Inc generation
> => This will force everybody down the slow path
>   * Record active VMIDs
>   * Broadcast TLBI
> => Only active VMIDs can be dirty
> => Reserve active VMIDs and mark as allocated
> 
> VCPU SCHED IN
> 
>   * Set active VMID
>   * Check generation
>   * If mismatch then:
> * Take lock
> * Try to match a reserved VMID
> * If no reserved VMID, allocate new
> 
> VCPU SCHED OUT
> 
>   * Clear active VMID
> 
> but I'm not daft enough to think I got it right first time. I think it
> needs both implementing *and* modelling in TLA+ before we merge it!

I attempted to implement the above algo as below. It seems to be
working in both 16-bit vmid and 4-bit vmid test setup. Though I am
not quite sure this Is exactly what you had in mind above and covers
all corner cases.

Please take a look and let me know.
(The diff below is against this v3 series)

Thanks,
Shameer

--->8<

--- a/arch/arm64/kvm/vmid.c
+++ b/arch/arm64/kvm/vmid.c
@@ -43,7 +43,7 @@ static void flush_context(void)
bitmap_clear(vmid_map, 0, NUM_USER_VMIDS);

for_each_possible_cpu(cpu) {
-   vmid = atomic64_xchg_relaxed(&per_cpu(active_vmids, cpu), 0);
+   vmid = atomic64_read(&per_cpu(active_vmids, cpu));

/* Preserve reserved VMID */
if (vmid == 0)
@@ -125,32 +125,17 @@ void kvm_arm_vmid_clear_active(void)
 void kvm_arm_vmid_update(struct kvm_vmid *kvm_vmid)
 {
unsigned long flags;
-   u64 vmid, old_active_vmid;
+   u64 vmid;

vmid = atomic64_read(&kvm_vmid->id);
-
-   /*
-* Please refer comments in check_and_switch_context() in
-* arch/arm64/mm/context.c.
-*/
-   old_active_vmid = atomic64_read(this_cpu_ptr(&active_vmids));
-   if (old_active_vmid && vmid_gen_match(vmid) &&
-   atomic64_cmpxchg_relaxed(this_cpu_ptr(&active_vmids),
-old_active_vmid, vmid))
+   if (vmid_gen_match(vmid)) {
+   atomic64_set(this_cpu_ptr(&active_vmids), vmid);
return;
-
-   raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
-
-   /* Check that our VMID belongs to the current generation. */
-   vmid = atomic64_read(&kvm_vmid->id);
-   if (!vmid_gen_match(vmid)) {
-   vmid = new_vmid(kvm_vmid);
-   atomic64_set(&kvm_vmid->id, vmid);
}

-
+   raw_spin_lock_irqsave(&cpu_vmid_lock, flags);
+   vmid = new_vmid(kvm_vmid);
+   atomic64_set(&kvm_vmid->id, vmid);
atomic64_set(this_cpu_ptr(&active_vmids), vmid);
raw_spin_unlock_irqrestore(&cpu_vmid_lock, flags);
 }
--->8<




___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 1/5] KVM: arm64: Drop direct PAGE_[SHIFT|SIZE] usage as page size

2021-08-11 Thread Marc Zyngier

On Wed, 11 Aug 2021 06:34:46 +0100,
Anshuman Khandual  wrote:
> 
> 
> 
> On 8/10/21 7:03 PM, Marc Zyngier wrote:
> > On 2021-08-10 08:02, Anshuman Khandual wrote:
> >> All instances here could just directly test against CONFIG_ARM64_XXK_PAGES
> >> instead of evaluating via PAGE_SHIFT or PAGE_SIZE. With this change, there
> >> will be no such usage left.
> >>
> >> Cc: Marc Zyngier 
> >> Cc: James Morse 
> >> Cc: Alexandru Elisei 
> >> Cc: Suzuki K Poulose 
> >> Cc: Catalin Marinas 
> >> Cc: Will Deacon 
> >> Cc: linux-arm-ker...@lists.infradead.org
> >> Cc: kvmarm@lists.cs.columbia.edu
> >> Cc: linux-ker...@vger.kernel.org
> >> Signed-off-by: Anshuman Khandual 
> >> ---
> >>  arch/arm64/kvm/hyp/pgtable.c | 6 +++---
> >>  arch/arm64/mm/mmu.c  | 2 +-
> >>  2 files changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> >> index 05321f4165e3..a6112b6d6ef6 100644
> >> --- a/arch/arm64/kvm/hyp/pgtable.c
> >> +++ b/arch/arm64/kvm/hyp/pgtable.c
> >> @@ -85,7 +85,7 @@ static bool kvm_level_supports_block_mapping(u32 level)
> >>   * Reject invalid block mappings and don't bother with 4TB mappings 
> >> for
> >>   * 52-bit PAs.
> >>   */
> >> -    return !(level == 0 || (PAGE_SIZE != SZ_4K && level == 1));
> >> +    return !(level == 0 || (!IS_ENABLED(CONFIG_ARM64_4K_PAGES) && level 
> >> == 1));
> >>  }
> >>
> >>  static bool kvm_block_mapping_supported(u64 addr, u64 end, u64 phys, u32 
> >> level)
> >> @@ -155,7 +155,7 @@ static u64 kvm_pte_to_phys(kvm_pte_t pte)
> >>  {
> >>  u64 pa = pte & KVM_PTE_ADDR_MASK;
> >>
> >> -    if (PAGE_SHIFT == 16)
> >> +    if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
> >>  pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
> >>
> >>  return pa;
> >> @@ -165,7 +165,7 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
> >>  {
> >>  kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
> >>
> >> -    if (PAGE_SHIFT == 16)
> >> +    if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
> >>  pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
> >>
> >>  return pte;
> >> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> >> index 9ff0de1b2b93..8fdfca179815 100644
> >> --- a/arch/arm64/mm/mmu.c
> >> +++ b/arch/arm64/mm/mmu.c
> >> @@ -296,7 +296,7 @@ static void alloc_init_cont_pmd(pud_t *pudp,
> >> unsigned long addr,
> >>  static inline bool use_1G_block(unsigned long addr, unsigned long next,
> >>  unsigned long phys)
> >>  {
> >> -    if (PAGE_SHIFT != 12)
> >> +    if (!IS_ENABLED(CONFIG_ARM64_4K_PAGES))
> >>  return false;
> >>
> >>  if (((addr | next | phys) & ~PUD_MASK) != 0)
> > 
> > I personally find it a lot less readable.
> > 
> > Also, there is no evaluation whatsoever. All the code guarded
> > by a PAGE_SIZE/PAGE_SHIFT that doesn't match the configuration
> > is dropped at compile time.
> 
> The primary idea here is to unify around IS_ENABLED(CONFIG_ARM64_XXK_PAGES)
> usage in arm64, rather than having multiple methods to test page size when
> ever required.

I'm sorry, but I find the idiom extremely painful to parse. If you are
annoyed with the 'PAGE_SHIFT == 12/14/16', consider replacing it with
'PAGE_SIZE == SZ_4/16/64K' instead.

IS_ENABLED(CONFIG_ARM64_XXK_PAGES) also gives the wrong impression
that *multiple* page sizes can be selected at any given time. That's
obviously not the case, which actually makes PAGE_SIZE a much better
choice.

As things stand, I don't plan to take such a patch.

Thanks,

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH v3] memblock: make memblock_find_in_range method private

2021-08-11 Thread Mike Rapoport

On Tue, Aug 10, 2021 at 12:21:46PM -0700, Guenter Roeck wrote:
> On 8/10/21 11:55 AM, Mike Rapoport wrote:
> > On Mon, Aug 09, 2021 at 12:06:41PM -0700, Guenter Roeck wrote:
> > > On Tue, Aug 03, 2021 at 09:42:18AM +0300, Mike Rapoport wrote:
> > > > From: Mike Rapoport 
> > > > 
> > > > There are a lot of uses of memblock_find_in_range() along with
> > > > memblock_reserve() from the times memblock allocation APIs did not 
> > > > exist.
> > > > 
> > > > memblock_find_in_range() is the very core of memblock allocations, so 
> > > > any
> > > > future changes to its internal behaviour would mandate updates of all 
> > > > the
> > > > users outside memblock.
> > > > 
> > > > Replace the calls to memblock_find_in_range() with an equivalent calls 
> > > > to
> > > > memblock_phys_alloc() and memblock_phys_alloc_range() and make
> > > > memblock_find_in_range() private method of memblock.
> > > > 
> > > > This simplifies the callers, ensures that (unlikely) errors in
> > > > memblock_reserve() are handled and improves maintainability of
> > > > memblock_find_in_range().
> > > > 
> > > > Signed-off-by: Mike Rapoport 
> > > 
> > > I see a number of crashes in next-20210806 when booting x86 images from 
> > > efi.
> > > 
> > > [0.00] efi: EFI v2.70 by EDK II
> > > [0.00] efi: SMBIOS=0x1fbcc000 ACPI=0x1fbfa000 ACPI 2.0=0x1fbfa014 
> > > MEMATTR=0x1f25f018
> > > [0.00] SMBIOS 2.8 present.
> > > [0.00] DMI: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 
> > > 02/06/2015
> > > [0.00] last_pfn = 0x1ff50 max_arch_pfn = 0x4
> > > [0.00] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- 
> > > WT
> > > [0.00] Kernel panic - not syncing: alloc_low_pages: can not alloc 
> > > memory
> > > [0.00] CPU: 0 PID: 0 Comm: swapper Not tainted 
> > > 5.14.0-rc4-next-20210806 #1
> > > [0.00] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> > > 0.0.0 02/06/2015
> > > [0.00] Call Trace:
> > > [0.00]  ? dump_stack_lvl+0x57/0x7d
> > > [0.00]  ? panic+0xfc/0x2c6
> > > [0.00]  ? alloc_low_pages+0x117/0x156
> > > [0.00]  ? phys_pmd_init+0x234/0x342
> > > [0.00]  ? phys_pud_init+0x171/0x337
> > > [0.00]  ? __kernel_physical_mapping_init+0xec/0x276
> > > [0.00]  ? init_memory_mapping+0x1ea/0x2aa
> > > [0.00]  ? init_range_memory_mapping+0xdf/0x12e
> > > [0.00]  ? init_mem_mapping+0x1e9/0x26f
> > > [0.00]  ? setup_arch+0x5ff/0xb6d
> > > [0.00]  ? start_kernel+0x71/0x6b4
> > > [0.00]  ? secondary_startup_64_no_verify+0xc2/0xcb
> > > 
> > > Bisect points to this patch. Reverting it fixes the problem. Key seems to
> > > be the amount of memory configured in qemu; the problem is not seen if
> > > there is 1G or more of memory, but it is seen with all test boots with
> > > 512M or 256M of memory. It is also seen with almost all 32-bit efi boots.
> > > 
> > > The problem is not seen when booting without efi.
> > 
> > It looks like this change uncovered a problem in
> > x86::memory_map_top_down().
> > 
> > The allocation in alloc_low_pages() is limited by min_pfn_mapped and
> > max_pfn_mapped. The min_pfn_mapped is updated at every iteration of the
> > loop in memory_map_top_down, but there is another loop in
> > init_range_memory_mapping() that maps several regions below the current
> > min_pfn_mapped without updating this variable.
> > 
> > The memory layout in qemu with 256M of RAM and EFI enabled, causes
> > exhaustion of the memory limited by min_pfn_mapped and max_pfn_mapped
> > before min_pfn_mapped is updated.
> > 
> > Before this commit there was unconditional "reservation" of 2M in the end
> > of the memory that moved the initial min_pfn_mapped below the memory
> > reserved by EFI. The addition of check for xen_domain() removed this
> > reservation for !XEN and made alloc_low_pages() use the range already busy
> > with EFI data.
> > 
> > The patch below moves the update of min_pfn_mapped near the update of
> > max_pfn_mapped so that every time a new range is mapped both limits will be
> > updated accordingly.
> > 
> > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
> > index 1152a29ce109..be279f6e5a0a 100644
> > --- a/arch/x86/mm/init.c
> > +++ b/arch/x86/mm/init.c
> > @@ -1,3 +1,4 @@
> > +#define DEBUG
> >   #include 
> >   #include 
> >   #include 
> > @@ -485,6 +486,7 @@ static void add_pfn_range_mapped(unsigned long 
> > start_pfn, unsigned long end_pfn)
> > nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_MAX_ENTRIES);
> > max_pfn_mapped = max(max_pfn_mapped, end_pfn);
> > +   min_pfn_mapped = min(min_pfn_mapped, start_pfn);
> > if (start_pfn < (1UL<<(32-PAGE_SHIFT)))
> > max_low_pfn_mapped = max(max_low_pfn_mapped,
> > @@ -643,7 +645,6 @@ static void __init memory_map_top_down(unsigned long 
> > map_start,
> > mapped_ram_size += init_range_memory_mapping(start,
> >

[PATCH v4 2/2] memblock: make memblock_find_in_range method private

[PATCH v4 1/2] x86/mm: memory_map_top_down: remove spurious reservation of upper 2M

[PATCH v4 0/2] memblock: make memblock_find_in_range method private

[PATCH V2 5/5] KVM: arm64: Drop unused REQUIRES_VIRT

[PATCH V2 4/5] KVM: arm64: Drop check_kvm_target_cpu() based percpu probe

[PATCH V2 3/5] KVM: arm64: Drop init_common_resources()

[PATCH V2 2/5] KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPA

[PATCH V2 1/5] arm64/mm: Add remaining ID_AA64MMFR0_PARANGE_ macros

[PATCH V2 0/5] KVM: arm64: General cleanups

Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state

[PATCH] KVM: arm64: Return -EPERM from __pkvm_host_share_hyp()

Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state

Re: [PATCH] KVM: arm64: Return -EPERM from __pkvm_host_share_hyp()

Re: [PATCH v6 13/21] KVM: arm64: Allow userspace to configure a vCPU's virtual offset

Re: [PATCH V2] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

Re: [PATCH v6 00/21] KVM: Add idempotent controls for migrating system counter state

Re: [PATCH v6 01/21] KVM: x86: Fix potential race in KVM_GET_CLOCK

Re: [PATCH v4 00/21] Track shared pages at EL2 in protected mode

[PATCH V2] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

Re: [PATCH] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

Re: [PATCH] arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT

Re: [PATCH 1/5] KVM: arm64: Drop direct PAGE_[SHIFT|SIZE] usage as page size

Re: [PATCH] arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT

Re: [PATCH V2] KVM: arm64: perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEF

Re: [PATCH 1/5] KVM: arm64: Drop direct PAGE_[SHIFT|SIZE] usage as page size

Re: [PATCH] KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size

Re: [PATCH v6 17/21] KVM: arm64: Allow userspace to configure a guest's counter-timer offset

RE: [PATCH v3 4/4] KVM: arm64: Clear active_vmids on vCPU schedule out

Re: [PATCH 1/5] KVM: arm64: Drop direct PAGE_[SHIFT|SIZE] usage as page size

Re: [PATCH v3] memblock: make memblock_find_in_range method private

30 matches

Site Navigation

Mail list logo

Footer information