Re: [PATCH v3 01/22] arm64: Add macros to read/write system registers

2015-12-07 Thread Catalin Marinas
On Mon, Dec 07, 2015 at 10:53:17AM +, Marc Zyngier wrote:
> From: Mark Rutland 
> 
> Rather than crafting custom macros for reading/writing each system
> register provide generics accessors, read_sysreg and write_sysreg, for
> this purpose.
> 
> Unlike read_cpuid, calls to read_exception_reg are never expected
> to be optimized away or replaced with synthetic values.

What's read_exception_reg? Is it a macro somewhere?

> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index d48ab5b..c9c283a 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -20,6 +20,8 @@
>  #ifndef __ASM_SYSREG_H
>  #define __ASM_SYSREG_H
>  
> +#include 
> +
>  #include 
>  
>  /*
> @@ -208,6 +210,8 @@
>  
>  #else
>  
> +#include 
> +
>  asm(
>  ".irp
> num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
>  ".equ__reg_num_x\\num, \\num\n"
> @@ -232,6 +236,19 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
>   val |= set;
>   asm volatile("msr sctlr_el1, %0" : : "r" (val));
>  }
> +
> +#define read_sysreg(r) ({\
> + u64 __val;  \
> + asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
> + __val;  \
> +})

And maybe a comment here on why this is always volatile.

Otherwise:

Acked-by: Catalin Marinas 

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 03/15] arm64: Introduce helpers for page table levels

2015-10-08 Thread Catalin Marinas
On Thu, Oct 08, 2015 at 06:22:34PM +0100, Suzuki K. Poulose wrote:
> On 08/10/15 15:45, Christoffer Dall wrote:
> >On Wed, Oct 07, 2015 at 10:26:14AM +0100, Marc Zyngier wrote:
> >>I just had a chat with Catalin, who did shed some light on this.
> >>It all has to do with rounding up. What you would like to have here is:
> >>
> >>#define ARM64_HW_PGTABLE_LEVELS(va_bits) DIV_ROUND_UP(va_bits - PAGE_SHIFT, 
> >>PAGE_SHIFT - 3)
> >>
> >>where (va_bits - PAGE_SHIFT) is the total number of bits we deal
> >>with during a page table walk, and (PAGE_SHIFT - 3) is the number
> >>of bits we deal with per level.
> >>
> >>The clue is in how DIV_ROUND_UP is written:
> >>
> >>#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))
> >>
> >>which gives you Suzuki's magic formula.
> >>
> >>I'd vote for the DIV_ROUND_UP(), which will make things a lot more readable.
> >>
> >Thanks for the explanation, I vote for DIV_ROUND_UP too.
> 
> Btw, DIV_ROUND_UP is defined in linux/kernel.h, including which in the 
> required
> headers breaks the build. I could add the definition of the same locally.

Or just keep the original magic formula and add the DIV_ROUND_UP one in
a comment.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/15] arm64: Add page size to the kernel image header

2015-10-02 Thread Catalin Marinas
On Fri, Oct 02, 2015 at 04:49:01PM +0100, Catalin Marinas wrote:
> On Tue, Sep 15, 2015 at 04:41:18PM +0100, Suzuki K. Poulose wrote:
> > From: Ard Biesheuvel 
> > 
> > This patch adds the page size to the arm64 kernel image header
> > so that one can infer the PAGESIZE used by the kernel. This will
> > be helpful to diagnose failures to boot the kernel with page size
> > not supported by the CPU.
> > 
> > Signed-off-by: Ard Biesheuvel 
> 
> This patch needs you signed-off-by as well since you are posting it. And
> IIRC I acked it as well, I'll check.
> 
> If you are fine with adding your signed-of-by, I can add it to the patch
> when applying (unless I see other issues with the series).

Actually, I just realised that the kvm patches don't have any acks, so
I'm not taking this series yet. You may want to repost once you have all
the acks in place.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 09/15] arm64: Add page size to the kernel image header

2015-10-02 Thread Catalin Marinas
On Tue, Sep 15, 2015 at 04:41:18PM +0100, Suzuki K. Poulose wrote:
> From: Ard Biesheuvel 
> 
> This patch adds the page size to the arm64 kernel image header
> so that one can infer the PAGESIZE used by the kernel. This will
> be helpful to diagnose failures to boot the kernel with page size
> not supported by the CPU.
> 
> Signed-off-by: Ard Biesheuvel 

This patch needs you signed-off-by as well since you are posting it. And
IIRC I acked it as well, I'll check.

If you are fine with adding your signed-of-by, I can add it to the patch
when applying (unless I see other issues with the series).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/14] arm64: Check for selected granule support

2015-09-04 Thread Catalin Marinas
On Wed, Sep 02, 2015 at 12:19:07PM +0200, Ard Biesheuvel wrote:
> On 2 September 2015 at 11:48, Ard Biesheuvel  
> wrote:
> > Couldn't we allocate some flag bits in the Image header to communicate
> > the page size to the bootloader?
> 
> Something like this perhaps?
> 
> 8<---
> diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
> index 7d9d3c2286b2..13a8aaa9a6e9 100644
> --- a/Documentation/arm64/booting.txt
> +++ b/Documentation/arm64/booting.txt
> @@ -104,7 +104,8 @@ Header notes:
>  - The flags field (introduced in v3.17) is a little-endian 64-bit field
>composed as follows:
>Bit 0:   Kernel endianness.  1 if BE, 0 if LE.
> -  Bits 1-63:   Reserved.
> +  Bits 1-2:Kernel page size.   0=unspecified, 1=4K, 2=16K, 3=64K
> +  Bits 3-63:   Reserved.
> 
>  - When image_size is zero, a bootloader should attempt to keep as much
>memory as possible free for use by the kernel immediately after the
> diff --git a/arch/arm64/kernel/image.h b/arch/arm64/kernel/image.h
> index 8fae0756e175..5def289bda84 100644
> --- a/arch/arm64/kernel/image.h
> +++ b/arch/arm64/kernel/image.h
> @@ -47,7 +47,9 @@
>  #define __HEAD_FLAG_BE 0
>  #endif
> 
> -#define __HEAD_FLAGS   (__HEAD_FLAG_BE << 0)
> +#define __HEAD_FLAG_PAGE_SIZE ((PAGE_SHIFT - 10) / 2)
> +
> +#define __HEAD_FLAGS   (__HEAD_FLAG_BE << 0) | (__HEAD_FLAG_PAGE_SIZE << 1)

I'm fine with this.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/14] arm64: Check for selected granule support

2015-08-13 Thread Catalin Marinas
On Thu, Aug 13, 2015 at 03:45:07PM +0100, Suzuki K. Poulose wrote:
> On 13/08/15 13:28, Steve Capper wrote:
> >On 13 August 2015 at 12:34, Suzuki K. Poulose  wrote:
> >>  __enable_mmu:
> >>+   mrs x1, ID_AA64MMFR0_EL1
> >>+   ubfxx2, x1, #ID_AA64MMFR0_TGran_SHIFT, 4
> >>+   cmp x2, #ID_AA64MMFR0_TGran_ENABLED
> >>+   b.ne__no_granule_support
> >> ldr x5, =vectors
> >> msr vbar_el1, x5
> >> msr ttbr0_el1, x25  // load TTBR0
> >>@@ -626,3 +643,8 @@ __enable_mmu:
> >> isb
> >> br  x27
> >>  ENDPROC(__enable_mmu)
> >>+
> >>+__no_granule_support:
> >>+   wfe
> >>+   b __no_granule_support
> >>+ENDPROC(__no_granule_support)
> >>--
> >>1.7.9.5
> >>
> >
> >Is is possible to tell the user that the kernel has failed to boot due
> >to the kernel granule being unsupported?
> 
> We don't have anything up at this time. The "looping address" is actually a 
> clue
> to the (expert) user. Not sure we can do something, until we get something 
> like DEBUG_LL(?)

No.

> Or we should let it continue and end in a panic(?). The current situation can 
> boot a
> multi-cluster system with boot cluster having the Tgran support(which doesn't 
> make a
> strong use case though). I will try out some options and get back to you.

If the boot CPU does not support 16KB pages, in general there isn't much
we can do since the console printing is done after we enabled the MMU.
Even mapping the UART address requires fixmap support and the PAGE_SIZE
is hard-coded in the kernel image. The DT is also mapped at run-time.

While in theory it's possible to fall back to a 4KB page size just
enough to load the DT and figure out the early console, I suggest we
just live with the "looping address" clue.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/6] irqchip: GIC: Convert to EOImode == 1

2015-08-12 Thread Catalin Marinas
On Wed, Aug 12, 2015 at 02:31:47PM +0100, Marc Zyngier wrote:
> On 11/08/15 10:15, Eric Auger wrote:
> > On 07/09/2015 03:19 PM, Marc Zyngier wrote:
> >>  static int gic_irq_set_irqchip_state(struct irq_data *d,
> >> @@ -272,11 +278,15 @@ static void __exception_irq_entry 
> >> gic_handle_irq(struct pt_regs *regs)
> >>irqnr = irqstat & GICC_IAR_INT_ID_MASK;
> >>  
> >>if (likely(irqnr > 15 && irqnr < 1021)) {
> > shouldn't we have < 1020?
> 
> Looks like you have unearthed a very long standing (though not fatal)
> bug - I can trace it back to 2005 and the inclusion of the Realview
> support (see include/asm-arm/arch-realview/entry-macro.S in 8ad68bbf for
> the details).
> 
> It may be that the original GIC didn't make number 1020 a special one,
> though the earliest spec I have access to (GICv1) is already making 1020
> a reserved interrupt number. And looking at the pre-existing code
> (arch/arm/common/gic.c), 1020 seems to already be considered an invalid
> number.
> 
> CC-ing Catalin, as he was the one who introduced it... ;-) Unless he
> says otherwise, I'll cook a patch for that.

I really have no idea where it came from. The code probably pre-dates
the existence of a GIC architecture spec (the GIC spec used to be part
of the board or CPU TRM).

I don't see any problem with using 1020 here.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/13] arm64: Virtualization Host Extension support

2015-08-06 Thread Catalin Marinas
On Wed, Jul 08, 2015 at 05:19:03PM +0100, Marc Zyngier wrote:
> Marc Zyngier (13):
>   arm/arm64: Add new is_kernel_in_hyp_mode predicate
>   arm64: Allow the arch timer to use the HYP timer
>   arm64: Add ARM64_HAS_VIRT_HOST_EXTN feature
>   arm64: KVM: skip HYP setup when already running in HYP
>   arm64: KVM: VHE: macroize VTCR_EL2 setup
>   arm64: KVM: VHE: Patch out kern_hyp_va
>   arm64: KVM: VHE: Patch out use of HVC
>   arm64: KVM: VHE: Preserve VHE config in world switch
>   arm64: KVM: VHE: Add alternatives for VHE-enabled world-switch
>   arm64: Add support for running Linux in EL2 mode
>   arm64: Panic when VHE and non VHE CPUs coexist
>   arm64: KVM: Split sysreg save/restore
>   arm64: KVM: VHE: Early interrupt handling

Do we need to do anything with the debug code? Do we have any
hardware breakpoints/watchpoints targeting kernel space (kgdb doesn't
seem to support this)?

If a breakpoint target is EL1, I don't think we trigger it when running
in the EL2/VHE mode, in which case we need a different
DBGBCR.{HMC,SSC,PMC} combination - {1,11,00}.

Another random untested patch below but we need to get Will to remember
the code he wrote (and the VHE implications):

diff --git a/arch/arm64/include/asm/hw_breakpoint.h 
b/arch/arm64/include/asm/hw_breakpoint.h
index 52b484b6aa1a..197af39a5ffb 100644
--- a/arch/arm64/include/asm/hw_breakpoint.h
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -34,8 +34,12 @@ struct arch_hw_breakpoint {
 
 static inline u32 encode_ctrl_reg(struct arch_hw_breakpoint_ctrl ctrl)
 {
-   return (ctrl.len << 5) | (ctrl.type << 3) | (ctrl.privilege << 1) |
+   u32 reg = (ctrl.len << 5) | (ctrl.type << 3) | (ctrl.privilege << 1) |
ctrl.enabled;
+   /* set HMC and SSC when debug target is EL2 */
+   if (ctrl.privilege == AARCH64_BREAKPOINT_EL2)
+   reg |= (3 << 14) | (1 << 13);
+   return reg
 }
 
 static inline void decode_ctrl_reg(u32 reg,
@@ -59,6 +63,7 @@ static inline void decode_ctrl_reg(u32 reg,
 #define AARCH64_ESR_ACCESS_MASK(1 << 6)
 
 /* Privilege Levels */
+#define AARCH64_BREAKPOINT_EL2 0
 #define AARCH64_BREAKPOINT_EL1 1
 #define AARCH64_BREAKPOINT_EL0 2
 
diff --git a/arch/arm64/kernel/hw_breakpoint.c 
b/arch/arm64/kernel/hw_breakpoint.c
index 7a1a5da6c8c1..77866839d1e8 100644
--- a/arch/arm64/kernel/hw_breakpoint.c
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -162,6 +162,7 @@ static enum debug_el debug_exception_level(int privilege)
case AARCH64_BREAKPOINT_EL0:
return DBG_ACTIVE_EL0;
case AARCH64_BREAKPOINT_EL1:
+   case AARCH64_BREAKPOINT_EL2:
return DBG_ACTIVE_EL1;
default:
pr_warning("invalid breakpoint privilege level %d\n", 
privilege);
@@ -456,7 +457,8 @@ static int arch_build_bp_info(struct perf_event *bp)
 * that would complicate the stepping code.
 */
if (arch_check_bp_in_kernelspace(bp))
-   info->ctrl.privilege = AARCH64_BREAKPOINT_EL1;
+   info->ctrl.privilege = is_kernel_in_hyp_mode() ?
+   AARCH64_BREAKPOINT_EL2 : AARCH64_BREAKPOINT_EL1;
else
info->ctrl.privilege = AARCH64_BREAKPOINT_EL0;
 
@@ -526,7 +528,7 @@ int arch_validate_hwbkpt_settings(struct perf_event *bp)
 * Disallow per-task kernel breakpoints since these would
 * complicate the stepping code.
 */
-   if (info->ctrl.privilege == AARCH64_BREAKPOINT_EL1 && bp->hw.target)
+   if (info->ctrl.privilege != AARCH64_BREAKPOINT_EL0 && bp->hw.target)
return -EINVAL;
 
return 0;

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/13] arm64: Panic when VHE and non VHE CPUs coexist

2015-08-06 Thread Catalin Marinas
On Wed, Jul 08, 2015 at 05:19:14PM +0100, Marc Zyngier wrote:
> Having both VHE and non-VHE capable CPUs in the same system
> is likely to be a recipe for disaster.
> 
> If the boot CPU has VHE, but a secondary is not, we won't be
> able to downgrade and run the kernel at EL1. Add CPU hotplug
> to the mix, and this produces a terrifying mess.
> 
> Let's solve the problem once and for all. If you mix VHE and
> non-VHE CPUs in the same system, you deserve to loose, and this
> patch makes sure you don't get a chance.
> 
> This is implemented by storing the kernel execution level in
> a global variable. Secondaries will park themselves in a
> WFI loop if they observe a mismatch. Also, the primary CPU
> will detect that the secondary CPU has died on a mismatched
> execution level. Panic will follow.

We don't have such system yet but we could try it on the model. Do you
know how far the secondary CPU goes? If it gets to the cpuid sanity
checking, we could do the check together with the rest of the features
(we could add a new CHECK_FATAL macro to make this even more obvious):

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index 8e797b2fcc01..36cc81c9b9df 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -29,6 +29,7 @@ struct cpuinfo_arm64 {
u32 reg_cntfrq;
u32 reg_dczid;
u32 reg_midr;
+   u32 reg_currentel;
 
u64 reg_id_aa64dfr0;
u64 reg_id_aa64dfr1;
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index ee6403df9fe4..de3e8903b5a8 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -118,6 +118,14 @@ static inline bool id_aa64mmfr0_mixed_endian_el0(u64 mmfr0)
return (ID_AA64MMFR0_BIGEND(mmfr0) == 0x1) ||
(ID_AA64MMFR0_BIGENDEL0(mmfr0) == 0x1);
 }
+
+static inline unsigned int read_currentel(void)
+{
+   u64 el;
+   asm("mrs%0, CurrentEL" : "=r" (el));
+   return el;
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 75d5a867e7fb..c0269210fd54 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -132,6 +132,9 @@ static void cpuinfo_sanity_check(struct cpuinfo_arm64 *cur)
/* If different, timekeeping will be broken (especially with KVM) */
diff |= CHECK(cntfrq, boot, cur, cpu);
 
+   /* Same EL for all CPUs */
+   diff |= CHECK(currentel, boot, cur, cpu);
+
/*
 * The kernel uses self-hosted debug features and expects CPUs to
 * support identical debug features. We presently need CTX_CMPs, WRPs,
@@ -205,6 +208,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
info->reg_ctr = read_cpuid_cachetype();
info->reg_dczid = read_cpuid(DCZID_EL0);
info->reg_midr = read_cpuid_id();
+   info->reg_currentel = read_currentel();
 
info->reg_id_aa64dfr0 = read_cpuid(ID_AA64DFR0_EL1);
info->reg_id_aa64dfr1 = read_cpuid(ID_AA64DFR1_EL1);
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/13] arm64: KVM: VHE: Patch out use of HVC

2015-08-05 Thread Catalin Marinas
On Wed, Jul 08, 2015 at 05:19:10PM +0100, Marc Zyngier wrote:
> --- /dev/null
> +++ b/arch/arm64/kvm/vhe-macros.h
> @@ -0,0 +1,36 @@
> +/*
> + * Copyright (C) 2015 - ARM Ltd
> + * Author: Marc Zyngier 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see .
> + */
> +
> +#ifndef __ARM64_VHE_MACROS_H__
> +#define __ARM64_VHE_MACROS_H__
> +
> +#include 
> +#include 
> +
> +#ifdef __ASSEMBLY__
> +
> +/* Hack to allow stringification of macros... */
> +#define __S__(a,args...) __stringify(a, ##args)
> +#define _S_(a,args...)   __S__(a, args)
> +
> +.macro ifnvhe nonvhe vhe
> + alternative_insn"\nonvhe", "\vhe", ARM64_HAS_VIRT_HOST_EXTN
> +.endm

I always found the alternative_insn tricks hard to read but with
Daniel's patch queued in -next it gets slightly better. If you are not
targeting 4.3, could you do something like (untested):

.macro  vhe_if_not
alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
.endm

.macro  vhe_else
alternative_else CONFIG_ARM64_VHE
.endm

.macro  vhe_endif
alternative_endif CONFIG_ARM64_VHE
.endm


The kvm_call_hyp, for example, would become:

ENTRY(kvm_call_hyp)
vhe_if_not
hvc #0
ret
vhe_else
nop
pushlr, xzr
vhe_endif


Or you can even ignore defining new vhe_* macros altogether and just use
"alternative_if_not ARM64_HAS_VIRT_HOST_EXTN" directly (more to type)
but you may be able to avoid a vhe-macros.h file.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm64: qcom: Add define for ARMv8 implementer (MIDR)

2015-06-22 Thread Catalin Marinas
On Fri, Jun 19, 2015 at 05:28:53PM -0500, Timur Tabi wrote:
> On 06/15/2015 05:59 AM, Catalin Marinas wrote:
> >I think this patch together with the second one could go through the kvm
> >tree. For the core arm64 part:
> >
> >Acked-by: Catalin Marinas
> 
> Suzuki Poulose posted a patch that adds generic support for ARMv8 KVM
> targets.  I want to drop my second patch, so can we pick up this first patch
> through the ARM tree?

So if the second patch is no longer needed, what's using this patch? I
would defer merging it until actually required in some part of the
kernel.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in


Re: [PATCH 1/2] arm64: qcom: Add define for ARMv8 implementer (MIDR)

2015-06-15 Thread Catalin Marinas
On Fri, Jun 12, 2015 at 04:57:42PM -0500, Timur Tabi wrote:
> From: Shanker Donthineni 
> 
> This patch adds define for Qualcomm Technologies ARMv8 CPU
> implementer ID 0x51 and part number for Kryo in asm/cputype.h.
> 
> Signed-off-by: Shanker Donthineni 
> Signed-off-by: Timur Tabi 

I think this patch together with the second one could go through the kvm
tree. For the core arm64 part:

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-05 Thread Catalin Marinas
On Thu, Mar 05, 2015 at 01:26:39PM +0100, Paolo Bonzini wrote:
> On 05/03/2015 13:03, Catalin Marinas wrote:
> >> > I'd hate to have to do that. PCI should be entirely probeable
> >> > given that we tell the guest where the host bridge is, that's
> >> > one of its advantages.
> > I didn't say a DT node per device, the DT doesn't know what PCI devices
> > are available (otherwise it defeats the idea of probing). But we need to
> > tell the OS where the host bridge is via DT.
> > 
> > So the guest would be told about two host bridges: one for real devices
> > and another for virtual devices. These can have different coherency
> > properties.
> 
> Yeah, and it would suck that the user needs to know the difference
> between the coherency proprties of the host bridges.

The host needs to know about this, unless we assume full coherency on
all the platforms. Arguably, Qemu needs to know as well if it is the one
generating the DT for guest (or at least passing some snippets from the
host DT).

> It would especially suck if the user has a cluster with different
> machines, some of them coherent and others non-coherent, and then has to
> debug why the same configuration works on some machines and not on others.

That's a problem indeed, especially with guest migration. But I don't
think we have any sane solution here for the bus master DMA.

> To avoid replying in two different places, which of the solutions look
> to me like something that half-works?  Pretty much all of them, because
> in the end it is just a processor misfeature.  For example, Intel
> virtualization extensions let the hypervisor override stage1 translation
> _if necessary_.  AMD doesn't, but has some other quirky things that let
> you achieve the same effect..

ARM can override them as well but only making them stricter. Otherwise,
on a weakly ordered architecture, it's not always safe (let's say the
guest thinks it accesses Strongly Ordered memory and avoids barriers for
flag updates but the host "upgrades" it to Cacheable which breaks the
memory order).

If we want the host to enforce guest memory mapping attributes via stage
2, we could do it the other way around: get the guests to always assume
full cache coherency, generating Normal Cacheable mappings, but use the
stage 2 attributes restriction in the host to make such mappings
non-cacheable when needed (it works this way on ARM but not in the other
direction to relax the attributes).

> In particular, I am not even sure that this is about bus coherency,
> because this problem does not happen when the device is doing bus master
> DMA.  Working around coherency for bus master DMA would be easy.

My previous emails on the "dma-coherent" property were only about bus
master DMA (which would cause the correct selection of the DMA API ops
in the guest).

But even for bus master DMA, guest OS still needs to be aware of the
(virtual) device DMA capabilities (cache coherent or not). You may be
able to work around it in the host (stage 2, explicit cache flushing or
SMMU attributes) if the guests assumes non-coherency but it's not really
efficient (nor nice to implement).

> The problem arises with MMIO areas that the guest can reasonably expect
> to be uncacheable, but that are optimized by the host so that they end
> up backed by cacheable RAM.  It's perfectly reasonable that the same
> device needs cacheable mapping with one userspace, and works with
> uncacheable mapping with another userspace that doesn't optimize the
> MMIO area to RAM.

Unless the guest allocates the framebuffer itself (e.g.
dma_alloc_coherent), we can't control the cacheability via
"dma-coherent" properties as it refers to bus master DMA.

So for MMIO with the buffer allocated by the host (Qemu), the only
solution I see on ARM is for the host to ensure coherency, either via
explicit cache maintenance (new KVM API) or by changing the memory
attributes used by Qemu to access such virtual MMIO.

Basically Qemu is acting as a bus master when reading the framebuffer it
allocated but the guest considers it a slave access and we don't have a
way to tell the guest that such accesses should be cacheable, nor can we
upgrade them via architecture features.

> Currently the VGA framebuffer is the main case where this happen, and I
> don't expect many more.  Because this is not bus master DMA, it's hard
> to find a QEMU API that can be hooked to invalidate the cache.  QEMU is
> just reading from an array of chars.

I now understand the problem better. I was under the impression that the
guest allocates the framebuffer itself and tells Qemu where it is (like
in amba-clcd.c for example).

> In practice, the VGA framebuffer has an optimization that uses dirty
> page tr

Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-05 Thread Catalin Marinas
On Thu, Mar 05, 2015 at 08:52:49PM +0900, Peter Maydell wrote:
> On 5 March 2015 at 20:04, Catalin Marinas  wrote:
> > On Thu, Mar 05, 2015 at 11:12:22AM +0100, Paolo Bonzini wrote:
> >> On 04/03/2015 18:28, Catalin Marinas wrote:
> >> >> > Can you add that property to the device tree for PCI devices too?
> >> >
> >> > Yes but not with mainline yet:
> >> >
> >> > http://thread.gmane.org/gmane.linux.kernel.iommu/8935
> >> >
> >> > We can add the property at the PCI host bridge level (with the drawback
> >> > that it covers all the PCI devices), like here:
> >>
> >> Even covering all PCI devices is not enough if we want to support device
> >> assignment of PCI host devices.
> >
> > Can we not have another PCI bridge node in the DT for the host device
> > assignments?
> 
> I'd hate to have to do that. PCI should be entirely probeable
> given that we tell the guest where the host bridge is, that's
> one of its advantages.

I didn't say a DT node per device, the DT doesn't know what PCI devices
are available (otherwise it defeats the idea of probing). But we need to
tell the OS where the host bridge is via DT.

So the guest would be told about two host bridges: one for real devices
and another for virtual devices. These can have different coherency
properties.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-05 Thread Catalin Marinas
On Thu, Mar 05, 2015 at 11:12:22AM +0100, Paolo Bonzini wrote:
> On 04/03/2015 18:28, Catalin Marinas wrote:
> >> > Can you add that property to the device tree for PCI devices too?
> >
> > Yes but not with mainline yet:
> > 
> > http://thread.gmane.org/gmane.linux.kernel.iommu/8935
> > 
> > We can add the property at the PCI host bridge level (with the drawback
> > that it covers all the PCI devices), like here:
> 
> Even covering all PCI devices is not enough if we want to support device
> assignment of PCI host devices. 

Can we not have another PCI bridge node in the DT for the host device
assignments?

> I'd rather not spend effort on a solution that we know will only
> half-work a few months down the road.

Which of the solutions are you referring to? On the Applied Micro
boards, for example, the PCIe is coherent. If you do device assignment,
the guest must be aware of the coherency property of the physical device
and behave accordingly, there isn't much the host can do about it.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-04 Thread Catalin Marinas
On Wed, Mar 04, 2015 at 06:03:11PM +0100, Paolo Bonzini wrote:
> 
> 
> On 04/03/2015 15:29, Catalin Marinas wrote:
> > I disagree it is 100% a host-side issue. It is a host-side issue _if_
> > the host tells the guest that the (virtual) device is non-coherent (or,
> > more precisely, it does not explicitly tell the guest that the device is
> > coherent). If the guest thinks the (virtual) device is non-coherent
> > because of information passed by the host, I fully agree that the host
> > needs to manage the cache coherency.
> > 
> > However, the host could also pass a "dma-coherent" property in the DT
> > given to the guest and avoid any form of cache maintenance. If the guest
> > does not honour such coherency property, it's a guest problem and it
> > needs fixing in the guest. This isn't any different from a real physical
> > device behaviour.
> 
> Can you add that property to the device tree for PCI devices too?

Yes but not with mainline yet:

http://thread.gmane.org/gmane.linux.kernel.iommu/8935

We can add the property at the PCI host bridge level (with the drawback
that it covers all the PCI devices), like here:

Documentation/devicetree/bindings/pci/xgene-pci.txt

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-04 Thread Catalin Marinas
On Wed, Mar 04, 2015 at 03:12:12PM +0100, Andrew Jones wrote:
> On Wed, Mar 04, 2015 at 01:43:02PM +0100, Ard Biesheuvel wrote:
> > On 4 March 2015 at 13:29, Catalin Marinas  wrote:
> > > On Wed, Mar 04, 2015 at 12:50:57PM +0100, Ard Biesheuvel wrote:
> > >> I think we have established that the performance hit is not the
> > >> problem but the correctness is.
> > >
> > > I haven't looked at the performance figures but has anyone assessed the
> > > hit caused by doing cache maintenance in Qemu vs cacheable guest
> > > accesses (and no maintenance)?
> 
> I'm working on a PoC of a QEMU/KVM cache maintenance approach now.
> Hopefully I'll send it out this evening. Tomorrow at the latest.
> Getting numbers of that approach vs. a guest's use of cached memory
> for devices would take a decent amount of additional work, so won't
> be part of that post.

OK.

> I'm actually not sure we should care about
> the numbers for a guest using normal mem attributes for device
> memory - other than out of curiosity. For correctness this issue
> really needs to be solved 100% host-side. We can't rely on a
> guest to do different/weird things, just because it's a guest.
> Ideally guests don't even know that they're guests. (Even if we
> describe the memory as cache-able to the guest, I don't think we
> can rely on the guest believing us.)

I disagree it is 100% a host-side issue. It is a host-side issue _if_
the host tells the guest that the (virtual) device is non-coherent (or,
more precisely, it does not explicitly tell the guest that the device is
coherent). If the guest thinks the (virtual) device is non-coherent
because of information passed by the host, I fully agree that the host
needs to manage the cache coherency.

However, the host could also pass a "dma-coherent" property in the DT
given to the guest and avoid any form of cache maintenance. If the guest
does not honour such coherency property, it's a guest problem and it
needs fixing in the guest. This isn't any different from a real physical
device behaviour.

(there are counter arguments for the latter as well like emulating old
platforms that never had coherency but from a performance/production
perspective, I strongly recommend that guests are passed the
"dma-coherent" property for such virtual devices)

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-04 Thread Catalin Marinas
On Wed, Mar 04, 2015 at 12:50:57PM +0100, Ard Biesheuvel wrote:
> On 4 March 2015 at 12:35, Catalin Marinas  wrote:
> > On Mon, Mar 02, 2015 at 06:20:19PM -0800, Mario Smarduch wrote:
> >> On 03/02/2015 08:31 AM, Christoffer Dall wrote:
> >> > However, my concern with these patches are on two points:
> >> >
> >> > 1. It's not a fix-all.  We still have the case where the guest expects
> >> > the behavior of device memory (for strong ordering for example) on a RAM
> >> > region, which we now break.  Similiarly this doesn't support the
> >> > non-coherent DMA to RAM region case.
> >> >
> >> > 2. While the code is probably as nice as this kind of stuff gets, it
> >> > is non-trivial and extremely difficult to debug.  The counter-point here
> >> > is that we may end up handling other stuff at EL2 for performanc reasons
> >> > in the future.
> >> >
> >> > Mainly because of point 1 above, I am leaning to thinking userspace
> >> > should do the invalidation when it knows it needs to, either through KVM
> >> > via a memslot flag or through some other syscall mechanism.
> >
> > I expressed my concerns as well, I'm definitely against merging this
> > series.
> 
> Don't worry, that was never the intention, at least not as-is :-)

I wasn't worried, just wanted to make my position clearer ;).

> I think we have established that the performance hit is not the
> problem but the correctness is.

I haven't looked at the performance figures but has anyone assessed the
hit caused by doing cache maintenance in Qemu vs cacheable guest
accesses (and no maintenance)?

> I do have a remaining question, though: my original [non-working]
> approach was to replace uncached mappings with write-through
> read-allocate write-allocate,

Does it make sense to have write-through and write-allocate at the same
time? The write-allocate hint would probably be ignored as write-through
writes do not generate linefills.

> which I expected would keep the caches
> in sync with main memory, but apparently I am misunderstanding
> something here. (This is the reason for s/0xbb/0xff/ in patch #2 to
> get it to work: it replaces WT/RA/WA with WB/RA/WA)
> 
> Is there no way to use write-through caching here?

Write-through is considered non-cacheable from a write perspective when
it does not hit in the cache. AFAIK, it should still be able to hit
existing cache lines and evict. The ARM ARM states that cache cleaning
to _PoU_ is not required for coherency when the writes are to
write-through memory but I have to dig further into the PoC because
that's what we care about here.

What platform did you test it on? I can't tell what the behaviour of
system caches is. I know they intercept explicit cache maintenance by VA
but not sure what happens to write-through writes when they hit in the
system cache (are they evicted to RAM or not?). If such write-through
writes are only evicted to the point-of-unification, they won't work
since non-cacheable accesses go all the way to PoC.

I need to do more reading through the ARM ARM, it should be hidden
somewhere ;).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-04 Thread Catalin Marinas
(please try to avoid top-posting)

On Mon, Mar 02, 2015 at 06:20:19PM -0800, Mario Smarduch wrote:
> On 03/02/2015 08:31 AM, Christoffer Dall wrote:
> > However, my concern with these patches are on two points:
> > 
> > 1. It's not a fix-all.  We still have the case where the guest expects
> > the behavior of device memory (for strong ordering for example) on a RAM
> > region, which we now break.  Similiarly this doesn't support the
> > non-coherent DMA to RAM region case.
> > 
> > 2. While the code is probably as nice as this kind of stuff gets, it
> > is non-trivial and extremely difficult to debug.  The counter-point here
> > is that we may end up handling other stuff at EL2 for performanc reasons
> > in the future.
> > 
> > Mainly because of point 1 above, I am leaning to thinking userspace
> > should do the invalidation when it knows it needs to, either through KVM
> > via a memslot flag or through some other syscall mechanism.

I expressed my concerns as well, I'm definitely against merging this
series.

> I don't understand how can the CPU handle different cache attributes
> used by QEMU and Guest won't you run into B2.9 checklist? Wouldn't
> cache evictions or cleans wipe out guest updates to same cache
> line(s)?

"Clean+invalidate" is a safe operation even if the guest accesses the
memory in a cacheable way. But if the guest can update the cache lines,
Qemu should avoid cache maintenance from a performance perspective.

The guest is either told that the DMA is coherent (via DT properties) or
Qemu deals with (non-)coherency itself. The latter is fully in line with
the B2.9 chapter in the ARM ARM, more precisely point 5:

  If the mismatched attributes for a memory location all assign the same
  shareability attribute to the location, any loss of uniprocessor
  semantics or coherency within a shareability domain can be avoided by
  use of software cache management.

... it continues with what kind of cache maintenance is required,
together with:

  A clean and invalidate instruction can be used instead of a clean
  instruction, or instead of an invalidate instruction.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/RFT PATCH 0/3] arm64: KVM: work around incoherency with uncached guest mappings

2015-03-03 Thread Catalin Marinas
On Thu, Feb 19, 2015 at 10:54:43AM +, Ard Biesheuvel wrote:
> This is a 0th order approximation of how we could potentially force the guest
> to avoid uncached mappings, at least from the moment the MMU is on. (Before
> that, all of memory is implicitly classified as Device-nGnRnE)

That's just for data accesses. IIRC instructions are cacheable on ARMv8
(though I think without allocation in the unified caches).

> The idea (patch #2) is to trap writes to MAIR_EL1, and replace uncached 
> mappings
> with cached ones. This way, there is no need to mangle any guest page tables.

There is another big downside to this - breaking the guest assumptions
about the (non-)cacheability of its mappings. It also only works for
guests that use MAIR_EL1 (LPAE).

We have two main cases where the guest and host cacheability do not
match:

1. During boot, as you said, when the MMU is off. What we have done in
   the guest kernel is to invalidate the data ranges that it writes with
   the MMU off in case there were any speculatively loaded cache lines
   via the cacheable mappings (in the host). We don't have any nice
   solution in the host here and MAIR_EL1 tweaking does not work

2. Guest explicitly creating a non-cacheable mapping (MMU enabled). Here
   we have two sub-cases:
   a) guest-only accesses to such mapping. The guest would need to
  perform cache maintenance as required if it ever accesses such
  memory via cacheable mappings (we do this already, see the
  streaming DMA API)
   b) memory shared with the host: e.g Qemu emulating DMA (frame buffer
  etc.)

This 2.b case is not any different than the OS dealing with a
(non-)coherent DMA-capable device. If the device is coherent, the
DMA buffer in the guest must be coherent as well, otherwise
non-coherent. Imagine a real VGA device that always snoops CPU caches.
You would not create a non-cacheable frame buffer mapping since the
device cannot see the updates and only read stale cache entries.

We don't (can't) have a safe set of DMA ops that would work in both
cases. So if Qemu cannot use a non-cacheable mapping or cannot perform
cache maintenance, the only solution is to tell the guest that such
virtual device is cache _coherent_. This also gives you better
performance overall anyway.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 3/3] arm64: Allow 48-bits VA space without ARM_SMMU

2014-10-10 Thread Catalin Marinas
On Fri, Oct 10, 2014 at 11:14:30AM +0100, Christoffer Dall wrote:
> Now when KVM has been reworked to support 48-bits host VA space, we can
> allow systems to be configured with this option.  However, the ARM SMMU
> driver also needs to be tweaked for 48-bit support so only allow the
> config option to be set when not including support for theSMMU.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm64/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index fd4e81a..a76c6c3b 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -200,7 +200,7 @@ config ARM64_VA_BITS_42
>  
>  config ARM64_VA_BITS_48
>   bool "48-bit"
> - depends on BROKEN
> + depends on !ARM_SMMU
>  
>  endchoice

I think we should rather merge this separately via the arm64 tree once
we test 48-bit VA some more (and as you noticed, there is a bug
already).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

2014-10-10 Thread Catalin Marinas
On Fri, Oct 10, 2014 at 11:14:28AM +0100, Christoffer Dall wrote:
> This patch adds the necessary support for all host kernel PGSIZE and
> VA_SPACE configuration options for both EL2 and the Stage-2 page tables.
> 
> However, for 40bit and 42bit PARange systems, the architecture mandates
> that VTCR_EL2.SL0 is maximum 1, resulting in fewer levels of stage-2
> pagge tables than levels of host kernel page tables.  At the same time,
> systems with a PARange > 42bit, we limit the IPA range by always setting
> VTCR_EL2.T0SZ to 24.
> 
> To solve the situation with different levels of page tables for Stage-2
> translation than the host kernel page tables, we allocate a dummy PGD
> with pointers to our actual inital level Stage-2 page table, in order
> for us to reuse the kernel pgtable manipulation primitives.  Reproducing
> all these in KVM does not look pretty and unnecessarily complicates the
> 32-bit side.
> 
> Systems with a PARange < 40bits are not yet supported.
> 
>  [ I have reworked this patch from its original form submitted by
>Jungseok to take the architecture constraints into consideration.
>There were too many changes from the original patch for me to
>preserve the authorship.  Thanks to Catalin Marinas for his help in
>figuring out a good solution to this challenge.  I have also fixed
>various bugs and missing error code handling from the original
>patch. - Christoffer ]
> 
> Cc: Marc Zyngier 
> Cc: Catalin Marinas 
> Signed-off-by: Jungseok Lee 
> Signed-off-by: Christoffer Dall 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 2/3] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE

2014-10-10 Thread Catalin Marinas
On Fri, Oct 10, 2014 at 11:14:29AM +0100, Christoffer Dall wrote:
> When creating or moving a memslot, make sure the IPA space is within the
> addressable range of the guest.  Otherwise, user space can create too
> large a memslot and KVM would try to access potentially unallocated page
> table entries when inserting entries in the Stage-2 page tables.
> 
> Signed-off-by: Christoffer Dall 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

2014-10-09 Thread Catalin Marinas
On Thu, Oct 09, 2014 at 12:01:37PM +0100, Christoffer Dall wrote:
> On Wed, Oct 08, 2014 at 10:47:04AM +0100, Catalin Marinas wrote:
> > On Tue, Oct 07, 2014 at 08:39:54PM +0100, Christoffer Dall wrote:
> > > +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
> > > +{
> > > +   pud_t *pud;
> > > +   pmd_t *pmd;
> > > +   unsigned int order, i;
> > > +   unsigned long hwpgd;
> > > +
> > > +   if (KVM_PREALLOC_LEVEL == 0)
> > > +   return 0;
> > > +
> > > +   order = get_order(PTRS_PER_S2_PGD);
> > 
> > Isn't order always 0 here? Based on our IRC discussion, PTRS_PER_S2_PGD
> > is 16 or less and the order should not be used.
> 
> no, if the kernel has 4K pages and 4 levels, then PGDIR_SHIFT is 39, and
> KVM_PHYS_SHIFT stays 40, so that means PTRS_PER_S2_PGD becomes 2, which
> means we concatenate two first level stage-2 page tables, which means we
> need to allocate two consecutive pages, giving us an order of 1, not 0.

So if PTRS_PER_S2_PGD is 2, how come get_order(PTRS_PER_S2_PGD) == 1? My
reading of the get_order() macro is that get_order(2) == 0.

Did you mean get_order(PTRS_PER_S2_PGD * PAGE_SIZE)?

Or you could define a PTRS_PER_S2_PGD_SHIFT as (KVM_PHYS_SHIFT -
PGDIR_SHIFT) and use this as the order directly.

> > > +   hwpgd = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> > 
> > I assume you need __get_free_pages() for alignment.
> 
> yes, would you prefer a comment to that fact?

No, that's fine.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

2014-10-08 Thread Catalin Marinas
On Tue, Oct 07, 2014 at 08:39:54PM +0100, Christoffer Dall wrote:
> I came up with the following based on your feedback, but I personally
> don't find it a lot easier to read than what I had already.  Suggestions
> are welcome:

At least PTRS_PER_S2_PGD and KVM_PREALLOC_LEVEL are clearer to me as
formulas than the magic numbers.

> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
> b/arch/arm64/include/asm/kvm_mmu.h
> index a030d16..7941a51 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
[...]
> +/*
> + * In the case where PGDIR_SHIFT is larger than KVM_PHYS_SHIFT, we can 
> address
> + * the entire IPA input range with a single pgd entry, and we would only need
> + * one pgd entry.
> + */

It may be worth here stating that this pgd is actually fake (covered
below as well). Maybe something like "single (fake) pgd entry".

> +#if PGDIR_SHIFT > KVM_PHYS_SHIFT
> +#define PTRS_PER_S2_PGD(1)
> +#else
> +#define PTRS_PER_S2_PGD(1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))
> +#endif
> +#define S2_PGD_ORDER   get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> 
> +/*
> + * If we are concatenating first level stage-2 page tables, we would have 
> less
> + * than or equal to 16 pointers in the fake PGD, because that's what the
> + * architecture allows.  In this case, (4 - CONFIG_ARM64_PGTABLE_LEVELS)
> + * represents the first level for the host, and we add 1 to go to the next
> + * level (which uses contatenation) for the stage-2 tables.
> + */
> +#if PTRS_PER_S2_PGD <= 16
> +#define KVM_PREALLOC_LEVEL (4 - CONFIG_ARM64_PGTABLE_LEVELS + 1)
> +#else
> +#define KVM_PREALLOC_LEVEL (0)
> +#endif
> +
> +/**
> + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
> + * @kvm:   The KVM struct pointer for the VM.
> + * @pgd:   The kernel pseudo pgd
> + *
> + * When the kernel uses more levels of page tables than the guest, we 
> allocate
> + * a fake PGD and pre-populate it to point to the next-level page table, 
> which
> + * will be the real initial page table pointed to by the VTTBR.
> + *
> + * When KVM_PREALLOC_LEVEL==2, we allocate a single page for the PMD and
> + * the kernel will use folded pud.  When KVM_PREALLOC_LEVEL==1, we
> + * allocate 2 consecutive PUD pages.
> + */

I don't have a strong preference here, if you find the code easier to
read as separate kvm_prealloc_hwpgd() functions, use those, as per your
original patch. My point was to no longer define the functions based on
#if 64K && 3-levels etc. but only on KVM_PREALLOC_LEVEL.

Anyway, I think the code below looks ok, with some fixes.

> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
> +{
> +   pud_t *pud;
> +   pmd_t *pmd;
> +   unsigned int order, i;
> +   unsigned long hwpgd;
> +
> +   if (KVM_PREALLOC_LEVEL == 0)
> +   return 0;
> +
> +   order = get_order(PTRS_PER_S2_PGD);

Isn't order always 0 here? Based on our IRC discussion, PTRS_PER_S2_PGD
is 16 or less and the order should not be used.

> +   hwpgd = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);

I assume you need __get_free_pages() for alignment.

> +   if (!hwpgd)
> +   return -ENOMEM;
> +
> +   if (KVM_PREALLOC_LEVEL == 1) {
> +   pud = (pud_t *)hwpgd;
> +   for (i = 0; i < PTRS_PER_S2_PGD; i++)
> +   pgd_populate(NULL, pgd + i, pud + i * PTRS_PER_PUD);
> +   } else if (KVM_PREALLOC_LEVEL == 2) {
> +   pud = pud_offset(pgd, 0);
> +   pmd = (pmd_t *)hwpgd;
> +   for (i = 0; i < PTRS_PER_S2_PGD; i++)
> +   pud_populate(NULL, pud + i, pmd + i * PTRS_PER_PMD);
> +   }

It could be slightly shorter as (I can't guarantee clearer ;)):

for (i = 0; i < PTRS_PER_S2_PGD; i++) {
if (KVM_PREALLOC_LEVEL == 1)
pgd_populate(NULL, pgd + i,
 (pud_t *)hwpgd + i * PTRS_PER_PUD);
else if (KVM_PREALLOC_LEVEL == 2)
pud_populate(NULL, pud_offset(pgd, 0) + i,
 (pmd_t *)hwpgd + i * PTRS_PER_PMD)
}

Or you could write a kvm_populate_swpgd() to handle the ifs and casting.

> +
> +   return 0;
> +}
> +
> +static inline void *kvm_get_hwpgd(struct kvm *kvm)
> +{
> +   pgd_t *pgd = kvm->arch.pgd;
> +   pud_t *pud;
> +   pmd_t *pmd;
> +
> +   switch (KVM_PREALLOC_LEVEL) {
> +   case 0:
> +   return pgd;
> +   case 1:
> +   pud = pud_offset(pgd, 0);
> +   return pud;
> +   case 2:
> +   pud = pud_offset(pgd, 0);
> +   pmd = pmd_offset(pud, 0);
> +   return pmd;
> +   default:
> +   BUG();
> +   return NULL;
> +   }

/* not needed? Use BUG_ON or BUILD_BUG_ON */
if (KVM_PREALLOC_LEVEL == 0)
return pgd;

pud = pud_

Re: [PATCH v2 1/3] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

2014-10-07 Thread Catalin Marinas
On Mon, Oct 06, 2014 at 09:30:25PM +0100, Christoffer Dall wrote:
> +/**
> + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
> + * @kvm:   The KVM struct pointer for the VM.
> + * @pgd:   The kernel pseudo pgd
> + *
> + * When the kernel uses more levels of page tables than the guest, we 
> allocate
> + * a fake PGD and pre-populate it to point to the next-level page table, 
> which
> + * will be the real initial page table pointed to by the VTTBR.
> + *
> + * When KVM_PREALLOC_LEVEL==2, we allocate a single page for the PMD and
> + * the kernel will use folded pud.  When KVM_PREALLOC_LEVEL==1, we
> + * allocate 2 consecutive PUD pages.
> + */
> +#if defined(CONFIG_ARM64_64K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 3
> +#define KVM_PREALLOC_LEVEL 2
> +#define PTRS_PER_S2_PGD1
> +#define S2_PGD_ORDER   get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))

I agree that my magic equation wasn't readable ;) (I had troubles
re-understanding it as well), but you also have some constants here that
are not immediately obvious where you got to them from. IIUC,
KVM_PREALLOC_LEVEL == 2 here means that the hardware only understands
stage 2 pmd and pte. I guess you could look into the ARM ARM tables but
it's still not clear.

Let's look at PTRS_PER_S2_PGD as I think it's simpler. My proposal was:

#if PGDIR_SHIFT > KVM_PHYS_SHIFT
#define PTRS_PER_S2_PGD (1)
#else
#define PTRS_PER_S2_PGD (1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))
#endif

In this case PGDIR_SHIFT is 42, so we get PTRS_PER_S2_PGD == 1. The 4K
and 4 levels case below is also correct.

The KVM start level calculation, we could assume that KVM needs either
host levels or host levels - 1 (unless we go for some weirdly small
KVM_PHYS_SHIFT). So we could define them KVM_PREALLOC_LEVEL as:

#if PTRS_PER_S2_PGD <= 16
#define KVM_PREALLOC_LEVEL  (4 - CONFIG_ARM64_PGTABLE_LEVELS + 1)
#else
#define KVM_PREALLOC_LEVEL  (0)
#endif

Basically if you can concatenate 16 or less pages at the level below the
top, the architecture does not allow a small top level. In this case,
(4 - CONFIG_ARM64_PGTABLE_LEVELS) represents the first level for the
host and we add 1 to go to the next level for KVM stage 2 when
PTRS_PER_S2_PGD is 16 or less. We use 0 when we don't need to
preallocate.

> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
> +{
> +   pud_t *pud;
> +   pmd_t *pmd;
> +
> +   pud = pud_offset(pgd, 0);
> +   pmd = (pmd_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 0);
> +
> +   if (!pmd)
> +   return -ENOMEM;
> +   pud_populate(NULL, pud, pmd);
> +
> +   return 0;
> +}
> +
> +static inline void kvm_free_hwpgd(struct kvm *kvm)
> +{
> +   pgd_t *pgd = kvm->arch.pgd;
> +   pud_t *pud = pud_offset(pgd, 0);
> +   pmd_t *pmd = pmd_offset(pud, 0);
> +   free_pages((unsigned long)pmd, 0);
> +}
> +
> +static inline phys_addr_t kvm_get_hwpgd(struct kvm *kvm)
> +{
> +   pgd_t *pgd = kvm->arch.pgd;
> +   pud_t *pud = pud_offset(pgd, 0);
> +   pmd_t *pmd = pmd_offset(pud, 0);
> +   return virt_to_phys(pmd);
> +
> +}
> +#elif defined(CONFIG_ARM64_4K_PAGES) && CONFIG_ARM64_PGTABLE_LEVELS == 4
> +#define KVM_PREALLOC_LEVEL 1
> +#define PTRS_PER_S2_PGD2
> +#define S2_PGD_ORDER   get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))

Here PGDIR_SHIFT is 39, so we get PTRS_PER_S2_PGD == (1 << (40 - 39))
which is 2 and KVM_PREALLOC_LEVEL == 1.

> +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
> +{
> +   pud_t *pud;
> +
> +   pud = (pud_t *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
> +   if (!pud)
> +   return -ENOMEM;
> +   pgd_populate(NULL, pgd, pud);
> +   pgd_populate(NULL, pgd + 1, pud + PTRS_PER_PUD);
> +
> +   return 0;
> +}

You still need to define these functions but you can make their
implementation dependent solely on the KVM_PREALLOC_LEVEL rather than
64K/4K and levels combinations. If it is KVM_PREALLOC_LEVEL is 1, you
allocate pud and populate the pgds (in a loop based on the
PTRS_PER_S2_PGD). If it is 2, you allocate the pmd and populate the pud
(still in a loop though it would probably be 1 iteration). We know based
on the assumption above that you can't get KVM_PREALLOC_LEVEL == 2 and
CONFIG_ARM64_PGTABLE_LEVELS == 4.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/3] arm/arm64: KVM: Host 48-bit VA support and IPA limits

2014-10-07 Thread Catalin Marinas
On Mon, Oct 06, 2014 at 09:30:24PM +0100, Christoffer Dall wrote:
> The following host configurations have been tested with KVM on APM
> Mustang:
[...]
>  3) 64KB  + 39 bits VA space

That would be 42-bit VA space.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE

2014-10-06 Thread Catalin Marinas
On Mon, Oct 06, 2014 at 02:47:01PM +0100, Christoffer Dall wrote:
> On Tue, Sep 30, 2014 at 01:46:51PM +0100, Catalin Marinas wrote:
> > On Thu, Sep 25, 2014 at 08:42:54PM +0100, Christoffer Dall wrote:
> > > When creating or moving a memslot, make sure the IPA space is within the
> > > addressable range of the guest.  Otherwise, user space can create too
> > > large a memslot and KVM would try to access potentially unallocated page
> > > table entries when inserting entries in the Stage-2 page tables.
> > > 
> > > Signed-off-by: Christoffer Dall 
> > > ---
> > >  arch/arm/kvm/mmu.c | 8 
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> > > index 4532f5f..52a311a 100644
> > > --- a/arch/arm/kvm/mmu.c
> > > +++ b/arch/arm/kvm/mmu.c
> > > @@ -975,6 +975,9 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, 
> > > struct kvm_run *run)
> > >   goto out_unlock;
> > >   }
> > >  
> > > + /* Userspace should not be able to register out-of-bounds IPAs */
> > 
> > I think "userspace" is a bit misleading (should be "guests").
> 
> see below
> 
> > > + VM_BUG_ON(fault_ipa >= KVM_PHYS_SIZE);
> > 
> > Can guests not generate IPA addresses higher than KVM_PHYS_SIZE? I don't
> > see why this wouldn't be possible when PARange > 40.
> > 
> Guests can, but higher in this function we resolve the the gfn (IPA) to
> a memslot (hva).  That only succeeds if userspace actually managed to
> register a memslot with such an IPA, which we prevent in the other part
> of this patch.  So if we get here, it's a bug, because we should have
> entered the section above and taken the goto out_unlock.

OK. I got it now.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

2014-10-06 Thread Catalin Marinas
On Mon, Oct 06, 2014 at 02:41:18PM +0100, Christoffer Dall wrote:
> On Tue, Sep 30, 2014 at 01:39:47PM +0100, Catalin Marinas wrote:
> > On Thu, Sep 25, 2014 at 08:42:53PM +0100, Christoffer Dall wrote:
> > > @@ -572,19 +614,36 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> > > return;
> > >
> > > unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> > > +   kvm_free_hwpgd(kvm);
> > > free_pages((unsigned long)kvm->arch.pgd, S2_PGD_ORDER);
> > > kvm->arch.pgd = NULL;
> > >  }
> > >
> > > -static pmd_t *stage2_get_pmd(struct kvm *kvm, struct 
> > > kvm_mmu_memory_cache *cache,
> > > +static pud_t *stage2_get_pud(struct kvm *kvm, struct 
> > > kvm_mmu_memory_cache *cache,
> > >  phys_addr_t addr)
> > >  {
> > > pgd_t *pgd;
> > > pud_t *pud;
> > > -   pmd_t *pmd;
> > >
> > > pgd = kvm->arch.pgd + pgd_index(addr);
> > > -   pud = pud_offset(pgd, addr);
> > > +   if (pgd_none(*pgd)) {
> > > +   if (!cache)
> > > +   return NULL;
> > > +   pud = mmu_memory_cache_alloc(cache);
> > > +   pgd_populate(NULL, pgd, pud);
> > > +   get_page(virt_to_page(pgd));
> > > +   }
> > > +
> > > +   return pud_offset(pgd, addr);
> > > +}
> >
> > This function looks correct, though we would not expect pgd_none() to
> > ever be true as we pre-populate it.
> > You could but a WARN_ON or something
> > until we actually have hardware 4-levels stage 2 (though I don't see a
> > need yet).
>
> pgd_none will never be true, because on both aarch64 and arm we fold the
> pud into the pgd (and include pgtable-nopud.h) if we have less than 4
> levels, and if we do have 4 levels, then we have preallocated and
> prepopulated the pgd.  If I got this right, I agree, and we should add
> the WARN_ON or VM_BUG_ON or something.

Yes.

> > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > > index fd4e81a..0f3e0a9 100644
> > > --- a/arch/arm64/Kconfig
> > > +++ b/arch/arm64/Kconfig
> > > @@ -200,7 +200,6 @@ config ARM64_VA_BITS_42
> > >
> > >  config ARM64_VA_BITS_48
> > > bool "48-bit"
> > > -   depends on BROKEN
> >
> > I'm ok with removing BROKEN here but I think at the same time we need to
> > disable the SMMU driver which has similar issues with (not) supporting
> > 4-levels page tables for stage 2 (stage 1 should be fine).
>
> Should that be in separate patches, or is it fine to just include it
> here?

I think a separate patch replacing BROKEN with !ARM_SMMU. But we'll
need to test the kernel a bit more to make sure there isn't anything
else that's broken before merging it.

> > > +
> > > +/**
> > > + * kvm_prealloc_hwpgd - allocate inital table for VTTBR
> > > + * @kvm:   The KVM struct pointer for the VM.
> > > + * @pgd:   The kernel pseudo pgd
> >
> > "hwpgd" doesn't sound like the right name since it's actually a fake pgd
> > that it is allocating. Maybe "swpgd"?
>
> The fake pgd was already allocated in kvm_alloc_stage2_pgd(), now we're
> allocating the first-level page table that is going to be walked by the
> MMU for Stage-2 translations (the address of which will go into the
> VTTBR), so that's the reasoning for calling it the hwpgd.

OK, it makes sense now.

> > > +#define KVM_PREALLOC_LEVELS0
> > > +#define PTRS_PER_S2_PGD(1 << (KVM_PHYS_SHIFT - 
> > > PGDIR_SHIFT))
> > > +#define S2_PGD_ORDER   get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> > > +
> > > +static inline int kvm_prealloc_hwpgd(struct kvm *kvm, pgd_t *pgd)
> > > +{
> > > +   return 0;
> > > +}
> > > +
> > > +static inline void kvm_free_hwpgd(struct kvm *kvm) { }
> > > +
> > > +static inline phys_addr_t kvm_get_hwpgd(struct kvm *kvm)
> > > +{
> > > +   return virt_to_phys(kvm->arch.pgd);
> > > +}
> > > +#endif
> >
> > I think all the above could be squashed into a single set of function
> > implementations with the right macros defined before.
> >
> > For the KVM levels, you could use something like (we exchanged emails in
> > private on this topic and how I got to these macros but they need better
> > checking):
> >
> >

Re: [PATCH 2/2] arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE

2014-09-30 Thread Catalin Marinas
On Thu, Sep 25, 2014 at 08:42:54PM +0100, Christoffer Dall wrote:
> When creating or moving a memslot, make sure the IPA space is within the
> addressable range of the guest.  Otherwise, user space can create too
> large a memslot and KVM would try to access potentially unallocated page
> table entries when inserting entries in the Stage-2 page tables.
> 
> Signed-off-by: Christoffer Dall 
> ---
>  arch/arm/kvm/mmu.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 4532f5f..52a311a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -975,6 +975,9 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu, struct 
> kvm_run *run)
>   goto out_unlock;
>   }
>  
> + /* Userspace should not be able to register out-of-bounds IPAs */

I think "userspace" is a bit misleading (should be "guests").

> + VM_BUG_ON(fault_ipa >= KVM_PHYS_SIZE);

Can guests not generate IPA addresses higher than KVM_PHYS_SIZE? I don't
see why this wouldn't be possible when PARange > 40.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2

2014-09-30 Thread Catalin Marinas
Hi Christoffer,

On Thu, Sep 25, 2014 at 08:42:53PM +0100, Christoffer Dall wrote:
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 7796051..048f37f 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -409,7 +409,7 @@ static void update_vttbr(struct kvm *kvm)
> kvm_next_vmid++;
>
> /* update vttbr to be used with the new vmid */
> -   pgd_phys = virt_to_phys(kvm->arch.pgd);
> +   pgd_phys = kvm_get_hwpgd(kvm);
> BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
> vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK;
> kvm->arch.vttbr = pgd_phys | vmid;
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index bb06f76..4532f5f 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -33,6 +33,7 @@
>
>  extern char  __hyp_idmap_text_start[], __hyp_idmap_text_end[];
>
> +static int kvm_prealloc_levels;

Do you have plans to determine this dynamically? In this patch, I can
only see it being assigned to KVM_PREALLOC_LEVELS. I guess you could
simply use the macro.

> @@ -521,6 +554,7 @@ int create_hyp_io_mappings(void *from, void *to, 
> phys_addr_t phys_addr)
>   */
>  int kvm_alloc_stage2_pgd(struct kvm *kvm)
>  {
> +   int ret;
> pgd_t *pgd;
>
> if (kvm->arch.pgd != NULL) {
> @@ -533,9 +567,17 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm)
> return -ENOMEM;
>
> memset(pgd, 0, PTRS_PER_S2_PGD * sizeof(pgd_t));

I don't think you need to allocate a full page here (not shown here in
the diff context) but it may not be trivial since you use get_page() to
keep track of when page tables could be freed (do you do this for the
pgd as well?).

> +
> +   ret = kvm_prealloc_hwpgd(kvm, pgd);
> +   if (ret)
> +   goto out;
> +
> kvm_clean_pgd(pgd);
> kvm->arch.pgd = pgd;
> -
> +   ret = 0;
> +out:
> +   if (ret)
> +   free_pages((unsigned long)pgd, S2_PGD_ORDER);
> return 0;
>  }

Does this always return 0, even in case of error? Does it look better
as:

ret = kvm_prealloc_hwpgd(kvm, pgd);
if (ret)
goto out;

kvm_clean_pgd(pgd);
kvm->arch.pgd = pgd;

return 0;

out:
free_pages((unsigned long)pgd, S2_PGD_ORDER);
return ret;

> @@ -572,19 +614,36 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
> return;
>
> unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> +   kvm_free_hwpgd(kvm);
> free_pages((unsigned long)kvm->arch.pgd, S2_PGD_ORDER);
> kvm->arch.pgd = NULL;
>  }
>
> -static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache 
> *cache,
> +static pud_t *stage2_get_pud(struct kvm *kvm, struct kvm_mmu_memory_cache 
> *cache,
>  phys_addr_t addr)
>  {
> pgd_t *pgd;
> pud_t *pud;
> -   pmd_t *pmd;
>
> pgd = kvm->arch.pgd + pgd_index(addr);
> -   pud = pud_offset(pgd, addr);
> +   if (pgd_none(*pgd)) {
> +   if (!cache)
> +   return NULL;
> +   pud = mmu_memory_cache_alloc(cache);
> +   pgd_populate(NULL, pgd, pud);
> +   get_page(virt_to_page(pgd));
> +   }
> +
> +   return pud_offset(pgd, addr);
> +}

This function looks correct, though we would not expect pgd_none() to
ever be true as we pre-populate it. You could but a WARN_ON or something
until we actually have hardware 4-levels stage 2 (though I don't see a
need yet).

> +static pmd_t *stage2_get_pmd(struct kvm *kvm, struct kvm_mmu_memory_cache 
> *cache,
> +phys_addr_t addr)
> +{
> +   pud_t *pud;
> +   pmd_t *pmd;
> +
> +   pud = stage2_get_pud(kvm, cache, addr);
> if (pud_none(*pud)) {
> if (!cache)
> return NULL;
> @@ -630,7 +689,7 @@ static int stage2_set_pte(struct kvm *kvm, struct 
> kvm_mmu_memory_cache *cache,
> pmd_t *pmd;
> pte_t *pte, old_pte;
>
> -   /* Create stage-2 page table mapping - Level 1 */
> +   /* Create stage-2 page table mapping - Levels 0 and 1 */
> pmd = stage2_get_pmd(kvm, cache, addr);
> if (!pmd) {
> /*
> @@ -688,7 +747,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t 
> guest_ipa,
> for (addr = guest_ipa; addr < end; addr += PAGE_SIZE) {
> pte_t pte = pfn_pte(pfn, PAGE_S2_DEVICE);
>
> -   ret = mmu_topup_memory_cache(&cache, 2, 2);
> +   ret = mmu_topup_memory_cache(&cache, KVM_MMU_CACHE_MIN_PAGES,
> +   KVM_MMU_CACHE_MIN_PAGES);

BTW, why do you need to preallocate the pages in KVM's own cache? Would
GFP_ATOMIC not work within the kvm->mmu_lock region?

> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index fd4e81a..0f3e0a9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -200,7 +200,6 @@ config ARM64_VA_BITS_42
>
> 

Re: [PATCH] arm/arm64: KVM: Fix VTTBR_BADDR_MASK and pgd alloc

2014-09-26 Thread Catalin Marinas
On Thu, Sep 25, 2014 at 07:32:19PM +0100, Christoffer Dall wrote:
> From: Joel Schopp 
> 
> The current aarch64 calculation for VTTBR_BADDR_MASK masks only 39 bits
> and not all the bits in the PA range. This is clearly a bug that
> manifests itself on systems that allocate memory in the higher address
> space range.
> 
>  [ Modified from Joel's original patch to be based on PHYS_MASK_SHIFT
>instead of a hard-coded value and to move the alignment check of the
>allocation to mmu.c.  Also added a comment explaining why we hardcode
>the IPA range and changed the stage-2 pgd allocation to be based on
>the 40 bit IPA range instead of the maximum possible 48 bit PA range.
>- Christoffer ]
> 
> Signed-off-by: Joel Schopp 
> Signed-off-by: Christoffer Dall 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] arm64: fix VTTBR_BADDR_MASK

2014-09-22 Thread Catalin Marinas
On Mon, Sep 22, 2014 at 04:56:58PM +0100, Joel Schopp wrote:
> > The TCR_EL2.PS setting should be done based on the ID_A64MMFR0_EL1
> > but you can do this in __do_hyp_init (it looks like this function
> > handles VTCR_EL2.PS already, not sure why it does do it for TCR_EL2 as
> > well).
> >
> > So IMO you only need about a few lines patch.
>
> My original patch to fix this problem was one line, so I'm perfectly
> happy with simplification.  But it would be nice if the other reviewers
> could agree with this approach.  With six versions that each addressed
> all the comments from reviewers I'd like it if the v7 that throws away
> most of that feedback didn't result in a v8 that puts it all back again.

I'm having some discussion with Christoffer around this. He will come up
with some patches on top of yours but I don't think the problem is that
simple. Basically the IPA size is restricted by the PARange but also
affected by the number of page table levels used by the host (both
having implications on VTCR_EL2.SL0).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] arm64: fix VTTBR_BADDR_MASK

2014-09-19 Thread Catalin Marinas
On Fri, Sep 19, 2014 at 04:28:54PM +0100, Catalin Marinas wrote:
> On Tue, Sep 09, 2014 at 12:08:52AM +0100, Joel Schopp wrote:
> > The current VTTBR_BADDR_MASK only masks 39 bits, which is broken on current
> > systems.  Rather than just add a bit it seems like a good time to also set
> > things at run-time instead of compile time to accomodate more hardware.
> > 
> > This patch sets TCR_EL2.PS, VTCR_EL2.T0SZ and vttbr_baddr_mask in runtime,
> > not compile time.
> > 
> > In ARMv8, EL2 physical address size (TCR_EL2.PS) and stage2 input address
> > size (VTCR_EL2.T0SZE) cannot be determined in compile time since they
> > depend on hardware capability.
> > 
> > According to Table D4-23 and Table D4-25 in ARM DDI 0487A.b document,
> > vttbr_x is calculated using different fixed values with consideration
> > of T0SZ, granule size and the level of translation tables. Therefore,
> > vttbr_baddr_mask should be determined dynamically.
> 
> So I agree with vttbr_baddr_mask being determined dynamically. I also
> agree with setting TCR_EL2.PS at run-time but the VTCR_EL2.T0SZ
> determines the input of the stage 2 translation. That's a KVM
> configuration about what IPA size it provides to the guests (and
> platform model it intends to support) and it doesn't need to be the same
> as the physical address range.
[...]
> > -#define VTTBR_BADDR_SHIFT (VTTBR_X - 1)
> > -#define VTTBR_BADDR_MASK  (((1LLU << (40 - VTTBR_X)) - 1) << 
> > VTTBR_BADDR_SHIFT)

Actually, after some more thinking, why don't we just make the upper
limit of this mask 48-bit always or even 64-bit. That's a physical mask
for checking whether the pgd pointer in vttbr is aligned as per the
architecture requirements. Given that the pointer is allocated from the
platform memory, it's clear that it is within the PA range. So basically
you just need a mask to check the bottom alignment based on
VTCR_EL2.T0SZ (which should be independent from the PA range). I guess
it should be enough as:

#define VTTBR_BADDR_MASK  (~0ULL << VTTBR_BADDR_SHIFT)

without any other changes to T0SZ.

The TCR_EL2.PS setting should be done based on the ID_A64MMFR0_EL1
but you can do this in __do_hyp_init (it looks like this function
handles VTCR_EL2.PS already, not sure why it does do it for TCR_EL2 as
well).

So IMO you only need about a few lines patch.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6] arm64: fix VTTBR_BADDR_MASK

2014-09-19 Thread Catalin Marinas
On Tue, Sep 09, 2014 at 12:08:52AM +0100, Joel Schopp wrote:
> The current VTTBR_BADDR_MASK only masks 39 bits, which is broken on current
> systems.  Rather than just add a bit it seems like a good time to also set
> things at run-time instead of compile time to accomodate more hardware.
> 
> This patch sets TCR_EL2.PS, VTCR_EL2.T0SZ and vttbr_baddr_mask in runtime,
> not compile time.
> 
> In ARMv8, EL2 physical address size (TCR_EL2.PS) and stage2 input address
> size (VTCR_EL2.T0SZE) cannot be determined in compile time since they
> depend on hardware capability.
> 
> According to Table D4-23 and Table D4-25 in ARM DDI 0487A.b document,
> vttbr_x is calculated using different fixed values with consideration
> of T0SZ, granule size and the level of translation tables. Therefore,
> vttbr_baddr_mask should be determined dynamically.

So I agree with vttbr_baddr_mask being determined dynamically. I also
agree with setting TCR_EL2.PS at run-time but the VTCR_EL2.T0SZ
determines the input of the stage 2 translation. That's a KVM
configuration about what IPA size it provides to the guests (and
platform model it intends to support) and it doesn't need to be the same
as the physical address range.

> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index 5cc0b0f..03a08bb 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -21,6 +21,7 @@
> 
>  #include 
>  #include 
> +#include 
> 
>  /*
>   * We directly use the kernel VA for the HYP, as we can directly share
> @@ -178,6 +179,18 @@ static inline void coherent_cache_guest_page(struct 
> kvm_vcpu *vcpu, hva_t hva,
> 
>  void stage2_flush_vm(struct kvm *kvm);
> 
> +static inline int kvm_get_phys_addr_shift(void)
> +{
> +   return KVM_PHYS_SHIFT;
> +}
> +
> +
> +static inline u32 get_vttbr_baddr_mask(void)
> +{
> +   return VTTBR_BADDR_MASK;
> +}

It should be u64 as it truncates the top 32 bits of the mask.

> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index a99e0cd..d0fca8f 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -37,6 +37,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -61,6 +62,12 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
>  static u8 kvm_next_vmid;
>  static DEFINE_SPINLOCK(kvm_vmid_lock);
> 
> +#ifdef CONFIG_ARM64
> +static u64 vttbr_baddr_mask;
> +#else
> +static u32 vttbr_baddr_mask;
> +#endif

This mask should always be 64-bit as it relates to the physical address
which is 64-bit even on arm32 with LPAE enabled (same with the
get_vttbr_baddr_mask() return type above).

> diff --git a/arch/arm64/include/asm/kvm_arm.h 
> b/arch/arm64/include/asm/kvm_arm.h
> index cc83520..ff4a4fa 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -95,7 +95,6 @@
>  /* TCR_EL2 Registers bits */
>  #define TCR_EL2_TBI(1 << 20)
>  #define TCR_EL2_PS (7 << 16)
> -#define TCR_EL2_PS_40B (2 << 16)
>  #define TCR_EL2_TG0(1 << 14)
>  #define TCR_EL2_SH0(3 << 12)
>  #define TCR_EL2_ORGN0  (3 << 10)
> @@ -104,8 +103,6 @@
>  #define TCR_EL2_MASK   (TCR_EL2_TG0 | TCR_EL2_SH0 | \
>  TCR_EL2_ORGN0 | TCR_EL2_IRGN0 | TCR_EL2_T0SZ)
> 
> -#define TCR_EL2_FLAGS  (TCR_EL2_PS_40B)
> -
>  /* VTCR_EL2 Registers bits */
>  #define VTCR_EL2_PS_MASK   (7 << 16)
>  #define VTCR_EL2_TG0_MASK  (1 << 14)
> @@ -120,36 +117,28 @@
>  #define VTCR_EL2_SL0_MASK  (3 << 6)
>  #define VTCR_EL2_SL0_LVL1  (1 << 6)
>  #define VTCR_EL2_T0SZ_MASK 0x3f
> -#define VTCR_EL2_T0SZ_40B  24
> +#define VTCR_EL2_T0SZ(bits)(64 - (bits))
> 
>  #ifdef CONFIG_ARM64_64K_PAGES
>  /*
>   * Stage2 translation configuration:
> - * 40bits output (PS = 2)
> - * 40bits input  (T0SZ = 24)
>   * 64kB pages (TG0 = 1)
>   * 2 level page tables (SL = 1)
>   */
>  #define VTCR_EL2_FLAGS (VTCR_EL2_TG0_64K | VTCR_EL2_SH0_INNER | \
>  VTCR_EL2_ORGN0_WBWA | VTCR_EL2_IRGN0_WBWA | \
> -VTCR_EL2_SL0_LVL1 | VTCR_EL2_T0SZ_40B)
> -#define VTTBR_X(38 - VTCR_EL2_T0SZ_40B)
> +VTCR_EL2_SL0_LVL1)
>  #else
>  /*
>   * Stage2 translation configuration:
> - * 40bits output (PS = 2)
> - * 40bits input  (T0SZ = 24)
>   * 4kB pages (TG0 = 0)
>   * 3 level page tables (SL = 1)
>   */
>  #define VTCR_EL2_FLAGS (VTCR_EL2_TG0_4K | VTCR_EL2_SH0_INNER | \
>  VTCR_EL2_ORGN0_WBWA | VTCR_EL2_IRGN0_WBWA | \
> -VTCR_EL2_SL0_LVL1 | VTCR_EL2_T0SZ_40B)
> -#define VTTBR_X(37 - VTCR_EL2_T0SZ_40B)
> +VTCR_EL2_SL0_LVL1)
>  #endif

The PS = 2 comment was misleading as it doesn't seem to be set here. But
why dropping T0SZ? That gives the input address for stage 2 (IPA), so
whether KVM exposes a 40-bit IPA to guests is independent of whether the
kernel would use 48-

Re: [PATCH] MAINTAINERS: co-maintainance of KVM/{arm,arm64}

2014-04-27 Thread Catalin Marinas
On 26 Apr 2014, at 11:43, Christoffer Dall  wrote:
> From: Marc Zyngier 
> 
> The KVM/{arm,arm64} ports are sharing a lot of code, and are
> effectively co-maintained (and have been for quite a while).
> 
> Make the situation official and list the two maintainers
> for both ports.
> 
> Cc: Catalin Marinas 
> Cc: Russell King 
> Cc: Paolo Bonzini 
> Cc: Gleb Natapov 
> Signed-off-by: Marc Zyngier 
> Signed-off-by: Christoffer Dall 

Acked-by: Catalin Marinas 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] ARM VM System Sepcification

2014-02-27 Thread Catalin Marinas
On 26 February 2014 20:05, Christoffer Dall  wrote:
> On Wed, Feb 26, 2014 at 08:55:58PM +0100, Arnd Bergmann wrote:
>> On Wednesday 26 February 2014 10:34:54 Christoffer Dall wrote:
>> > For more information about UEFI and ACPI booting, see [4] and [5].
>>
>> What's the point of having ACPI in a virtual machine? You wouldn't
>> need to abstract any of the hardware in AML since you already know
>> what the virtual hardware is, so I can't see how this would help
>> anyone.
>
> The most common response I've been getting so far is that people
> generally want their VMs to look close to the real thing, but not sure
> how valid an argument that is.
>
> Some people feel strongly about this and seem to think that ARMv8
> kernels will only work with ACPI in the future...

My strong feeling is that AArch64 kernels *may* support ACPI in the future ;).

On a more serious note, both FDT and ACPI will be first-class citizens
on AArch64 and I have no intention whatsoever of dropping FDT.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/12] arm/arm64: KVM: host cache maintenance when guest caches are off

2014-02-19 Thread Catalin Marinas
On Wed, Feb 19, 2014 at 09:02:34AM +, Marc Zyngier wrote:
> On 2014-02-18 20:57, Eric Northup wrote:
> > On Tue, Feb 18, 2014 at 7:27 AM, Marc Zyngier  
> > wrote:
> >>
> >> When we run a guest with cache disabled, we don't flush the cache to
> >> the Point of Coherency, hence possibly missing bits of data that 
> >> have
> >> been written in the cache, but have not yet reached memory.
> >>
> >> We also have the opposite issue: when a guest enables its cache,
> >> whatever sits in the cache is suddenly going to become visible,
> >> shadowing whatever the guest has written into RAM.
> >>
> >> There are several approaches to these issues:
> >> - Using the DC bit when caches are off: this breaks guests assuming
> >>   caches off while doing DMA operations. Bootloaders, for example.
> >>   It also breaks the I-D coherency.
> >> - Fetch the memory attributes on translation fault, and flush the
> >>   cache while handling the fault. This relies on using the PAR_EL1
> >>   register to obtain the Stage-1 memory attributes, and tends to be
> >>   slow.
> >> - Detecting the translation faults occuring with MMU off (and
> >>   performing a cache clean), and trapping SCTLR_EL1 to detect the
> >>   moment when the guest is turning its caches on (and performing a
> >>   cache invalidation). Trapping of SCTLR_EL1 is then disabled to
> >>   ensure the best performance.
> >
> > This will preclude noticing the 2nd .. Nth cache off -> on cycles,
> > right?  Will any guests care - doesn't kexec go through a caches-off
> > state?
> 
> kexec, bootloaders, and whatever firmware requires to switch caches on 
> and then off. Guest does care, but we don't have an (efficient) 
> architectural solution to that.
> 
> The best I can think of so far is a "switch-the-damned-thing-off" 
> hypercall that would do the above before returning to the guest.

We could push for a PSCI extension to cover such cases as well, though
even for a host, we may not need to involve the secure world for kexec.

An alternative is to trap the set/way cache flushing and re-activate the
MMU trapping in the guest. If the MMU is still on, disable trapping
until the next set/way (since that's a normal function on power-down
code sequences). But it doesn't look nice ;).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 04/12] ARM: KVM: introduce kvm_p*d_addr_end

2014-02-18 Thread Catalin Marinas
On Tue, Feb 18, 2014 at 03:27:25PM +, Marc Zyngier wrote:
> The use of p*d_addr_end with stage-2 translation is slightly dodgy,
> as the IPA is 40bits, while all the p*d_addr_end helpers are
> taking an unsigned long (arm64 is fine with that as unligned long
> is 64bit).
> 
> The fix is to introduce 64bit clean versions of the same helpers,
> and use them in the stage-2 page table code.
> 
> Signed-off-by: Marc Zyngier 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 05/11] ARM: LPAE: provide an IPA capable pmd_addr_end

2014-02-07 Thread Catalin Marinas
On Fri, Feb 07, 2014 at 04:04:56AM +, Christoffer Dall wrote:
> On Thu, Feb 06, 2014 at 10:43:28AM +0000, Catalin Marinas wrote:
> > On Wed, Feb 05, 2014 at 07:55:45PM +, Marc Zyngier wrote:
> > > The default pmd_addr_end macro uses an unsigned long to represent
> > > the VA. When used with KVM and stage-2 translation, the VA is
> > > actually an IPA, which is up to 40 bits. This also affect the
> > > SMMU driver, which also deals with stage-2 translation.
> > > 
> > > Instead, provide an implementation that can cope with larger VAs
> > > by using a u64 instead. This version will overload the default
> > > one provided in include/asm-generic/pgtable.h.
> > > 
> > > Signed-off-by: Marc Zyngier 
> > > ---
> > >  arch/arm/include/asm/pgtable-3level.h | 5 +
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/arch/arm/include/asm/pgtable-3level.h 
> > > b/arch/arm/include/asm/pgtable-3level.h
> > > index 03243f7..594867b 100644
> > > --- a/arch/arm/include/asm/pgtable-3level.h
> > > +++ b/arch/arm/include/asm/pgtable-3level.h
> > > @@ -262,6 +262,11 @@ static inline int has_transparent_hugepage(void)
> > >   return 1;
> > >  }
> > >  
> > > +#define pmd_addr_end(addr, end)  
> > > \
> > > +({   u64 __boundary = ((addr) + PMD_SIZE) & PMD_MASK;
> > > \
> > > + (__boundary - 1 < (end) - 1)? __boundary: (end);\
> > > +})
> > 
> > I see why you need this but it affects all the other uses of
> > pmd_addr_end() with 32-bit VA. It would be slight performance, I don't
> > think it's noticeable.
> > 
> > A different approach could be something like (untested):
> > 
> > #define pmd_addr_end(addr, end) \
> > ({  __typeof__(addr) __boundary = ...
> > ...
> > })
> > 
> > What about the pgd_addr_end(), do we need this or it's not used by KVM?
> > 
> 
> What about pud_addr_end(), is that defined as a noop on LPAE, or?
> 
> I would be in favor of introducing them all using your approach to avoid
> somebody being inspired by the KVM code when dealing with IPAs and
> breaking things unknowingly.

I had a brief chat with Marc yesterday around this and it may be safer
to simply introduce kvm_p*d_addr_end() macros. You already do this for
pgd_addr_end() since it cannot be overridden in the generic headers.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 10/11] ARM: KVM: add world-switch for AMAIR{0,1}

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:50PM +, Marc Zyngier wrote:
> HCR.TVM traps (among other things) accesses to AMAIR0 and AMAIR1.
> In order to minimise the amount of surprise a guest could generate by
> trying to access these registers with caches off, add them to the
> list of registers we switch/handle.
> 
> Signed-off-by: Marc Zyngier 
> Reviewed-by: Christoffer Dall 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 11/11] ARM: KVM: trap VM system registers until MMU and caches are ON

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:51PM +, Marc Zyngier wrote:
> In order to be able to detect the point where the guest enables
> its MMU and caches, trap all the VM related system registers.
> 
> Once we see the guest enabling both the MMU and the caches, we
> can go back to a saner mode of operation, which is to leave these
> registers in complete control of the guest.
> 
> Signed-off-by: Marc Zyngier 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 09/11] ARM: KVM: introduce per-vcpu HYP Configuration Register

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:49PM +, Marc Zyngier wrote:
> So far, KVM/ARM used a fixed HCR configuration per guest, except for
> the VI/VF/VA bits to control the interrupt in absence of VGIC.
> 
> With the upcoming need to dynamically reconfigure trapping, it becomes
> necessary to allow the HCR to be changed on a per-vcpu basis.
> 
> The fix here is to mimic what KVM/arm64 already does: a per vcpu HCR
> field, initialized at setup time.
> 
> Signed-off-by: Marc Zyngier 
> Reviewed-by: Christoffer Dall 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 07/11] ARM: KVM: fix handling of trapped 64bit coprocessor accesses

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:47PM +, Marc Zyngier wrote:
> Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling)
> changed the way we match the 64bit coprocessor access from
> user space, but didn't update the trap handler for the same
> set of registers.
> 
> The effect is that a trapped 64bit access is never matched, leading
> to a fault being injected into the guest. This went unnoticed as we
> didn't really trap any 64bit register so far.
> 
> Placing the CRm field of the access into the CRn field of the matching
> structure fixes the problem. Also update the debug feature to emit the
> expected string in case of failing match.
> 
> Signed-off-by: Marc Zyngier 
> Reviewed-by: Christoffer Dall 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 08/11] ARM: KVM: fix ordering of 64bit coprocessor accesses

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:48PM +, Marc Zyngier wrote:
> Commit 240e99cbd00a (ARM: KVM: Fix 64-bit coprocessor handling)
> added an ordering dependency for the 64bit registers.
> 
> The order described is: CRn, CRm, Op1, Op2, 64bit-first.
> 
> Unfortunately, the implementation is: CRn, 64bit-first, CRm...
> 
> Move the 64bit test to be last in order to match the documentation.
> 
> Signed-off-by: Marc Zyngier 
> Reviewed-by: Christoffer Dall 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 06/11] ARM: KVM: force cache clean on page fault when caches are off

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:46PM +, Marc Zyngier wrote:
> In order for a guest with caches disabled to observe data written
> contained in a given page, we need to make sure that page is
> committed to memory, and not just hanging in the cache (as guest
> accesses are completely bypassing the cache until it decides to
> enable it).
> 
> For this purpose, hook into the coherent_cache_guest_page
> function and flush the region if the guest SCTLR
> register doesn't show the MMU and caches as being enabled.
> 
> Signed-off-by: Marc Zyngier 
> Reviewed-by: Christoffer Dall 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 05/11] ARM: LPAE: provide an IPA capable pmd_addr_end

2014-02-06 Thread Catalin Marinas
On Wed, Feb 05, 2014 at 07:55:45PM +, Marc Zyngier wrote:
> The default pmd_addr_end macro uses an unsigned long to represent
> the VA. When used with KVM and stage-2 translation, the VA is
> actually an IPA, which is up to 40 bits. This also affect the
> SMMU driver, which also deals with stage-2 translation.
> 
> Instead, provide an implementation that can cope with larger VAs
> by using a u64 instead. This version will overload the default
> one provided in include/asm-generic/pgtable.h.
> 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm/include/asm/pgtable-3level.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/include/asm/pgtable-3level.h 
> b/arch/arm/include/asm/pgtable-3level.h
> index 03243f7..594867b 100644
> --- a/arch/arm/include/asm/pgtable-3level.h
> +++ b/arch/arm/include/asm/pgtable-3level.h
> @@ -262,6 +262,11 @@ static inline int has_transparent_hugepage(void)
>   return 1;
>  }
>  
> +#define pmd_addr_end(addr, end)  
> \
> +({   u64 __boundary = ((addr) + PMD_SIZE) & PMD_MASK;\
> + (__boundary - 1 < (end) - 1)? __boundary: (end);\
> +})

I see why you need this but it affects all the other uses of
pmd_addr_end() with 32-bit VA. It would be slight performance, I don't
think it's noticeable.

A different approach could be something like (untested):

#define pmd_addr_end(addr, end) \
({  __typeof__(addr) __boundary = ...
...
})

What about the pgd_addr_end(), do we need this or it's not used by KVM?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] ARM: mm: Fix stage-2 device memory attributes

2014-01-27 Thread Catalin Marinas
On Mon, Jan 27, 2014 at 05:02:25PM +, Marc Zyngier wrote:
> On 27/01/14 16:57, Catalin Marinas wrote:
> > On Mon, Jan 27, 2014 at 11:16:57AM +, Marc Zyngier wrote:
> >> On 24/01/14 23:37, Christoffer Dall wrote:
> >>> On Sat, Jan 04, 2014 at 08:27:23AM -0800, Christoffer Dall wrote:
> >>>> --- a/arch/arm/include/asm/pgtable-3level.h
> >>>> +++ b/arch/arm/include/asm/pgtable-3level.h
> >>>> @@ -120,13 +120,19 @@
> >>>>  /*
> >>>>   * 2nd stage PTE definitions for LPAE.
> >>>>   */
> >>>> -#define L_PTE_S2_MT_UNCACHED (_AT(pteval_t, 0x5) << 2) /* 
> >>>> MemAttr[3:0] */
> >>>> -#define L_PTE_S2_MT_WRITETHROUGH (_AT(pteval_t, 0xa) << 2) /* 
> >>>> MemAttr[3:0] */
> >>>> -#define L_PTE_S2_MT_WRITEBACK(_AT(pteval_t, 0xf) << 2) /* 
> >>>> MemAttr[3:0] */
> >>>> -#define L_PTE_S2_RDONLY  (_AT(pteval_t, 1) << 6)   /* HAP[1]   
> >>>> */
> >>>> -#define L_PTE_S2_RDWR(_AT(pteval_t, 3) << 6)   /* HAP[2:1] 
> >>>> */
> >>>> -
> >>>> -#define L_PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] 
> >>>> */
> >>>> +#define L_PTE_S2_MT_UNCACHED(_AT(pteval_t, 0x0) << 2) /* 
> >>>> strongly ordered */
> >>>> +#define L_PTE_S2_MT_WRITETHROUGH(_AT(pteval_t, 0xa) << 2) /* 
> >>>> normal inner write-through */
> >>>> +#define L_PTE_S2_MT_WRITEBACK   (_AT(pteval_t, 0xf) << 2) /* 
> >>>> normal inner write-back */
> >>>> +#define L_PTE_S2_MT_DEV_SHARED  (_AT(pteval_t, 0x1) << 2) /* 
> >>>> device */
> >>>> +#define L_PTE_S2_MT_DEV_NONSHARED   (_AT(pteval_t, 0x1) << 2) /* 
> >>>> device */
> >>>> +#define L_PTE_S2_MT_DEV_WC  (_AT(pteval_t, 0x5) << 2) /* 
> >>>> normal non-cacheable */
> >>>> +#define L_PTE_S2_MT_DEV_CACHED  (_AT(pteval_t, 0xf) << 2) /* 
> >>>> normal inner write-back */
> >>>> +#define L_PTE_S2_MT_MASK(_AT(pteval_t, 0xf) << 2)
> >>>> +
> >>>> +#define L_PTE_S2_RDONLY (_AT(pteval_t, 1) << 6)   /* 
> >>>> HAP[1]   */
> >>>> +#define L_PTE_S2_RDWR   (_AT(pteval_t, 3) << 6)   /* 
> >>>> HAP[2:1] */
> >>>> +
> >>>> +#define L_PMD_S2_RDWR   (_AT(pmdval_t, 3) << 6)   /* 
> >>>> HAP[2:1] */
> >>>>  
> >>>>  /*
> >>>>   * Hyp-mode PL2 PTE definitions for LPAE.
> >>
> >> The change makes sense to me. arm64 uses a slightly different approach,
> >> by using a PTE_S2_MEMATTR macro, but I'm not sure that would work for ARM.
> >>
> >> Russell, Catalin: could you please have a look at this?
> > 
> > Do we actually need more than Normal Cacheable and Device for stage 2?
> 
> Not so far. As long as these two memory types are enforced as a minimum,
> we're quite happy to let the guest use whatever it decides.
> 
> I suppose Christoffer introduces them all here as a matter of
> completeness, but I don't see them as being useful anytime soon.

That would be useful on arm if you want cachepolicy= argument to force
the cacheability of guest Normal memory type.

On arm64, the stage 1 memory type is decided via MAIR and that's how we
handle cachepolicy for Normal memory. But for stage 2 this won't work,
the type is explicitly set in the MemAttr encoding. But I don't think we
need host cachepolicy enforced onto guest.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] ARM: mm: Fix stage-2 device memory attributes

2014-01-27 Thread Catalin Marinas
On Mon, Jan 27, 2014 at 11:16:57AM +, Marc Zyngier wrote:
> On 24/01/14 23:37, Christoffer Dall wrote:
> > On Sat, Jan 04, 2014 at 08:27:23AM -0800, Christoffer Dall wrote:
> >> --- a/arch/arm/include/asm/pgtable-3level.h
> >> +++ b/arch/arm/include/asm/pgtable-3level.h
> >> @@ -120,13 +120,19 @@
> >>  /*
> >>   * 2nd stage PTE definitions for LPAE.
> >>   */
> >> -#define L_PTE_S2_MT_UNCACHED   (_AT(pteval_t, 0x5) << 2) /* 
> >> MemAttr[3:0] */
> >> -#define L_PTE_S2_MT_WRITETHROUGH (_AT(pteval_t, 0xa) << 2) /* 
> >> MemAttr[3:0] */
> >> -#define L_PTE_S2_MT_WRITEBACK  (_AT(pteval_t, 0xf) << 2) /* 
> >> MemAttr[3:0] */
> >> -#define L_PTE_S2_RDONLY(_AT(pteval_t, 1) << 6)   /* HAP[1]   
> >> */
> >> -#define L_PTE_S2_RDWR  (_AT(pteval_t, 3) << 6)   /* HAP[2:1] 
> >> */
> >> -
> >> -#define L_PMD_S2_RDWR  (_AT(pmdval_t, 3) << 6)   /* HAP[2:1] 
> >> */
> >> +#define L_PTE_S2_MT_UNCACHED  (_AT(pteval_t, 0x0) << 2) /* 
> >> strongly ordered */
> >> +#define L_PTE_S2_MT_WRITETHROUGH  (_AT(pteval_t, 0xa) << 2) /* normal 
> >> inner write-through */
> >> +#define L_PTE_S2_MT_WRITEBACK (_AT(pteval_t, 0xf) << 2) /* 
> >> normal inner write-back */
> >> +#define L_PTE_S2_MT_DEV_SHARED(_AT(pteval_t, 0x1) << 2) /* 
> >> device */
> >> +#define L_PTE_S2_MT_DEV_NONSHARED (_AT(pteval_t, 0x1) << 2) /* device */
> >> +#define L_PTE_S2_MT_DEV_WC(_AT(pteval_t, 0x5) << 2) /* 
> >> normal non-cacheable */
> >> +#define L_PTE_S2_MT_DEV_CACHED(_AT(pteval_t, 0xf) << 2) /* 
> >> normal inner write-back */
> >> +#define L_PTE_S2_MT_MASK  (_AT(pteval_t, 0xf) << 2)
> >> +
> >> +#define L_PTE_S2_RDONLY   (_AT(pteval_t, 1) << 6)   /* 
> >> HAP[1]   */
> >> +#define L_PTE_S2_RDWR (_AT(pteval_t, 3) << 6)   /* 
> >> HAP[2:1] */
> >> +
> >> +#define L_PMD_S2_RDWR (_AT(pmdval_t, 3) << 6)   /* 
> >> HAP[2:1] */
> >>  
> >>  /*
> >>   * Hyp-mode PL2 PTE definitions for LPAE.
> 
> The change makes sense to me. arm64 uses a slightly different approach,
> by using a PTE_S2_MEMATTR macro, but I'm not sure that would work for ARM.
> 
> Russell, Catalin: could you please have a look at this?

Do we actually need more than Normal Cacheable and Device for stage 2?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/2] KVM: ARM: Support hugetlbfs backed huge pages

2013-10-07 Thread Catalin Marinas
On Fri, Oct 04, 2013 at 05:13:20PM +0100, Christoffer Dall wrote:
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -85,6 +85,8 @@
>  #define PTE_S2_RDONLY  (_AT(pteval_t, 1) << 6)   /* HAP[2:1] */
>  #define PTE_S2_RDWR(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
> 
> +#define PMD_S2_RDWR(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
> +
>  /*
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */

For the arm64 part:

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] arm64: KVM: let other tasks run when hitting WFE

2013-07-23 Thread Catalin Marinas
On Mon, Jul 22, 2013 at 01:51:52PM +0100, Christoffer Dall wrote:
> On 22 July 2013 10:53, Raghavendra KT  wrote:
> > On Fri, Jul 19, 2013 at 7:23 PM, Marc Zyngier  wrote:
> >> So far, when a guest executes WFE (like when waiting for a spinlock
> >> to become unlocked), we don't do a thing and let it run uninterrupted.
> >>
> >> Another option is to trap a blocking WFE and offer the opportunity
> >> to the scheduler to switch to another task, potentially giving the
> >> vcpu holding the spinlock a chance to run sooner.
> >>
> >
> > Idea looks to be correct from my experiments on x86. It does bring some
> > percentage of benefits in overcommitted guests. Infact,
> >
> > https://lkml.org/lkml/2013/7/22/41 tries to do the same thing for x86.
> > (this results in using ple handler heuristics in vcpu_block pach).
> 
> What about the adverse effect in the non-overcommitted case?

Could we not detect overcommitment and only set the TWE bit in this case
(per VCPU scheduled to run)?

The advantage of a properly para-virtualised lock is that you can target
which VCPU to wake up. I don't think we can do this currently (without
assumptions about the underlying OS and how the compiler allocates
registers in the ticket spinlocks).

There have been suggestions to use WFE in user-space for a more power
efficient busy loop (usually in user-only locking, I think some
PostreSQL does that) together with an automatic even stream for waking
up the WFE (Sudeep posted some patches recently for enabling 100us event
stream). If such feature gets used, we have a new meaning for WFE and we
may not want to trap it all the time.

Question for Will - do we have a PMU counter for WFE? (don't ask why ;))

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Planning the merge of KVM/arm64

2013-06-05 Thread Catalin Marinas
On Wed, Jun 05, 2013 at 07:01:05AM +0100, Gleb Natapov wrote:
> On Tue, Jun 04, 2013 at 10:57:32PM -0700, Christoffer Dall wrote:
> > On 4 June 2013 09:37, Gleb Natapov  wrote:
> > > On Tue, Jun 04, 2013 at 05:51:41PM +0200, Paolo Bonzini wrote:
> > >> Il 04/06/2013 17:43, Christoffer Dall ha scritto:
> > >> > Hi Paolo,
> > >> >
> > >> > I don't think this is an issue. Gleb and Marcelo for example pulled
> > >> > RMK's stable tree for my KVM/ARM updates for the 3.10 merge window and
> > >> > that wasn't an issue.  If Linus pulls the kvm/next tree first the
> > >> > diffstat should be similar and everything clean enough, no?
> > >> >
> > >> > Catalin has previously expressed his wish to upstream the kvm/arm64
> > >> > patches directly through him given the churn in a completely new
> > >> > architecture and he wants to make sure that everything looks right.
> > >> >
> > >> > It's a pretty clean implementation with quite few dependencies and
> > >> > merging as a working series should be a priority instead of the
> > >> > Kconfig hack, imho.
> > >>
> > >> Ok, let's see what Gleb says.
> > >>
> > > I have no objection to merge arm64 kvm trough Catalin if it mean less
> > > churn for everyone. That's what we did with arm and mips. Arm64 kvm
> > > has a dependency on kvm.git next though, so how Catalin make sure that
> > > everything looks right? Will he merge kvm.git/next to arm64 tree?
> > >
> > Yes, that was the idea. Everything in kvm/next is considered stable, right?
> > 
> Right. Catalin should wait for kvm.git to be pulled by Linus next merge
> windows before sending his pull request then.

I think it's better if I push the bulk of the arm64 KVM branch but
without Kconfig patch enabling it. This branch would be based on
mainline rather than kvm/next. Once your code goes in mainline, I'll
just push the Kconfig entry (for bisection reasons, it could be after
-rc1). This would keep the pull-request diffstat cleaner.

As we discussed some time ago, after the core arm64 KVM is merged you
will use the same workflow as for arm (merge via the kvm tree).

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Planning the merge of KVM/arm64

2013-06-04 Thread Catalin Marinas
On Tue, Jun 04, 2013 at 02:13:52PM +0100, Anup Patel wrote:
> Hi Marc,
> 
> On Tue, Jun 4, 2013 at 5:59 PM, Marc Zyngier  wrote:
> > Guys,
> >
> > The KVM/arm64 code is now, as it seems, in good enough shape to be
> > merged. I've so far addressed all the comments, and it doesn't seem any
> > worse then what is queued for its 32bit counterpart.
> >
> > For reference, it is sitting there:
> > git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git
> > kvm-arm64/kvm
> >
> > What is not defined yet is the merge path:
> > - It is touching some of the arm64 core code, so it would be better if
> > it was merged through the arm64 tree
> > - It is depending on some of the patches in the core KVM queue (the
> > vgic/timer move to virt/kvm/arm/)
> > - It is also depending on some of the patches that are in the KVM/ARM
> > queue (parametrized timer interrupt, some MMU/MMIO fixes)
> >
> > So I can see two possibilities:
> > - Either I can rely on a stable branch from both KVM and KVM/ARM trees
> > on which I can base my tree for Catalin/Will to pull,
> > - Or I ask Catalin to only pull the arm64 part *minus the Kconfig*, and
> > only merge this last bit when the dependencies are satisfied in Linus' tree.
> >
> > What do you guys think?
> 
> I had quick look at your kvm-arm64/kvm branch. I agree with the approach
> of going through arm64 tree.
> 
> FYI, latest tested branch on APM ARMv8 board is kvm-arm64/kvm-3.10-rc3
> branch.
> 
> From my side, +1 for the second option that is "pull the arm64 part *minus
> the Kconfig*, and ..."

+1 as well for the second option.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 30/32] arm64: KVM: userspace API documentation

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:58PM +0100, Marc Zyngier wrote:
> Unsurprisingly, the arm64 userspace API is extremely similar to
> the 32bit one, the only significant difference being the ONE_REG
> register mapping.
> 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 32/32] arm64: KVM: document kernel object mappings in HYP

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:14:00PM +0100, Marc Zyngier wrote:
> HYP mode has access to some of the kernel pages. Document the
> memory mapping and the offset between kernel VA and HYP VA.
> 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 29/32] arm64: KVM: enable initialization of a 32bit vcpu

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:57PM +0100, Marc Zyngier wrote:
> Wire the init of a 32bit vcpu by allowing 32bit modes in pstate,
> and providing sensible defaults out of reset state.
> 
> This feature is of course conditioned by the presence of 32bit
> capability on the physical CPU, and is checked by the KVM_CAP_ARM_EL1_32BIT
> capability.
> 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 28/32] arm64: KVM: 32bit guest fault injection

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:56PM +0100, Marc Zyngier wrote:
> Add fault injection capability for 32bit guests.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 27/32] arm64: KVM: 32bit specific register world switch

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:55PM +0100, Marc Zyngier wrote:
> Allow registers specific to 32bit guests to be saved/restored
> during the world switch.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 26/32] arm64: KVM: CPU specific 32bit coprocessor access

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:54PM +0100, Marc Zyngier wrote:
> Enable handling of CPU specific 32bit coprocessor access. Not much
> here either.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 25/32] arm64: KVM: 32bit handling of coprocessor traps

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:53PM +0100, Marc Zyngier wrote:
> Provide the necessary infrastructure to trap coprocessor accesses that
> occur when running 32bit guests.
> 
> Also wire SMC and HVC trapped in 32bit mode while were at it.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 24/32] arm64: KVM: 32bit conditional execution emulation

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:52PM +0100, Marc Zyngier wrote:
> As conditional instructions can trap on AArch32, add the thinest
> possible emulation layer to keep 32bit guests happy.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 23/32] arm64: KVM: 32bit GP register access

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:51PM +0100, Marc Zyngier wrote:
> Allow access to the 32bit register file through the usual API.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 22/32] arm64: KVM: define 32bit specific registers

2013-05-23 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:50PM +0100, Marc Zyngier wrote:
> Define the 32bit specific registers (SPSRs, cp15...).
> 
> Most CPU registers are directly mapped to a 64bit register
> (r0->x0...). Only the SPSRs have separate registers.
> 
> cp15 registers are also mapped into their 64bit counterpart in most
> cases.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 21/32] arm64: KVM: Build system integration

2013-05-22 Thread Catalin Marinas
On Tue, May 21, 2013 at 05:09:47PM +0100, Paolo Bonzini wrote:
> 
> 
> - Messaggio originale -
> > Da: "Catalin Marinas" 
> > A: "Marc Zyngier" 
> > Cc: linux-arm-ker...@lists.infradead.org, kvm...@lists.cs.columbia.edu, 
> > kvm@vger.kernel.org, "Will Deacon"
> > , pbonz...@redhat.com, g...@redhat.com, "Christopher 
> > Covington" 
> > Inviato: Martedì, 21 maggio 2013 17:42:27
> > Oggetto: Re: [PATCH v4 21/32] arm64: KVM: Build system integration
> > 
> > On Tue, May 14, 2013 at 03:13:49PM +0100, Marc Zyngier wrote:
> > > --- /dev/null
> > > +++ b/arch/arm64/kvm/Kconfig
> > ...
> > > +config KVM_ARM_VGIC
> > > +bool
> > > + depends on KVM_ARM_HOST && OF
> > > + select HAVE_KVM_IRQCHIP
> > > + ---help---
> > > +   Adds support for a hardware assisted, in-kernel GIC emulation.
> > > +
> > > +config KVM_ARM_TIMER
> > > +bool
> > > + depends on KVM_ARM_VGIC
> > > + select HAVE_KVM_IRQCHIP
> > > + ---help---
> > > +   Adds support for the Architected Timers in virtual machines
> > > +
> > > +source drivers/virtio/Kconfig
> > 
> > Shouldn't the above configs go to drivers/virtio/Kconfig as well (or
> > drivers/virtio/arm/Kconfig)?
> 
> virtio doesn't mean "virtual versions of devices that also exist in
> hardware"; those are indeed in arch/ARCH/kvm.  virtio is a specific bus
> for paravirtualized devices.  It is not KVM-specific and the code runs
> in the guest (whereas arch/arm64/kvm/Kconfig is host code).

You are right. What I meant was kvm/virt/arm/Kconfig (the place where
the vgic and timer driver goes).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/7] ARM: KVM: various mmu related fixes for 3.10

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 12:11:33PM +0100, Marc Zyngier wrote:
> This patch series fixes a number of of KVM/ARM issues that have either
> been spotted during the review of the arm64 code, or while reworking
> related code.
> 
> Only the first patch fixes a potential (if unlikely) problem, the
> others are either cosmetic or performance related.
> 
> Tested on TC-2, based on 3.10-rc1.
> 
> * From v2:
>   - [1/7] Drop the unnecessary "TLB invalidate all", as we already do
>   it on a page-per-page level.
>   - [3/7] Add a cache cleaning primitive, move the cleaning outside of
>   kvm_set_pte(), and clean a range as large as possible when inserting
>   PTEs.
>   - [4,5,7/7] New patches
> 
> Marc Zyngier (7):
>   ARM: KVM: be more thorough when invalidating TLBs
>   ARM: KVM: remove dead prototype for __kvm_tlb_flush_vmid
>   ARM: KVM: relax cache maintainance when building page tables
>   ARM: KVM: use phys_addr_t instead of unsigned long long for HYP PGDs
>   ARM: KVM: don't special case PC when doing an MMIO
>   ARM: KVM: get rid of S2_PGD_SIZE
>   ARM: KVM: drop use of PAGE_S2_DEVICE

This series looks good to me:

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 31/32] arm64: KVM: MAINTAINERS update

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:59PM +0100, Marc Zyngier wrote:
> Elect myself as the KVM/arm64 maintainer.
> 
> Signed-off-by: Marc Zyngier 

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 21/32] arm64: KVM: Build system integration

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:49PM +0100, Marc Zyngier wrote:
> --- /dev/null
> +++ b/arch/arm64/kvm/Kconfig
...
> +config KVM_ARM_VGIC
> +bool
> + depends on KVM_ARM_HOST && OF
> + select HAVE_KVM_IRQCHIP
> + ---help---
> +   Adds support for a hardware assisted, in-kernel GIC emulation.
> +
> +config KVM_ARM_TIMER
> +bool
> + depends on KVM_ARM_VGIC
> + select HAVE_KVM_IRQCHIP
> + ---help---
> +   Adds support for the Architected Timers in virtual machines
> +
> +source drivers/virtio/Kconfig

Shouldn't the above configs go to drivers/virtio/Kconfig as well (or
drivers/virtio/arm/Kconfig)?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 20/32] arm64: KVM: PSCI implementation

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:48PM +0100, Marc Zyngier wrote:
> Wire the PSCI backend into the exit handling code.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 19/32] arm64: KVM: Plug the arch timer

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:47PM +0100, Marc Zyngier wrote:
> Add support for the in-kernel timer emulation.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

I think same comment as one of the previous patches, you could get rid
of some ISBs if you get an ERET anyway. Otherwise:

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 18/32] arm64: KVM: Plug the VGIC

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:46PM +0100, Marc Zyngier wrote:
> Add support for the in-kernel GIC emulation.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 17/32] arm64: KVM: Exit handling

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:45PM +0100, Marc Zyngier wrote:
> Handle the exit of a VM, decoding the exit reason from HYP mode
> and calling the corresponding handler.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 16/32] arm64: KVM: HYP mode world switch implementation

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:44PM +0100, Marc Zyngier wrote:
> +// void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
> +ENTRY(__kvm_tlb_flush_vmid_ipa)
> +   kern_hyp_va x0
> +   ldr x2, [x0, #KVM_VTTBR]
> +   msr vttbr_el2, x2
> +   isb
> +
> +   /*
> +* We could do so much better if we had the VA as well.
> +* Instead, we invalidate Stage-2 for this IPA, and the
> +* whole of Stage-1. Weep...
> +*/
> +   tlbiipas2e1is, x1
> +   dsb sy
> +   tlbivmalle1is
> +   dsb sy
> +   isb
> +
> +   msr vttbr_el2, xzr
> +   isb
> +   ret
> +ENDPROC(__kvm_tlb_flush_vmid_ipa)

There are some isbs here which could be removed if you need an eret
anyway.

> +ENTRY(__kvm_flush_vm_context)
> +   tlbialle1is
> +   ic  ialluis
> +   dsb sy
> +   isb
> +   ret
> +ENDPROC(__kvm_flush_vm_context)

I didn't fully understand - why do we need I-cache maintenance here? Is
it for ASID-tagged VIVT I-cache?

BTW, the arch/arm equivalent has some better comments on this code ;).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 15/32] arm64: KVM: hypervisor initialization code

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:43PM +0100, Marc Zyngier wrote:
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp-init.S
...
> + .text
> + .pushsection.hyp.idmap.text, "ax"
> +
> + .align  11
> +
> +__kvm_hyp_init:
> + .global __kvm_hyp_init
> +
> +ENTRY(__kvm_hyp_init_vector)

Why do you need both __kvm_hyp_init and __kvm_hyp_init_vector? You could
drop the former.

> +__do_hyp_init:
> +
> + msr ttbr0_el2, x0
> +
> + mrs x4, tcr_el1
> + ldr x5, =TCR_EL2_MASK
> + and x4, x4, x5
> + ldr x5, =TCR_EL2_FLAGS
> + orr x4, x4, x5
> + msr tcr_el2, x4
> +
> + ldr x4, =VTCR_EL2_FLAGS
> + msr vtcr_el2, x4
> +
> + mrs x4, mair_el1
> + msr mair_el2, x4
> + isb
> +
> + mov x4, #SCTLR_EL2_FLAGS
> + msr sctlr_el2, x4
> + isb
> +
> + /* MMU is now enabled. Get ready for the trampoline dance */
> + ldr x4, =TRAMPOLINE_VA
> + adr x5, target
> + bfi x4, x5, #0, #PAGE_SHIFT
> + br  x4
> +
> + nop

What is this nop for?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 14/32] arm64: KVM: guest one-reg interface

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:42PM +0100, Marc Zyngier wrote:
> Let userspace play with the guest registers.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 
> --- /dev/null
> +++ b/arch/arm64/kvm/guest.c
...
> +static int get_core_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
> +{
> + __u32 __user *uaddr = (__u32 __user *)(unsigned long)reg->addr;
> + struct kvm_regs *regs = vcpu_gp_regs(vcpu);
> + int nr_regs = sizeof(*regs) / sizeof(__u32);

Was there any conclusion on using __u32 rather than __u64 here? At least
you should add a comment since it's slightly confusing.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 13/32] arm64: KVM: MMIO access backend

2013-05-21 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:41PM +0100, Marc Zyngier wrote:
> Define the necessary structures to perform an MMIO access.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 12/32] arm64: KVM: kvm_arch and kvm_vcpu_arch definitions

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:40PM +0100, Marc Zyngier wrote:
> Provide the architecture dependent structures for VM and
> vcpu abstractions.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 11/32] arm64: KVM: virtual CPU reset

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:39PM +0100, Marc Zyngier wrote:
> Provide the reset code for a virtual CPU booted in 64bit mode.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 10/32] arm64: KVM: CPU specific system registers handling

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:38PM +0100, Marc Zyngier wrote:
> Add the support code for CPU specific system registers. Not much
> here yet.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 07/32] arm64: KVM: architecture specific MMU backend

2013-05-20 Thread Catalin Marinas
On Mon, May 20, 2013 at 05:17:31PM +0100, Marc Zyngier wrote:
> On 20/05/13 16:57, Catalin Marinas wrote:
> > On Tue, May 14, 2013 at 03:13:35PM +0100, Marc Zyngier wrote:
> >> +static inline bool kvm_is_write_fault(unsigned long esr)
> >> +{
> >> +  unsigned long esr_ec = esr >> ESR_EL2_EC_SHIFT;
> > 
> > Not that it would make much difference but for consistency - we use esr
> > as an 'unsigned int' in the arm64 code (only 32-bit needed).
> 
> Indeed. Changing it would require a 32bit patch though, as I'd like to
> keep the two code bases in line.
> 
> If you don't mind, I'll queue a patch changing this to "unsigned int" on
> both architectures once this is merged.

I'm OK if you push the patch afterwards, you can add my reviewed-by on
this patch.

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 09/32] arm64: KVM: system register handling

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:37PM +0100, Marc Zyngier wrote:
> Provide 64bit system register handling, modeled after the cp15
> handling for ARM.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 08/32] arm64: KVM: user space interface

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:36PM +0100, Marc Zyngier wrote:
> Provide the kvm.h file that defines the user space visible
> interface.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 07/32] arm64: KVM: architecture specific MMU backend

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:35PM +0100, Marc Zyngier wrote:
> +static inline bool kvm_is_write_fault(unsigned long esr)
> +{
> + unsigned long esr_ec = esr >> ESR_EL2_EC_SHIFT;

Not that it would make much difference but for consistency - we use esr
as an 'unsigned int' in the arm64 code (only 32-bit needed).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 06/32] arm64: KVM: fault injection into a guest

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:34PM +0100, Marc Zyngier wrote:
> Implement the injection of a fault (undefined, data abort or
> prefetch abort) into a 64bit guest.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 05/32] arm64: KVM: Basic ESR_EL2 helpers and vcpu register access

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:33PM +0100, Marc Zyngier wrote:
> Implements helpers for dealing with the EL2 syndrome register as
> well as accessing the vcpu registers.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 
...
> +static inline bool kvm_vcpu_dabt_isvalid(const struct kvm_vcpu *vcpu)
> +{
> + return !!(kvm_vcpu_get_hsr(vcpu) & ESR_EL2_ISV);
> +}

You kept these '!!' ;). BTW, would the compiler handle the conversion
between the integer and bool here?

Either way is fine by me:

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 04/32] arm64: KVM: system register definitions for 64bit guests

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:32PM +0100, Marc Zyngier wrote:
> Define the saved/restored registers for 64bit guests.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 03/32] arm64: KVM: EL2 register definitions

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:31PM +0100, Marc Zyngier wrote:
> Define all the useful bitfields for EL2 registers.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 02/32] arm64: KVM: HYP mode idmap support

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:30PM +0100, Marc Zyngier wrote:
> Add the necessary infrastructure for identity-mapped HYP page
> tables. Idmap-ed code must be in the ".hyp.idmap.text" linker
> section.
> 
> The rest of the HYP ends up in ".hyp.text".
> 
> Signed-off-by: Marc Zyngier 
...
> +/*
> + * The HYP init code can't be more than a page long.
> + */
> +ASSERT(((__hyp_idmap_text_start + PAGE_SIZE) >= __hyp_idmap_text_end),
> +   "HYP init code too big")

Is __hyp_idmap_text_end inclusive or exclusive? I think the latter and
you can use greater than (without equal).

Otherwise:

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 01/32] arm64: KVM: define HYP and Stage-2 translation page flags

2013-05-20 Thread Catalin Marinas
On Tue, May 14, 2013 at 03:13:29PM +0100, Marc Zyngier wrote:
> Add HYP and S2 page flags, for both normal and device memory.
> 
> Reviewed-by: Christopher Covington 
> Signed-off-by: Marc Zyngier 

Reviewed-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 02/10] arm64: uaccess s/might_sleep/might_fault/

2013-05-16 Thread Catalin Marinas
On 16 May 2013 12:10, Michael S. Tsirkin  wrote:
> The only reason uaccess routines might sleep
> is if they fault. Make this explicit.
>
> Signed-off-by: Michael S. Tsirkin 
> ---
>  arch/arm64/include/asm/uaccess.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

For arm64:

Acked-by: Catalin Marinas 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/2] ARM: KVM: Moving GIC/timer out of arch/arm

2013-05-12 Thread Catalin Marinas
Hi Gleb,

On Sun, May 12, 2013 at 10:03:59AM +0100, Gleb Natapov wrote:
> On Fri, May 03, 2013 at 04:55:01PM +0100, Marc Zyngier wrote:
> > On 03/05/13 16:31, Anup Patel wrote:
> > > On Fri, May 3, 2013 at 7:32 PM, Marc Zyngier  wrote:
> > >> As KVM/arm64 is looming on the horizon, it makes sense to move some
> > >> of the common code to a single location in order to reduce duplication.
> > >>
> > >> The code could live anywhere. Actually, most of KVM is already built
> > >> with a bunch of ugly ../../.. hacks in the various Makefiles, so we're
> > >> not exactly talking about style here. But maybe it is time to start
> > >> moving into a less ugly direction.
> > >>
> > >> The include files must be in a "public" location, as they are accessed
> > >> from non-KVM files (arch/arm/kernel/asm-offsets.c).
> > >>
> > >> For this purpose, introduce two new locations:
> > >> - virt/kvm/arm/ : x86 and ia64 already share the ioapic code in
> > >>   virt/kvm, so this could be seen as a (very ugly) precedent.
> > >> - include/kvm/  : there is already an include/xen, and while the
> > >>   intent is slightly different, this seems as good a location as
> > >>   any
> > >>
> > >> Once the code has been moved, it becomes easy to build it in a
> > >> less hackish way, which makes the code easily reusable by KVM/arm64.
> > >>
> > >> Marc Zyngier (2):
> > >>   ARM: KVM: move GIC/timer code to a common location
> > >>   ARM: KVM: standalone Makefile for vgic and timers
> > >>
> > >>  Makefile   | 2 +-
> > >>  arch/arm/include/asm/kvm_host.h| 4 ++--
> > >>  arch/arm/kvm/Makefile  | 5 ++---
> > >>  {arch/arm/include/asm => include/kvm}/kvm_arch_timer.h | 0
> > >>  {arch/arm/include/asm => include/kvm}/kvm_vgic.h   | 0
> > >>  virt/Makefile  | 1 +
> > >>  virt/kvm/Makefile  | 1 +
> > >>  virt/kvm/arm/Makefile  | 2 ++
> > >>  {arch/arm/kvm => virt/kvm/arm}/arch_timer.c| 4 ++--
> > >>  {arch/arm/kvm => virt/kvm/arm}/vgic.c  | 0
> > >>  10 files changed, 11 insertions(+), 8 deletions(-)
> > >>  rename {arch/arm/include/asm => include/kvm}/kvm_arch_timer.h (100%)
> > >>  rename {arch/arm/include/asm => include/kvm}/kvm_vgic.h (100%)
> > >>  create mode 100644 virt/Makefile
> > >>  create mode 100644 virt/kvm/Makefile
> > >>  create mode 100644 virt/kvm/arm/Makefile
> > >>  rename {arch/arm/kvm => virt/kvm/arm}/arch_timer.c (99%)
> > >>  rename {arch/arm/kvm => virt/kvm/arm}/vgic.c (100%)
> > > 
> > > The source files arch/arm/kvm/arm.c and arch/arm/kvm/mmu.c are also
> > > shared between KVM ARM and KVM ARM64.
> > > 
> > > Can we move these files in virt/arm ?
> > 
> > I suggest we start by finding out if there is an agreement on the
> > location, method and overall usefulness of this particular patch.
> > 
> > Moving core ARM code around is quite different from sharing what is
> > basically device emulation stuff.
> > 
> Yes, so the question in this regard: are there any plans to eventually
> merge arch/arm and arch/arm64 like it happened with arch/i386
> and arch/x86_64?  Is it even feasible?  Looking at the dark days of
> i386/x86_64 split there were a lot of ../../i386/ and -Iarch/i386/kernel
> in arch/x86_64. Isn't there some code, outside of kvm, that can be shared
> between arm/arm64? How will it be shared?

There are similarities between arm and arm64 (especially since the arm64
port started as a fork of arm) and few other bits that could be shared
but the benefits of a clean port outweigh a bit of code duplication.
Most of the SoC support is now going into drivers, so it's pretty much
architecture code left under arch/arm64.

KVM is the first to make references to ../arm/ from arm64 and I don't
see an easy solution (and I wouldn't like to see common arm/arm64 code
under the top kvm directory either, apart form device emulation). Of
course, a lot of the code like page table maintenance, mapping/unmapping
ranges is pretty generic and could be shared with other architectures
(e.g. x86) but it's not a trivial task.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 24/32] arm64: KVM: 32bit GP register access

2013-05-11 Thread Catalin Marinas
On Sat, May 11, 2013 at 01:36:30AM +0100, Christoffer Dall wrote:
> On Tue, May 07, 2013 at 05:33:03PM +0100, Catalin Marinas wrote:
> > On Tue, May 07, 2013 at 05:28:00PM +0100, Marc Zyngier wrote:
> > > On 02/05/13 17:09, Catalin Marinas wrote:
> > > > BTW, on arch/arm it looks like this is used when you get a data abort
> > > > with PC as the destination register and you inject a prefetch abort in
> > > > this case. Why isn't this a normal data abort? Once you get the
> > > > information, you load it into PC but first you need to sort out the data
> > > > abort (unless I don't understand how the kvm_inject_pabt works).
> > > 
> > > Indeed, it should be a data abort, as we correctly fetched the
> > > instruction. Now, I wonder why we even bother trying to catch this case.
> > > Fetching PC from MMIO looks quite silly, but I don't think anything
> > > really forbids it in the architecture.
> > 
> > It's not forbidden and you should just treat it as any other data abort,
> > no need to check whether the register is PC. If you do the PC adjustment
> > further down in that function it will be overridden by the instruction
> > emulation anyway. There is no optimisation in checking for PC since such
> > fault is very unlikely in sane code anyway.
> > 
> The reason is simply that it is most likely because of a bug that this
> happens, and we would rather catch such an error in a meaningful way
> than let this go crazy and have users track it down for a long time.

It is probably a bug but it is a valid code sequence (and Peter even
gave an example).

> Especially, this was relevant when we did io instruction decoding and we
> wanted to catch potential bugs in that when it was development code.
> 
> That all being said, we can remove the check.  I don't think, however,
> that it being an unlikely thing is a good argument: if we remove the
> check we need to make sure that the VM does whatever the architecture
> dictates, which I assume is to not skip the MMIO instruction and support
> setting the PC to the value returned from an I/O emulation device in
> QEMU.
> 
> I think the scenario sounds crazy and is more than anything a sign of a
> bug somewhere, and whether it should be a PABT or a DABT is a judgement
> call, but I chose a PABT because the thing that's weird is jumping to an
> I/O address, it's not that the memory address for the load is
> problematic.

I think that's where things get confused. You are reading from an I/O
location with the destination being the PC and it is trapped by KVM.
This has nothing to do with PABT, it's purely a DABT on the I/O access.
You should emulate it and store the result in the PC. If the value
loaded in PC is wrong, you will get a subsequent PABT in the guest but
you must not translate the DABT on I/O memory (which would be generated
even if the destination is not PC) into a PABT which confuses the guest
further. By doing this you try to convince the guest that it really
branched to the I/O address (since you raise the PABT with the DFAR
value) when it simply tried to load an address from I/O and branch to
this new address.

IOW, these two are equivalent regarding PC:

ldr pc, [r0]@ r0 is an I/O address

and 

ldr r1, [r0]@ r0 is an I/O address
mov pc, r1

In the second scenario, you don't raise a PABT on the first instruction.
You may raise one on the second if PC is invalid but that's different
and most likely it will be a guest-only thing.

So for DABT emulation, checking whether the destination register is PC
is simply a minimal optimisation (if at all) of that case to avoid
storing the PC twice. Injecting PABT when you get a DABT is a bug. You
already catch PABT on I/O address in kvm_handle_guest_abort() and
correctly inject PABT into guest.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 24/32] arm64: KVM: 32bit GP register access

2013-05-07 Thread Catalin Marinas
On Tue, May 07, 2013 at 05:28:00PM +0100, Marc Zyngier wrote:
> On 02/05/13 17:09, Catalin Marinas wrote:
> > BTW, on arch/arm it looks like this is used when you get a data abort
> > with PC as the destination register and you inject a prefetch abort in
> > this case. Why isn't this a normal data abort? Once you get the
> > information, you load it into PC but first you need to sort out the data
> > abort (unless I don't understand how the kvm_inject_pabt works).
> 
> Indeed, it should be a data abort, as we correctly fetched the
> instruction. Now, I wonder why we even bother trying to catch this case.
> Fetching PC from MMIO looks quite silly, but I don't think anything
> really forbids it in the architecture.

It's not forbidden and you should just treat it as any other data abort,
no need to check whether the register is PC. If you do the PC adjustment
further down in that function it will be overridden by the instruction
emulation anyway. There is no optimisation in checking for PC since such
fault is very unlikely in sane code anyway.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 00/32] Port of KVM to arm64

2013-05-03 Thread Catalin Marinas
On Mon, Apr 08, 2013 at 05:17:02PM +0100, Marc Zyngier wrote:
> This series contains the third version of KVM for arm64.
> 
> It depends on the following branches/series:
> - git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git 
> soc-armv8-model
>   Catalin's platform support branch for v8 models

After 3.10-rc1 you should be able to base it on the mainline kernel (or
-rc2 if something breaks).

> The code is unsurprisingly extremely similar to the KVM/arm code, and
> a lot of it is actually shared with the 32bit version. Some of the
> include files are duplicated though (I'm definitely willing to fix
> that).

As it was already commented, some of the commonality could be shared via
virt/kvm/ (timers, gic).

Apart from the other comments I (and the others) had, the series looks
fine and in shape for 3.11. Looking forward to see v4...

Thanks.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 24/32] arm64: KVM: 32bit GP register access

2013-05-02 Thread Catalin Marinas
On Mon, Apr 08, 2013 at 05:17:26PM +0100, Marc Zyngier wrote:
>  static inline bool kvm_vcpu_reg_is_pc(const struct kvm_vcpu *vcpu, int reg)
>  {
> - return false;
> + return (vcpu_mode_is_32bit(vcpu)) && reg == 15;
>  }

On AArch64, would ESR_EL2 have SRT == 15 when the source/destination
register is PC? The mapping between AArch32 and AArch64 registers
suggests R13_hyp. Maybe 15 is correct but it's not clear to me from the
spec.

BTW, on arch/arm it looks like this is used when you get a data abort
with PC as the destination register and you inject a prefetch abort in
this case. Why isn't this a normal data abort? Once you get the
information, you load it into PC but first you need to sort out the data
abort (unless I don't understand how the kvm_inject_pabt works).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 20/32] arm64: KVM: Plug the arch timer

2013-05-02 Thread Catalin Marinas
On Wed, Apr 24, 2013 at 12:43:51PM +0100, Marc Zyngier wrote:
> On 24/04/13 00:00, Christoffer Dall wrote:
> > On Mon, Apr 08, 2013 at 05:17:22PM +0100, Marc Zyngier wrote:
> >> Add support for the in-kernel timer emulation. The include file
> >> is a complete duplicate of the 32bit one - something to fix
> >> at one point.
> > 
> > again, I'd really like to see this fixed before we merge the code...
> 
> Feel free to suggest a solution. None of the one I tried are nice.

Both arch timers and GIC are now drivers outside the arch/ directories.
To me it makes sense for emulated device drivers to also live outside
the arch/ directories, something like virt/kvm/drivers? As you said,
ioapic is already in virt/kvm.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/5] ARM: KVM: be more thorough when invalidating TLBs

2013-05-02 Thread Catalin Marinas
On Thu, May 02, 2013 at 03:38:58PM +0100, Marc Zyngier wrote:
> diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
> index f7793df..9e2d906c 100644
> --- a/arch/arm/kvm/interrupts.S
> +++ b/arch/arm/kvm/interrupts.S
...
> -static void clear_pte_entry(pte_t *pte)
> +static void clear_pte_entry(struct kvm *kvm, pte_t *pte, phys_addr_t addr)
>  {
>   if (pte_present(*pte)) {
>   kvm_set_pte(pte, __pte(0));
>   put_page(virt_to_page(pte));
> + kvm_tlb_flush_vmid_ipa(kvm, addr);
>   }
>  }
...
>  static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>  {
> - unmap_range(kvm->arch.pgd, start, size);
> + unmap_range(kvm, kvm->arch.pgd, start, size);
>  }
>  
>  /**
> @@ -413,6 +425,7 @@ void kvm_free_stage2_pgd(struct kvm *kvm)
>   return;
>  
>   unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
> + kvm_tlb_flush_vmid_ipa(kvm, 0); /* Invalidate TLB ALL */

Do you still need this here if you invalidated each individual pte in
clear_pte_entry()? I think you can remove it from clear_pte_entry() and
just leave it here (more efficient probably) since you wouldn't free the
actual pages pointed at by the pte before unmapping.

>   free_pages((unsigned long)kvm->arch.pgd, S2_PGD_ORDER);
>   kvm->arch.pgd = NULL;
>  }
> @@ -675,7 +688,6 @@ static void handle_hva_to_gpa(struct kvm *kvm,
>  static void kvm_unmap_hva_handler(struct kvm *kvm, gpa_t gpa, void *data)
>  {
>   unmap_stage2_range(kvm, gpa, PAGE_SIZE);
> - kvm_tlb_flush_vmid_ipa(kvm, gpa);

Here you removed it relying on clear_pte_entry(), I think you could keep
it (see above).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 3/5] ARM: KVM: relax cache maintainance when building page tables

2013-05-02 Thread Catalin Marinas
On Thu, May 02, 2013 at 03:39:00PM +0100, Marc Zyngier wrote:
> Patch 5a677ce044f1 (ARM: KVM: switch to a dual-step HYP init code)
> introduced code that flushes page tables to the point of coherency.
> This is overkill (point of unification is enough and already done),
> and actually not required if running on a SMP capable platform
> (the HW PTW can snoop other cpus' L1).
> 
> Remove this code and let ae8a8b9553bd (ARM: 7691/1: mm: kill unused
> TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead) turn it into
> a no-op for SMP ARMv7.

I think the comment is a bit misleading. You actually don't need
kvm_flush_dcache_to_poc() since you already do this in kvm_set_pte()
etc.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 19/32] arm64: KVM: Plug the VGIC

2013-05-02 Thread Catalin Marinas
On Wed, Apr 24, 2013 at 12:43:21PM +0100, Marc Zyngier wrote:
> On 24/04/13 00:00, Christoffer Dall wrote:
> > On Mon, Apr 08, 2013 at 05:17:21PM +0100, Marc Zyngier wrote:
> >> Add support for the in-kernel GIC emulation. The include file
> >> is a complete duplicate of the 32bit one - something to fix
> >> at one point.
> > 
> > seems like something that should be fixed sooner as opposed to later...
> > Is it hard?
> 
> Hard? No. Disgusting? Yes.
> 
> You need to find a common location for the files. Something like
> virt/kvm/arm/vgic.h? The x86/ia64 guys already have virt/kvm/ioapic.h as
> a (bad) precedent.

Probably not that bad, we already have include/linux/irqchip/arm-gic.h,
so you could add virt/kvm/arm-vgic.[ch] (or create an arm subdirectory
so there will be a tendency to move a lot more in there).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 16/32] arm64: KVM: hypervisor initialization code

2013-05-02 Thread Catalin Marinas
On Mon, Apr 08, 2013 at 05:17:18PM +0100, Marc Zyngier wrote:
> Provide EL2 with page tables and stack, and set the vectors
> to point to the full blown world-switch code.
> 
> Signed-off-by: Marc Zyngier 
> ---
>  arch/arm64/include/asm/kvm_host.h |  13 +
>  arch/arm64/kvm/hyp-init.S | 112 
> ++
>  2 files changed, 125 insertions(+)
>  create mode 100644 arch/arm64/kvm/hyp-init.S
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h 
> b/arch/arm64/include/asm/kvm_host.h
> index a0279ff..8655de4 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -183,4 +183,17 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run 
> *run,
>  int kvm_perf_init(void);
>  int kvm_perf_teardown(void);
>  
> +static inline void __cpu_init_hyp_mode(unsigned long long boot_pgd_ptr,
> +unsigned long long pgd_ptr,
> +unsigned long hyp_stack_ptr,
> +unsigned long vector_ptr)

Why some types are long long other others just long? Is it called from
arch/arm/ code? I would rather use phys_addr_t where relevant and
unsigned long where it means a virtual address.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 08/32] arm64: KVM: architecture specific MMU backend

2013-04-29 Thread Catalin Marinas
On Mon, Apr 08, 2013 at 05:17:10PM +0100, Marc Zyngier wrote:
> diff --git a/arch/arm64/include/asm/kvm_mmu.h 
> b/arch/arm64/include/asm/kvm_mmu.h
> new file mode 100644
> index 000..2eb2230
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_mmu.h
...
> +/*
> + * Align KVM with the kernel's view of physical memory. Should be
> + * 40bit IPA, with PGD being 8kB aligned.
> + */
> +#define KVM_PHYS_SHIFT   PHYS_MASK_SHIFT

Just wondering - do we allow the IPA to be bigger than what qemu/kvmtool
can map?

> +#define KVM_PHYS_SIZE(1UL << KVM_PHYS_SHIFT)
> +#define KVM_PHYS_MASK(KVM_PHYS_SIZE - 1UL)
> +
> +#ifdef CONFIG_ARM64_64K_PAGES
> +#define PAGE_LEVELS  2
> +#define BITS_PER_LEVEL   13
> +#else  /* 4kB pages */
> +#define PAGE_LEVELS  3
> +#define BITS_PER_LEVEL   9

You could use (PAGE_SHIFT - 3) for BITS_PER_LEVEL if that's any clearer
but you can get rid of these entirely (see below).

> +#endif
> +
> +/* Make sure we get the right size, and thus the right alignment */
> +#define BITS_PER_S2_PGD (KVM_PHYS_SHIFT - (PAGE_LEVELS - 1) * BITS_PER_LEVEL 
> - PAGE_SHIFT)
> +#define PTRS_PER_S2_PGD (1 << max(BITS_PER_LEVEL, BITS_PER_S2_PGD))
> +#define S2_PGD_ORDER get_order(PTRS_PER_S2_PGD * sizeof(pgd_t))
> +#define S2_PGD_SIZE  (1 << S2_PGD_ORDER)

It looks like lots of calculations which I can't fully follow ;). So we
need to map KVM_PHYS_SHIFT (40-bit) with a number of stage 2 pgds. We
know that a pgd entry can map PGDIR_SHIFT bits. So the pointers you need
in S2 pgd:

#define PTRS_PER_S2_PGD (1 << (KVM_PHYS_SHIFT - PGDIR_SHIFT))

(no need of PAGE_LEVELS)

With some more juggling you can probably get rid of get_order as well,
though it should be a compile-time constant. Basically KVM_PHYS_SHIFT -
PGDIR_SHIFT - PAGE_SHIFT + 3 (and cope with the negative value for the
64K pages).

> +static inline void coherent_icache_guest_page(struct kvm *kvm, gfn_t gfn)
> +{
> + unsigned long hva = gfn_to_hva(kvm, gfn);
> + flush_icache_range(hva, hva + PAGE_SIZE);

Even ARMv8 can have aliasing VIPT I-cache. Do we need to flush some more
here?

> +#define kvm_flush_dcache_to_poc(a,l) __flush_dcache_area((a), (l))

I had a brief look at the other hyp reworking patches and this function
is used in two situations: (1) flush any data before the Hyp is
initialised and (2) page table creation. The first case looks fine but
the second is not needed on AArch64 (and SMP AArch32) nor entirely
optimal as it should flush to PoU for page table walks. Can we have
different functions, at least we can make one a no-op?

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >