From: Christoffer Dall
When a guest hypervisor running virtual EL2 in EL1 executes an ERET
instruction, we will have set HCR_EL2.NV which traps ERET to EL2, so
that we can emulate the exception return in software.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm64/incl
For the time being, pretend that NV and SVE are incompatible.
Things will shortly change... Or not.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/sys_regs.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index e711dde4
Extract the direct HW accessors for later reuse.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/sys_regs.c | 247 +-
1 file changed, 139 insertions(+), 108 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 2b8734f75a09..e18
From: Jintack Lim
Forward exceptions due to WFI or WFE instructions to the virtual EL2 if
they are not coming from the virtual EL2 and virtual HCR_EL2.TWX is set.
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/kvm_nested.h | 2 ++
arch/arm64/kvm/Makefile
From: Dave Martin
Currently, the {read,write}_sysreg_el*() accessors for accessing
particular ELs' sysregs in the presence of VHE rely on some local
hacks and define their system register encodings in a way that is
inconsistent with the core definitions in .
As a result, it is necessary to add d
From: Jintack Lim
VMs used to execute hvc #0 for the psci call if EL3 is not implemented.
However, when we come to provide the virtual EL2 mode to the VM, the
host OS inside the VM calls kvm_call_hyp() which is also hvc #0. So,
it's hard to differentiate between them from the host hypervisor's po
From: Jintack Lim
Forward traps due to FP/ASIMD register accesses to the virtual EL2 if
virtual CPTR_EL2.TFP is set. Note that if TFP bit is set, then even
accesses to FP/ASIMD register from EL2 as well as NS EL0/1 will trap to
EL2. So, we don't check the VM's exception level.
Signed-off-by: Jin
From: Christoffer Dall
When running a nested hypervisor we commonly have to figure out if
the VCPU mode is running in the context of a guest hypervisor or guest
guest, or just a normal guest.
Add convenient primitives for this.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
From: Christoffer Dall
Reset the VCPU with PSTATE.M = EL2h when the nested virtualization
feature is enabled on the VCPU.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/reset.c | 7 +++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/kvm/reset.c b/
From: Jintack Lim
Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
CPU has the ARMv8.3 nested virtualization capability.
This will be used to support nested virtualization in KVM.
Signed-off-by: Jintack Lim
Signed-off-by: Andre Przywara
Signed-off-by: Christoffer Dall
Signed-off-
From: Christoffer Dall
When running in virtual EL2 mode, we actually run the hardware in EL1
and therefore have to use the EL1 registers to ensure correct operation.
By setting the HCR.TVM and HCR.TVRM we ensure that the virtual EL2 mode
doesn't shoot itself in the foot when setting up what it b
SPSR_EL2 needs special attention when running nested on ARMv8.3:
If taking an exception while running at vEL2 (actually EL1), the
HW will update the SPSR_EL1 register with the EL1 mode. We need
to track this in order to make sure that accesses to the virtual
view of SPSR_EL2 is correct.
To do so,
From: Andre Przywara
KVM internally uses accessor functions when reading or writing the
guest's system registers. This takes care of accessing either the stored
copy or using the "live" EL1 system registers when the host uses VHE.
With the introduction of virtual EL2 we add a bunch of EL2 system
We don't want to expose complicated features to guests until we have
a good grasp on the basic CPU emulation. So let's pretend that RAS,
just like SVE, doesn't exist in a nested guest.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/sys_regs.c | 32 +---
1 file changed
From: Christoffer Dall
Introduce the feature bit and a primitive that checks if the feature is
set behind a static key check based on the cpus_have_const_cap check.
Checking nested_virt_in_use() on systems without nested virt enabled
should have neglgible overhead.
We don't yet allow userspace
From: Andre Przywara
Whenever we need to restore the guest's system registers to the CPU, we
now need to take care of the EL2 system registers as well. Most of them
are accessed via traps only, but some have an immediate effect and also
a guest running in VHE mode would expect them to be accessib
From: Jintack Lim
Support injecting exceptions and performing exception returns to and
from virtual EL2. This must be done entirely in software except when
taking an exception from vEL0 to vEL2 when the virtual HCR_EL2.{E2H,TGE}
== {1,1} (a VHE guest hypervisor).
Signed-off-by: Jintack Lim
Si
From: Christoffer Dall
We were not allowing userspace to set a more privileged mode for the VCPU
than EL1, but we should allow this when nested virtualization is enabled
for the VCPU.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/guest.c | 6 ++
1 file cha
From: Jintack Lim
For the same reason we trap virtual memory register accesses at virtual
EL2, we need to trap SPSR_EL1, ELR_EL1 and VBAR_EL1 accesses. ARM v8.3
introduces the HCR_EL2.NV1 bit to be able to trap on those register
accesses in EL1. Do not set this bit until the whole nesting support
From: Christoffer Dall
We can no longer blindly copy the VCPU's PSTATE into SPSR_EL2 and return
to the guest and vice versa when taking an exception to the hypervisor,
because we emulate virtual EL2 in EL1 and therefore have to translate
the mode field from EL2 to EL1 and vice versa.
Signed-off-
From: Jintack Lim
Forward the EL1 virtual memory register traps to the virtual EL2 if they
are not coming from the virtual EL2 and the virtual HCR_EL2.TVM or TRVM
bit is set.
This is for recursive nested virtualization.
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
---
arch/arm64/kv
From: Jintack Lim
We enable nested virtualization by setting the HCR NV and NV1 bit.
When the virtual E2H bit is set, we can support EL2 register accesses
via EL1 registers from the virtual EL2 by doing trap-and-emulate. A
better alternative, however, is to allow the virtual EL2 to access EL2
re
From: Jintack Lim
With HCR_EL2.NV bit set, accesses to EL12 registers in the virtual EL2
trap to EL2. Handle those traps just like we do for EL1 registers.
One exception is CNTKCTL_EL12. We don't trap on CNTKCTL_EL1 for non-VHE
virtual EL2 because we don't have to. However, accessing CNTKCTL_EL1
From: Jintack Lim
Forward ELR_EL1, SPSR_EL1 and VBAR_EL1 traps to the virtual EL2 if the
virtual HCR_EL2.NV bit is set.
This is for recursive nested virtualization.
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/kvm_arm.h | 1 +
arch/arm64/kvm/sys_regs.c
Having __load_guest_stage2 in kvm_hyp.h is quickly going to trigger
a circular include problem. In order to avoid this, let's move
it to kvm_mmu.h, where it will be a better fit anyway.
In the process, drop the __hyp_text annotation, which doesn't help
as the function is marked as __always_inline.
From: Jintack Lim
When HCR.NV bit is set, execution of the EL2 translation regime address
aranslation instructions and TLB maintenance instructions are trapped to
EL2. In addition, execution of the EL1 translation regime address
aranslation instructions and TLB maintenance instructions that are o
From: Jintack Lim
For the same reason we trap virtual memory register accesses in virtual
EL2, we trap CPACR_EL1 access too; We allow the virtual EL2 mode to
access EL1 system register state instead of the virtual EL2 one.
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
---
arch/arm64/
From: Jintack Lim
Now that the psci call is done by the smc instruction when nested
virtualization is enabled, it is clear that all hvc instruction from the
VM (including from the virtual EL2) are supposed to handled in the
virtual EL2.
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
--
The VMPIDR_EL2 and VPIDR_EL2 are architecturally UNKNOWN at reset, but
let's be nice to a guest hypervisor behaving foolishly and reset these
to something reasonable anyway.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/sys_regs.c | 25 +
From: Christoffer Dall
So far we were flushing almost the entire universe whenever a VM would
load/unload the SCTLR_EL1 and the two versions of that register had
different MMU enabled settings. This turned out to be so slow that it
prevented forward progress for a nested VM, because a scheduler
From: Jintack Lim
ARM v8.3 introduces a new bit in the HCR_EL2, which is the NV bit. When
this bit is set, accessing EL2 registers in EL1 traps to EL2. In
addition, executing the following instructions in EL1 will trap to EL2:
tlbi, at, eret, and msr/mrs instructions to access SP_EL1. Most of the
I've taken over the maintenance of this series originally written by
Jintack and Christoffer. Since then, the series has been substantially
reworked, new features (and most probably bugs) have been added, and
the whole thing rebased multiple times. If anything breaks, please
blame me, and nobody el
From: Andre Przywara
The VGIC maintenance IRQ signals various conditions about the LRs, when
the GIC's virtualization extension is used.
So far we didn't need it, but nested virtualization needs to know about
this interrupt, so add a userland interface to setup the IRQ number.
The architecture ma
When mapping a page in a shadow stage-2, special care must be
taken not to be more permissive than the guest is (writable or
readable page when the guest hasn't set that permission).
Signed-off-by: Marc Zyngier
---
arch/arm/include/asm/kvm_mmu.h | 18 ++
arch/arm64/include/a
From: Jintack Lim
Rework the system instruction emulation framework to handle potentially
all system instruction traps other than MSR/MRS instructions. Those
system instructions would be AT and TLBI instructions controlled by
HCR_EL2.NV, AT, and TTLB bits.
Signed-off-by: Jintack Lim
[Changed to
From: Christoffer Dall
Add stage 2 mmu data structures for virtual EL2 and for nested guests.
We don't yet populate shadow stage 2 page tables, but we now have a
framework for getting to a shadow stage 2 pgd.
We allocate twice the number of vcpus as stage 2 mmu structures because
that's sufficie
From: Christoffer Dall
Based on the pseudo-code in the ARM ARM, implement a stage 2 software
page table walker.
Signed-off-by: Christoffer Dall
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/esr.h| 1 +
arch/arm64/include/asm/kvm_arm.h| 2 +
Since we're (almost) feature complete, let's allow userspace to
request KVM_ARM_VCPU_NESTED_VIRT by bumping the KVM_VCPU_MAX_FEATURES
up.
It's going to be great...
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/kvm_host.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git
From: Jintack Lim
When supporting nested virtualization a guest hypervisor executing AT
instructions must be trapped and emulated by the host hypervisor,
because untrapped AT instructions operating on S1E1 will use the wrong
translation regieme (the one used to emulate virtual EL2 in EL1 instead
From: Jintack Lim
This introduces a function prototype to determine if we need to forward
system instruction traps to the virtual EL2. The implementation of
forward_trap functions for each system instruction will be added in
later patches.
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
Depending on the HCR_EL2.{E2H,TGE} values, SCTLR_EL2 has different
RES0/RES1 constraints.
Let's handle that.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/sys_regs.c | 28 +++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arc
From: Jintack Lim
When entering a nested VM, we set up the hypervisor control interface
based on what the guest hypervisor has set. Especially, we investigate
each list register written by the guest hypervisor whether HW bit is
set. If so, we translate hw irq number from the guest's point of vie
Add the required handling for EL2 and EL02 registers, as
well as EL1 registers used in the E2H context.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/sys_regs.c | 72 +++
1 file changed, 72 insertions(+)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kv
From: Jintack Lim
Exposing memory management support to the virtual EL2 as is exposed to
the host hypervisor would make the implementation too complex and
inefficient. Therefore expose limited memory management support for the
following two cases.
We expose same or larger page granules than the
From: Jintack Lim
Forward traps due to HCR_EL2.NV bit to the virtual EL2 if they are not
coming from the virtual EL2 and the virtual HCR_EL2.NV bit is set.
In addition to EL2 register accesses, setting NV bit will also make EL12
register accesses trap to EL2. To emulate this for the virtual EL2,
From: Christoffer Dall
Adding tracepoints to be able to peek into the shadow LRs used when
running a guest guest.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
virt/kvm/arm/vgic/vgic-nested-trace.h | 137 ++
virt/kvm/arm/vgic/vgic-v3-nested.c| 12
In order for vgic_v3_load_nested to be able to observe which
which timer interrupts have the HW bit set for the current
context, the timers must have been loaded in the new mode
and the right timer mapped to their corresponding HW IRQs.
At the moment, we load the GIC first, meaning that timer
inte
As we are about to reuse our stage 2 page table manipulation code for
shadow stage 2 page tables in the context of nested virtualization, we
are going to manage multiple stage 2 page tables for a single VM.
This requires some pretty invasive changes to our data structures,
which moves the vmid and
We need to allow a guest hypervisor to virtualize the virtual timer.
FOr that, let's propagate CNTVOFF_EL2 to the guest's view of that
timer.
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/kvm_host.h | 1 -
arch/arm64/kvm/sys_regs.c | 8 ++--
include/kvm/arm_arch_timer.h
On entering vEL2, we must honor the SCTLR_EL2.SPAN bit so that
PSTATE.PAN reflect the expected setting.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/emulate-nested.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/k
From: Christoffer Dall
Unmap/flush shadow stage 2 page tables for the nested VMs as well as the
stage 2 page table for the guest hypervisor.
Note: A bunch of the code in mmu.c relating to MMU notifiers is
currently dealt with in an extremely abrupt way, for example by clearing
out an entire shad
When we take a maintenance interrupt, we need to decide whether
it is generated on an action from the guest, or if it is something
that needs to be forwarded to the guest hypervisor.
Signed-off-by: Marc Zyngier
---
arch/arm64/kvm/nested.c| 2 +-
virt/kvm/arm/vgic/vgic-init.c |
From: Jintack Lim
When supporting nested virtualization a guest hypervisor executing TLBI
instructions must be trapped and emulated by the host hypervisor,
because the guest hypervisor can only affect physical TLB entries
relating to its own execution environment (virtual EL2 in EL1) but not
to t
From: Christoffer Dall
Emulating EL2 also means emulating the EL2 timers. To do so, we expand
our timer framework to deal with at most 4 timers. At any given time,
two timers are using the HW timers, and the two others are purely
emulated.
The role of deciding which is which at any given time is
From: Christoffer Dall
If we are faulting on a shadow stage 2 translation, we first walk the
guest hypervisor's stage 2 page table to see if it has a mapping. If
not, we inject a stage 2 page fault to the virtual EL2. Otherwise, we
create a mapping in the shadow stage 2 page table.
Note that we
Starting a S2 MMU search from the beginning all the time means that
we're potentially nuking a useful context (like we'd potentially
have on a !VHE KVM guest).
Instead, let's always start the search from the point *after* the
last allocated context. This should ensure that alternating between
two
last_vcpu_ran has to be per s2 mmu now that we can have multiple S2
per VM. Let's take this opportunity to perform some cleanup.
Signed-off-by: Marc Zyngier
---
arch/arm/include/asm/kvm_host.h | 6 +++---
arch/arm/include/asm/kvm_mmu.h| 2 +-
arch/arm64/include/asm/kvm_host.h | 6 +++---
From: Andre Przywara
Add trap handlers for the timer system registers accessed from a guest
hypervisors using either _EL02 or _EL2 system register access
instructions.
Signed-off-by: Andre Przywara
Signed-off-by: Jintack Lim
Signed-off-by: Marc Zyngier
---
arch/arm64/include/asm/sysreg.h |
From: Christoffer Dall
If we move the used_lrs field to the version-specific cpu interface
structure, the following functions only operate on the struct
vgic_v3_cpu_if and not the full vcpu:
__vgic_v3_save_state
__vgic_v3_restore_state
__vgic_v3_activate_traps
__vgic_v3_deactivate_traps
From: Christoffer Dall
Should the guest hypervisor use the HW bit in the LRs, we need to
emulate the deactivation from the L2 guest into the L1 distributor
emulation, which is handled by L0.
It's all good fun.
Signed-off-by: Christoffer Dall
Signed-off-by: Marc Zyngier
---
arch/arm64/include
On 21/06/2019 10:57, Itaru Kitayama wrote:
> Marc,
> Only possible way to test this series is to get
> on Fast Model?
Unless you have some ARMv8.3-capable hardware lying around, yes.
Note that the Foundation model is also a good option.
Thanks,
M.
--
Jazz is not dead. It just smells fun
Marc,
Only possible way to test this series is to get
on Fast Model?
On Fri, Jun 21, 2019 at 18:55 Marc Zyngier wrote:
> I've taken over the maintenance of this series originally written by
> Jintack and Christoffer. Since then, the series has been substantially
> reworked, new features (and mos
On 21/06/2019 10:37, Marc Zyngier wrote:
> From: Jintack Lim
>
> Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
> CPU has the ARMv8.3 nested virtualization capability.
>
> This will be used to support nested virtualization in KVM.
>
> Signed-off-by: Jintack Lim
> Signed-off-by
On 21/06/2019 10:37, Marc Zyngier wrote:
> From: Christoffer Dall
>
> Introduce the feature bit and a primitive that checks if the feature is
> set behind a static key check based on the cpus_have_const_cap check.
>
> Checking nested_virt_in_use() on systems without nested virt enabled
> shou
On 21/06/2019 14:08, Julien Thierry wrote:
>
>
> On 21/06/2019 10:37, Marc Zyngier wrote:
>> From: Jintack Lim
>>
>> Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
>> CPU has the ARMv8.3 nested virtualization capability.
>>
>> This will be used to support nested virtualization in K
On 21/06/2019 10:37, Marc Zyngier wrote:
> From: Christoffer Dall
>
> We were not allowing userspace to set a more privileged mode for the VCPU
> than EL1, but we should allow this when nested virtualization is enabled
> for the VCPU.
>
> Signed-off-by: Christoffer Dall
> Signed-off-by: Marc
Hi James,
sorry for the late reply.
On 2019/6/17 19:19, James Morse wrote:
Hi Zenghui,
On 13/06/2019 12:28, Zenghui Yu wrote:
On 2019/6/12 20:49, James Morse wrote:
On 12/06/2019 10:08, Zenghui Yu wrote:
Currently, we use trace_kvm_exit() to report exception type (e.g.,
"IRQ", "TRAP") and e
On 06/21/2019 10:37 AM, Marc Zyngier wrote:
From: Jintack Lim
Add a new ARM64_HAS_NESTED_VIRT feature to indicate that the
CPU has the ARMv8.3 nested virtualization capability.
This will be used to support nested virtualization in KVM.
Signed-off-by: Jintack Lim
Signed-off-by: Andre Przywara
On 21/06/2019 14:24, Julien Thierry wrote:
>
>
> On 21/06/2019 10:37, Marc Zyngier wrote:
>> From: Christoffer Dall
>>
>> We were not allowing userspace to set a more privileged mode for the VCPU
>> than EL1, but we should allow this when nested virtualization is enabled
>> for the VCPU.
>>
>> S
On Wed, Jun 19, 2019 at 07:51:03PM +0800, Guo Ren wrote:
> On Wed, Jun 19, 2019 at 4:54 PM Julien Grall wrote:
> > On 6/19/19 9:07 AM, Guo Ren wrote:
> > > Move arm asid allocator code in a generic one is a agood idea, I've
> > > made a patchset for C-SKY and test is on processing, See:
> > > http
70 matches
Mail list logo