Re: [RFC 00/55] Nested Virtualization on KVM/ARM
[My previous reply had HTML subpart, which made the e-mail look terrible and being rejected from mailing lists. So, I'm sending it again. Sorry for the inconvenience] Hi Christoffer, On Wed, Feb 22, 2017 at 1:23 PM, Christoffer Dall wrote: > Hi Jintack, > > > On Mon, Jan 09, 2017 at 01:23:56AM -0500, Jintack Lim wrote: >> Nested virtualization is the ability to run a virtual machine inside another >> virtual machine. In other words, it’s about running a hypervisor (the guest >> hypervisor) on top of another hypervisor (the host hypervisor). >> >> This series supports nested virtualization on arm64. ARM recently announced >> an >> extension (ARMv8.3) which has support for nested virtualization[1]. This >> series >> is based on the ARMv8.3 specification. >> >> Supporting nested virtualization means that the hypervisor provides not only >> EL0/EL1 execution environment with VMs as it usually does, but also the >> virtualization extensions including EL2 execution environment with the VMs. >> Once the host hypervisor provides those execution environment with the VMs, >> then the guest hypervisor can run its own VMs (nested VMs) naturally. >> >> To support nested virtualization on ARM the hypervisor must emulate a virtual >> execution environment consisting of EL2, EL1, and EL0, as the guest >> hypervisor >> will run in a virtual EL2 mode. Normally KVM/ARM only emulated a VM >> supporting >> EL1/0 running in their respective native CPU modes, but with nested >> virtualization we deprivilege the guest hypervisor and emulate a virtual EL2 >> execution mode in EL1 using the hardware features provided by ARMv8.3 to trap >> EL2 operations to EL1. To do that the host hypervisor needs to manage EL2 >> register state for the guest hypervisor, and shadow EL1 register state that >> reflects the EL2 register state to run the guest hypervisor in EL1. See >> patch 6 >> through 10 for this. >> >> For memory virtualization, the biggest issue is that we now have more than >> two >> stages of translation when running nested VMs. We choose to merge two stage-2 >> page tables (one from the guest hypervisor and the other from the host >> hypervisor) and create shadow stage-2 page tables, which have mappings from >> the >> nested VM’s physical addresses to the machine physical addresses. Stage-1 >> translation is done by the hardware as is done for the normal VMs. >> >> To provide VGIC support to the guest hypervisor, we emulate the GIC >> virtualization extensions using trap-and-emulate to a virtual GIC Hypervisor >> Control Interface. Furthermore, we can still use the GIC VE hardware >> features >> to deliver virtual interrupts to the nested VM, by directly mapping the GIC >> VCPU interface to the nested VM and switching the content of the GIC >> Hypervisor >> Control interface when alternating between a nested VM and a normal VM. See >> patches 25 through 32, and 50 through 52 for more information. >> >> For timer virtualization, the guest hypervisor expects to have access to the >> EL2 physical timer, the EL1 physical timer and the virtual timer. So, the >> host >> hypervisor needs to provide all of them. The virtual timer is always >> available >> to VMs. The physical timer is available to VMs via my previous patch >> series[3]. >> The EL2 physical timer is not supported yet in this RFC. We plan to support >> this as it is required to run other guest hypervisors such as Xen. >> >> Even though this work is not complete (see limitations below), I'd appreciate >> early feedback on this RFC. Specifically, I'm interested in: >> - Is it better to have a kernel config or to make it configurable at runtime? >> - I wonder if the data structure for memory management makes sense. >> - What architecture version do we support for the guest hypervisor, and how? >> For example, do we always support all architecture versions or the same >> architecture as the underlying hardware platform? Or is it better >> to make it configurable from the userspace? >> - Initial comments on the overall design? >> >> This patch series is based on kvm-arm-for-4.9-rc7 with the patch series to >> provide >> VMs with the EL1 physical timer[2]. >> >> Git: https://github.com/columbia/nesting-pub/tree/rfc-v1 >> >> Testing: >> We have tested this on ARMv8.0 (Applied Micro X-Gene)[3] since ARMv8.3 >> hardware >> is not available yet. We have paravirtualized the guest hypervisor to trap to >> EL2 as specified in ARMv8.3 specification using hvc instruction. We plan to >> test this on ARMv8.3 model, and will post the result and v2 if necessary. >> >> Limitations: >> - This patch series only supports arm64, not arm. All the patches compile on >> arm, but I haven't try to boot normal VMs on it. >> - The guest hypervisor with VHE (ARMv8.1) is not supported in this RFC. I >> have >> patches for that, but they need to be cleaned up. >> - Recursive nesting (i.e. emulating ARMv8.3 in the VM) is not tested yet. >> - Other hypervisors (such as Xen) o
Re: [RFC 00/55] Nested Virtualization on KVM/ARM
Hi Christoffer, On Wed, Feb 22, 2017 at 1:23 PM, Christoffer Dall wrote: > Hi Jintack, > > > On Mon, Jan 09, 2017 at 01:23:56AM -0500, Jintack Lim wrote: > > Nested virtualization is the ability to run a virtual machine inside > another > > virtual machine. In other words, it’s about running a hypervisor (the > guest > > hypervisor) on top of another hypervisor (the host hypervisor). > > > > This series supports nested virtualization on arm64. ARM recently > announced an > > extension (ARMv8.3) which has support for nested virtualization[1]. This > series > > is based on the ARMv8.3 specification. > > > > Supporting nested virtualization means that the hypervisor provides not > only > > EL0/EL1 execution environment with VMs as it usually does, but also the > > virtualization extensions including EL2 execution environment with the > VMs. > > Once the host hypervisor provides those execution environment with the > VMs, > > then the guest hypervisor can run its own VMs (nested VMs) naturally. > > > > To support nested virtualization on ARM the hypervisor must emulate a > virtual > > execution environment consisting of EL2, EL1, and EL0, as the guest > hypervisor > > will run in a virtual EL2 mode. Normally KVM/ARM only emulated a VM > supporting > > EL1/0 running in their respective native CPU modes, but with nested > > virtualization we deprivilege the guest hypervisor and emulate a virtual > EL2 > > execution mode in EL1 using the hardware features provided by ARMv8.3 to > trap > > EL2 operations to EL1. To do that the host hypervisor needs to manage EL2 > > register state for the guest hypervisor, and shadow EL1 register state > that > > reflects the EL2 register state to run the guest hypervisor in EL1. See > patch 6 > > through 10 for this. > > > > For memory virtualization, the biggest issue is that we now have more > than two > > stages of translation when running nested VMs. We choose to merge two > stage-2 > > page tables (one from the guest hypervisor and the other from the host > > hypervisor) and create shadow stage-2 page tables, which have mappings > from the > > nested VM’s physical addresses to the machine physical addresses. Stage-1 > > translation is done by the hardware as is done for the normal VMs. > > > > To provide VGIC support to the guest hypervisor, we emulate the GIC > > virtualization extensions using trap-and-emulate to a virtual GIC > Hypervisor > > Control Interface. Furthermore, we can still use the GIC VE hardware > features > > to deliver virtual interrupts to the nested VM, by directly mapping the > GIC > > VCPU interface to the nested VM and switching the content of the GIC > Hypervisor > > Control interface when alternating between a nested VM and a normal VM. > See > > patches 25 through 32, and 50 through 52 for more information. > > > > For timer virtualization, the guest hypervisor expects to have access to > the > > EL2 physical timer, the EL1 physical timer and the virtual timer. So, > the host > > hypervisor needs to provide all of them. The virtual timer is always > available > > to VMs. The physical timer is available to VMs via my previous patch > series[3]. > > The EL2 physical timer is not supported yet in this RFC. We plan to > support > > this as it is required to run other guest hypervisors such as Xen. > > > > Even though this work is not complete (see limitations below), I'd > appreciate > > early feedback on this RFC. Specifically, I'm interested in: > > - Is it better to have a kernel config or to make it configurable at > runtime? > > - I wonder if the data structure for memory management makes sense. > > - What architecture version do we support for the guest hypervisor, and > how? > > For example, do we always support all architecture versions or the same > > architecture as the underlying hardware platform? Or is it better > > to make it configurable from the userspace? > > - Initial comments on the overall design? > > > > This patch series is based on kvm-arm-for-4.9-rc7 with the patch series > to provide > > VMs with the EL1 physical timer[2]. > > > > Git: https://github.com/columbia/nesting-pub/tree/rfc-v1 > > > > Testing: > > We have tested this on ARMv8.0 (Applied Micro X-Gene)[3] since ARMv8.3 > hardware > > is not available yet. We have paravirtualized the guest hypervisor to > trap to > > EL2 as specified in ARMv8.3 specification using hvc instruction. We plan > to > > test this on ARMv8.3 model, and will post the result and v2 if necessary. > > > > Limitations: > > - This patch series only supports arm64, not arm. All the patches > compile on > > arm, but I haven't try to boot normal VMs on it. > > - The guest hypervisor with VHE (ARMv8.1) is not supported in this RFC. > I have > > patches for that, but they need to be cleaned up. > > - Recursive nesting (i.e. emulating ARMv8.3 in the VM) is not tested yet. > > - Other hypervisors (such as Xen) on KVM are not tested. > > > > TODO: > > - Test to boot normal VMs on arm architec
Re: [RFC 00/55] Nested Virtualization on KVM/ARM
Hi Jintack, On Mon, Jan 09, 2017 at 01:23:56AM -0500, Jintack Lim wrote: > Nested virtualization is the ability to run a virtual machine inside another > virtual machine. In other words, it’s about running a hypervisor (the guest > hypervisor) on top of another hypervisor (the host hypervisor). > > This series supports nested virtualization on arm64. ARM recently announced an > extension (ARMv8.3) which has support for nested virtualization[1]. This > series > is based on the ARMv8.3 specification. > > Supporting nested virtualization means that the hypervisor provides not only > EL0/EL1 execution environment with VMs as it usually does, but also the > virtualization extensions including EL2 execution environment with the VMs. > Once the host hypervisor provides those execution environment with the VMs, > then the guest hypervisor can run its own VMs (nested VMs) naturally. > > To support nested virtualization on ARM the hypervisor must emulate a virtual > execution environment consisting of EL2, EL1, and EL0, as the guest hypervisor > will run in a virtual EL2 mode. Normally KVM/ARM only emulated a VM > supporting > EL1/0 running in their respective native CPU modes, but with nested > virtualization we deprivilege the guest hypervisor and emulate a virtual EL2 > execution mode in EL1 using the hardware features provided by ARMv8.3 to trap > EL2 operations to EL1. To do that the host hypervisor needs to manage EL2 > register state for the guest hypervisor, and shadow EL1 register state that > reflects the EL2 register state to run the guest hypervisor in EL1. See patch > 6 > through 10 for this. > > For memory virtualization, the biggest issue is that we now have more than two > stages of translation when running nested VMs. We choose to merge two stage-2 > page tables (one from the guest hypervisor and the other from the host > hypervisor) and create shadow stage-2 page tables, which have mappings from > the > nested VM’s physical addresses to the machine physical addresses. Stage-1 > translation is done by the hardware as is done for the normal VMs. > > To provide VGIC support to the guest hypervisor, we emulate the GIC > virtualization extensions using trap-and-emulate to a virtual GIC Hypervisor > Control Interface. Furthermore, we can still use the GIC VE hardware features > to deliver virtual interrupts to the nested VM, by directly mapping the GIC > VCPU interface to the nested VM and switching the content of the GIC > Hypervisor > Control interface when alternating between a nested VM and a normal VM. See > patches 25 through 32, and 50 through 52 for more information. > > For timer virtualization, the guest hypervisor expects to have access to the > EL2 physical timer, the EL1 physical timer and the virtual timer. So, the host > hypervisor needs to provide all of them. The virtual timer is always available > to VMs. The physical timer is available to VMs via my previous patch > series[3]. > The EL2 physical timer is not supported yet in this RFC. We plan to support > this as it is required to run other guest hypervisors such as Xen. > > Even though this work is not complete (see limitations below), I'd appreciate > early feedback on this RFC. Specifically, I'm interested in: > - Is it better to have a kernel config or to make it configurable at runtime? > - I wonder if the data structure for memory management makes sense. > - What architecture version do we support for the guest hypervisor, and how? > For example, do we always support all architecture versions or the same > architecture as the underlying hardware platform? Or is it better > to make it configurable from the userspace? > - Initial comments on the overall design? > > This patch series is based on kvm-arm-for-4.9-rc7 with the patch series to > provide > VMs with the EL1 physical timer[2]. > > Git: https://github.com/columbia/nesting-pub/tree/rfc-v1 > > Testing: > We have tested this on ARMv8.0 (Applied Micro X-Gene)[3] since ARMv8.3 > hardware > is not available yet. We have paravirtualized the guest hypervisor to trap to > EL2 as specified in ARMv8.3 specification using hvc instruction. We plan to > test this on ARMv8.3 model, and will post the result and v2 if necessary. > > Limitations: > - This patch series only supports arm64, not arm. All the patches compile on > arm, but I haven't try to boot normal VMs on it. > - The guest hypervisor with VHE (ARMv8.1) is not supported in this RFC. I have > patches for that, but they need to be cleaned up. > - Recursive nesting (i.e. emulating ARMv8.3 in the VM) is not tested yet. > - Other hypervisors (such as Xen) on KVM are not tested. > > TODO: > - Test to boot normal VMs on arm architecture > - Test this on ARMv8.3 model > - Support the guest hypervisor with VHE > - Provide the guest hypervisor with the EL2 physical timer > - Run other hypervisors such as Xen on KVM > I have a couple of overall questions and comments on this series: First, I think we
Re: [RFC 00/55] Nested Virtualization on KVM/ARM
On Mon, Jan 9, 2017 at 10:05 AM, David Hildenbrand wrote: > >> Even though this work is not complete (see limitations below), I'd >> appreciate >> early feedback on this RFC. Specifically, I'm interested in: >> - Is it better to have a kernel config or to make it configurable at >> runtime? > > > x86 and s390x have a kernel module parameter (nested) that can only be > changed when loading the module and should default to false. So the > admin explicitly has to enable it. Maybe going the same path makes > sense. I think that makes sense. Thanks! > > -- > > David > ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC 00/55] Nested Virtualization on KVM/ARM
Even though this work is not complete (see limitations below), I'd appreciate early feedback on this RFC. Specifically, I'm interested in: - Is it better to have a kernel config or to make it configurable at runtime? x86 and s390x have a kernel module parameter (nested) that can only be changed when loading the module and should default to false. So the admin explicitly has to enable it. Maybe going the same path makes sense. -- David ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[RFC 00/55] Nested Virtualization on KVM/ARM
Nested virtualization is the ability to run a virtual machine inside another virtual machine. In other words, it???s about running a hypervisor (the guest hypervisor) on top of another hypervisor (the host hypervisor). This series supports nested virtualization on arm64. ARM recently announced an extension (ARMv8.3) which has support for nested virtualization[1]. This series is based on the ARMv8.3 specification. Supporting nested virtualization means that the hypervisor provides not only EL0/EL1 execution environment with VMs as it usually does, but also the virtualization extensions including EL2 execution environment with the VMs. Once the host hypervisor provides those execution environment with the VMs, then the guest hypervisor can run its own VMs (nested VMs) naturally. To support nested virtualization on ARM the hypervisor must emulate a virtual execution environment consisting of EL2, EL1, and EL0, as the guest hypervisor will run in a virtual EL2 mode. Normally KVM/ARM only emulated a VM supporting EL1/0 running in their respective native CPU modes, but with nested virtualization we deprivilege the guest hypervisor and emulate a virtual EL2 execution mode in EL1 using the hardware features provided by ARMv8.3 to trap EL2 operations to EL1. To do that the host hypervisor needs to manage EL2 register state for the guest hypervisor, and shadow EL1 register state that reflects the EL2 register state to run the guest hypervisor in EL1. See patch 6 through 10 for this. For memory virtualization, the biggest issue is that we now have more than two stages of translation when running nested VMs. We choose to merge two stage-2 page tables (one from the guest hypervisor and the other from the host hypervisor) and create shadow stage-2 page tables, which have mappings from the nested VM???s physical addresses to the machine physical addresses. Stage-1 translation is done by the hardware as is done for the normal VMs. To provide VGIC support to the guest hypervisor, we emulate the GIC virtualization extensions using trap-and-emulate to a virtual GIC Hypervisor Control Interface. Furthermore, we can still use the GIC VE hardware features to deliver virtual interrupts to the nested VM, by directly mapping the GIC VCPU interface to the nested VM and switching the content of the GIC Hypervisor Control interface when alternating between a nested VM and a normal VM. See patches 25 through 32, and 50 through 52 for more information. For timer virtualization, the guest hypervisor expects to have access to the EL2 physical timer, the EL1 physical timer and the virtual timer. So, the host hypervisor needs to provide all of them. The virtual timer is always available to VMs. The physical timer is available to VMs via my previous patch series[3]. The EL2 physical timer is not supported yet in this RFC. We plan to support this as it is required to run other guest hypervisors such as Xen. Even though this work is not complete (see limitations below), I'd appreciate early feedback on this RFC. Specifically, I'm interested in: - Is it better to have a kernel config or to make it configurable at runtime? - I wonder if the data structure for memory management makes sense. - What architecture version do we support for the guest hypervisor, and how? For example, do we always support all architecture versions or the same architecture as the underlying hardware platform? Or is it better to make it configurable from the userspace? - Initial comments on the overall design? This patch series is based on kvm-arm-for-4.9-rc7 with the patch series to provide VMs with the EL1 physical timer[2]. Git: https://github.com/columbia/nesting-pub/tree/rfc-v1 Testing: We have tested this on ARMv8.0 (Applied Micro X-Gene)[3] since ARMv8.3 hardware is not available yet. We have paravirtualized the guest hypervisor to trap to EL2 as specified in ARMv8.3 specification using hvc instruction. We plan to test this on ARMv8.3 model, and will post the result and v2 if necessary. Limitations: - This patch series only supports arm64, not arm. All the patches compile on arm, but I haven't try to boot normal VMs on it. - The guest hypervisor with VHE (ARMv8.1) is not supported in this RFC. I have patches for that, but they need to be cleaned up. - Recursive nesting (i.e. emulating ARMv8.3 in the VM) is not tested yet. - Other hypervisors (such as Xen) on KVM are not tested. TODO: - Test to boot normal VMs on arm architecture - Test this on ARMv8.3 model - Support the guest hypervisor with VHE - Provide the guest hypervisor with the EL2 physical timer - Run other hypervisors such as Xen on KVM [1] https://www.community.arm.com/processors/b/blog/posts/armv8-a-architecture-2016-additions [2] https://lists.cs.columbia.edu/pipermail/kvmarm/2016-December/022825.html [3] https://www.cloudlab.us/hardware.php#utah Christoffer Dall (27): arm64: Add missing TCR hw defines KVM: arm64: Add nesting config option KVM: arm64: Add KVM nestin