Re: [RFC PATCH 00/19] QEMU gmem implemention
On Thu, Aug 10, 2023 at 10:58:09AM -0500, Michael Roth via wrote: > On Tue, Aug 01, 2023 at 09:45:41AM +0800, Xiaoyao Li wrote: > > On 8/1/2023 12:51 AM, Daniel P. Berrangé wrote: > > > On Mon, Jul 31, 2023 at 12:21:42PM -0400, Xiaoyao Li wrote: > > > > This is the first RFC version of enabling KVM gmem[1] as the backend for > > > > private memory of KVM_X86_PROTECTED_VM. > > > > > > > > It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, > > > > and introduces 'private' property for memory backend. When the vm type > > > > is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, > > > > it will call KVM gmem ioctl to allocate private memory for the backend. > > > > > > > > $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ > > > >-machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ > > > > ... > > > > > > > > Unfortunately this patch series fails the boot of OVMF at very early > > > > stage due to triple fault because KVM doesn't support emulate string IO > > > > to private memory. We leave it as an open to be discussed. > > > > > > > > There are following design opens that need to be discussed: > > > > > > > > 1. how to determine the vm type? > > > > > > > > a. like this series, specify the vm type via machine property > > > >'kvm-type' > > > > b. check the memory backend, if any backend has 'private' property > > > >set, the vm-type is set to KVM_X86_PROTECTED_VM. > > > > > > > > 2. whether 'private' property is needed if we choose 1.b as design > > > > > > > > with 1.b, QEMU can decide whether the memory region needs to be > > > > private (allocates gmem fd for it) or not, on its own. > > > > > > > > 3. What is KVM_X86_SW_PROTECTED_VM going to look like? What's the > > > > purose of it and what's the requirement on it. I think it's the > > > > questions for KVM folks than QEMU folks. > > > > > > > > Any other idea/open/question is welcomed. > > > > > > > > > > > > Beside, TDX QEMU implemetation is based on this series to provide > > > > private gmem for TD private memory, which can be found at [2]. > > > > And it can work corresponding KVM [3] to boot TDX guest. > > > > > > We already have a general purpose configuration mechanism for > > > confidential guests. The -machine argument has a property > > > confidential-guest-support=$OBJECT-ID, for pointing to an > > > object that implements the TYPE_CONFIDENTIAL_GUEST_SUPPORT > > > interface in QEMU. This is implemented with SEV, PPC PEF > > > mode, and s390 protvirt. > > > > > > I would expect TDX to follow this same design ie > > > > > > qemu-system-x86_64 \ > > >-object tdx-guest,id=tdx0,. \ > > >-machine q35,confidential-guest-support=tdx0 \ > > >... > > > > > > and not require inventing the new 'kvm-type' attribute at least. > > > > yes. > > > > TDX is initialized exactly as the above. > > > > This RFC series introduces the 'kvm-type' for KVM_X86_SW_PROTECTED_VM. It's > > my fault that forgot to list the option of introducing sw_protected_vm > > object with CONFIDENTIAL_GUEST_SUPPORT interface. > > Thanks for Isaku to raise it > > https://lore.kernel.org/qemu-devel/20230731171041.gb1807...@ls.amr.corp.intel.com/ > > > > we can specify KVM_X86_SW_PROTECTED_VM this way: > > > > qemu \ > > -object sw-protected,id=swp0,... \ > > -machine confidential-guest-support=swp0 \ > > ... > > > > > For the memory backend though, I'm not so sure - possibly that > > > might be something that still wants an extra property to identify > > > the type of memory to allocate, since we use memory-backend-ram > > > for a variety of use cases. Or it could be an entirely new object > > > type such as "memory-backend-gmem" > > > > What I want to discuss is whether providing the interface to users to allow > > them configuring which memory is/can be private. For example, QEMU can do it > > internally. If users wants a confidential guest, QEMU allocates private gmem > > for normal RAM automatically. > > I think handling it automatically simplifies things a good deal on the > QEMU side. I think it's still worthwhile to still allow: > > -object memory-backend-memfd-private,... > > because it provides a nice mechanism to set up a pair of shared/private > memfd's to enable hole-punching via fallocate() to avoid doubling memory > allocations for shared/private. It's also a nice place to control > potentially-configurable things like: > > - whether or not to enable discard/hole-punching > - if discard is enabled, whether or not to register the range via >RamDiscardManager interface so that VFIO/IOMMU mappings get updated >when doing PCI passthrough. SNP relies on this for PCI passthrough >when discard is enabled, otherwise DMA occurs to stale mappings of >discarded bounce-buffer pages: > > > https://github.com/AMDESE/qemu/blob/snp-latest/backends/hostmem-memfd-private.c
Re: [RFC PATCH 00/19] QEMU gmem implemention
On Tue, Aug 01, 2023 at 09:45:41AM +0800, Xiaoyao Li wrote: > On 8/1/2023 12:51 AM, Daniel P. Berrangé wrote: > > On Mon, Jul 31, 2023 at 12:21:42PM -0400, Xiaoyao Li wrote: > > > This is the first RFC version of enabling KVM gmem[1] as the backend for > > > private memory of KVM_X86_PROTECTED_VM. > > > > > > It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, > > > and introduces 'private' property for memory backend. When the vm type > > > is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, > > > it will call KVM gmem ioctl to allocate private memory for the backend. > > > > > > $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ > > >-machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ > > > ... > > > > > > Unfortunately this patch series fails the boot of OVMF at very early > > > stage due to triple fault because KVM doesn't support emulate string IO > > > to private memory. We leave it as an open to be discussed. > > > > > > There are following design opens that need to be discussed: > > > > > > 1. how to determine the vm type? > > > > > > a. like this series, specify the vm type via machine property > > >'kvm-type' > > > b. check the memory backend, if any backend has 'private' property > > >set, the vm-type is set to KVM_X86_PROTECTED_VM. > > > > > > 2. whether 'private' property is needed if we choose 1.b as design > > > > > > with 1.b, QEMU can decide whether the memory region needs to be > > > private (allocates gmem fd for it) or not, on its own. > > > > > > 3. What is KVM_X86_SW_PROTECTED_VM going to look like? What's the > > > purose of it and what's the requirement on it. I think it's the > > > questions for KVM folks than QEMU folks. > > > > > > Any other idea/open/question is welcomed. > > > > > > > > > Beside, TDX QEMU implemetation is based on this series to provide > > > private gmem for TD private memory, which can be found at [2]. > > > And it can work corresponding KVM [3] to boot TDX guest. > > > > We already have a general purpose configuration mechanism for > > confidential guests. The -machine argument has a property > > confidential-guest-support=$OBJECT-ID, for pointing to an > > object that implements the TYPE_CONFIDENTIAL_GUEST_SUPPORT > > interface in QEMU. This is implemented with SEV, PPC PEF > > mode, and s390 protvirt. > > > > I would expect TDX to follow this same design ie > > > > qemu-system-x86_64 \ > >-object tdx-guest,id=tdx0,. \ > >-machine q35,confidential-guest-support=tdx0 \ > >... > > > > and not require inventing the new 'kvm-type' attribute at least. > > yes. > > TDX is initialized exactly as the above. > > This RFC series introduces the 'kvm-type' for KVM_X86_SW_PROTECTED_VM. It's > my fault that forgot to list the option of introducing sw_protected_vm > object with CONFIDENTIAL_GUEST_SUPPORT interface. > Thanks for Isaku to raise it > https://lore.kernel.org/qemu-devel/20230731171041.gb1807...@ls.amr.corp.intel.com/ > > we can specify KVM_X86_SW_PROTECTED_VM this way: > > qemu \ > -object sw-protected,id=swp0,... \ > -machine confidential-guest-support=swp0 \ > ... > > > For the memory backend though, I'm not so sure - possibly that > > might be something that still wants an extra property to identify > > the type of memory to allocate, since we use memory-backend-ram > > for a variety of use cases. Or it could be an entirely new object > > type such as "memory-backend-gmem" > > What I want to discuss is whether providing the interface to users to allow > them configuring which memory is/can be private. For example, QEMU can do it > internally. If users wants a confidential guest, QEMU allocates private gmem > for normal RAM automatically. I think handling it automatically simplifies things a good deal on the QEMU side. I think it's still worthwhile to still allow: -object memory-backend-memfd-private,... because it provides a nice mechanism to set up a pair of shared/private memfd's to enable hole-punching via fallocate() to avoid doubling memory allocations for shared/private. It's also a nice place to control potentially-configurable things like: - whether or not to enable discard/hole-punching - if discard is enabled, whether or not to register the range via RamDiscardManager interface so that VFIO/IOMMU mappings get updated when doing PCI passthrough. SNP relies on this for PCI passthrough when discard is enabled, otherwise DMA occurs to stale mappings of discarded bounce-buffer pages: https://github.com/AMDESE/qemu/blob/snp-latest/backends/hostmem-memfd-private.c#L449 But for other memory ranges, it doesn't do a lot of good to rely on users to control those via -object memory-backend-memfd-private, since QEMU will set up some regions internally, like the UEFI ROM. It also isn't ideal for QEMU itself to internally control what should/shouldn't
Re: [RFC PATCH 00/19] QEMU gmem implemention
On 8/1/2023 1:10 AM, Isaku Yamahata wrote: On Mon, Jul 31, 2023 at 12:21:42PM -0400, Xiaoyao Li wrote: This is the first RFC version of enabling KVM gmem[1] as the backend for private memory of KVM_X86_PROTECTED_VM. It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, and introduces 'private' property for memory backend. When the vm type is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, it will call KVM gmem ioctl to allocate private memory for the backend. $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ -machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ ... Unfortunately this patch series fails the boot of OVMF at very early stage due to triple fault because KVM doesn't support emulate string IO to private memory. We leave it as an open to be discussed. There are following design opens that need to be discussed: 1. how to determine the vm type? a. like this series, specify the vm type via machine property 'kvm-type' b. check the memory backend, if any backend has 'private' property set, the vm-type is set to KVM_X86_PROTECTED_VM. Hi Xiaoyao. Because qemu has already confidential guest support, we should utilize it. Say, qemu \ -object sw-protected, id=swp0, \ -machine confidential-guest-support=swp0 thanks for pointing out this option. I thought of it and forgot to list it as option. It seems better and I'll go this direction if no one has different opinion. 2. whether 'private' property is needed if we choose 1.b as design with 1.b, QEMU can decide whether the memory region needs to be private (allocates gmem fd for it) or not, on its own. Memory region property (how to create KVM memory slot) should be independent from underlying VM type. Some (e.g. TDX) may require KVM private memory slot, some may not. Leave the decision to its vm type backend. They can use qemu memory listener. As I replied to Daniel, the topic is whether 'private' property is needed. Is it essential to let users decide which memory can be private? It seems OK that QEMU can make the decision based on VM type.
Re: [RFC PATCH 00/19] QEMU gmem implemention
On 8/1/2023 12:51 AM, Daniel P. Berrangé wrote: On Mon, Jul 31, 2023 at 12:21:42PM -0400, Xiaoyao Li wrote: This is the first RFC version of enabling KVM gmem[1] as the backend for private memory of KVM_X86_PROTECTED_VM. It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, and introduces 'private' property for memory backend. When the vm type is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, it will call KVM gmem ioctl to allocate private memory for the backend. $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ -machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ ... Unfortunately this patch series fails the boot of OVMF at very early stage due to triple fault because KVM doesn't support emulate string IO to private memory. We leave it as an open to be discussed. There are following design opens that need to be discussed: 1. how to determine the vm type? a. like this series, specify the vm type via machine property 'kvm-type' b. check the memory backend, if any backend has 'private' property set, the vm-type is set to KVM_X86_PROTECTED_VM. 2. whether 'private' property is needed if we choose 1.b as design with 1.b, QEMU can decide whether the memory region needs to be private (allocates gmem fd for it) or not, on its own. 3. What is KVM_X86_SW_PROTECTED_VM going to look like? What's the purose of it and what's the requirement on it. I think it's the questions for KVM folks than QEMU folks. Any other idea/open/question is welcomed. Beside, TDX QEMU implemetation is based on this series to provide private gmem for TD private memory, which can be found at [2]. And it can work corresponding KVM [3] to boot TDX guest. We already have a general purpose configuration mechanism for confidential guests. The -machine argument has a property confidential-guest-support=$OBJECT-ID, for pointing to an object that implements the TYPE_CONFIDENTIAL_GUEST_SUPPORT interface in QEMU. This is implemented with SEV, PPC PEF mode, and s390 protvirt. I would expect TDX to follow this same design ie qemu-system-x86_64 \ -object tdx-guest,id=tdx0,. \ -machine q35,confidential-guest-support=tdx0 \ ... and not require inventing the new 'kvm-type' attribute at least. yes. TDX is initialized exactly as the above. This RFC series introduces the 'kvm-type' for KVM_X86_SW_PROTECTED_VM. It's my fault that forgot to list the option of introducing sw_protected_vm object with CONFIDENTIAL_GUEST_SUPPORT interface. Thanks for Isaku to raise it https://lore.kernel.org/qemu-devel/20230731171041.gb1807...@ls.amr.corp.intel.com/ we can specify KVM_X86_SW_PROTECTED_VM this way: qemu \ -object sw-protected,id=swp0,... \ -machine confidential-guest-support=swp0 \ ... For the memory backend though, I'm not so sure - possibly that might be something that still wants an extra property to identify the type of memory to allocate, since we use memory-backend-ram for a variety of use cases. Or it could be an entirely new object type such as "memory-backend-gmem" What I want to discuss is whether providing the interface to users to allow them configuring which memory is/can be private. For example, QEMU can do it internally. If users wants a confidential guest, QEMU allocates private gmem for normal RAM automatically.
Re: [RFC PATCH 00/19] QEMU gmem implemention
On Mon, Jul 31, 2023 at 12:21:42PM -0400, Xiaoyao Li wrote: > This is the first RFC version of enabling KVM gmem[1] as the backend for > private memory of KVM_X86_PROTECTED_VM. > > It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, > and introduces 'private' property for memory backend. When the vm type > is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, > it will call KVM gmem ioctl to allocate private memory for the backend. > > $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ > -machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ > ... > > Unfortunately this patch series fails the boot of OVMF at very early > stage due to triple fault because KVM doesn't support emulate string IO > to private memory. We leave it as an open to be discussed. > > There are following design opens that need to be discussed: > > 1. how to determine the vm type? > >a. like this series, specify the vm type via machine property > 'kvm-type' >b. check the memory backend, if any backend has 'private' property > set, the vm-type is set to KVM_X86_PROTECTED_VM. Hi Xiaoyao. Because qemu has already confidential guest support, we should utilize it. Say, qemu \ -object sw-protected, id=swp0, \ -machine confidential-guest-support=swp0 > 2. whether 'private' property is needed if we choose 1.b as design > >with 1.b, QEMU can decide whether the memory region needs to be >private (allocates gmem fd for it) or not, on its own. Memory region property (how to create KVM memory slot) should be independent from underlying VM type. Some (e.g. TDX) may require KVM private memory slot, some may not. Leave the decision to its vm type backend. They can use qemu memory listener. -- Isaku Yamahata
Re: [RFC PATCH 00/19] QEMU gmem implemention
On Mon, Jul 31, 2023 at 12:21:42PM -0400, Xiaoyao Li wrote: > This is the first RFC version of enabling KVM gmem[1] as the backend for > private memory of KVM_X86_PROTECTED_VM. > > It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, > and introduces 'private' property for memory backend. When the vm type > is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, > it will call KVM gmem ioctl to allocate private memory for the backend. > > $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ > -machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ > ... > > Unfortunately this patch series fails the boot of OVMF at very early > stage due to triple fault because KVM doesn't support emulate string IO > to private memory. We leave it as an open to be discussed. > > There are following design opens that need to be discussed: > > 1. how to determine the vm type? > >a. like this series, specify the vm type via machine property > 'kvm-type' >b. check the memory backend, if any backend has 'private' property > set, the vm-type is set to KVM_X86_PROTECTED_VM. > > 2. whether 'private' property is needed if we choose 1.b as design > >with 1.b, QEMU can decide whether the memory region needs to be >private (allocates gmem fd for it) or not, on its own. > > 3. What is KVM_X86_SW_PROTECTED_VM going to look like? What's the >purose of it and what's the requirement on it. I think it's the >questions for KVM folks than QEMU folks. > > Any other idea/open/question is welcomed. > > > Beside, TDX QEMU implemetation is based on this series to provide > private gmem for TD private memory, which can be found at [2]. > And it can work corresponding KVM [3] to boot TDX guest. We already have a general purpose configuration mechanism for confidential guests. The -machine argument has a property confidential-guest-support=$OBJECT-ID, for pointing to an object that implements the TYPE_CONFIDENTIAL_GUEST_SUPPORT interface in QEMU. This is implemented with SEV, PPC PEF mode, and s390 protvirt. I would expect TDX to follow this same design ie qemu-system-x86_64 \ -object tdx-guest,id=tdx0,. \ -machine q35,confidential-guest-support=tdx0 \ ... and not require inventing the new 'kvm-type' attribute at least. For the memory backend though, I'm not so sure - possibly that might be something that still wants an extra property to identify the type of memory to allocate, since we use memory-backend-ram for a variety of use cases. Or it could be an entirely new object type such as "memory-backend-gmem" With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[RFC PATCH 00/19] QEMU gmem implemention
This is the first RFC version of enabling KVM gmem[1] as the backend for private memory of KVM_X86_PROTECTED_VM. It adds the support to create a specific KVM_X86_PROTECTED_VM type VM, and introduces 'private' property for memory backend. When the vm type is KVM_X86_PROTECTED_VM and memory backend has private enabled as below, it will call KVM gmem ioctl to allocate private memory for the backend. $qemu -object memory-backend-ram,id=mem0,size=1G,private=on \ -machine q35,kvm-type=sw-protected-vm,memory-backend=mem0 \ ... Unfortunately this patch series fails the boot of OVMF at very early stage due to triple fault because KVM doesn't support emulate string IO to private memory. We leave it as an open to be discussed. There are following design opens that need to be discussed: 1. how to determine the vm type? a. like this series, specify the vm type via machine property 'kvm-type' b. check the memory backend, if any backend has 'private' property set, the vm-type is set to KVM_X86_PROTECTED_VM. 2. whether 'private' property is needed if we choose 1.b as design with 1.b, QEMU can decide whether the memory region needs to be private (allocates gmem fd for it) or not, on its own. 3. What is KVM_X86_SW_PROTECTED_VM going to look like? What's the purose of it and what's the requirement on it. I think it's the questions for KVM folks than QEMU folks. Any other idea/open/question is welcomed. Beside, TDX QEMU implemetation is based on this series to provide private gmem for TD private memory, which can be found at [2]. And it can work corresponding KVM [3] to boot TDX guest. [1] https://lore.kernel.org/all/20230718234512.1690985-1-sea...@google.com/ [2] https://github.com/intel/qemu-tdx/tree/tdx-upstream-wip [3] https://github.com/intel/tdx/tree/kvm-upstream-2023.07.27-v6.5-rc2-workaround Chao Peng (4): RAMBlock: Support KVM gmemory kvm: Enable KVM_SET_USER_MEMORY_REGION2 for memslot physmem: Add ram_block_convert_range kvm: handle KVM_EXIT_MEMORY_FAULT Isaku Yamahata (4): HostMem: Add private property to indicate to use kvm gmem trace/kvm: Add trace for page convertion between shared and private pci-host/q35: Move PAM initialization above SMRAM initialization q35: Introduce smm_ranges property for q35-pci-host Xiaoyao Li (11): trace/kvm: Split address space and slot id in trace_kvm_set_user_memory() *** HACK *** linux-headers: Update headers to pull in gmem APIs memory: Introduce memory_region_can_be_private() i386/pc: Drop pc_machine_kvm_type() target/i386: Implement mc->kvm_type() to get VM type i386/kvm: Create gmem fd for KVM_X86_SW_PROTECTED_VM kvm: Introduce support for memory_attributes kvm/memory: Introduce the infrastructure to set the default shared/private value i386/kvm: Set memory to default private for KVM_X86_SW_PROTECTED_VM physmem: replace function name with __func__ in ram_block_discard_range() i386: Disable SMM mode for X86_SW_PROTECTED_VM accel/kvm/kvm-all.c | 166 +--- accel/kvm/trace-events | 4 +- backends/hostmem.c | 18 hw/i386/pc.c| 5 -- hw/i386/pc_q35.c| 3 +- hw/i386/x86.c | 27 ++ hw/pci-host/q35.c | 61 - include/exec/cpu-common.h | 2 + include/exec/memory.h | 24 ++ include/exec/ramblock.h | 1 + include/hw/i386/pc.h| 4 +- include/hw/i386/x86.h | 4 + include/hw/pci-host/q35.h | 1 + include/sysemu/hostmem.h| 2 +- include/sysemu/kvm.h| 3 + include/sysemu/kvm_int.h| 2 + linux-headers/asm-x86/kvm.h | 3 + linux-headers/linux/kvm.h | 50 +++ qapi/qom.json | 4 + softmmu/memory.c| 27 ++ softmmu/physmem.c | 97 ++--- target/i386/kvm/kvm.c | 84 ++ target/i386/kvm/kvm_i386.h | 1 + 23 files changed, 517 insertions(+), 76 deletions(-) -- 2.34.1