Hi Tobin,

On 03/02/21 21:48, Tobin Feldman-Fitzthum wrote:
This is a demonstration of fast migration for encrypted virtual machines
using a Migration Handler that lives in OVMF. This demo uses AMD SEV,
but the ideas may generalize to other confidential computing platforms.
With AMD SEV, guest memory is encrypted and the hypervisor cannot access
or move it. This makes migration tricky. In this demo, we show how the
HV can ask a Migration Handler (MH) in the firmware for an encrypted
page. The MH encrypts the page with a transport key prior to releasing
it to the HV. The target machine also runs an MH that decrypts the page
once it is passed in by the target HV. These patches are not ready for
production, but the are a full end-to-end solution that facilitates a
fast live migration between two SEV VMs.

Corresponding patches for QEMU have been posted my colleague Dov Murik
on qemu-devel. Our approach needs little kernel support, requiring only
one hypercall that the guest can use to mark a page as encrypted or
shared. This series includes updated patches from Ashish Kalra and
Brijesh Singh that allow OVMF to use this hypercall.

The MH runs continuously in the guest, waiting for communication from
the HV. The HV starts an additional vCPU for the MH but does not expose
it to the guest OS via ACPI. We use the MpService to start the MH. The
MpService is only available at runtime and processes that are started by
it are usually cleaned up on ExitBootServices. Since we need the MH to
run continuously, we had to make some modifications. Ideally a feature
could be added to the MpService to allow for the starting of
long-running processes. Besides migration, this could support other
background processes that need to operate within the encryption
boundary. For now, we have included a handful of patches that modify the
MpService to allow the MH to keep running after ExitBootServices. These
are temporary.
I plan to do a lightweight review for this series. (My understanding is
that it's an RFC and not actually being proposed for merging.)

Regarding the MH's availability at runtime -- does that necessarily
require the isolation of an AP? Because in the current approach,
allowing the MP Services to survive into OS runtime (in some form or
another) seems critical, and I don't think it's going to fly.

I agree that the UefiCpuPkg patches have been well separated from the
rest of the series, but I'm somewhat doubtful the "firmware-initiated
background process" idea will be accepted. Have you investigated
exposing a new "runtime service" (a function pointer) via the UEFI
Configuration table, and calling that (perhaps periodically?) from the
guest kernel? It would be a form of polling I guess. Or maybe, poll the
mailbox directly in the kernel, and call the new firmware runtime
service when there's an actual command to process.
Continuous runtime availability for the MH is almost certainly the most controversial part of this proposal, which is why I put it in the cover letter and why it's good to discuss.
(You do spell out "little kernel support", and I'm not sure if that's a
technical benefit, or a political / community benefit.)

As you allude to, minimal kernel support is really one of the main things that shapes our approach. This is partly a political and practical benefit, but there are also technical benefits. Having the MH in firmware likely leads to higher availability. It can be accessed when the OS is unreachable, perhaps during boot or when the OS is hung. There are also potential portability advantages although we do currently require support for one hypercall. The cost of implementing this hypercall is low.

Generally speaking, our task is to find a home for functionality that was traditionally provided by the hypervisor, but that needs to be inside the trust domain, but that isn't really part of a guest. A meta-goal of this project is to figure out the best way to do this.


I'm quite uncomfortable with an attempt to hide a CPU from the OS via
ACPI. The OS has other ways to learn (for example, a boot loader could
use the MP services itself, stash the information, and hand it to the OS
kernel -- this would minimally allow for detecting an inconsistency in
the OS). What about "all-but-self" IPIs too -- the kernel might think
all the processors it's poking like that were under its control.

This might be the second most controversial piece. Here's a question: if we could successfully hide the MH vCPU from the OS, would it still make you uncomfortable? In other words, is the worry that there might be some inconsistency or more generally that there is something hidden from the OS? One thing to think about is that the guest owner should generally be aware that there is a migration handler running. The way I see it, a guest owner of an SEV VM would need to opt-in to migration and should then expect that there is an MH running even if they aren't able to see it. Of course we need to be certain that the MH isn't going to break the OS.

Also, as far as I can tell from patch #7, the AP seems to be
busy-looping (with a CpuPause() added in), for the entire lifetime of
the OS. Do I understand right? If so -- is it a temporary trait as well?

In our approach the MH continuously checks for commands from the hypervisor. There are potentially ways to optimize this, such as having the hypervisor de-schedule the MH vCPU while not migrating. You could potentially shut down down the MH on the target after receiving the MH_RESET command (when the migration finishes), but what if you want to migrate that VM somewhere else?


Sorry if my questions are "premature", in the sense that I could get my
own answers as well if I actually read the patches in detail -- however,
I wouldn't like to do that at once, because then I'll be distracted by
many style issues and other "trivial" stuff. Examples for the latter:

Not premature at all. I think you hit the nail on the head with everything you raised.

-Tobin


- patch#1 calls SetMemoryEncDecHypercall3(), but there is no such
function in edk2, so minimally it's a patch ordering bug in the series,

- in patch#1, there's minimally one whitespace error (no whitespace
right after "EFI_SIZE_TO_PAGES")

- in patch#1, the alphabetical ordering in the [LibraryClasses] section,
and in the matching #include directives, gets broken,

- I'd prefer if the "SevLiveMigrationEnabled" UEFI variable were set in
ConfidentialMigrationDxe, rather than PlatformDxe (patch #3), or at
least another AMD SEV related DXE driver (OvmfPkg/AmdSevDxe etc).

- any particular reasonf or making the UEFI variable non-volatile? I
don't think it should survive any particular boot of the guest.

- Why do we need a variable in the first place?

etc etc

Thanks!
Laszlo




Ashish Kalra (2):
   OvmfPkg/PlatformPei: Mark SEC GHCB page in the page encrpytion bitmap.
   OvmfPkg/PlatformDxe: Add support for SEV live migration.

Brijesh Singh (1):
   OvmfPkg/BaseMemEncryptLib: Support to issue unencrypted hypercall

Dov Murik (1):
   OvmfPkg/AmdSev: Build page table for migration handler

Tobin Feldman-Fitzthum (10):
   OvmfPkg/AmdSev: Base for Confidential Migration Handler
   OvmfPkg/PlatfomPei: Set Confidential Migration PCD
   OvmfPkg/AmdSev: Setup Migration Handler Mailbox
   OvmfPkg/AmdSev: MH support for mailbox protocol
   UefiCpuPkg/MpInitLib: temp removal of MpLib cleanup
   UefiCpuPkg/MpInitLib: Allocate MP buffer as runtime memory
   UefiCpuPkg/CpuExceptionHandlerLib: Exception handling as runtime
     memory
   OvmfPkg/AmdSev: Don't overwrite mailbox or pagetables
   OvmfPkg/AmdSev: Don't overwrite MH stack
   OvmfPkg/AmdSev: MH page encryption POC

  OvmfPkg/OvmfPkg.dec                           |  11 +
  OvmfPkg/AmdSev/AmdSevX64.dsc                  |   2 +
  OvmfPkg/AmdSev/AmdSevX64.fdf                  |  13 +-
  .../ConfidentialMigrationDxe.inf              |  45 +++
  .../ConfidentialMigrationPei.inf              |  35 ++
  .../DxeMemEncryptSevLib.inf                   |   1 +
  .../PeiMemEncryptSevLib.inf                   |   1 +
  OvmfPkg/PlatformDxe/Platform.inf              |   2 +
  OvmfPkg/PlatformPei/PlatformPei.inf           |   2 +
  UefiCpuPkg/Library/MpInitLib/DxeMpInitLib.inf |   2 +
  UefiCpuPkg/Library/MpInitLib/PeiMpInitLib.inf |   2 +
  OvmfPkg/AmdSev/ConfidentialMigration/MpLib.h  | 235 +++++++++++++
  .../ConfidentialMigration/VirtualMemory.h     | 177 ++++++++++
  OvmfPkg/Include/Guid/MemEncryptLib.h          |  16 +
  OvmfPkg/PlatformDxe/PlatformConfig.h          |   5 +
  .../ConfidentialMigrationDxe.c                | 325 ++++++++++++++++++
  .../ConfidentialMigrationPei.c                |  25 ++
  .../X64/PeiDxeVirtualMemory.c                 |  18 +
  OvmfPkg/PlatformDxe/AmdSev.c                  |  99 ++++++
  OvmfPkg/PlatformDxe/Platform.c                |   6 +
  OvmfPkg/PlatformPei/AmdSev.c                  |  10 +
  OvmfPkg/PlatformPei/Platform.c                |  10 +
  .../CpuExceptionHandlerLib/DxeException.c     |   8 +-
  UefiCpuPkg/Library/MpInitLib/DxeMpLib.c       |  21 +-
  UefiCpuPkg/Library/MpInitLib/MpLib.c          |   7 +-
  25 files changed, 1061 insertions(+), 17 deletions(-)
  create mode 100644 
OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationDxe.inf
  create mode 100644 
OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationPei.inf
  create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/MpLib.h
  create mode 100644 OvmfPkg/AmdSev/ConfidentialMigration/VirtualMemory.h
  create mode 100644 OvmfPkg/Include/Guid/MemEncryptLib.h
  create mode 100644 
OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationDxe.c
  create mode 100644 
OvmfPkg/AmdSev/ConfidentialMigration/ConfidentialMigrationPei.c
  create mode 100644 OvmfPkg/PlatformDxe/AmdSev.c



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#72399): https://edk2.groups.io/g/devel/message/72399
Mute This Topic: https://groups.io/mt/81036365/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to