On Fri, Mar 05, 2021 at 10:44:23AM +0000, Ashish Kalra wrote:
On Wed, Mar 03, 2021 at 01:25:40PM -0500, Tobin Feldman-Fitzthum wrote:
Hi Tobin,

On 03/02/21 21:48, Tobin Feldman-Fitzthum wrote:
This is a demonstration of fast migration for encrypted virtual machines
using a Migration Handler that lives in OVMF. This demo uses AMD SEV,
but the ideas may generalize to other confidential computing platforms.
With AMD SEV, guest memory is encrypted and the hypervisor cannot access
or move it. This makes migration tricky. In this demo, we show how the
HV can ask a Migration Handler (MH) in the firmware for an encrypted
page. The MH encrypts the page with a transport key prior to releasing
it to the HV. The target machine also runs an MH that decrypts the page
once it is passed in by the target HV. These patches are not ready for
production, but the are a full end-to-end solution that facilitates a
fast live migration between two SEV VMs.

Corresponding patches for QEMU have been posted my colleague Dov Murik
on qemu-devel. Our approach needs little kernel support, requiring only
one hypercall that the guest can use to mark a page as encrypted or
shared. This series includes updated patches from Ashish Kalra and
Brijesh Singh that allow OVMF to use this hypercall.

The MH runs continuously in the guest, waiting for communication from
the HV. The HV starts an additional vCPU for the MH but does not expose
it to the guest OS via ACPI. We use the MpService to start the MH. The
MpService is only available at runtime and processes that are started by
it are usually cleaned up on ExitBootServices. Since we need the MH to
run continuously, we had to make some modifications. Ideally a feature
could be added to the MpService to allow for the starting of
long-running processes. Besides migration, this could support other
background processes that need to operate within the encryption
boundary. For now, we have included a handful of patches that modify the
MpService to allow the MH to keep running after ExitBootServices. These
are temporary.
I plan to do a lightweight review for this series. (My understanding is
that it's an RFC and not actually being proposed for merging.)

Regarding the MH's availability at runtime -- does that necessarily
require the isolation of an AP? Because in the current approach,
allowing the MP Services to survive into OS runtime (in some form or
another) seems critical, and I don't think it's going to fly.

I agree that the UefiCpuPkg patches have been well separated from the
rest of the series, but I'm somewhat doubtful the "firmware-initiated
background process" idea will be accepted. Have you investigated
exposing a new "runtime service" (a function pointer) via the UEFI
Configuration table, and calling that (perhaps periodically?) from the
guest kernel? It would be a form of polling I guess. Or maybe, poll the
mailbox directly in the kernel, and call the new firmware runtime
service when there's an actual command to process.
Continuous runtime availability for the MH is almost certainly the most
controversial part of this proposal, which is why I put it in the cover
letter and why it's good to discuss.
(You do spell out "little kernel support", and I'm not sure if that's a
technical benefit, or a political / community benefit.)
As you allude to, minimal kernel support is really one of the main things
that shapes our approach. This is partly a political and practical benefit,
but there are also technical benefits. Having the MH in firmware likely
leads to higher availability. It can be accessed when the OS is unreachable,
perhaps during boot or when the OS is hung. There are also potential
portability advantages although we do currently require support for one
hypercall. The cost of implementing this hypercall is low.

Generally speaking, our task is to find a home for functionality that was
traditionally provided by the hypervisor, but that needs to be inside the
trust domain, but that isn't really part of a guest. A meta-goal of this
project is to figure out the best way to do this.

I'm quite uncomfortable with an attempt to hide a CPU from the OS via
ACPI. The OS has other ways to learn (for example, a boot loader could
use the MP services itself, stash the information, and hand it to the OS
kernel -- this would minimally allow for detecting an inconsistency in
the OS). What about "all-but-self" IPIs too -- the kernel might think
all the processors it's poking like that were under its control.
This might be the second most controversial piece. Here's a question: if we
could successfully hide the MH vCPU from the OS, would it still make you
uncomfortable? In other words, is the worry that there might be some
inconsistency or more generally that there is something hidden from the OS?
One thing to think about is that the guest owner should generally be aware
that there is a migration handler running. The way I see it, a guest owner
of an SEV VM would need to opt-in to migration and should then expect that
there is an MH running even if they aren't able to see it. Of course we need
to be certain that the MH isn't going to break the OS.

Also, as far as I can tell from patch #7, the AP seems to be
busy-looping (with a CpuPause() added in), for the entire lifetime of
the OS. Do I understand right? If so -- is it a temporary trait as well?
In our approach the MH continuously checks for commands from the hypervisor.
There are potentially ways to optimize this, such as having the hypervisor
de-schedule the MH vCPU while not migrating. You could potentially shut down
down the MH on the target after receiving the MH_RESET command (when the
migration finishes), but what if you want to migrate that VM somewhere else?

I think another approach can be considered here, why not implement MH
vCPU(s) as hot-plugged vCPU(s), basically hot-plug a new vCPU when migration
is started and hot unplug the vCPU when migration is completed, then we
won't need a vCPU running (and potentially consuming cycles) forever and
busy-looping with CpuPause().

After internal discussions, realized that this approach will not work as
vCPU hotplug will not work for SEV-ES, SNP. As the VMSA has to be
encrypted as part of the LAUNCH command, therefore we can't create/add a
new vCPU after LAUNCH has completed.

Thanks,
Ashish

Hm yeah we talked about hotplug a bit. It was never clear how it would square with OVMF.

-Tobin



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#72508): https://edk2.groups.io/g/devel/message/72508
Mute This Topic: https://groups.io/mt/81036365/21656
Group Owner: devel+ow...@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Reply via email to