On Wed, Feb 24, 2021, Ashish Kalra wrote:
> # Samples: 19K of event 'kvm:kvm_hypercall'
> # Event count (approx.): 19573
> #
> # Overhead  Command          Shared Object     Symbol
> # ........  ...............  ................  .........................
> #
>    100.00%  qemu-system-x86  [kernel.vmlinux]  [k] kvm_emulate_hypercall
> 
> Out of these 19573 hypercalls, # of page encryption status hcalls are 19479,
> so almost all hypercalls here are page encryption status hypercalls.

Oof.

> The above data indicates that there will be ~2% more Heavyweight VMEXITs
> during SEV guest boot if we do page encryption status hypercalls 
> pass-through to host userspace.
> 
> But, then Brijesh pointed out to me and highlighted that currently
> OVMF is doing lot of VMEXITs because they don't use the DMA pool to minimize 
> the C-bit toggles,
> in other words, OVMF bounce buffer does page state change on every DMA 
> allocate and free.
> 
> So here is the performance analysis after kernel and initrd have been
> loaded into memory using grub and then starting perf just before booting the 
> kernel.
> 
> These are the performance #'s after kernel and initrd have been loaded into 
> memory, 
> then perf is attached and kernel is booted : 
> 
> # Samples: 1M of event 'kvm:kvm_userspace_exit'
> # Event count (approx.): 1081235
> #
> # Overhead  Trace output
> # ........  ........................
> #
>     99.77%  reason KVM_EXIT_IO (2)
>      0.23%  reason KVM_EXIT_MMIO (6)
> 
> # Samples: 1K of event 'kvm:kvm_hypercall'
> # Event count (approx.): 1279
> #
> 
> So as the above data indicates, Linux is only making ~1K hypercalls,
> compared to ~18K hypercalls made by OVMF in the above use case.
> 
> Does the above adds a prerequisite that OVMF needs to be optimized if 
> and before hypercall pass-through can be done ? 

Disclaimer: my math could be totally wrong.

I doubt it's a hard requirement.  Assuming a conversative roundtrip time of 50k
cycles, those 18K hypercalls will add well under a 1/2 a second of boot time.
If userspace can push the roundtrip time down to 10k cycles, the overhead is
more like 50 milliseconds.

That being said, this does seem like a good OVMF cleanup, irrespective of this
new hypercall.  I assume it's not cheap to convert a page between encrypted and
decrypted.

Thanks much for getting the numbers!

Reply via email to