On Thu, Feb 18, 2021 at 12:32:47PM -0600, Kalra, Ashish wrote:
> [AMD Public Use]
> 
> 
> -----Original Message-----
> From: Sean Christopherson <sea...@google.com> 
> Sent: Tuesday, February 16, 2021 7:03 PM
> To: Kalra, Ashish <ashish.ka...@amd.com>
> Cc: pbonz...@redhat.com; t...@linutronix.de; mi...@redhat.com; 
> h...@zytor.com; rkrc...@redhat.com; j...@8bytes.org; b...@suse.de; Lendacky, 
> Thomas <thomas.lenda...@amd.com>; x...@kernel.org; k...@vger.kernel.org; 
> linux-kernel@vger.kernel.org; srutherf...@google.com; 
> venu.busire...@oracle.com; Singh, Brijesh <brijesh.si...@amd.com>
> Subject: Re: [PATCH v10 10/16] KVM: x86: Introduce KVM_GET_SHARED_PAGES_LIST 
> ioctl
> 
> On Thu, Feb 04, 2021, Ashish Kalra wrote:
> > From: Brijesh Singh <brijesh.si...@amd.com>
> > 
> > The ioctl is used to retrieve a guest's shared pages list.
> 
> >What's the performance hit to boot time if KVM_HC_PAGE_ENC_STATUS is passed 
> >through to userspace?  That way, userspace could manage the set of pages >in 
> >whatever data structure they want, and these get/set ioctls go away.
> 
> I will be more concerned about performance hit during guest DMA I/O if the 
> page encryption status hypercalls are passed through to user-space, 
> a lot of guest DMA I/O dynamically sets up pages for encryption and then 
> flips them at DMA completion, so guest I/O will surely take a performance 
> hit with this pass-through stuff.
> 

Here are some rough performance numbers comparing # of heavy-weight VMEXITs 
compared to # of hypercalls, 
during a SEV guest boot: (launch of a ubuntu 18.04 guest)

# ./perf record -e kvm:kvm_userspace_exit -e kvm:kvm_hypercall -a 
./qemu-system-x86_64 -enable-kvm -cpu host -machine q35 -smp 16,maxcpus=64 -m 
512M -drive 
if=pflash,format=raw,unit=0,file=/home/ashish/sev-migration/qemu-5.1.50/OVMF_CODE.fd,readonly
 -drive if=pflash,format=raw,unit=1,file=OVMF_VARS.fd -drive 
file=../ubuntu-18.04.qcow2,if=none,id=disk0,format=qcow2 -device 
virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -device 
scsi-hd,drive=disk0 -object 
sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1,policy=0x0 -machine 
memory-encryption=sev0 -trace events=/tmp/events -nographic -monitor pty 
-monitor unix:monitor-source,server,nowait -qmp 
unix:/tmp/qmp-sock,server,nowait -device 
virtio-rng-pci,disable-legacy=on,iommu_platform=true

...
...

root@diesel2540:/home/ashish/sev-migration/qemu-5.1.50# ./perf report
# To display the perf.data header info, please use --header/--header-only 
options.
#
#
# Total Lost Samples: 0
#
# Samples: 981K of event 'kvm:kvm_userspace_exit'
# Event count (approx.): 981021
#
# Overhead  Command          Shared Object     Symbol
# ........  ...............  ................  ..................
#
   100.00%  qemu-system-x86  [kernel.vmlinux]  [k] kvm_vcpu_ioctl


# Samples: 19K of event 'kvm:kvm_hypercall'
# Event count (approx.): 19573
#
# Overhead  Command          Shared Object     Symbol
# ........  ...............  ................  .........................
#
   100.00%  qemu-system-x86  [kernel.vmlinux]  [k] kvm_emulate_hypercall

Out of these 19573 hypercalls, # of page encryption status hcalls are 19479,
so almost all hypercalls here are page encryption status hypercalls.

The above data indicates that there will be ~2% more Heavyweight VMEXITs
during SEV guest boot if we do page encryption status hypercalls 
pass-through to host userspace.

But, then Brijesh pointed out to me and highlighted that currently
OVMF is doing lot of VMEXITs because they don't use the DMA pool to minimize 
the C-bit toggles,
in other words, OVMF bounce buffer does page state change on every DMA allocate 
and free.

So here is the performance analysis after kernel and initrd have been
loaded into memory using grub and then starting perf just before booting the 
kernel.

These are the performance #'s after kernel and initrd have been loaded into 
memory, 
then perf is attached and kernel is booted : 

# Samples: 1M of event 'kvm:kvm_userspace_exit'
# Event count (approx.): 1081235
#
# Overhead  Trace output
# ........  ........................
#
    99.77%  reason KVM_EXIT_IO (2)
     0.23%  reason KVM_EXIT_MMIO (6)

# Samples: 1K of event 'kvm:kvm_hypercall'
# Event count (approx.): 1279
#

So as the above data indicates, Linux is only making ~1K hypercalls,
compared to ~18K hypercalls made by OVMF in the above use case.

Does the above adds a prerequisite that OVMF needs to be optimized if 
and before hypercall pass-through can be done ? 

Thanks,
Ashish

Reply via email to