Re: bhyve win-guest benchmark comparing

Harry Schmalzbauer Sat, 27 Oct 2018 11:04:17 -0700

Am 22.10.2018 um 13:26 schrieb Harry Schmalzbauer:
…

Test-Runs:
Each hypervisor had only the one bench-guest running, no othertasks/guests were running besides system's native standard processes.Since the time between powering up the guest and finishing logondiffered notably (~5s vs. ~20s) from one host to the other, I did aquick synthetic IO-Test beforehand.I'm using IOmeter since heise.de published a great test pattern calledIOmix – about 18 years ago I guess. This access pattern has alwaysperfectly reflected the system performance for human computer usagewith non-caculation-centric applications, and still is my favourite,despite throughput and latency changed by some orders of manitudesduring the last decade (and I had defined something for "fio" whichmimics IOmix and shows reasonable relational results; but I'm stillprefering IOmeter for homogenous IO benchmarking).
The results is about factor 7 :-(
~3800iops&69MB/s (CPU-guest-usage 42%IOmeter+12%irq)
                vs.
~29000iops&530MB/s (CPU-guest-usage 11%IOmeter+19%irq)


    [with debug kernel and debug-malloc, numbers are 3000iops&56MB/s,
virtio-blk instead of ahci,hd: results in 5660iops&104MB/s withnon-debug kernel
     – much better, but even higher CPU load and still factor 4 slower]
What I don't understand is, why the IOmeter process differs that muchin CPU utilization!?! It's the same binary on the same OS (guest)with the same OS-driver and the same underlying hardware – "just" theAHCI emulation and the vmm differ...
Unfortunately, the picture for virtio-net vs. vmxnet3 is similar sad.
Copying a single 5GB file from CIFS share to DB-ssd results in 100%guest-CPU usage, where 40% are irqs and the throughput max out at~40MB/s.When copying the same file from the same source with the same guest onthe same host but host booted ESXi, there's 20% guest-CPU usage whiletransfering 111MB/s – the uplink GbE limit.
These synthetic benchmark very well explain the "feelable" differencewhen using a guest between the two hypervisors, but

…

To add an additional and rather surprinsing result, at least for me:

Virtualbox provides

'VBoxManage internalcommands createrawvmdk -filename"testbench_da0.vmdk" -rawdisk /dev/da0'


So I could use the exactly same test setup as for ESXi and bhyve.

FreeBSD-Virtualbox (running on the same host installation like bhyve)performed quiet well, although it doesn't survive IOmix benchmark runwhen the "testbench_da0.vmdk" (the "raw" SSD-R0-array) is hooked up tothe SATA controller.But connected to the emulated SAS controller(LSI1068), it runs withoutproblems and results in 9600iops@185MB/s with 1%IOmeter+7%irq CPUutilization (yes, 1% vs. 42% for IOmeter load).Still far away from what ESXi provides, but almost double performance ofvirtio-blk with bhyve, and most important, much less load (host andguest show exactly the same low values as opposed to the very high loadswhich are shown on host and guest with bhyve:virtio-blk).The HDtune random access benchmark also shows the factor 2, linear overall block sizes.

Virtualbox's virtio-net setup gives ~100MB/s with peaks at 111 and ~40%CPU load.Guest uses the same driver like with bhyve:virtio-blk, while backend ofvirtualbox:virtio-net is vboxnetflt utilizing netgraph and vboxnetadp.kovs. tap(4).So not only the IO efficiency (lower throughput but also much lower CPUutilization) is remarbably better, but also the network performance. Even low-bandwidth RDP sessions via GbE-LAN suffer from micro hangsunder bhyve and virtio-net. And 40MB/s transfers cause 100% CPU load onbhyve – both runs had exactly the same WIndows virtio-net driver in use(RedHat 141).

Conclusion: Virtualbox vs. ESXi shows a 0.5% efficiency factor, whilebhyve vs. ESXi shows 0.25% overall efficiency factor.I tried to provide a test environment with shortest hardware pathspossible. At least the benchmarks were run 100% reproducable with thesame binaries.


So I'm really interested if
…

Are these (emulation(only?) related, I guess) performace issues wellknown? I mean, does somebody know what needs to be done in what area,in order to catch up with the other results? So it's just a matter oftime/resources?Or are these results surprising and extensive analysis must be donebefore anybody can tell how to fix the IO limitations?
Is the root cause for the problematic low virtio-net throughputprobably the same as for the disk IO limits? Both really hurt in myuse case and the host is not idling in relation, but even showinghigher load with lower results. So even if the loweruser-experience-performance would be considered as toleratable, theguests/host ratio was only half dense.


Thanks,

-harry

_______________________________________________
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"

Re: bhyve win-guest benchmark comparing

Reply via email to