Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)
Hello Brian, On 18.08.22 07:10, Brian Buhrow wrote: hello. that's interesting. Do the cores used for the vms also get used for the host os? Can you arrange things so that the host os gets dedicated cores that the vms can't use? If you do that, do you still see a performance drop when you add cores to the vms? the host OS has no other purpose than managing the VMs at this time. So apart from the minimal footprint of the hosts processes (postfix, syslog...) all the ressources are available to the Qemu processes. In the meantime, because of the other mail in this thread, I believe my fundamental misunderstanding is/was that I can overcommit a physical 2 Core CPU with a multiple of virtual cores without penalties. I'm starting to see that now, I wasn't aware that the context switches are so expensive that they can paralyze the whole I/O system. Kind regards Matthias smime.p7s Description: S/MIME Cryptographic Signature
Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)
hello. that's interesting. Do the cores used for the vms also get used for the host os? Can you arrange things so that the host os gets dedicated cores that the vms can't use? If you do that, do you still see a performance drop when you add cores to the vms? -thanks -Brian
Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)
Hello Brian, On 17.08.22 20:51, Brian Buhrow wrote: hello. If you want to use zfs for your storage, which I strongly recommend, lose the zvols and use flat files inside zfs itself. I think you'll find your storage performance goes up by orders of magnetude. I struggled with this on FreeBSD for over a year before I found the myriad of tickets on google regarding the terrible performance of zvols. It's a real shame, because zvols are such a tidy way to manage virtual servers. However, the performance penalty is just too big to ignore. -thanks -Brian thank you for your suggestion. I have researched the ZVOL vs. QCOW2 discussion. Unfortunately, nothing can be found in connection with NetBSD, but some in connection with Linux and KVM. The things I have found attest to the ZVOLs at least a slight performance advantage. That people finally decide for QCOW2 seems to be mainly due to the fact that the VM can be paused when the underlying storage is overfilled instead of crashing like with ZVOL. However, this situation can also be prevented with monitoring and regular snapshots. Nevertheless, I made a practical attempt and built my described test scenario exactly with QCOW2 files located in one and the same ZFS dataset. However, the result is almost the same. If I give the Qemu processes only one core via parameter -smp, I can measure a very good I/O bandwidth on the host - depending on the number of running VMs it even increases significantly, so that the limiting factor here seems to be only the single-thread performance of a CPU core: - VM 1 with 1 SMP Core: ~200 MByte/s - + VM2 with 1 SMP Core: ~300 MByte/s - + VM3 with 1 SMP Core: ~500 MByte/s As with my first test, performance is dramatically worse when I give each VM 2 cores instead of 1: - VM 1 with 2 SMP Cores: ~30...40 MByte/s - + VM2 with 2 SMP Cores: < 1 MByte/s - + VM3 with 2 SMP Cores: < 1 MByte/s Is there any logical explanation for this drastic drop in performance? Kind regards Matthias smime.p7s Description: S/MIME Cryptographic Signature
Automated report: NetBSD-current/i386 test failure
This is an automatically generated notice of a new failure of the NetBSD test suite. The newly failing test case is: usr.bin/make/t_make:opt_query The above test failed in each of the last 4 test runs, and passed in at least 26 consecutive runs before that. The following commits were made between the last successful test and the failed test: 2022.08.17.19.56.28 rillig src/sys/net/if.c,v 1.512 2022.08.17.20.03.05 riastradh src/sys/dev/usb/uhci.c,v 1.316 2022.08.17.20.05.41 rillig src/usr.bin/make/unit-tests/opt-query.exp,v 1.3 2022.08.17.20.05.41 rillig src/usr.bin/make/unit-tests/opt-query.mk,v 1.5 Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2022.08.html#2022.08.17.20.05.41
daily CVS update output
Updating src tree: P src/bin/sh/histedit.c P src/distrib/sets/lists/base/mi P src/doc/3RDPARTY P src/doc/CHANGES P src/external/public-domain/tz/tzdata2netbsd P src/external/public-domain/tz/dist/Makefile P src/external/public-domain/tz/dist/NEWS U src/external/public-domain/tz/dist/TZDATA_VERSION P src/external/public-domain/tz/dist/africa P src/external/public-domain/tz/dist/antarctica P src/external/public-domain/tz/dist/asia P src/external/public-domain/tz/dist/australasia P src/external/public-domain/tz/dist/backward P src/external/public-domain/tz/dist/backzone P src/external/public-domain/tz/dist/calendars P src/external/public-domain/tz/dist/etcetera P src/external/public-domain/tz/dist/europe P src/external/public-domain/tz/dist/leap-seconds.list P src/external/public-domain/tz/dist/leapseconds P src/external/public-domain/tz/dist/northamerica P src/external/public-domain/tz/dist/southamerica P src/external/public-domain/tz/dist/theory.html U src/external/public-domain/tz/dist/version P src/external/public-domain/tz/dist/ziguard.awk P src/external/public-domain/tz/dist/zishrink.awk P src/external/public-domain/tz/dist/zone.tab P src/external/public-domain/tz/dist/zone1970.tab P src/lib/libc/stdlib/strfmon.c P src/sbin/ifconfig/af_inetany.c P src/sys/arch/sun3/conf/GENERIC3X P src/sys/dev/usb/uhci.c P src/sys/net/if.c P src/usr.bin/make/compat.c P src/usr.bin/make/make.c P src/usr.bin/make/unit-tests/opt-query.exp P src/usr.bin/make/unit-tests/opt-query.mk Updating xsrc tree: Killing core files: Updating file list: -rw-rw-r-- 1 srcmastr netbsd 38500134 Aug 18 03:03 ls-lRA.gz
Automated report: NetBSD-current/i386 test failure
This is an automatically generated notice of a new failure of the NetBSD test suite. The newly failing test case is: lib/libc/locale/t_strfmon:strfmon The above test failed in each of the last 4 test runs, and passed in at least 26 consecutive runs before that. The following commits were made between the last successful test and the failed test: 2022.08.17.09.32.56 christos src/lib/libc/stdlib/strfmon.c,v 1.17 Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2022.08.html#2022.08.17.09.32.56
Re: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)
hello. If you want to use zfs for your storage, which I strongly recommend, lose the zvols and use flat files inside zfs itself. I think you'll find your storage performance goes up by orders of magnetude. I struggled with this on FreeBSD for over a year before I found the myriad of tickets on google regarding the terrible performance of zvols. It's a real shame, because zvols are such a tidy way to manage virtual servers. However, the performance penalty is just too big to ignore. -thanks -Brian On Aug 17, 5:43pm, Matthias Petermann wrote: } Subject: Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nv } This is a cryptographically signed message in MIME format. } } --ms050409000304070603080100 } Content-Type: text/plain; charset=utf-8; format=flowed } Content-Language: de-DE } Content-Transfer-Encoding: quoted-printable } } Hello, } } I'm trying to find the cause of a performance problem and don't really=20 } know how to proceed. }
Qemu storage performance drops when smp > 1 (NetBSD 9.3 + Qemu/nvmm + ZVOL)
Hello, I'm trying to find the cause of a performance problem and don't really know how to proceed. ## Test Setup Given a host (Intel NUC7CJYHN, 2 physical cores, 8 GB RAM, 500 GB SSD) with a fresh NetBSD/amd64 9.3_STABLE. The SSD contains ESP, an FFS root partition and swap, and a large ZPOOL. The host is to be used as a virtual host for VMs. For this, VMs are run with Qemu 7.0.0 (from pkgsrc 2022Q2) and nvmm. The VMs also run NetBSD 9.3. Storage is provided by ZVOLs through virtio. Before explaining the issue I face, here some rounded numbers showing the performance of the host OS (sampled with iostat): 1) Starting one writer ``` # dd if=/dev/zero of=/dev/zvol/rdsk/tank/vol/test1 bs=4m & ---> ~ 200 MByte/s 2) Adding another writer ``` # dd if=/dev/zero of=/dev/zvol/rdsk/tank/vol/test2 bs=4m & ``` ---> ~ 300 MByte/s 3) Adding another writer ``` # dd if=/dev/zero of=/dev/zvol/rdsk/tank/vol/test3 bs=4m & ``` ---> ~ 500 MByte/s From my understanding, this represents the write performance I can expect with my hardware when I write raw data in parallel to discrete ZVOLS located on the same physical storage (SSD). This picture changes completely when Qemu comes into play. I did install a basic NetBSD 9.3 on each of the ZVOLs (standard layout with FFSv2 + WAPBL) and operate them with this QEMU command: ``` qemu-system-x86_64 -machine pc-q35-7.0 -smp $VM_CORES -m $VM_RAM -accel nvmm \ -k de -boot cd \ -machine graphics=off -display none -vga none \ -object rng-random,filename=/dev/urandom,id=viornd0 \ -device virtio-rng-pci,rng=viornd0 \ -object iothread,id=t0 \ -device virtio-blk-pci,drive=hd0,iothread=t0 \ -device virtio-net-pci,netdev=vioif0,mac=$VM_MAC \ -chardev socket,id=monitor,path=$MONITOR_SOCKET,server=on,wait=off \ -monitor chardev:monitor \ -chardev socket,id=serial0,path=$CONSOLE_SOCKET,server=on,wait=off \ -serial chardev:serial0 \ -pidfile /tmp/$VM_ID.pid \ -cdrom $VM_CDROM_IMAGE \ -drive file=$VM_HDD_VOLUME,if=none,id=hd0,format=raw \ -netdev tap,id=vioif0,ifname=$VM_NETIF,script=no,downscript=no \ -device virtio-balloon-pci,id=balloon0 ``` The command already includes following optimizations: - use virtio driver instead of emulated SCSI device - use a separate I/O thread for block device access ## Test Case 1 The environment is set for this test: - VM_CORES: 1 - VM_RAM: 256 - - VM_HDD_VOLUME (e.g. /dev/zvol/rdsk/tank/vol/test3), each VM has its dedicated ZVOL My test case is the following: 0) Launch iostat -c on the Host and monitor continuously 1) Launch 3 instances of the VM configuration (vm1, vm2, vm3) 2) SSH into vm1 3) Issue dd if=/dev/zero of=/root/test.img bs=4m Observation: iostat on Host shows ~140 MByte /s 4) SSH into vm2 5) Issue dd if=/dev/zero of=/root/test.img bs=4m Observation: iostat on Host shows ~180 MByte /s 6) SSH into vm3 7) Issue dd if=/dev/zero of=/root/test.img bs=4m Observation: iostat on Host shows ~220 MByte /s Intermediate summary: - pretty good results :-) - with each additional writer, the bandwidth utilization raises ## Test Case 2 The environment is modified for this test: - VM_CORES: 2 The same test case yields completely different results: 0) Launch iostat -c on the Host and monitor continuously 1) Launch 3 instances of the VM configuration (vm1, vm2, vm3) 2) SSH into vm1 3) Issue dd if=/dev/zero of=/root/test.img bs=4m Observation: iostat on Host shows ~ 30 MByte /s 4) SSH into vm2 5) Issue dd if=/dev/zero of=/root/test.img bs=4m Observation: iostat on Host shows ~ 3 MByte /s 6) SSH into vm3 7) Issue dd if=/dev/zero of=/root/test.img bs=4m Observation: iostat on Host shows < 1 MByte /s Intermediate summary: - unexpected bad performance - even with one writer performance is far below the values compared to using only one core per VM - bandwidth drops dramatically with each additional writer ## Summary and Questions - adding more cores to Qemu seems to considerable impact disk I/O performance - Is this expected / known behavior? - What could I do to mitigate / help to find root cause? By the way - except for this hopefully solvable problem I am surprised how well the team of NetBSD, ZVOL, Qemu and NVMM works. Kind regards Matthisa smime.p7s Description: S/MIME Cryptographic Signature
Automated report: NetBSD-current/i386 build success
The NetBSD-current/i386 build is working again. The following commits were made between the last failed build and the successful build: 2022.08.17.14.03.05 kre src/distrib/sets/lists/base/mi,v 1.1311 Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2022.08.html#2022.08.17.14.03.05
Automated report: NetBSD-current/i386 build failure
This is an automatically generated notice of a NetBSD-current/i386 build failure. The failure occurred on babylon5.netbsd.org, a NetBSD/amd64 host, using sources from CVS date 2022.08.17.12.35.10. An extract from the build.sh output follows: Files in DESTDIR but missing from flist. File is obsolete or flist is out of date ? -- ./usr/share/zoneinfo/Europe/Kyiv = end of 1 extra files === *** Failed target: checkflist *** Failed commands: ${SETSCMD} ${.CURDIR}/checkflist ${MAKEFLIST_FLAGS} ${CHECKFLIST_FLAGS} ${METALOG.unpriv} => cd /tmp/build/2022.08.17.12.35.10-i386/src/distrib/sets && DESTDIR=/tmp/build/2022.08.17.12.35.10-i386/destdir MACHINE=i386 MACHINE_ARCH=i386 AWK=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbawk CKSUM=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbcksum DB=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbdb EGREP=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbgrep\ -E HOST_SH=/bin/sh MAKE=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbmake MKTEMP=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbmktemp MTREE=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbmtree PAX=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbpax COMPRESS_PROGRAM=gzip GZIP=-n XZ_OPT=-9 TAR_SUFF=tgz PKG_CREATE=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbpkg_create SED=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbsed TSORT=/tmp/build/2022.08.17.12.35.10-i386/tools/bin/nbtsort\ -q /bin/sh /tmp/build/2022.08.17.12.35.10-i386/src/distrib/sets/checkflist -L base -M /tmp/build/2022 .08.17.12.35.10-i386/destdir/METALOG.sanitised *** [checkflist] Error code 1 nbmake[2]: stopped in /tmp/build/2022.08.17.12.35.10-i386/src/distrib/sets 1 error nbmake[2]: stopped in /tmp/build/2022.08.17.12.35.10-i386/src/distrib/sets nbmake[1]: stopped in /tmp/build/2022.08.17.12.35.10-i386/src nbmake: stopped in /tmp/build/2022.08.17.12.35.10-i386/src ERROR: Failed to make release The following commits were made between the last successful build and the failed build: 2022.08.17.12.17.43 kre src/external/public-domain/tz/dist/Makefile,v 1.1.1.32 2022.08.17.12.17.43 kre src/external/public-domain/tz/dist/NEWS,v 1.1.1.36 2022.08.17.12.17.43 kre src/external/public-domain/tz/dist/calendars,v 1.1.1.2 2022.08.17.12.17.44 kre src/external/public-domain/tz/dist/africa,v 1.1.1.27 2022.08.17.12.17.44 kre src/external/public-domain/tz/dist/theory.html,v 1.1.1.14 2022.08.17.12.17.45 kre src/external/public-domain/tz/dist/antarctica,v 1.1.1.15 2022.08.17.12.17.52 kre src/external/public-domain/tz/dist/europe,v 1.1.1.32 2022.08.17.12.17.54 kre src/external/public-domain/tz/dist/northamerica,v 1.1.1.29 2022.08.17.12.17.54 kre src/external/public-domain/tz/dist/southamerica,v 1.1.1.19 2022.08.17.12.17.55 kre src/external/public-domain/tz/dist/backzone,v 1.1.1.22 2022.08.17.12.17.55 kre src/external/public-domain/tz/dist/etcetera,v 1.1.1.6 2022.08.17.12.17.55 kre src/external/public-domain/tz/dist/zone1970.tab,v 1.1.1.22 2022.08.17.12.17.56 kre src/external/public-domain/tz/dist/ziguard.awk,v 1.1.1.8 2022.08.17.12.17.56 kre src/external/public-domain/tz/dist/zishrink.awk,v 1.1.1.8 2022.08.17.12.17.56 kre src/external/public-domain/tz/dist/zone.tab,v 1.1.1.21 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/TZDATA_VERSION,v 1.28 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/asia,v 1.5 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/australasia,v 1.5 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/backward,v 1.4 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/leap-seconds.list,v 1.4 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/leapseconds,v 1.4 2022.08.17.12.19.41 kre src/external/public-domain/tz/dist/version,v 1.5 2022.08.17.12.24.42 kre src/doc/3RDPARTY,v 1.1869 2022.08.17.12.24.42 kre src/doc/CHANGES,v 1.2899 2022.08.17.12.25.47 kre src/doc/CHANGES,v 1.2900 2022.08.17.12.35.10 nat src/sbin/ifconfig/af_inetany.c,v 1.22 Logs can be found at: http://releng.NetBSD.org/b5reports/i386/commits-2022.08.html#2022.08.17.12.35.10