Public bug reported: We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1 have noticed a problem at startup where QEMu appears to hang. We are not seeing this problem on our bare metal nodes and only on a VSI that supports nested virtualization.
We unfortunately see nothing at all in the QEMU logs to help understand the problem and a hung process is just a guess at this point. Using git bisect we first see the problem with... --- f19bcdfedd53ee93412d535a842a89fa27cae7f2 is the first bad commit commit f19bcdfedd53ee93412d535a842a89fa27cae7f2 Author: Jason Wang <jasow...@redhat.com> Date: Wed Jul 1 22:55:28 2020 +0800 virtio-pci: implement queue_enabled method With version 1, we can detect whether a queue is enabled via queue_enabled. Signed-off-by: Jason Wang <jasow...@redhat.com> Signed-off-by: Cindy Lu <l...@redhat.com> Message-Id: <20200701145538.22333-5-l...@redhat.com> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> Acked-by: Jason Wang <jasow...@redhat.com> hw/virtio/virtio-pci.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) --- Reverting this commit seems to work and prevent the hanging. --- Here's how kata ends up launching qemu in our environment -- /opt/kata/bin/qemu-system-x86_64 -name sandbox-849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f -uuid 6bec458e-1da7-4847-a5d7-5ab31d4d2465 -machine pc,accel=kvm,kernel_irqchip -cpu host,pmu=off -qmp unix:/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qmp.sock,server,nowait -m 4096M,slots=10,maxmem=30978M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/console.sock,server,nowait -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/kata.sock,server,nowait -chardev socket,id=char-396c5c3e19e29353,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-396c5c3e19e29353,tag=kataShared,romfile= -netdev tap,id=network-0,vhost=on,vhostfds=3:4,fds=5:6 -device driver=virtio-net-pci,netdev=network-0,mac=52:ac:2d:02:1f:6f,disable-modern=true,mq=on,vectors=6,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -object memory-backend-file,id=dimm1,size=4096M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-5.7.9-74 -initrd /opt/kata/share/kata-containers/kata-containers-initrd_alpine_1.11.2-6_agent.initrd -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 debug panic=1 nr_cpus=4 agent.use_vsock=false scsi_mod.scan=none init=/usr/bin/kata-agent -pidfile /run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/pid -D /run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qemu.log -smp 2,cores=1,threads=1,sockets=4,maxcpus=4 --- ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1888601 Title: QEMU v5.1.0-rc0/rc1 hang with nested virtualization Status in QEMU: New Bug description: We're running Kata Containers using QEMU and with v5.1.0rc0 and rc1 have noticed a problem at startup where QEMu appears to hang. We are not seeing this problem on our bare metal nodes and only on a VSI that supports nested virtualization. We unfortunately see nothing at all in the QEMU logs to help understand the problem and a hung process is just a guess at this point. Using git bisect we first see the problem with... --- f19bcdfedd53ee93412d535a842a89fa27cae7f2 is the first bad commit commit f19bcdfedd53ee93412d535a842a89fa27cae7f2 Author: Jason Wang <jasow...@redhat.com> Date: Wed Jul 1 22:55:28 2020 +0800 virtio-pci: implement queue_enabled method With version 1, we can detect whether a queue is enabled via queue_enabled. Signed-off-by: Jason Wang <jasow...@redhat.com> Signed-off-by: Cindy Lu <l...@redhat.com> Message-Id: <20200701145538.22333-5-l...@redhat.com> Reviewed-by: Michael S. Tsirkin <m...@redhat.com> Signed-off-by: Michael S. Tsirkin <m...@redhat.com> Acked-by: Jason Wang <jasow...@redhat.com> hw/virtio/virtio-pci.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) --- Reverting this commit seems to work and prevent the hanging. --- Here's how kata ends up launching qemu in our environment -- /opt/kata/bin/qemu-system-x86_64 -name sandbox-849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f -uuid 6bec458e-1da7-4847-a5d7-5ab31d4d2465 -machine pc,accel=kvm,kernel_irqchip -cpu host,pmu=off -qmp unix:/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qmp.sock,server,nowait -m 4096M,slots=10,maxmem=30978M -device pci-bridge,bus=pci.0,id=pci-bridge-0,chassis_nr=1,shpc=on,addr=2,romfile= -device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/console.sock,server,nowait -device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng-pci,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/kata.sock,server,nowait -chardev socket,id=char-396c5c3e19e29353,path=/run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-396c5c3e19e29353,tag=kataShared,romfile= -netdev tap,id=network-0,vhost=on,vhostfds=3:4,fds=5:6 -device driver=virtio-net-pci,netdev=network-0,mac=52:ac:2d:02:1f:6f,disable-modern=true,mq=on,vectors=6,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config -nodefaults -nographic -daemonize -object memory-backend-file,id=dimm1,size=4096M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-5.7.9-74 -initrd /opt/kata/share/kata-containers/kata-containers-initrd_alpine_1.11.2-6_agent.initrd -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 debug panic=1 nr_cpus=4 agent.use_vsock=false scsi_mod.scan=none init=/usr/bin/kata-agent -pidfile /run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/pid -D /run/vc/vm/849df14c6065931adedb9d18bc9260a6d896f1814a8c5cfa239865772f1b7a5f/qemu.log -smp 2,cores=1,threads=1,sockets=4,maxcpus=4 --- To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1888601/+subscriptions