** Description changed: + [Impact] + + * Host -> Guest notifications can be lost and kill I/O due to that, + see below at the original bug report for more details. + + * Backport the fix that ensures that the generated code has to re-load + variables properly avoiding the issue. + + [Test Case] + + * Set up iperf in the host and run the server "iperf -s" + * get a guest using driver=qemu like: + <interface type='network'> + <source network='default'/> + <model type='virtio'/> + <driver name='qemu'/> + <interface/> + * In the guest run a loop of iperf runs connecting to the + server on the host. + #!/bin/bash + for i in $(seq 1 1000); + do + echo Try $i + iperf -c 192.168.122.1 || break + done + * Depending on the HW model, the machine saturation and such it seems + the above test either is rather reproducible or not-at-all. + That is bad, but we haven't found a much better repro, gladly IBM + who reported this issue (and created the fix) can recreate this on + their end and are willing to do so again for the SRU verification. + + [Regression Potential] + + * The changed code path is s390x only and there on the virtio-ccw + handling. Therefore regressions - if any - would be isolated to s390x + only and there manifest on virtio-ccw based I/O. + + [Other Info] + + * n/a + + ---- + + Problem Description: When irqfds are not used setting of the adapter interruption host-->guest notifier bit is accomplished by the QEMU function virtio_set_ind_atomic(). - The atomic_cmpxchg() loop in virtio_set_ind_atomic() is broken because we occasionally end up with old and _old having different values (a legit compiler can generate code that accessed *ind_addr again to pick up a value for _old instead of using the value of old that was already fetched according to the rules of the abstract machine). This means the underlying CS instruction may use a different old (_old) than the one we intended to use if atomic_cmpxchg() performed the xchg part. - - The direct consequence of the problem is that host --> guest notifications can get lost. The indirect consequence is that queues may get stuck and the devices may cease operate normally. We stumbled on debugging a choked virtio-net interface (one that used the qemu driver and not vhost). But it can affect other virtio-ccw devices as well. + The atomic_cmpxchg() loop in virtio_set_ind_atomic() is broken because + we occasionally end up with old and _old having different values (a + legit compiler can generate code that accessed *ind_addr again to pick + up a value for _old instead of using the value of old that was already + fetched according to the rules of the abstract machine). This means the + underlying CS instruction may use a different old (_old) than the one we + intended to use if atomic_cmpxchg() performed the xchg part. + + The direct consequence of the problem is that host --> guest + notifications can get lost. The indirect consequence is that queues may + get stuck and the devices may cease operate normally. We stumbled on + debugging a choked virtio-net interface (one that used the qemu driver + and not vhost). But it can affect other virtio-ccw devices as well. If irqfds are used for host->guest notifications, then we are safe because notifier bit manipulation is done in the kernel (and it's done correctly). - The problem described above is fixed upstream by commit. 1a8242f7c3 ("virtio-ccw: fix virtio_set_ind_atomic") All upstream versions since v2.0.0 are (potentially) affected. The same mistake was made in QEMU in another place, and is fixed by: 45175361f1 ("s390x/pci: fix set_ind_atomic") We can file a separate BZ for it if necessary.
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1894942 Title: [UBUNTU 20.04] Lost virtio host --> guest notifications cause devices to cease normal operation To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu-z-systems/+bug/1894942/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs