The issue was reported by Yihuang Yu on NVidia's grace-hopper (ARM64) platform. The wrong head (available ring entry) is seen by the guest when running 'netperf' on the guest and running 'netserver' on another NVidia's grace-grace machine.
/home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ -accel kvm -machine virt,gic-version=host -cpu host \ -smp maxcpus=1,cpus=1,sockets=1,clusters=1,cores=1,threads=1 \ -m 4096M,slots=16,maxmem=64G \ -object memory-backend-ram,id=mem0,size=4096M \ : \ -netdev tap,id=tap0,vhost=true \ -device virtio-net-pci,bus=pcie.8,netdev=tap0,mac=52:54:00:f1:26:b0 : guest# ifconfig eth0 | grep 'inet addr' inet addr:10.26.1.220 guest# netperf -H 10.26.1.81 -l 60 -C -c -t UDP_STREAM virtio_net virtio0: output.0:id 100 is not a head! There is missed smp_rmb() in vhost_{vq_avail_empty, enable_notify}() Without smp_rmb(), vq->avail_idx is advanced but the available ring entries aren't arriving to vhost side yet. So a stale available ring entry can be fetched in vhost_get_vq_desc(). Fix it by adding smp_rmb() in those two functions. Note that I need two patches so that they can be easily picked up by the stable kernel. With the changes, I'm unable to hit the issue again. Besides, the function vhost_get_avail_idx() is improved to tackle the memory barrier so that the callers needn't to worry about it. v2: https://lore.kernel.org/virtualization/46c6a9aa-821c-4013-afe7-61ec05fc9...@redhat.com v1: https://lore.kernel.org/virtualization/66e12633-b2d6-4b9a-9103-bb79770fc...@redhat.com Changelog ========= v3: Improved change log (Jason) Improved comments and added PATCH[v3 3/3] to execute smp_rmb() in vhost_get_avail_idx() (Michael) Gavin Shan (3): vhost: Add smp_rmb() in vhost_vq_avail_empty() vhost: Add smp_rmb() in vhost_enable_notify() vhost: Improve vhost_get_avail_idx() with smp_rmb() drivers/vhost/vhost.c | 51 ++++++++++++++++++++----------------------- 1 file changed, 24 insertions(+), 27 deletions(-) -- 2.44.0