This is my initial code analysis: In between 2.3 and 2.5 we have about 80 vhost changes (no merges, no tests), being ~30 for vhost-user.
The most important vhost-user ones are these: 48854f57 vhost-user: fix log size dc3db6ad vhost-user: start/stop all rings 5421f318 vhost-user: print original request on error 2b8819c6 vhost-user: modify SET_LOG_BASE to pass mmap size and offset f6f56291 vhost user: add support of live migration 9a78a5dd vhost-user: send log shm fd along with log_base 1be0ac21 vhost-user: add vhost_user_requires_shm_log() 7263a0ad vhost-user: add a new message to disable/enable a specific virt queue. * b931bfbf vhost-user: add multiple queue support fc57fd99 vhost: introduce vhost_backend_get_vq_index method e2051e9e vhost-user: add VHOST_USER_GET_QUEUE_NUM message dcb10c00 vhost-user: add protocol feature negotiation 7305483a vhost-user: use VHOST_USER_XXX macro for switch statement d345ed2d Revert "vhost-user: add multi queue support" 830d70db vhost-user: add multi queue support 294ce717 vhost-user: Send VHOST_RESET_OWNER on vhost stop And these for vhost: 12b8cbac3c8 vhost: don't send RESET_OWNER at stop 25a2a920ddd vhost: set the correct queue index in case of migration with multiqueue * 15324404f68 vhost: alloc shareable log 2ce68e4cf5b vhost: add vhost_has_free_slot() interface 0cf33fb6b49 virtio-net: correctly drop truncated packets fc57fd9900d vhost: introduce vhost_backend_get_vq_index method 06c4670ff6d Revert "virtio-net: enable virtio 1.0" dfb8e184db7 virtio-pci: initial virtio 1.0 support b1506132001 vhost_net: add version_1 feature df91055db5c virtio-net: enable virtio 1.0 * 309750fad51 vhost: logs sharing 9718e4ae362 arm_gicv2m: set kvm_gsi_direct_mapping and kvm_msi_via_irqfd_allowed The vhost-user change is responsible for refactoring the multiple queue support for vhost-user. I'm not entirely sure about this change, in regards to this problem, since they're not using queues=XX in "-netdev" command. They have changed amount of virtio device queues (virtio) - http://pastebin.ubuntu.com/24087865/ - but not the number of queues for the virtio-net-pci device (vhost-user multi queues, for this example). Possible causes of such behavior (based on QEMU changes): - vhost-user multiple queue support refactored they are not using "queues=XX" in "-netdev" cmdline it could have changed some logic (to check) - tx queue callbacks scheduling (either timer or qemu aio bottom half) this would happen if there wasn't enough context switching (for qemu and vhost-user threads). could happen due to lock contention or system overload (due to some other change unrelated to virtio). * raising tx queue size we make the flushes longer in time and that is possibly causing a bigger throughput (stopping the queue overrun). this tells us that either the buffer is small OR the flush is being called less times than it should. * that is why im focusing on this part. something either reduced buffer size or is causing a bottleneck for the buffer flush typical of the "burst" behavior", btw. - There was also a change in vhost logging system: * vhost-user, commit: 309750fad51 * For live migration they started to log vhost (309750fad51) into anonymous pages from malloc() and into anonymous pages from memfd_create() OR backed by a file (in specific cases). * Not sure the log backend is used when there is no live migration occurring (causing a lock contention, for example). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1668829 Title: Performance regression from qemu 2.3 to 2.5 for vhost-user with ovs + dpdk To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1668829/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs