Async enqueue offloads large copies to DMA devices, and small copies are still performed by the CPU. However, it requires users to get enqueue completed packets by rte_vhost_poll_enqueue_completed(), even if they are completed by the CPU when rte_vhost_submit_enqueue_burst() returns. This design incurs extra overheads of tracking completed pktmbufs and function calls, thus degrading performance on small packets.
The first patch cleans up async enqueue code, and the second patch enables rte_vhost_submit_enqueue_burst() to return completed packets. Change log ========== v3: - fix incorrect ret value when DMA ring is full - enhance description of API declaration and programmer guide v2: - fix typo - rename API variables - update programmer guide Jiayu Hu (2): vhost: cleanup async enqueue vhost: enhance async enqueue for small packets doc/guides/prog_guide/vhost_lib.rst | 8 +- lib/librte_vhost/rte_vhost_async.h | 32 +++-- lib/librte_vhost/vhost.c | 14 +- lib/librte_vhost/vhost.h | 7 +- lib/librte_vhost/vhost_user.c | 7 +- lib/librte_vhost/virtio_net.c | 258 ++++++++++++++++++++---------------- 6 files changed, 185 insertions(+), 141 deletions(-) -- 2.7.4