On Thu, Jan 29, 2026 at 5:20 AM Vishwanath Seshagiri <[email protected]> wrote:
>
> Introduce page_pool support in virtio_net driver to enable page recycling
> in RX buffer allocation and avoid repeated page allocator calls. This
> applies to mergeable and small buffer modes.
>
> Beyond performance improvements, this patch is a prerequisite for enabling
> memory provider-based zero-copy features in virtio_net, specifically devmem
> TCP and io_uring ZCRX, which require drivers to use page_pool for buffer
> management.
>
> The implementation preserves the DMA premapping optimization introduced in
> commit 31f3cd4e5756 ("virtio-net: rq submits premapped per-buffer") by
> conditionally using PP_FLAG_DMA_MAP when the virtio backend supports
> standard DMA API (vhost, virtio-pci), and falling back to allocation-only
> mode for backends with custom DMA mechanisms (VDUSE).
>
> Changes in v2
> =============
>
> Addressing reviewer feedback from v1:
>
> - Add "select PAGE_POOL" to Kconfig (Jason Wang)
> - Move page pool creation from ndo_open to probe for device lifetime
>   management (Xuan Zhuo, Jason Wang)
> - Implement conditional DMA strategy using virtqueue_dma_dev():
>   - When non-NULL: use PP_FLAG_DMA_MAP for page_pool-managed DMA premapping
>   - When NULL (VDUSE): page_pool handles allocation only
> - Use page_pool_get_dma_addr() + virtqueue_add_inbuf_premapped() to
>   preserve DMA premapping optimization from commit 31f3cd4e5756
>   ("virtio-net: rq submits premapped per-buffer") (Jason Wang)
> - Remove dual allocation code paths - page_pool now always used for
>   small/mergeable modes (Jason Wang)
> - Remove unused virtnet_rq_alloc/virtnet_rq_init_one_sg functions
> - Add comprehensive performance data (Michael S. Tsirkin)
> - v1 link: 
> https://lore.kernel.org/virtualization/[email protected]/
>
> Performance Results
> ===================
>
> Tested using iperf3 TCP_STREAM with virtio-net on vhost backend.
> 300-second runs, results show throughput and TCP retransmissions.
> The base kernel is synced to net tree and commit: 709bbb015538.
>
> Mergeable Buffer Mode (mrg_rxbuf=on, GSO enabled, MTU 1500):
> +--------+---------+---------+------------+------------+--------+--------+
> | Queues | Streams |  Patch  | Throughput |   Retries  | Delta  | Retry% |
> +--------+---------+---------+------------+------------+--------+--------+
> |   1    |    1    |  base   |  25.7 Gbps |      0     |   -    |   -    |
> |   1    |    1    |   pp    |  26.2 Gbps |      0     | +1.9%  |   0%   |
> +--------+---------+---------+------------+------------+--------+--------+
> |   8    |    8    |  base   |  95.6 Gbps |  236,432   |   -    |   -    |
> |   8    |    8    |   pp    |  97.9 Gbps |  188,249   | +2.4%  | -20.4% |
> +--------+---------+---------+------------+------------+--------+--------+
>
> Small Buffer Mode (mrg_rxbuf=off, GSO disabled, MTU 1500):
> +--------+---------+---------+------------+------------+--------+--------+
> | Queues | Streams |  Patch  | Throughput |   Retries  | Delta  | Retry% |
> +--------+---------+---------+------------+------------+--------+--------+
> |   1    |    1    |  base   |  9.17 Gbps |    15,152  |   -    |   -    |
> |   1    |    1    |   pp    |  9.19 Gbps |    12,203  | +0.2%  | -19.5% |
> +--------+---------+---------+------------+------------+--------+--------+
> |   8    |    8    |  base   | 43.0 Gbps  |   974,500  |   -    |   -    |
> |   8    |    8    |   pp    | 44.7 Gbps  |   717,411  | +4.0%  | -26.4% |
> +--------+---------+---------+------------+------------+--------+--------+

It would be better to have more benchmark like:

PPS (using pktgen on host and XDP_DROP in the guest)

We can see PPS as well as XDP performance as well.

Thanks

>
> Testing
> =======
>
> The patches have been tested with:
> - iperf3 bulk transfer workloads (multiple queue/stream configurations)
> - Included selftests for buffer circulation verification
> - Edge case testing: device unbind/bind cycles, rapid interface open/close,
>   traffic during close, ethtool feature toggling, close with pending refill
>   work, and data integrity verification
>
> Vishwanath Seshagiri (2):
>   virtio_net: add page_pool support for buffer allocation
>   selftests: virtio_net: add buffer circulation test
>
>  drivers/net/Kconfig                           |   1 +
>  drivers/net/virtio_net.c                      | 353 ++++++++++--------
>  .../drivers/net/virtio_net/basic_features.sh  |  70 ++++
>  3 files changed, 273 insertions(+), 151 deletions(-)
>
> --
> 2.47.3
>


Reply via email to