On 3/11/26 6:32 PM, David Marchand wrote:
> By default, DPDK based dp-packets points to data buffers that can't be
> expanded dynamically.
> Their layout is as follows:
> - a minimum 128 bytes headroom chosen at DPDK build time
> (RTE_PKTMBUF_HEADROOM),
> - a maximum size chosen at mempool creation,
>
> In some usecases though (like encapsulating with multiple tunnels),
> a 128 bytes headroom is too short.
>
> Keep on using mono segment packets but dynamically allocate buffers
> in DPDK memory and make use of DPDK external buffers API
> (previously used for userspace TSO).
>
> Signed-off-by: David Marchand <[email protected]>
> ---
> Changes since v4:
> - fixed tailroom,
> - added a check on configured DPDK headroom,
> - added more description and renamed ifaces in the unit test,
>
> Changes since v3:
> - split buffer length calculation in a helper,
> - handled running test without qdisc (net/tap does not require
> those qdiscs, but spews ERR level logs if absent),
> - added check on firewall,
>
> Changes since v2:
> - moved check on uint16_t overflow in netdev_dpdk_extbuf_allocate(),
>
> Changes since v1:
> - fixed new segment length (reset by extbuf attach helper),
> - added a system-dpdk unit test,
>
> ---
> acinclude.m4 | 7 +++
> lib/dp-packet.c | 21 ++++++++-
> lib/netdev-dpdk.c | 47 +++++++++++++++++---
> lib/netdev-dpdk.h | 4 ++
> tests/atlocal.in | 1 +
> tests/system-dpdk.at | 100 +++++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 174 insertions(+), 6 deletions(-)
>
> diff --git a/acinclude.m4 b/acinclude.m4
> index e4e48cb531..060c416f8a 100644
> --- a/acinclude.m4
> +++ b/acinclude.m4
> @@ -431,6 +431,13 @@ AC_DEFUN([OVS_CHECK_DPDK], [
> AC_MSG_ERROR([unable to find rte_config.h in $with_dpdk])
> ], [AC_INCLUDES_DEFAULT])
>
> + AC_COMPUTE_INT([dpdk_mbuf_headroom], [RTE_PKTMBUF_HEADROOM],
> + [AC_INCLUDES_DEFAULT],
> + [AC_MSG_ERROR([unable to determine
> RTE_PKTMBUF_HEADROOM])])
> + AC_DEFINE_UNQUOTED([DPDK_MBUF_HEADROOM], [$dpdk_mbuf_headroom],
> + [Value of RTE_PKTMBUF_HEADROOM from DPDK])
> + AC_SUBST([DPDK_MBUF_HEADROOM], [$dpdk_mbuf_headroom])
> +
> AC_CHECK_DECLS([RTE_LIBRTE_VHOST_NUMA, RTE_EAL_NUMA_AWARE_HUGEPAGES], [
> OVS_FIND_DEPENDENCY([get_mempolicy], [numa], [libnuma])
> ], [], [[#include <rte_config.h>]])
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index c04d608be6..30fd013c29 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -255,8 +255,27 @@ dp_packet_resize(struct dp_packet *b, size_t
> new_headroom, size_t new_tailroom)
> new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
>
> switch (b->source) {
> - case DPBUF_DPDK:
> + case DPBUF_DPDK: {
> +#ifdef DPDK_NETDEV
> + uint32_t extbuf_len;
> +
> + extbuf_len = netdev_dpdk_extbuf_size(new_allocated);
> + ovs_assert(extbuf_len <= UINT16_MAX);
> + new_base = netdev_dpdk_extbuf_allocate(extbuf_len);
> + if (!new_base) {
> + out_of_memory();
> + }
> + dp_packet_copy__(b, new_base, new_headroom, new_tailroom);
> + netdev_dpdk_extbuf_replace(b, new_base, extbuf_len);
> + /* Because of alignment, we may have gained a bit more tailroom than
> + * expected. Rely on this mbuf buf_len which got adjusted by
> + * rte_pktmbuf_attach_extbuf(). */
> + new_allocated = b->mbuf.buf_len;
nit: We're not accessing mbuf directly anywhere in this file, so I wonder
if we should use the access function here instead, e.g.:
/* Because of alignment, we may have gained a bit more tailroom than
* expected. Update from the currently allocated length which got
* adjusted by rte_pktmbuf_attach_extbuf(). */
new_allocated = dp_packet_get_allocated(b);
WDYT?
Also, double spaces between sentences.
All these could probbaly be adjusted on commit, the rest of the code and
the updated test look good to me:
Acked-by: Ilya Maximets <[email protected]>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev