On 11/01/2026 17:54, Aaron Conole wrote:
External email: Use caution opening links or attachments
Eelco Chaudron via dev <[email protected]> writes:
This RFC patch series introduces a major architectural
refactoring of Open vSwitch's hardware offload
infrastructure. It replaces the tightly coupled
`netdev-offload` implementation with a new, modular
`dpif-offload-provider` framework.
MOTIVATION
-------------------------------------------------------------
The existing `netdev-offload` API tightly couples datapath
implementations (like `dpif-netdev`) with specific offload
technologies (DPDK's rte_flow). This design has several
limitations:
- Rigid Architecture: It creates complex dependencies,
making the code difficult to maintain and extend.
- Limited Flexibility: Supporting multiple offload backends
simultaneously or adding new ones is cumbersome.
- Inconsistent APIs: The logic for handling different
offload types is scattered, leading to an inconsistent
and hard-to-follow API surface.
This refactoring aims to resolve these issues by creating a
clean separation of concerns, improving modularity, and
establishing a clear path for future hardware offload
integrations.
Thanks for all the work you've done on this over the last 8 months.
I'm joining the thanks.
PROPOSED SOLUTION: THE `DPIF-OFFLOAD-PROVIDER` FRAMEWORK
-------------------------------------------------------------
This series introduces the `dpif-offload-provider`
framework, which functions similarly to the existing
`dpif-provider` pattern. It treats hardware offload as a
distinct layer with multiple, dynamically selectable
backends.
Key features of the new framework include:
1. Modular Architecture: A clean separation between the
generic datapath interface and specific offload
provider implementations (e.g., `dpif-offload-tc`,
`dpif-offload-dpdk`). `dpif` layers are now generic
clients of the offload API.
2. Provider-based System: Allows multiple offload backends
to coexist.
3. Unified and Asynchronous API: Establishes a consistent
API across all offload providers. For userspace
datapaths, the API is extended to support asynchronous
flow operations with callbacks, making `dpif-netdev` a
more efficient client.
4. Enhanced Configuration: Provides granular control over
offload provider selection through a global and per-port
priority system (`hw-offload-priority`), allowing
fine-tuned policies for different hardware.
5. Improved Testing: Includes a new test framework
specifically for validating DPDKs rte_flow offloads,
enhancing long-term maintainability.
PATCH SERIES ORGANIZATION
-------------------------------------------------------------
This large series is organized logically to facilitate
review:
1. Framework Foundation: The initial patches establish the
core `dpif-offload-provider` framework, including the
necessary APIs for port management, flow mark
allocation, configuration, and a dummy provider for
testing.
2. Provider Implementation: These patches introduce the new
`dpif-offload-tc` and `dpif-offload-dpdk` providers,
building out their specific implementations on top of
the new framework.
3. API Migration and Decoupling: The bulk of the series
systematically migrates functionality from the legacy
`netdev-offload` layer to the new providers. Key
commits here decouple `dpif-netdev` and, crucially,
`dpif-netlink` from their hardware offload
entanglements.
4. Cleanup: The final patches remove the now-redundant
global APIs and structures from `netdev-offload`,
completing the transition.
BACKWARD COMPATIBILITY
-------------------------------------------------------------
This refactoring maintains full API compatibility from a
user's perspective. All existing `ovs-vsctl` and
`ovs-appctl` commands continue to function as before. The
changes are primarily internal architectural improvements
designed to make OVS more robust and extensible.
REQUEST FOR COMMENTS
-------------------------------------------------------------
This is a significant architectural change that affects
core OVS infrastructure. We welcome feedback on:
- The overall architectural approach and the
`dpif-offload-provider` concept.
- The API design, particularly the new asynchronous model
for `dpif-netdev`.
- The migration strategy and any potential backward
compatibility concerns.
- Performance implications of the new framework.
Just a few quick thoughts while I'm finishing up (I'm on patch 29):
1. There are a few groups that might make more sense to squash together
while applying. For example, 23-25 is particularly agregious example
where it introduces some temp change, and then deletes it with a
different change. While seeing the steps is nice, it is *really*
confusing - even on the review side, it's a bit jarring to see.
Typically we don't allow patches that clean up earlier patches in the
same series. It may be worth spending the time to squash some of
these. Regardless of how messy it can be.
2. The dpdk offload stuff - my opinion is it should be named with
rte_flow rather than dpdk. It could make things difficult in the
future if a second offload type becomes available for DPDK ports.
WDYT (I realize that's a rename that takes a bit of time to do)?
Actually, giving it another thought, there is already a second offload
type in dpdk - rte_flow_async_XXX.
If someone wants to implement in OVS dpif-offload-X based on
rte_flow_async API, it will indeed be confusing.
3. During my read-ahead, I got the heebee jeebees with
[34/40] dpif_netdev: Fix nullable memcpy in queue_netdev_flow_put().
I haven't completed my review to that portion, but I suspect that if
it's a real issue, it could be first in the series or even separate
and applied separately. WDYT?
-------------------------------------------------------------
Changes from v1:
- Fixed issues reported by Aaron and my AI experiments.
- See individual patches for specific changes.
Changes from v2:
- See individual patches for specific changes.
Changes from v3:
- In patch 7, removed leftover netdev_close(port->netdev)
causing netdev reference count issues.
- Changed naming for dpif_offload_impl_type enum entries.
- Merged previous patch 36, 'dpif-netdev: Add full name
to the dp_netdev structure', with the next patch.
Eelco Chaudron (40):
dpif-offload-provider: Add dpif-offload-provider implementation.
dpif-offload: Add provider for tc offload.
dpif-offload: Add provider for dpdk (rte_flow).
dpif-offload: Allow configuration of offload provider priority.
dpif-offload: Move hw-offload configuration to dpif-offload.
dpif-offload: Add offload provider set_config API.
dpif-offload: Add port registration and management APIs.
dpif-offload-tc: Add port management framework.
dpif-offload-dpdk: Add port management framework.
dpif-offload: Validate mandatory port class callbacks on registration.
dpif-offload: Allow per-port offload provider priority config.
dpif-offload: Introduce provider debug information API.
dpif-offload: Call flow-flush netdev-offload APIs via dpif-offload.
dpif-offload: Call meter netdev-offload APIs via dpif-offload.
dpif-offload: Move the flow_get_n_flows() netdev-offload API to dpif.
dpif-offload: Move hw_post_process netdev API to dpif.
dpif-offload: Add flow dump APIs to dpif-offload.
dpif-offload: Move the tc flow dump netdev APIs to dpif-offload.
dpif-netlink: Remove netlink-offload integration.
dpif-netlink: Add API to get offloaded netdev from port_id.
dpif-offload: Add API to find offload implementation type.
dpif-offload: Add operate implementation to dpif-offload.
netdev-offload: Temporarily move thread-related APIs to dpif-netdev.
dpif-offload: Add port dump APIs to dpif-offload.
dpif-netdev: Remove indirect DPDK netdev offload API calls.
dpif: Add dpif_get_features() API.
dpif-offload: Add flow operations to dpif-offload-tc.
dpif-netlink: Remove entangled hardware offload.
dpif-offload-tc: Remove netdev-offload dependency.
netdev_dummy: Remove hardware offload override.
dpif-offload: Move the netdev_any_oor() API to dpif-offload.
netdev-offload: Remove the global netdev-offload API.
dpif-offload: Add inline flow APIs for userspace datapaths.
dpif_netdev: Fix nullable memcpy in queue_netdev_flow_put().
dpif-offload: Move offload_stats_get() API to dpif-offload.
dpif-offload-dpdk: Abstract rte_flow implementation from dpif-netdev.
dpif-offload-dummy: Add flow add/del/get APIs.
netdev-offload: Fold netdev-offload APIs and files into dpif-offload.
tests: Fix NSH decap header test for real Ethernet devices.
tests: Add a simple DPDK rte_flow test framework.
Documentation/topics/testing.rst | 19 +
include/openvswitch/json.h | 1 +
include/openvswitch/netdev.h | 1 +
lib/automake.mk | 17 +-
lib/dp-packet.h | 1 +
lib/dpctl.c | 50 +-
lib/dpdk.c | 2 -
lib/dpif-netdev-avx512.c | 4 +-
lib/dpif-netdev-private-flow.h | 9 +-
lib/dpif-netdev.c | 1244 +++---------
lib/dpif-netlink.c | 557 +----
...load-dpdk.c => dpif-offload-dpdk-netdev.c} | 592 ++++--
lib/dpif-offload-dpdk-private.h | 73 +
lib/dpif-offload-dpdk.c | 1186 +++++++++++
lib/dpif-offload-dummy.c | 920 +++++++++
lib/dpif-offload-provider.h | 421 ++++
...-offload-tc.c => dpif-offload-tc-netdev.c} | 238 ++-
lib/dpif-offload-tc-private.h | 76 +
lib/dpif-offload-tc.c | 877 ++++++++
lib/dpif-offload.c | 1790 +++++++++++++++++
lib/dpif-offload.h | 221 ++
lib/dpif-provider.h | 65 +-
lib/dpif.c | 166 +-
lib/dpif.h | 14 +-
lib/dummy.h | 9 +
lib/json.c | 7 +
lib/netdev-dpdk.c | 9 +-
lib/netdev-dpdk.h | 2 +-
lib/netdev-dummy.c | 199 +-
lib/netdev-linux.c | 3 +-
lib/netdev-offload-provider.h | 148 --
lib/netdev-offload.c | 910 ---------
lib/netdev-offload.h | 169 --
lib/netdev-provider.h | 10 +-
lib/netdev.c | 71 +-
lib/netdev.h | 22 +
lib/tc.c | 2 +-
ofproto/ofproto-dpif-upcall.c | 50 +-
ofproto/ofproto-dpif.c | 90 +-
tests/.gitignore | 3 +
tests/automake.mk | 24 +
tests/dpif-netdev.at | 40 +-
tests/ofproto-dpif.at | 170 ++
tests/ofproto-macros.at | 17 +-
tests/sendpkt.py | 12 +-
tests/system-dpdk-offloads-macros.at | 236 +++
tests/system-dpdk-offloads-testsuite.at | 28 +
tests/system-dpdk-offloads.at | 223 ++
tests/system-dpdk.at | 35 +
tests/system-kmod-macros.at | 5 +
tests/system-offloads-testsuite-macros.at | 5 +
tests/system-offloads-traffic.at | 48 +
tests/system-traffic.at | 9 +-
tests/system-userspace-macros.at | 5 +
vswitchd/bridge.c | 7 +-
vswitchd/vswitch.xml | 43 +
56 files changed, 7808 insertions(+), 3347 deletions(-)
rename lib/{netdev-offload-dpdk.c => dpif-offload-dpdk-netdev.c} (83%)
create mode 100644 lib/dpif-offload-dpdk-private.h
create mode 100644 lib/dpif-offload-dpdk.c
create mode 100644 lib/dpif-offload-dummy.c
create mode 100644 lib/dpif-offload-provider.h
rename lib/{netdev-offload-tc.c => dpif-offload-tc-netdev.c} (95%)
create mode 100644 lib/dpif-offload-tc-private.h
create mode 100644 lib/dpif-offload-tc.c
create mode 100644 lib/dpif-offload.c
create mode 100644 lib/dpif-offload.h
delete mode 100644 lib/netdev-offload-provider.h
delete mode 100644 lib/netdev-offload.c
delete mode 100644 lib/netdev-offload.h
create mode 100644 tests/system-dpdk-offloads-macros.at
create mode 100644 tests/system-dpdk-offloads-testsuite.at
create mode 100644 tests/system-dpdk-offloads.at
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev