On 1/15/26 7:10 PM, Aaron Conole wrote:
> Ilya Maximets <[email protected]> writes:
> 
>> On 1/15/26 10:19 AM, Eelco Chaudron wrote:
>>> This patch series introduces a major architectural
>>> refactoring of Open vSwitch's hardware offload
>>> infrastructure. It replaces the tightly coupled
>>> `netdev-offload` implementation with a new, modular
>>> `dpif-offload-provider` framework.
>>>
>>> MOTIVATION
>>> -------------------------------------------------------------
>>> The existing `netdev-offload` API tightly couples datapath
>>> implementations (like `dpif-netdev`) with specific offload
>>> technologies (DPDK's rte_flow). This design has several
>>> limitations:
>>>
>>> - Rigid Architecture: It creates complex dependencies,
>>>   making the code difficult to maintain and extend.
>>>
>>> - Limited Flexibility: Supporting multiple offload backends
>>>   simultaneously or adding new ones is cumbersome.
>>>
>>> - Inconsistent APIs: The logic for handling different
>>>   offload types is scattered, leading to an inconsistent
>>>   and hard-to-follow API surface.
>>>
>>> This refactoring aims to resolve these issues by creating a
>>> clean separation of concerns, improving modularity, and
>>> establishing a clear path for future hardware offload
>>> integrations.
>>>
>>> PROPOSED SOLUTION: THE `DPIF-OFFLOAD-PROVIDER` FRAMEWORK
>>> -------------------------------------------------------------
>>> This series introduces the `dpif-offload-provider`
>>> framework, which functions similarly to the existing
>>> `dpif-provider` pattern. It treats hardware offload as a
>>> distinct layer with multiple, dynamically selectable
>>> backends.
>>>
>>> Key features of the new framework include:
>>>
>>> 1. Modular Architecture: A clean separation between the
>>>    generic datapath interface and specific offload
>>>    provider implementations (e.g., `dpif-offload-tc`,
>>>    `dpif-offload-dpdk`). `dpif` layers are now generic
>>>    clients of the offload API.
>>>
>>> 2. Provider-based System: Allows multiple offload backends
>>>    to coexist.
>>>
>>> 3. Unified and Asynchronous API: Establishes a consistent
>>>    API across all offload providers. For userspace
>>>    datapaths, the API is extended to support asynchronous
>>>    flow operations with callbacks, making `dpif-netdev` a
>>>    more efficient client.
>>>
>>> 4. Enhanced Configuration: Provides granular control over
>>>    offload provider selection through a global and per-port
>>>    priority system (`hw-offload-priority`), allowing
>>>    fine-tuned policies for different hardware.
>>>
>>> 5. Improved Testing: Includes a new test framework
>>>    specifically for validating DPDKs rte_flow offloads,
>>>    enhancing long-term maintainability.
>>>
>>> PATCH SERIES ORGANIZATION
>>> -------------------------------------------------------------
>>> This large series is organized logically to facilitate
>>> review:
>>>
>>> 1. Framework Foundation: The initial patches establish the
>>>    core `dpif-offload-provider` framework, including the
>>>    necessary APIs for port management, flow mark
>>>    allocation, configuration, and a dummy provider for
>>>    testing.
>>>
>>> 2. Provider Implementation: These patches introduce the new
>>>    `dpif-offload-tc` and `dpif-offload-dpdk` providers,
>>>    building out their specific implementations on top of
>>>    the new framework.
>>>
>>> 3. API Migration and Decoupling: The bulk of the series
>>>    systematically migrates functionality from the legacy
>>>    `netdev-offload` layer to the new providers. Key
>>>    commits here decouple `dpif-netdev` and, crucially,
>>>    `dpif-netlink` from their hardware offload
>>>    entanglements.
>>>
>>> 4. Cleanup: The final patches remove the now-redundant
>>>    global APIs and structures from `netdev-offload`,
>>>    completing the transition.
>>>
>>> BACKWARD COMPATIBILITY
>>> -------------------------------------------------------------
>>> This refactoring maintains full API compatibility from a
>>> user's perspective. All existing `ovs-vsctl` and
>>> `ovs-appctl` commands continue to function as before. The
>>> changes are primarily internal architectural improvements
>>> designed to make OVS more robust and extensible.
>>>
>>> REQUEST FOR COMMENTS
>>> -------------------------------------------------------------
>>> This is a significant architectural change that affects
>>> core OVS infrastructure. We welcome feedback on:
>>>
>>> - The overall architectural approach and the
>>>   `dpif-offload-provider` concept.
>>> - The API design, particularly the new asynchronous model
>>>   for `dpif-netdev`.
>>> - The migration strategy and any potential backward
>>>   compatibility concerns.
>>> - Performance implications of the new framework.
>>>
>>> -------------------------------------------------------------
>>> Changes from v1:
>>>   - Fixed issues reported by Aaron and my AI experiments.
>>>   - See individual patches for specific changes.
>>>
>>> Changes from v2:
>>>   - See individual patches for specific changes.
>>>
>>> Changes from v3:
>>>   - In patch 7, removed leftover netdev_close(port->netdev)
>>>     causing netdev reference count issues.
>>>   - Changed naming for dpif_offload_impl_type enum entries.
>>>   - Merged previous patch 36, 'dpif-netdev: Add full name
>>>     to the dp_netdev structure', with the next patch.
>>>
>>> Changes from v4:
>>>   - Applied review comments from Aaron, see individual patches.
>>>   - Added Eli's ACK to the individual patches.
>>>
>>> Changes from v5:
>>>   - Addressed Ilya's comments, see upstream discussion for
>>>     details.
>>>
>>> Eelco Chaudron (40):
>>>   dpif-offload-provider: Add dpif-offload-provider implementation.
>>>   dpif-offload: Add provider for tc offload.
>>>   dpif-offload: Add provider for dpdk (rte_flow).
>>>   dpif-offload: Allow configuration of offload provider priority.
>>>   dpif-offload: Move hw-offload configuration to dpif-offload.
>>>   dpif-offload: Add offload provider set_config API.
>>>   dpif-offload: Add port registration and management APIs.
>>>   dpif-offload-tc: Add port management framework.
>>>   dpif-offload-dpdk: Add port management framework.
>>>   dpif-offload: Validate mandatory port class callbacks on registration.
>>>   dpif-offload: Allow per-port offload provider priority config.
>>>   dpif-offload: Introduce provider debug information API.
>>>   dpif-offload: Call flow-flush netdev-offload APIs via dpif-offload.
>>>   dpif-offload: Call meter netdev-offload APIs via dpif-offload.
>>>   dpif-offload: Move the flow_get_n_flows() netdev-offload API to dpif.
>>>   dpif-offload: Move hw_post_process netdev API to dpif.
>>>   dpif-offload: Add flow dump APIs to dpif-offload.
>>>   dpif-offload: Move the tc flow dump netdev APIs to dpif-offload.
>>>   dpif-netlink: Remove netlink-offload integration.
>>>   dpif-netlink: Add API to get offloaded netdev from port_id.
>>>   dpif-offload: Add API to find offload implementation type.
>>>   dpif-offload: Add operate implementation to dpif-offload.
>>>   netdev-offload: Temporarily move thread-related APIs to dpif-netdev.
>>>   dpif-offload: Add port dump APIs to dpif-offload.
>>>   dpif-netdev: Remove indirect DPDK netdev offload API calls.
>>>   dpif: Add dpif_get_features() API.
>>>   dpif-offload: Add flow operations to dpif-offload-tc.
>>>   dpif-netlink: Remove entangled hardware offload.
>>>   dpif-offload-tc: Remove netdev-offload dependency.
>>>   netdev_dummy: Remove hardware offload override.
>>>   dpif-offload: Move the netdev_any_oor() API to dpif-offload.
>>>   netdev-offload: Remove the global netdev-offload API.
>>>   dpif-offload: Add inline flow APIs for userspace datapaths.
>>>   dpif_netdev: Fix nullable memcpy in queue_netdev_flow_put().
>>>   dpif-offload: Move offload_stats_get() API to dpif-offload.
>>>   dpif-offload-dpdk: Abstract rte_flow implementation from dpif-netdev.
>>>   dpif-offload-dummy: Add flow add/del/get APIs.
>>>   netdev-offload: Fold netdev-offload APIs and files into dpif-offload.
>>>   tests: Fix NSH decap header test for real Ethernet devices.
>>>   tests: Add a simple DPDK rte_flow test framework.
>>>
>>>  Documentation/topics/testing.rst              |   19 +
>>>  include/openvswitch/json.h                    |    1 +
>>>  include/openvswitch/netdev.h                  |    1 +
>>>  include/openvswitch/thread.h                  |    6 +
>>>  lib/automake.mk                               |   17 +-
>>>  lib/dp-packet.h                               |    1 +
>>>  lib/dpctl.c                                   |   50 +-
>>>  lib/dpdk.c                                    |    2 -
>>>  lib/dpif-netdev-avx512.c                      |    4 +-
>>>  lib/dpif-netdev-private-flow.h                |    9 +-
>>>  lib/dpif-netdev.c                             | 1243 +++---------
>>>  lib/dpif-netlink.c                            |  557 +-----
>>>  ...load-dpdk.c => dpif-offload-dpdk-netdev.c} |  592 ++++--
>>>  lib/dpif-offload-dpdk-private.h               |   73 +
>>>  lib/dpif-offload-dpdk.c                       | 1114 +++++++++++
>>>  lib/dpif-offload-dummy.c                      |  890 +++++++++
>>>  lib/dpif-offload-provider.h                   |  412 ++++
>>>  ...-offload-tc.c => dpif-offload-tc-netdev.c} |  229 ++-
>>>  lib/dpif-offload-tc-private.h                 |   73 +
>>>  lib/dpif-offload-tc.c                         |  810 ++++++++
>>>  lib/dpif-offload.c                            | 1773 +++++++++++++++++
>>>  lib/dpif-offload.h                            |  227 +++
>>>  lib/dpif-provider.h                           |   66 +-
>>>  lib/dpif.c                                    |  164 +-
>>>  lib/dpif.h                                    |   14 +-
>>>  lib/dummy.h                                   |    9 +
>>>  lib/json.c                                    |    7 +
>>>  lib/netdev-dpdk.c                             |    9 +-
>>>  lib/netdev-dpdk.h                             |    2 +-
>>>  lib/netdev-dummy.c                            |  199 +-
>>>  lib/netdev-linux.c                            |    3 +-
>>>  lib/netdev-offload-provider.h                 |  148 --
>>>  lib/netdev-offload.c                          |  910 ---------
>>>  lib/netdev-offload.h                          |  169 --
>>>  lib/netdev-provider.h                         |   10 +-
>>>  lib/netdev.c                                  |   71 +-
>>>  lib/netdev.h                                  |   22 +
>>>  lib/tc.c                                      |    2 +-
>>>  ofproto/ofproto-dpif-upcall.c                 |   50 +-
>>>  ofproto/ofproto-dpif.c                        |   82 +-
>>>  tests/.gitignore                              |    3 +
>>>  tests/automake.mk                             |   24 +
>>>  tests/dpif-netdev.at                          |   37 +-
>>>  tests/ofproto-dpif.at                         |  170 ++
>>>  tests/ofproto-macros.at                       |   17 +-
>>>  tests/sendpkt.py                              |   12 +-
>>>  tests/system-dpdk-offloads-macros.at          |  236 +++
>>>  tests/system-dpdk-offloads-testsuite.at       |   28 +
>>>  tests/system-dpdk-offloads.at                 |  223 +++
>>>  tests/system-dpdk.at                          |   35 +
>>>  tests/system-kmod-macros.at                   |    5 +
>>>  tests/system-offloads-testsuite-macros.at     |    5 +
>>>  tests/system-offloads-traffic.at              |   46 +
>>>  tests/system-traffic.at                       |    9 +-
>>>  tests/system-userspace-macros.at              |    5 +
>>>  vswitchd/bridge.c                             |    7 +-
>>>  vswitchd/vswitch.xml                          |   44 +
>>>  57 files changed, 7601 insertions(+), 3345 deletions(-)
>>>  rename lib/{netdev-offload-dpdk.c => dpif-offload-dpdk-netdev.c} (83%)
>>>  create mode 100644 lib/dpif-offload-dpdk-private.h
>>>  create mode 100644 lib/dpif-offload-dpdk.c
>>>  create mode 100644 lib/dpif-offload-dummy.c
>>>  create mode 100644 lib/dpif-offload-provider.h
>>>  rename lib/{netdev-offload-tc.c => dpif-offload-tc-netdev.c} (95%)
>>>  create mode 100644 lib/dpif-offload-tc-private.h
>>>  create mode 100644 lib/dpif-offload-tc.c
>>>  create mode 100644 lib/dpif-offload.c
>>>  create mode 100644 lib/dpif-offload.h
>>>  delete mode 100644 lib/netdev-offload-provider.h
>>>  delete mode 100644 lib/netdev-offload.c
>>>  delete mode 100644 lib/netdev-offload.h
>>>  create mode 100644 tests/system-dpdk-offloads-macros.at
>>>  create mode 100644 tests/system-dpdk-offloads-testsuite.at
>>>  create mode 100644 tests/system-dpdk-offloads.at
>>>
>>
>> Thanks, Eelco for the update!  I read through the set again and it
>> reads much easier after the renames.  There are still a few small
>> style issues throughout the set, but they are all minor like an
>> extra empty line at the EOF of dpif-offload.c in patch 7, a few
>> more places with a missing double space or underscore or the thread
>> safety annotations not on a separate line.  There is also a missed
>> rename of dpif_offload_flow_get_n_offloaded() into something like
>> dpif_ofload_flow_count(), since the underlying callback was renamed.
>> And the provider_collection_add() should return a positive errno
>> after all, since it's passed directly into ovs_strerror().
>>
>> As discussed off-list, instead of me writing all of these nits down
>> in the emails and then you fixing them, it's simpler if I just
>> handle the merge and fix the nits on commit.
> 
> Thanks for the work on that.
> 
>> Also, found a couple more issues that we should address:
>>
>> 1. In the first patch, the dpif_offload_dump_next() function is
>>    describing a case that should be impossible, but still tries
>>    to handle it.  We shouldn't do that.  And instead we need to
>>    just take the collection reference in dpif_offload_dump_start()
>>    to make sure the collection doesn't go away during the dump.
>>    And then we can fully rely on the LIST_FOR_EACH_CONTINUE.
>>
>> 2. The per-port priority configuration should be per-interface
>>    instead as datapath doesn't deal with ports, it only works
>>    with interfaces.  And some ports, like bonds can have multiple
>>    interfaces.  It may not be a big deal as in a normal case
>>    I don't think we would need different offload for interfaces
>>    in the same port, but it's not really correct to configure
>>    ports, when it is applied on the interface.
>>
>> And there are two things that I flagged in v5, but we decided to
>> handle separately:
>>
>> 3. Mass-rename of offload provider structures and functions to
>>    make them coherent and shorter than they are.  Most of them
>>    are not exported and don't need extensive prefixes.  And these
>>    prefixes need unification after moving stuff between netdev*
>>    and dpif* modules.
>>
>> 4. Move of the port manager instances from provider-specific
>>    structures to the common struct dpif_offload.  This should
>>    eliminate some duplication and make the module boundaries
>>    more clear.  This should also eliminate the need for the
>>    port dump API or at least significantly simplify it, e.g.
>>    by just iterating over port cmap with a cmap_cursor.
>>
>>
>> So, the situation is the following: We have branching planned for
>> tomorrow.  Full CI run takes about 24 hours.  So, we don't really
>> have time for v7.  We could postpone branching, but it also seems
>> unreasonable as we only really need changes in about 4 patches
>> out of 40, and none of the issues listed above are critical or
>> breaking any functionality.  They are mostly internal code movements.
>> It would be much easier to get the set merged and then work on the
>> 4 items above next week and backport to the new branch.
> 
> +1 to that.
> 
>> As mentioned earlier, to save some time, I can handle the merging
>> and fix all the small nits on commit instead of wasting a lot of
>> time on writing them down on the mailing list and then Eelco
>> fixing them.
>>
>> Eelco, Aaron, does that sound like a good plan?  Or did I miss
>> anything in my summary?
> 
> <gimli-voice/> And my ACKs :)

Thanks!

> 
> (if it makes it easier, I can go through and send them or you can add
>  them during your merge)

I'll be doing an interactive rebase for all the nit fixing anyway, so I
can add the ACks during that process.

> 
>> Best regards, Ilya Maximets.
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to