On Mon, 16 Mar 2026 17:27:28 +0000 Anatoly Burakov <[email protected]> wrote:
> Most rte_flow parsers in DPDK suffer from huge implementation complexity > because > even though 99% of what people use rte_flow parsers for is parsing protocol > graphs, no parser is written explicitly as a graph. This patchset attempts to > suggest a viable model to build rte_flow parsers as graphs, by offering a > lightweight header only library to build rte_flow parsering graphs without too > much boilerplate and complexity. > > Most of the patchset is about Intel drivers, but they are meant as > reimplementations as well as examples for the rest of the community to assess > how to build parsers using this new infrastructure. I expect the first two > patches will be of most interest to non-Intel reviewers, as they deal with > building two reusable parser architecture pieces. > > The first piece is a new flow graph helper in ethdev. Its purpose is > deliberately narrow: it targets the protocol-graph part of rte_flow pattern > parsing, where drivers walk packet headers and validate legal item sequences > and > parameters. That does not cover all possible rte_flow features, especially > more > exotic flow items, but it does cover a large and widely shared part of what > existing drivers need to do. Or, to put it in other words, the only flow items > this infrastructure *doesn't* cover is things that do not lend themselves well > to be parsed as a graph of protocol headers (e.g. conntrack items). Everything > else should be covered or cover-able. > > The second piece is a reusable flow engine framework for Intel Ethernet > drivers. > This is kept Intel-local for now because I do not feel it is generic enough to > be presented as an ethdev-wide engine model. Even so, the intent is to > establish > a cleaner parser architecture with a defined interaction model, explicit > memory > ownership rules, and engine definitions that do not block > secondary-process-safe > usage. It is my hope that we could also promote something like this into > ethdev > proper and remove the necessity for drivers to build so much boilerplate > around > rte_flow parsing (and more often than not doing it in a way that is more > complex > than it needs to be). > > Most of the rest of the series is parser reimplementation, but that is mainly > the vehicle for demonstrating and validating those two pieces. ixgbe and i40e > are wired into the new common parsing path, and their existing parsers are > migrated incrementally to the graph-based model. Besides reducing ad hoc > parser > code, this also makes validation more explicit and more consistent with the > actual install path. In a few places that means invalid inputs that were > previously ignored, deferred, or interpreted loosely are now rejected earlier > and more strictly, without any increase in code complexity (in fact, with > marked > *decrease* of it!). > > Series depends on previously submitted patchsets: > > - IAVF global buffer fix [1] > - Common attr parsing stuff [2] > > [1] https://patches.dpdk.org/project/dpdk/list/?series=37585&state=* > [2] https://patches.dpdk.org/project/dpdk/list/?series=37663&state=* > > Anatoly Burakov (21): > ethdev: add flow graph API > net/intel/common: add flow engines infrastructure > net/intel/common: add utility functions > net/ixgbe: add support for common flow parsing > net/ixgbe: reimplement ethertype parser > net/ixgbe: reimplement syn parser > net/ixgbe: reimplement L2 tunnel parser > net/ixgbe: reimplement ntuple parser > net/ixgbe: reimplement security parser > net/ixgbe: reimplement FDIR parser > net/ixgbe: reimplement hash parser > net/i40e: add support for common flow parsing > net/i40e: reimplement ethertype parser > net/i40e: reimplement FDIR parser > net/i40e: reimplement tunnel QinQ parser > net/i40e: reimplement VXLAN parser > net/i40e: reimplement NVGRE parser > net/i40e: reimplement MPLS parser > net/i40e: reimplement gtp parser > net/i40e: reimplement L4 cloud parser > net/i40e: reimplement hash parser > > drivers/net/intel/common/flow_engine.h | 1003 ++++ > drivers/net/intel/common/flow_util.h | 165 + > drivers/net/intel/i40e/i40e_ethdev.c | 56 +- > drivers/net/intel/i40e/i40e_ethdev.h | 49 +- > drivers/net/intel/i40e/i40e_fdir.c | 47 - > drivers/net/intel/i40e/i40e_flow.c | 4092 +---------------- > drivers/net/intel/i40e/i40e_flow.h | 44 + > drivers/net/intel/i40e/i40e_flow_ethertype.c | 258 ++ > drivers/net/intel/i40e/i40e_flow_fdir.c | 1806 ++++++++ > drivers/net/intel/i40e/i40e_flow_hash.c | 1289 ++++++ > drivers/net/intel/i40e/i40e_flow_tunnel.c | 1510 ++++++ > drivers/net/intel/i40e/i40e_hash.c | 980 +--- > drivers/net/intel/i40e/i40e_hash.h | 8 +- > drivers/net/intel/i40e/meson.build | 4 + > drivers/net/intel/ixgbe/ixgbe_ethdev.c | 40 +- > drivers/net/intel/ixgbe/ixgbe_ethdev.h | 13 +- > drivers/net/intel/ixgbe/ixgbe_fdir.c | 13 +- > drivers/net/intel/ixgbe/ixgbe_flow.c | 3130 +------------ > drivers/net/intel/ixgbe/ixgbe_flow.h | 38 + > .../net/intel/ixgbe/ixgbe_flow_ethertype.c | 240 + > drivers/net/intel/ixgbe/ixgbe_flow_fdir.c | 1510 ++++++ > drivers/net/intel/ixgbe/ixgbe_flow_hash.c | 182 + > drivers/net/intel/ixgbe/ixgbe_flow_l2tun.c | 228 + > drivers/net/intel/ixgbe/ixgbe_flow_ntuple.c | 483 ++ > drivers/net/intel/ixgbe/ixgbe_flow_security.c | 297 ++ > drivers/net/intel/ixgbe/ixgbe_flow_syn.c | 280 ++ > drivers/net/intel/ixgbe/meson.build | 7 + > lib/ethdev/meson.build | 1 + > lib/ethdev/rte_flow_graph.h | 414 ++ > 29 files changed, 9867 insertions(+), 8320 deletions(-) > create mode 100644 drivers/net/intel/common/flow_engine.h > create mode 100644 drivers/net/intel/common/flow_util.h > create mode 100644 drivers/net/intel/i40e/i40e_flow.h > create mode 100644 drivers/net/intel/i40e/i40e_flow_ethertype.c > create mode 100644 drivers/net/intel/i40e/i40e_flow_fdir.c > create mode 100644 drivers/net/intel/i40e/i40e_flow_hash.c > create mode 100644 drivers/net/intel/i40e/i40e_flow_tunnel.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow.h > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_ethertype.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_fdir.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_hash.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_l2tun.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_ntuple.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_security.c > create mode 100644 drivers/net/intel/ixgbe/ixgbe_flow_syn.c > create mode 100644 lib/ethdev/rte_flow_graph.h > I ran patch review with Claude and current AGENTS.md file. Found some bugs right away. --- I'll start by reading the AGENTS.md from the project and then examine the uploaded patch.This is a 21-patch series. Let me split it and read through each patch systematically.Let me start with the core API patch (01/21) which is the most important one to review carefully.Now let me read patches 02 and 03 (infrastructure):Now let me quickly scan the driver implementation patches for correctness issues:This is a massive series. Let me focus on the key correctness issues rather than reading every line of every patch. Let me check the most critical areas - the core API and the flow engine infrastructure for bugs.I found a key issue. Let me verify it:Good, I see the issue I suspected. Now let me check the `__flow_graph_visit_node` call with `NULL` item for the start node:Now let me check the `__flow_graph_visit_node` to see what happens when `item` is NULL (called for START node):Now let me also look at the `ci_flow_engine_conf_reset` for the use-after-free I noticed:I see the use-after-free: `ci_flow_free(engine, flow)` then `TAILQ_REMOVE(&engine_conf->flows, flow, node)` - the flow is freed before being removed from the list. Now let me check `ci_flow_create` for the same:I see another issue: `ci_flow_create` never inserts the flow into `engine_conf->flows` TAILQ on success. The flow is allocated, parsed, installed, but never added to the flow list. This means `ci_flow_destroy` and `ci_flow_flush` will never find it. Let me also check i40e patches quickly for any additional issues:Now let me check the `__rte_internal` on `static inline` functions - that's unusual:Now let me write up the review. I've identified several significant issues across this 21-patch series. --- **Review of [RFC PATCH v1 00/21] Building a better rte_flow parser** **Patch 01/21: ethdev: add flow graph API** **Error: NULL pointer dereference in `__flow_graph_node_is_expected` when called from start node** `rte_flow_graph_parse()` calls `__flow_graph_visit_node(cur_node, ctx, NULL, error)` for the START node. `__flow_graph_visit_node` passes this NULL `item` to `__flow_graph_node_is_expected()`, which unconditionally dereferences it (`item->spec`, `item->mask`, `item->last`) when `constraints != RTE_FLOW_NODE_EXPECT_NONE`. If any start node has non-NONE constraints, this is a NULL dereference. In practice the existing start nodes all use `EXPECT_NONE` so the early return saves it, but the API is fragile — a single misconfigured graph will crash. Either guard with `if (item == NULL) return true;` before the dereferences, or document the contract that start nodes must always be `EXPECT_NONE`. **Error: Use-after-free in `ci_flow_engine_conf_reset` (patch 02/21)** In the flow cleanup loop: ```c RTE_TAILQ_FOREACH_SAFE(flow, &engine_conf->flows, node, tmp) { engine = ci_flow_engine_find(engine_list, flow->engine_type); ci_flow_free(engine, flow); /* frees flow */ TAILQ_REMOVE(&engine_conf->flows, flow, node); /* accesses freed flow */ } ``` `ci_flow_free()` frees `flow`, then `TAILQ_REMOVE` dereferences `flow->node`. The remove must come before the free. **Error: Flow never added to TAILQ in `ci_flow_create` (patch 02/21)** `ci_flow_create()` allocates, parses, and installs a flow, but never calls `TAILQ_INSERT_TAIL(&engine_conf->flows, flow, node)`. This means `ci_flow_destroy` and `ci_flow_flush` will never find the flow (the `ci_flow_is_valid` TAILQ walk won't match), so flows are leaked on destroy/flush. The success path needs: ```c if (ret == 0) { flow->dev = engine_conf->dev; flow->engine_type = engine->type; TAILQ_INSERT_TAIL(&engine_conf->flows, flow, node); goto unlock; } ``` **Warning: `__rte_internal` on `static inline` functions is a no-op (patch 01/21)** `__rte_internal` is a visibility/symbol-export annotation meaningful only for non-static functions that appear in linker symbol maps. On `static inline` functions in a header, it does nothing. Either drop the annotation (since these are header-only and already `static inline`), or if the intent is to mark them as internal API, use a comment instead. **Warning: `enum rte_flow_graph_node_expect` brace on wrong line (patch 01/21)** The opening brace is on its own line (`enum ... \n{`). DPDK style places it on the same line as `enum`. **Patch 02/21: net/intel/common: add flow engines infrastructure** **Warning: `ci_flow_alloc` erases only `struct ci_flow` base, not full allocation** When the custom allocator succeeds, the code does `*flow = (struct ci_flow){0}` which only zeroes the base struct. The engine-specific fields beyond `ci_flow` in the allocation are left with whatever the custom allocator returned. If the custom allocator returns uninitialized memory (not zeroed), those engine-specific fields are garbage. The framework comment says "Custom allocators only own their own engine-specific fields" but the intent seems to be that the framework zeroes its own fields — documenting this more clearly or using `memset(flow, 0, sizeof(struct ci_flow))` explicitly would be safer. **Patch 03/21: net/intel/common: add utility functions** **Warning: `ci_be24_to_cpu` uses `int` arithmetic on `uint8_t` shift** `val[0] << 16` promotes `val[0]` to `int` (signed). If the compiler's `int` is 16 bits (hypothetically), this would be undefined. In practice on all DPDK targets `int` is 32 bits so it's fine, but `(uint32_t)val[0] << 16` would be more precise. **Patch 12/21: net/i40e: add support for common flow parsing** **Warning: i40e's `struct rte_flow` has both `struct ci_flow base` and `TAILQ_ENTRY(rte_flow) node`** The `ci_flow` base already contains a `TAILQ_ENTRY(ci_flow) node`. Having a second TAILQ_ENTRY in the i40e `rte_flow` struct means there are two independent list linkages. The old code uses `flow->node` via the legacy path; the new engine uses `flow->base.node` via `ci_flow`. This works during the transition but is confusing and wastes space. The legacy TAILQ_ENTRY should be removed once migration is complete. **General observation across patches 04-21:** The series progressively migrates each legacy flow parser to the new graph-based engine, which is a good incremental approach. The individual engine implementations (ethertype, syn, L2 tunnel, ntuple, security, FDIR, hash) appear to correctly translate the old hand-rolled parsing into graph nodes with validate/process callbacks. The graph definitions look structurally correct with proper edge termination sentinels. The two critical bugs (use-after-free in reset, missing TAILQ insert in create) and the NULL dereference risk in the graph parser should be fixed before this moves beyond RFC. Reviewed-by: Stephen Hemminger <[email protected]>

