Replace the FIFO circular buffer in rte_graph with a bitmap and a
priority-sorted schedule table. Nodes with lower priority values are
visited first during the graph walk. Source nodes are forced to
INT16_MIN. Topological depth from BFS ensures upstream-before-downstream
ordering when all priorities are equal.

In fan-out-then-converge topologies, setting a higher priority (lower
value) on branch nodes avoids redundant visits to the converge node.
The diamond perf test shows a ~10% throughput improvement (converge
visited once at 256 objs/call instead of twice at 128).

Changes v1 -> v2:
 - added diamond perf test into separate preparatory patch
 - added topological depth (topo_order) as secondary sort key to preserve
   FIFO-like upstream-before-downstream ordering for default priorities,
   preventing a regression in the reverse tree test
 - restored idx == 0 guard on bitmap set in the enqueue path to avoid a
   ~15% throughput regression caused by touching the pending bitmap
   cache line on every enqueue call
 - use hiprio worker nodes in diamond test to demonstrate the actual
   priority-based scheduling benefit
 - added performance numbers measured before and after

Cc: Christophe Fontaine <[email protected]>
Cc: David Marchand <[email protected]>
Cc: Jerin Jacob <[email protected]>
Cc: Kiran Kumar Kokkilagadda <[email protected]>
Cc: Konstantin Ananyev <[email protected]>
Cc: Maxime Leroy <[email protected]>
Cc: Nithin Kumar Dabilpuram <[email protected]>
Cc: Vladimir Medvedkin <[email protected]>
Cc: Zhirun Yan <[email protected]>

Robin Jarry (2):
  graph: add diamond topology performance test
  graph: replace circular buffer with priority-based bitmap

 app/test/test_graph_perf.c                    |  140 +-
 doc/guides/prog_guide/graph_lib.rst           |   37 +-
 .../prog_guide/img/graph_mem_layout.svg       | 1823 +++++++----------
 lib/graph/graph.c                             |   27 +-
 lib/graph/graph_debug.c                       |   12 +-
 lib/graph/graph_ops.c                         |   46 +
 lib/graph/graph_populate.c                    |  119 +-
 lib/graph/graph_private.h                     |   42 +-
 lib/graph/node.c                              |    2 +
 lib/graph/rte_graph.h                         |    1 +
 lib/graph/rte_graph_model_mcore_dispatch.h    |   34 +-
 lib/graph/rte_graph_model_rtc.h               |   65 +-
 lib/graph/rte_graph_worker.h                  |    2 +-
 lib/graph/rte_graph_worker_common.h           |   79 +-
 14 files changed, 1194 insertions(+), 1235 deletions(-)

-- 
2.54.0

Reply via email to