This patch series aims to improve the performance of the management of hw-offloads in dpif-netdev. In the current version, some setup will experience high memory usage and poor latency between a flow decision and its execution regarding hardware offloading.
This series starts by measuring key metrics regarding both issues Those patches are introduced first to compare the current status with each improvements introduced. Offloads enqueued and inserted, as well as the latency from queue insertion to hardware insertion is measured. A new command 'ovs-appctl dpctl/offload-stats-show' is introduced to show the current measure. In my current performance test setup I am measuring an average latency hovering between 1~2 seconds. After the optimizations, it is reduced to 500~900 ms. Finally when using multiple threads and with proper driver support[1], it is measured in the order of 1 ms. A few modules are introduced: * An ID pool with reduced capabilities, simplifying its operations and allowing better performances in both single and multi-thread setup. * A lockless queue between PMDs / revalidators and offload thread(s). As the number of PMDs increases, contention can be high on the shared queue. This queue is designed to serve as message queue between threads. * A bounded lockless MPMC ring and some helpers for calculating moving averages. * A moving average module for Cumulative and Exponential moving averages. The netdev-offload-dpdk module is made thread-safe. Internal maps are made per-netdev instead, and locks are taken for shorter critical sections within the module. CI result: https://github.com/grivet/ovs/actions/runs/554918929 [1]: The rte_flow API was made thread-safe in the 20.11 DPDK release. Drivers that do not implement those operations concurrently are protected by a lock. Others will allow better concurrency, that improve the result of this series. v2: * Improved the MPSC queue API to simplify usage. * Moved flush operation from initiator thread to offload thread(s). This ensures offload metadata are shared only among the offload thread pool. * Flush operation needs additional thread synchronization. The ovs_barrier currently triggers a UAF. Add a unit-test to validate its operations and a fix for the UAF. CI result: https://github.com/grivet/ovs/actions/runs/741430135 The error comes from a failure to download 'automake' on osx, unrelated to any change in this series. v3: * Re-ordered commits so fixes are first. No conflict seen currently, but it might prevent them if some requested changes to the series were to move code in the same parts. * Modified the reduced quiescing the thread to use ovsrcu_quiesce(), and base next_rcu on the current time value (after quiescing happened, however long it takes). * Added Reviewed-by tags to the relevant commits. CI result: https://github.com/grivet/ovs/actions/runs/782655601 v4: * Modified the seq-pool to use batches of IDs with a spinlock instead of lockless rings. * The llring structure is removed. * Due to the length of the changes to the structure, some acked-by or reviewed-by were not ported to the id-fpool patch. CI result: https://github.com/grivet/ovs/actions/runs/921095015 v5: * Rebase on master. Conflicts were seen related to the vxlan-decap and pmd rebalance series. * Fix typo in xchg patch spotted by Maxime Coquelin. * Added Reviewed-by Maxime Coquelin on 4 patches. CI result: https://github.com/grivet/ovs/actions/runs/1212804378 Gaetan Rivet (27): ovs-thread: Fix barrier use-after-free dpif-netdev: Rename flow offload thread tests: Add ovs-barrier unit test netdev: Add flow API uninit function netdev-offload-dpdk: Use per-netdev offload metadata netdev-offload-dpdk: Implement hw-offload statistics read dpctl: Add function to read hardware offload statistics dpif-netdev: Rename offload thread structure mov-avg: Add a moving average helper structure dpif-netdev: Implement hardware offloads stats query ovs-atomic: Expose atomic exchange operation mpsc-queue: Module for lock-free message passing id-fpool: Module for fast ID generation netdev-offload: Add multi-thread API dpif-netdev: Quiesce offload thread periodically dpif-netdev: Postpone flow offload item freeing dpif-netdev: Use id-fpool for mark allocation dpif-netdev: Introduce tagged union of offload requests dpif-netdev: Execute flush from offload thread netdev-offload-dpdk: Use per-thread HW offload stats netdev-offload-dpdk: Lock rte_flow map access netdev-offload-dpdk: Protect concurrent offload destroy/query dpif-netdev: Use lockless queue to manage offloads dpif-netdev: Make megaflow and mark mappings thread objects dpif-netdev: Replace port mutex by rwlock dpif-netdev: Use one or more offload threads netdev-dpdk: Remove rte-flow API access locks lib/automake.mk | 5 + lib/dpctl.c | 36 ++ lib/dpif-netdev.c | 734 ++++++++++++++++++++++++-------- lib/dpif-netlink.c | 1 + lib/dpif-provider.h | 7 + lib/dpif.c | 8 + lib/dpif.h | 9 + lib/id-fpool.c | 279 ++++++++++++ lib/id-fpool.h | 66 +++ lib/mov-avg.h | 171 ++++++++ lib/mpsc-queue.c | 251 +++++++++++ lib/mpsc-queue.h | 190 +++++++++ lib/netdev-dpdk.c | 6 - lib/netdev-offload-dpdk.c | 274 ++++++++++-- lib/netdev-offload-provider.h | 4 + lib/netdev-offload.c | 92 +++- lib/netdev-offload.h | 21 + lib/ovs-atomic-c++.h | 3 + lib/ovs-atomic-clang.h | 5 + lib/ovs-atomic-gcc4+.h | 5 + lib/ovs-atomic-gcc4.7+.h | 5 + lib/ovs-atomic-i586.h | 5 + lib/ovs-atomic-locked.h | 9 + lib/ovs-atomic-msvc.h | 22 + lib/ovs-atomic-pthreads.h | 5 + lib/ovs-atomic-x86_64.h | 5 + lib/ovs-atomic.h | 8 +- lib/ovs-thread.c | 61 ++- lib/ovs-thread.h | 6 +- tests/automake.mk | 3 + tests/library.at | 14 + tests/test-barrier.c | 264 ++++++++++++ tests/test-id-fpool.c | 615 +++++++++++++++++++++++++++ tests/test-mpsc-queue.c | 772 ++++++++++++++++++++++++++++++++++ vswitchd/vswitch.xml | 16 + 35 files changed, 3749 insertions(+), 228 deletions(-) create mode 100644 lib/id-fpool.c create mode 100644 lib/id-fpool.h create mode 100644 lib/mov-avg.h create mode 100644 lib/mpsc-queue.c create mode 100644 lib/mpsc-queue.h create mode 100644 tests/test-barrier.c create mode 100644 tests/test-id-fpool.c create mode 100644 tests/test-mpsc-queue.c -- 2.31.1 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev