This patch series introduces rule offload functionality to dpif-netlink via netdev ports new flow offloading API. The user can specify whether to enable rule offloading or not via OVS configuration. Netdev providers are able to implement netdev flow offload API in order to offload rules.
This patch series also implements one offload scheme for netdev-linux, using TC flower classifier, which was chosen because its sort of natural to state OVS DP rules for this classifier. However, the code can be extended to support other classifiers such as U32, eBPF, etc which support offload as well. The use-case we are currently addressing is the newly sriov switchdev mode in the Linux kernel which was introduced in version 4.8. This series was tested against sriov vfs vports representors of the Mellanox 100G ConnectX-4 series exposed by the mlx5 kernel driver. The feature is disabled by default and can be enabled by setting other_config:hw-offload to True. Currently offloading is done for rules consisting on matching on L2 MAC, L3 IP (not including IP flags), L4 TCP/UDP (not inclduing TCP flags), VLAN, VXLAN tunnel. HW offloading is supported only for drop and for single output action with also vlan push/pop, vxlan encap/decap. V10->V11 - fix parse/dump with vlan+ip - consolidate params in netdev_tc_flow_put - updates to flow api doc in netdev-provider.h Travis: https://travis-ci.org/roidayan/ovs/builds/242455077 AppVeyor: https://ci.appveyor.com/project/roidayan/ovs/build/1.0.26 V9->V10 - fix wrong tunnel info in flow dump - add rules to ingress instead of ffff: parent (match to tc userspace) - move some tc wrappers to be static inline in tc.h - remove redundant includes in tc and dpif-netlink.c - fix order in automake.mk - fix order of includes in tc - consolidate params in netdev_flow_dump_next - log the filter type we fail on - fix line lengths - add ovs_requires attr to netdev_ports_lookup - fix log typo - use a macro for converting jiffies to ms in tc.c - use put_32aligned_u64 to update stats - add missing netdev_hmap_mutex locks Travis: https://travis-ci.org/roidayan/ovs/builds/240750652 AppVeyor: https://ci.appveyor.com/project/roidayan/ovs/build/1.0.25 V8->V9 - Refactor rl_err/parse_err to error_rl - Refactor logical if statement in nl_parse_act_drop - Refactor tca_flower_policy and tunnel_key_policy - not to ignore dst mac mask - add a note that hw-offload supported on Linux systems - split tc.c commit into multiple commits - fix last_prio type in get_prio_for_tc_flower - log error if we exhaust last_prio - fix netdev_tc_flow_put args order - fix netdev_flow_dump_.. style consistency - cleanup includes in netdev-tc-offloads.c - fix memory leak parsing ovs-dpctl type/filter - return error on invalid dump flows type - indicate if flows are offloaded in ovs-dpctl dump flows - add note about hw offloading to NEWS - move compat commit before flower to avoid breaking bisection Travis: https://travis-ci.org/roidayan/ovs/builds/236869671 AppVeyor: https://ci.appveyor.com/project/roidayan/ovs/build/1.0.21 Not done: - refactor netdev_ports_* functions to accept a typed obj instead of void* - Able to change hw-offload and tc-policy at runtime - commit ready but held for now. will send later. - Add support for SCTP. V7->V8 - Refactor dpif logging functions and use them in dpif-netlink - Ignore internal devices from netdev hashmap - Refactor netdev hmap naming to prefix netdev_ports_* - Use single hashmap with 2 nodes for ufid/tc mapping - Verify ifindex is valid in netdev_hmap_port_add - Close netdev in netdev_tc_flow_get error flow - Improve comments for flow offload api - Reorder flow api output args to be last args - Remove redundant netdev_flow_support - Fix using uninitialized var 's' Not done: refactor netdev_ports_* functions to accept a typed obj (e.g. netdev_ports struct) instead of void*. We can do it as a follow-up commit later. V6->V7: - Fix L3 IPv4 matching got broken - Refactor offloads test and testsuite to have same prefix - Better handling of unsupported match attributes V5->V6: - Rebase over master branch, fix compilation issue - Add Nicira copyright to tc interface V4->V5: - Fix compat - Fix VXLAN IPv6 tunnel matching - Fix order of actions in dump flows - Update ovs-dpctl man page about the addtion of type to dump-flows Travis https://travis-ci.org/roidayan/ovs/builds/213735371 AppVeyor https://ci.appveyor.com/project/roidayan/ovs/build/1.0.18 V3->V4: - Move declarations to the right commit with implementation - Fix tc_get_flower flow return false success - Fix memory leaks - not releasing tc_transact replies - Fix travis failure for OSX compilation - Fix loop in dpif_netlink_flow_dump_next - Fix declared default value for tc-policy in vswitch.xml - Refactor loop in netdev_tc_flow_dump_next - Add missing error checks in parse_flow_put - Fix handling modify request where old rule is in hw and new rule is not supported and needs to be in sw. - Use 2 hashmaps instead of 1 for faster reverse lookup of ufid from tc - Init ports when enabling hw-offload after OVS is running TODO: Fix breaking of datapath compilation Fix testsuite failures Travis https://travis-ci.org/roidayan/ovs/builds/210549325 AppVeyor https://ci.appveyor.com/project/roidayan/ovs/build/1.0.15 V2->V3: - Code styling fixes - Bug fixes - Using already available macros/functions to match current OVS code - Refactored code according to V2 review - Replaced bool option skip-hw for string option tc-policy - Added hw offload tests using policy skip_hw - Fixed most compatability compiling issues - Travis https://travis-ci.org/roidayan/ovs/builds/199610124 - AppVeyor https://ci.appveyor.com/project/roidayan/ovs/build/1.0.14 - Fixed compiling with DPDK enabled TODO: - need to fix datapath compiling issues found in travis after adding tc compatability headers - need to fix failing test cases because of get_ifindex V1->V2: - Added generic netdev flow offloads API. - Implemented relevant flow API in netdev-linux (and netdev-vport). - Added a other_config hw-offload option to enable offloading (defaults to false). - Fixed coding style to conform with OVS. - Policy removed for now. (Will be discussed how best implemented later). Thanks, Paul & Roi Paul Blakey (24): tc: Introduce tc module netdev: Adding a new netdev API to be used for offloading flows other-config: Add hw-offload switch to control netdev flow offloading other-config: Add tc-policy switch to control tc flower flag dpif: Save added ports in a port map for netdev flow api use dpif-netlink: Flush added ports using netdev flow api netdev-tc-offloads: Implement netdev flow flush using tc interface dpif-netlink: Dump netdevs flows on flow dump netdev-tc-offloads: Add ufid to tc/netdev map netdev-tc-offloads: Implement netdev flow dump api using tc interface dpif-netlink: Use netdev flow put api to insert a flow netdev-tc-offloads: Add flower mask to priority map netdev-tc-offloads: Implement netdev flow put using tc interface dpif-netlink: Use netdev flow del api to delete a flow netdev-tc-offloads: Implement netdev flow del using tc interface dpif-netlink: Use netdev flow get api to query a flow netdev-tc-offloads: Implement flow get using tc interface netdev-linux: Disallow setting policing when configured with hw offload netdev-vport: Use common offloads interface netdev-tc-offloads: Add ingress on netdev flow api init dpctl: Add an option to dump only certain kinds of flows dpctl: Indicate if flow is offloaded when dumping flows of all types tests: Add system-offloads-testsuite netdev: Init flow api on already added ports on offload enable Roi Dayan (9): netdev-linux: Refactor two tc functions tc: Refactor tcm handle assignment when creating filter qdisc tc: Move functions the create/parse handle to be static inline tc: Add tc flower functions match: Add helper function to set tunnel tp_dst dpctl: Add filter arg to dump-flows command info dpif: Refactor flow logging functions to be used by other modules dpif-netlink: Use dpif logging functions NEWS: add a note about hw offloading NEWS | 3 + include/openvswitch/match.h | 2 + lib/automake.mk | 6 +- lib/dpctl.c | 55 +- lib/dpctl.man | 7 +- lib/dpif-netdev.c | 3 +- lib/dpif-netlink.c | 491 +++++++++++++++- lib/dpif-provider.h | 6 +- lib/dpif.c | 123 ++-- lib/dpif.h | 34 +- lib/match.c | 13 + lib/netdev-bsd.c | 2 + lib/netdev-dpdk.c | 1 + lib/netdev-dummy.c | 2 + lib/netdev-linux.c | 187 ++---- lib/netdev-linux.h | 11 + lib/netdev-provider.h | 72 +++ lib/netdev-tc-offloads.c | 948 ++++++++++++++++++++++++++++++ lib/netdev-tc-offloads.h | 42 ++ lib/netdev-vport.c | 60 +- lib/netdev.c | 364 ++++++++++++ lib/netdev.h | 41 ++ lib/odp-util.c | 56 ++ lib/odp-util.h | 3 + lib/tc.c | 1132 ++++++++++++++++++++++++++++++++++++ lib/tc.h | 159 +++++ ofproto/ofproto-dpif-upcall.c | 3 +- ofproto/ofproto-dpif.c | 2 +- tests/.gitignore | 1 + tests/automake.mk | 16 + tests/ofproto-macros.at | 6 +- tests/system-offloads-testsuite.at | 25 + tests/system-offloads-traffic.at | 67 +++ vswitchd/bridge.c | 1 + vswitchd/vswitch.xml | 32 + 35 files changed, 3754 insertions(+), 222 deletions(-) create mode 100644 lib/netdev-tc-offloads.c create mode 100644 lib/netdev-tc-offloads.h create mode 100644 lib/tc.c create mode 100644 lib/tc.h create mode 100644 tests/system-offloads-testsuite.at create mode 100644 tests/system-offloads-traffic.at -- 2.7.4 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev