Hi Toshiaki, Thanks for the interesting patch series! I haven't finished reviewing, but I went through a couple of steps to make sure it works in my environment. Comments below:
On Tue, Jun 30, 2020 at 12:30:29AM +0900, Toshiaki Makita wrote: > This patch adds an XDP-based flow cache using the OVS netdev-offload > flow API provider. When an OVS device with XDP offload enabled, > packets first are processed in the XDP flow cache (with parse, and > table lookup implemented in eBPF) and if hits, the action processing > are also done in the context of XDP, which has the minimum overhead. > > This provider is based on top of William's recently posted patch for > custom XDP load. When a custom XDP is loaded, the provider detects if > the program supports classifier, and if supported it starts offloading > flows to the XDP program. > > The patches are derived from xdp_flow[1], which is a mechanism similar to > this but implemented in kernel. > > > * Motivation > > While userspace datapath using netdev-afxdp or netdev-dpdk shows good > performance, there are use cases where packets better to be processed in > kernel, for example, TCP/IP connections, or container to container > connections. Current solution is to use tap device or af_packet with > extra kernel-to/from-userspace overhead. But with XDP, a better solution > is to steer packets earlier in the XDP program, and decides to send to > userspace datapath or stay in kernel. > > One problem with current netdev-afxdp is that it forwards all packets to > userspace, The first patch from William (netdev-afxdp: Enable loading XDP > program.) only provides the interface to load XDP program, howerver users > usually don't know how to write their own XDP program. > > XDP also supports HW-offload so it may be possible to offload flows to > HW through this provider in the future, although not currently. > The reason is that map-in-map is required for our program to support > classifier with subtables in XDP, but map-in-map is not offloadable. > If map-in-map becomes offloadable, HW-offload of our program will also > be doable. > > > * How to use > > 1. Install clang/llvm >= 9, libbpf >= 0.0.6 (included in kernel 5.5), and > kernel >= 5.3. Encounter an error: lib/netdev-offload-xdp.c: In function ‘probe_meta_info’: lib/netdev-offload-xdp.c:386:19: error: implicit declaration of function ‘btf__find_by_name_kind’ [-Werror=implicit-function-declaration] meta_sec_id = btf__find_by_name_kind(btf, ".ovs_meta", BTF_KIND_DATASEC); $~/bpf-next/tools/lib/bpf# nm libbpf.so.0.0.7 | grep "btf__.*kind" | grep T 000000000001d4b3 T btf__find_by_name_kind $~/bpf-next/tools/lib/bpf# nm libbpf.so.0.0.6 | grep "btf__.*kind" | grep T Actually we need libbpf 0.0.7 > > 2. make with --enable-afxdp --enable-bpf > --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o". Note that > the BPF object will not be installed anywhere by "make install" at this > point. This works ok for me. I was using clang8 and it doesn't work due to BTF. 2020-07-15T14:59:15.224Z|00043|netdev_afxdp|WARN|libbpf: BTF is required, but is missing or corrupted. Maybe we should check clang version at ./configure time. > > 3. Load custom XDP program > E.g. > $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \ > options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o" > $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \ > options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o" Once clang is fixed, loading the program works: 2020-07-15T15:07:02.374Z|00049|netdev_offload|INFO|afxdp-p0: Assigned flow API 'linux_xdp'. $~/ovs# bpftool map 42: devmap name output_map flags 0x80 key 4B value 4B max_entries 65536 memlock 524288B 44: array_of_maps name flow_table flags 0x0 key 4B value 4B max_entries 128 memlock 4096B 45: array name subtbl_masks_hd flags 0x0 key 4B value 4B max_entries 1 memlock 4096B 46: (null) name xsks_map flags 0x0 ... But ping between two namespace shows packet lost. Then I realize I need to load dummy program. > > 4. Enable XDP_REDIRECT > If you use veth devices, make sure to load some (possibly dummy) programs > on the peers of veth devices. > Some HW NIC drivers require as many queues as cores on its system. Tweak > queues using "ethtool -L". > Maybe it's better put this dummy program under ovs/bpf/ I tried to compile the tools/testing/selftests/bpf/progs/xdp_dummy.c but encountered many errors. I end up using samples/bpf/xdp_redirect_kern.o sec xdp_redirect_dummy and finally got ping working between two namespaces. Will continue reviewing patches... Regards, William _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev