Hi Toshiaki,
Thanks for the interesting patch series!
I haven't finished reviewing, but I went through a couple of
steps to make sure it works in my environment.
Comments below:
On Tue, Jun 30, 2020 at 12:30:29AM +0900, Toshiaki Makita wrote:
> This patch adds an XDP-based flow cache using the OVS netdev-offload
> flow API provider. When an OVS device with XDP offload enabled,
> packets first are processed in the XDP flow cache (with parse, and
> table lookup implemented in eBPF) and if hits, the action processing
> are also done in the context of XDP, which has the minimum overhead.
>
> This provider is based on top of William's recently posted patch for
> custom XDP load. When a custom XDP is loaded, the provider detects if
> the program supports classifier, and if supported it starts offloading
> flows to the XDP program.
>
> The patches are derived from xdp_flow[1], which is a mechanism similar to
> this but implemented in kernel.
>
>
> * Motivation
>
> While userspace datapath using netdev-afxdp or netdev-dpdk shows good
> performance, there are use cases where packets better to be processed in
> kernel, for example, TCP/IP connections, or container to container
> connections. Current solution is to use tap device or af_packet with
> extra kernel-to/from-userspace overhead. But with XDP, a better solution
> is to steer packets earlier in the XDP program, and decides to send to
> userspace datapath or stay in kernel.
>
> One problem with current netdev-afxdp is that it forwards all packets to
> userspace, The first patch from William (netdev-afxdp: Enable loading XDP
> program.) only provides the interface to load XDP program, howerver users
> usually don't know how to write their own XDP program.
>
> XDP also supports HW-offload so it may be possible to offload flows to
> HW through this provider in the future, although not currently.
> The reason is that map-in-map is required for our program to support
> classifier with subtables in XDP, but map-in-map is not offloadable.
> If map-in-map becomes offloadable, HW-offload of our program will also
> be doable.
>
>
> * How to use
>
> 1. Install clang/llvm >= 9, libbpf >= 0.0.6 (included in kernel 5.5), and
> kernel >= 5.3.
Encounter an error:
lib/netdev-offload-xdp.c: In function ‘probe_meta_info’:
lib/netdev-offload-xdp.c:386:19: error: implicit declaration of function
‘btf__find_by_name_kind’
[-Werror=implicit-function-declaration]
meta_sec_id = btf__find_by_name_kind(btf, ".ovs_meta", BTF_KIND_DATASEC);
$~/bpf-next/tools/lib/bpf# nm libbpf.so.0.0.7 | grep "btf__.*kind" | grep T
000000000001d4b3 T btf__find_by_name_kind
$~/bpf-next/tools/lib/bpf# nm libbpf.so.0.0.6 | grep "btf__.*kind" | grep T
Actually we need libbpf 0.0.7
>
> 2. make with --enable-afxdp --enable-bpf
> --enable-bpf will generate XDP program "bpf/flowtable_afxdp.o". Note that
> the BPF object will not be installed anywhere by "make install" at this
> point.
This works ok for me. I was using clang8 and it doesn't work due to BTF.
2020-07-15T14:59:15.224Z|00043|netdev_afxdp|WARN|libbpf: BTF is required, but
is missing or corrupted.
Maybe we should check clang version at ./configure time.
>
> 3. Load custom XDP program
> E.g.
> $ ovs-vsctl add-port ovsbr0 veth0 -- set int veth0 options:xdp-mode=native \
> options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
> $ ovs-vsctl add-port ovsbr0 veth1 -- set int veth1 options:xdp-mode=native \
> options:xdp-obj="path/to/ovs/bpf/flowtable_afxdp.o"
Once clang is fixed, loading the program works:
2020-07-15T15:07:02.374Z|00049|netdev_offload|INFO|afxdp-p0: Assigned flow API
'linux_xdp'.
$~/ovs# bpftool map
42: devmap name output_map flags 0x80
key 4B value 4B max_entries 65536 memlock 524288B
44: array_of_maps name flow_table flags 0x0
key 4B value 4B max_entries 128 memlock 4096B
45: array name subtbl_masks_hd flags 0x0
key 4B value 4B max_entries 1 memlock 4096B
46: (null) name xsks_map flags 0x0
...
But ping between two namespace shows packet lost. Then I realize I
need to load dummy program.
>
> 4. Enable XDP_REDIRECT
> If you use veth devices, make sure to load some (possibly dummy) programs
> on the peers of veth devices.
> Some HW NIC drivers require as many queues as cores on its system. Tweak
> queues using "ethtool -L".
>
Maybe it's better put this dummy program under ovs/bpf/
I tried to compile the
tools/testing/selftests/bpf/progs/xdp_dummy.c
but encountered many errors.
I end up using
samples/bpf/xdp_redirect_kern.o sec xdp_redirect_dummy
and finally got ping working between two namespaces.
Will continue reviewing patches...
Regards,
William
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev