Existing OVS has three datapath types: Linux kernel (dpif-netlink), userspace (dpif-netdev), and Windows. This series add another type of OVS datapath: the eBPF datapath (dpif-bpf).
eBPF stands for extended Berkeley Packet Filter. It enables userspace applications to customize and extend the Linux kernel’s functionality. Thus, the benefit of implementing OVS datapath in eBPF is flexibility: new feature can be added through eBPF bytecode and dynamically loaded into the Linux kernel, and safety, the eBPF bytecode is guaranteed to not crash the kernel by the BPF verifier, and finally: portability, the eBPF bytecode is platform-agnostic so hopefully the same implementation can run on different platforms. The implementation tries to re-implement whatever under Linux kernel's net/openvswitch/* into eBPF code. However, this series is still far from being complete. A couple of eBPF limitations make it difficult. OVS eBPF Architecture ===================== OVS has the following architecure: _ | +-------------------+ | | ovs-vswitchd | | +-------------------+ userspace | | ofproto |<-->OpenFlow controllers | +--------+-+--------+ | | netdev | |ofproto-| | +--------+ | dpif | | netdev | +--------+ *eBPF hook --> |provider| | dpif | +---||---+ +--------+ | || | dpif | <--- *eBPF provider | || |provider| |_ || +---||---+ || || _ +---||-----+---||---+ | | |datapath| <--- *eBPF datapath kernel | | +--------+ | | | |_ +--------||---------+ || physical NIC And the patch adds: - eBPF hook for attaching eBPF/XDP program to netdev, files: lib/netdev-linux.* - eBPF dpif provider, an interface to communicate with eBPF datapath files: lib/dpif-bpf.*, lib/dpif-bpf-odp.* - eBPF datapath, the implementation of OVS datapath in eBPF files: bpf/datapath.c, bpf/*.h Most of the design and implemention are described in OSR2018 paper[1], "Building an Extensible Open vSwitch Datapath" and OVS conference[2], "Offloading OVS Flow Processing using eBPF" [1] https://dl.acm.org/citation.cfm?id=3139657 [2] http://openvswitch.org/support/ovscon2016/7/1120-tu.pdf eBPF/XDP ======== A single bpf bytecode 'bpf/datapath.o' is generated and loaded into all netdevs attached to OVS, in either ingress or egress of the netdev. A packet traversing the OVS eBPF datapath typically go through three stages: parse, lookup, and action. Each stage consists of multiple eBPF program and each stage is tail called from each other. 'objdump -h bpf/datapath.o' shows the OVS eBPF datapath object file. 1 tail-32 000018b0 0000000000000000 0000000000000000 00000040 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 2 tail-33 00001b08 0000000000000000 0000000000000000 000018f0 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 3 tail-0 000000b8 0000000000000000 0000000000000000 000033f8 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE 4 tail-1 00000a48 0000000000000000 0000000000000000 000034b0 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE <skip> 16 tail-13 00000aa0 0000000000000000 0000000000000000 000095d8 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 17 xdp 00000070 0000000000000000 0000000000000000 0000a078 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE 18 af_xdp 00000010 0000000000000000 0000000000000000 0000a0e8 2**3 CONTENTS, ALLOC, LOAD, READONLY, CODE 19 tail-35 000008b0 0000000000000000 0000000000000000 0000a0f8 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 20 ingress 00000178 0000000000000000 0000000000000000 0000a9a8 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 21 egress 00000188 0000000000000000 0000000000000000 0000ab20 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 22 downcall 000003b0 0000000000000000 0000000000000000 0000aca8 2**3 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 23 maps 000000fc 0000000000000000 0000000000000000 0000b058 2**2 Program with 'tail-{0-13}' is the OVS action implementation, see actions.h. Program ingress, egress, and downcall are three possible entry points of a packet triggered the eBPF program, which from there, tail calls the next stage. Program xdp and af_xdp is still empty for future integration. Currently, llvm/clang-4.0 doesn't work, we have to use version 3.8. Testsuite ========= We create a set of test cases under tests/system-bpf-traffic.at, which is a subset of the kernel datapath testsuite (system-traffic.at) 'make check-bpf' will kick start the testing, so far this patch can do BPF datapath-sanity 1: datapath - basic BPF commands ok 2: datapath - ping between two ports ok 3: datapath - http between two ports ok 4: datapath - ping between two ports on vlan ok 5: datapath - ping between two ports on cvlan ok 6: datapath - ping6 between two ports ok 7: datapath - ping6 between two ports on vlan ok 8: datapath - ping6 between two ports on cvlan ok 9: datapath - ping over bond skipped (system-bpf-traffic.at:210) 10: datapath - ping over vxlan tunnel ok 11: datapath - ping over vxlan6 tunnel ok 12: datapath - ping over gre tunnel ok 13: datapath - ping over geneve tunnel ok 14: datapath - ping over geneve6 tunnel ok 15: datapath - clone action FAILED (system-bpf-traffic.at:445) and FAILED for the rest. The log of each test is saved at tests/system-bpf-testsuite.dir/<id>/ Discussion ========== We are still actively working on finishing the feature, currently the basic forwarding and tunnel feature work, but still under heavy debugging and development. The purpose of this RFC is to get some early feedbacks and direction for finishing the complete features in existing kernel's OVS datapath (the net/openvswitch/*). Three major issues we are worried: a. Megaflow support in BPF. b. Connection Tracking support in BPF. c. Verifier limitation. The patch is based on top of the OVS 2.9.1 commit f8b6477aa019 ("Set release date for 2.9.1.") Or at my github # git clone https://github.com/williamtu/ovs-ebpf at branch rfc Joe Stringer (7): ovs-bpf: add documentation and configuration. netdev: add ebpf support for netdev provider. lib: implement perf event ringbuffer for upcall. lib/bpf: add support for managing bpf program/map. dpif: add 'dpif-bpf' provider. dpif-bpf-odp: Add bpf datapath interface and impl. utilities: Add ovs-bpfctl utility. William Tu (4): bpf: implement OVS BPF datapath. vswitch/bridge.c: add bpf datapath initialization. tests: Add "make check-bpf" traffic target. vagrant: add ebpf support using ubuntu/bionic Documentation/automake.mk | 1 + Documentation/index.rst | 2 +- Documentation/intro/install/bpf.rst | 142 +++ Documentation/intro/install/index.rst | 1 + Makefile.am | 13 +- Vagrantfile-eBPF | 99 ++ acinclude.m4 | 39 + bpf/.gitignore | 4 + bpf/action.h | 628 +++++++++++ bpf/api.h | 279 +++++ bpf/automake.mk | 60 + bpf/datapath.c | 187 +++ bpf/datapath.h | 71 ++ bpf/generated_headers.h | 185 +++ bpf/helpers.h | 209 ++++ bpf/lookup.h | 227 ++++ bpf/maps.h | 170 +++ bpf/odp-bpf.h | 254 +++++ bpf/openvswitch.h | 49 + bpf/ovs-p4.h | 112 ++ bpf/ovs-proto.p4 | 329 ++++++ bpf/parser.h | 412 +++++++ bpf/xdp.h | 35 + configure.ac | 1 + include/linux/pkt_cls.h | 21 + lib/automake.mk | 12 + lib/bpf.c | 524 +++++++++ lib/bpf.h | 69 ++ lib/dpif-bpf-odp.c | 943 ++++++++++++++++ lib/dpif-bpf-odp.h | 47 + lib/dpif-bpf.c | 1995 +++++++++++++++++++++++++++++++++ lib/dpif-netdev.c | 29 +- lib/dpif-provider.h | 1 + lib/dpif.c | 3 + lib/netdev-bsd.c | 2 + lib/netdev-dpdk.c | 2 + lib/netdev-dummy.c | 2 + lib/netdev-linux.c | 436 ++++++- lib/netdev-linux.h | 2 + lib/netdev-provider.h | 11 + lib/netdev-vport.c | 145 ++- lib/netdev.c | 25 + lib/netdev.h | 4 + lib/packets.h | 6 +- lib/perf-event.c | 288 +++++ lib/perf-event.h | 43 + ofproto/ofproto-dpif.c | 69 +- tests/.gitignore | 1 + tests/automake.mk | 30 +- tests/ofproto-macros.at | 8 + tests/system-bpf-macros.at | 112 ++ tests/system-bpf-testsuite.at | 25 + tests/system-bpf-testsuite.patch | 10 + tests/system-bpf-traffic.at | 809 +++++++++++++ utilities/automake.mk | 9 + utilities/ovs-bpfctl.8.xml | 45 + utilities/ovs-bpfctl.c | 248 ++++ vswitchd/bridge.c | 21 + 58 files changed, 9453 insertions(+), 53 deletions(-) create mode 100644 Documentation/intro/install/bpf.rst create mode 100644 Vagrantfile-eBPF create mode 100644 bpf/.gitignore create mode 100644 bpf/action.h create mode 100644 bpf/api.h create mode 100644 bpf/automake.mk create mode 100644 bpf/datapath.c create mode 100644 bpf/datapath.h create mode 100644 bpf/generated_headers.h create mode 100644 bpf/helpers.h create mode 100644 bpf/lookup.h create mode 100644 bpf/maps.h create mode 100644 bpf/odp-bpf.h create mode 100644 bpf/openvswitch.h create mode 100644 bpf/ovs-p4.h create mode 100644 bpf/ovs-proto.p4 create mode 100644 bpf/parser.h create mode 100644 bpf/xdp.h create mode 100644 lib/bpf.c create mode 100644 lib/bpf.h create mode 100644 lib/dpif-bpf-odp.c create mode 100644 lib/dpif-bpf-odp.h create mode 100644 lib/dpif-bpf.c create mode 100644 lib/perf-event.c create mode 100644 lib/perf-event.h create mode 100644 tests/system-bpf-macros.at create mode 100644 tests/system-bpf-testsuite.at create mode 100644 tests/system-bpf-testsuite.patch create mode 100644 tests/system-bpf-traffic.at create mode 100644 utilities/ovs-bpfctl.8.xml create mode 100644 utilities/ovs-bpfctl.c -- 2.7.4 -=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#1339): https://lists.iovisor.org/g/iovisor-dev/message/1339 Mute This Topic: https://lists.iovisor.org/mt/22656941/21656 Group Owner: iovisor-dev+ow...@lists.iovisor.org Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-