On Thu, Apr 25, 2019 at 8:09 AM Ilya Maximets <i.maxim...@samsung.com> wrote:
> Hi. > > This is not a full review. Just a bunch of thoughts. > > See inline. > > Best regards, Ilya Maximets. > > On 25.04.2019 2:47, William Tu wrote: > > The patch introduces experimental AF_XDP support for OVS netdev. > > AF_XDP is a new address family working together with eBPF/XDP. > > A socket with AF_XDP family can receive and send raw packets > > from an eBPF/XDP program attached to the netdev. > > For details introduction and configuration, see > > Documentation/intro/install/afxdp.rst > > > > Signed-off-by: William Tu <u9012...@gmail.com> > > Signed-off-by: Yi-Hung Wei <yihung....@gmail.com> > > Co-authored-by: Yi-Hung Wei <yihung....@gmail.com> > > --- > > v1->v2: > > - add a list to maintain unused umem elements > > - remove copy from rx umem to ovs internal buffer > > - use hugetlb to reduce misses (not much difference) > > - use pmd mode netdev in OVS (huge performance improve) > > - remove malloc dp_packet, instead put dp_packet in umem > > > > v2->v3: > > - rebase on the OVS master, 7ab4b0653784 > > ("configure: Check for more specific function to pull in pthread > library.") > > - remove the dependency on libbpf and dpif-bpf. > > instead, use the built-in XDP_ATTACH feature. > > - data structure optimizations for better performance, see[1] > > - more test cases support > > v3: > https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354179.html > > > > v3->v4: > > - Use AF_XDP API provided by libbpf > > - Remove the dependency on XDP_ATTACH kernel patch set > > - Add documentation, bpf.rst > > > > v4->v5: > > - rebase to master > > - remove rfc, squash all into a single patch > > - add --enable-afxdp, so by default, AF_XDP is not compiled > > - add options: xdpmode=drv,skb > > - add multiple queue and multiple PMD support, with options: n_rxq > > - improve documentation, rename bpf.rst to af_xdp.rst > > > > v5->v6 > > - rebase to master, commit 0cdd5b13de91b98 > > - address errors from sparse and clang > > - pass travis-ci test > > - address feedback from Ben > > - fix issues reported by 0-day robot > > - improved documentation > > --- > > Documentation/automake.mk | 1 + > > Documentation/index.rst | 1 + > > Documentation/intro/install/afxdp.rst | 366 +++++++++++++ > > Documentation/intro/install/index.rst | 1 + > > acinclude.m4 | 23 + > > configure.ac | 1 + > > lib/automake.mk | 7 +- > > lib/dp-packet.c | 16 + > > lib/dp-packet.h | 35 +- > > lib/dpif-netdev-perf.h | 13 + > > lib/netdev-afxdp.c | 589 ++++++++++++++++++++ > > lib/netdev-afxdp.h | 47 ++ > > lib/netdev-linux.c | 89 +++- > > lib/netdev-linux.h | 1 + > > lib/netdev-provider.h | 1 + > > lib/netdev.c | 1 + > > lib/xdpsock.c | 210 ++++++++ > > lib/xdpsock.h | 133 +++++ > > tests/automake.mk | 17 + > > tests/system-afxdp-macros.at | 153 ++++++ > > tests/system-afxdp-testsuite.at | 26 + > > tests/system-afxdp-traffic.at | 978 > ++++++++++++++++++++++++++++++++++ > > 22 files changed, 2703 insertions(+), 6 deletions(-) > > create mode 100644 Documentation/intro/install/afxdp.rst > > create mode 100644 lib/netdev-afxdp.c > > create mode 100644 lib/netdev-afxdp.h > > create mode 100644 lib/xdpsock.c > > create mode 100644 lib/xdpsock.h > > create mode 100644 tests/system-afxdp-macros.at > > create mode 100644 tests/system-afxdp-testsuite.at > > create mode 100644 tests/system-afxdp-traffic.at > > > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > > index 082438e09a33..11cc59efc881 100644 > > --- a/Documentation/automake.mk > > +++ b/Documentation/automake.mk > > @@ -10,6 +10,7 @@ DOC_SOURCE = \ > > Documentation/intro/why-ovs.rst \ > > Documentation/intro/install/index.rst \ > > Documentation/intro/install/bash-completion.rst \ > > + Documentation/intro/install/afxdp.rst \ > > Documentation/intro/install/debian.rst \ > > Documentation/intro/install/documentation.rst \ > > Documentation/intro/install/distributions.rst \ > > diff --git a/Documentation/index.rst b/Documentation/index.rst > > index 46261235c732..aa9e7c49f179 100644 > > --- a/Documentation/index.rst > > +++ b/Documentation/index.rst > > @@ -59,6 +59,7 @@ vSwitch? Start here. > > :doc:`intro/install/windows` | > > :doc:`intro/install/xenserver` | > > :doc:`intro/install/dpdk` | > > + :doc:`intro/install/afxdp` | > > :doc:`Installation FAQs <faq/releases>` > > > > - **Tutorials:** :doc:`tutorials/faucet` | > > diff --git a/Documentation/intro/install/afxdp.rst > b/Documentation/intro/install/afxdp.rst > > new file mode 100644 > > index 000000000000..a1e3317bbdb5 > > --- /dev/null > > +++ b/Documentation/intro/install/afxdp.rst > > @@ -0,0 +1,366 @@ > > +.. > > + Licensed under the Apache License, Version 2.0 (the "License"); > you may > > + not use this file except in compliance with the License. You may > obtain > > + a copy of the License at > > + > > + http://www.apache.org/licenses/LICENSE-2.0 > > + > > + Unless required by applicable law or agreed to in writing, > software > > + distributed under the License is distributed on an "AS IS" BASIS, > WITHOUT > > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > See the > > + License for the specific language governing permissions and > limitations > > + under the License. > > + > > + Convention for heading levels in Open vSwitch documentation: > > + > > + ======= Heading 0 (reserved for the title in a document) > > + ------- Heading 1 > > + ~~~~~~~ Heading 2 > > + +++++++ Heading 3 > > + ''''''' Heading 4 > > + > > + Avoid deeper levels because they do not render well. > > + > > + > > +======================== > > +Open vSwitch with AF_XDP > > +======================== > > + > > +This document describes how to build and install Open vSwitch using > > +AF_XDP netdev. > > + > > +.. warning:: > > + The AF_XDP support of Open vSwitch is considered 'experimental', > > + and it is not compiled in by default. > > + > > +Introduction > > +------------ > > +AF_XDP, Address Family of the eXpress Data Path, is a new Linux socket > type > > +built upon the eBPF and XDP technology. It is aims to have comparable > > +performance to DPDK but cooperate better with existing kernel's > networking > > +stack. An AF_XDP socket receives and sends packets from an eBPF/XDP > program > > +attached to the netdev, by-passing a couple of Linux kernel's > subsystems. > > +As a result, AF_XDP socket shows much better performance than AF_PACKET. > > +For more details about AF_XDP, please see linux kernel's > > +Documentation/networking/af_xdp.rst > > + > > + > > +AF_XDP Netdev > > +------------- > > +OVS has a couple of netdev types, i.e., system, tap, or > > +internal. The AF_XDP feature adds a new netdev types called > > +"afxdp", and implement its configuration, packet reception, > > +and transmit functions. Since the AF_XDP socket, xsk, > > +operates in userspace, once ovs-vswitchd receives packets > > +from xsk, the proposed architecture re-uses the existing > > +userspace dpif-netdev datapath. As a result, most of > > +the packet processing happens at the userspace instead of > > +linux kernel. > > + > > +:: > > + > > + | +-------------------+ > > + | | ovs-vswitchd |<-->ovsdb-server > > + | +-------------------+ > > + | | ofproto |<-->OpenFlow controllers > > + | +--------+-+--------+ > > + | | netdev | |ofproto-| > > + userspace | +--------+ | dpif | > > + | | afxdp | +--------+ > > + | | netdev | | dpif | > > + | +---||---+ +--------+ > > + | || | dpif- | > > + | || | netdev | > > + |_ || +--------+ > > + || > > + _ +---||-----+--------+ > > + | | AF_XDP prog + | > > + kernel | | xsk_map | > > + |_ +--------||---------+ > > + || > > + physical > > + NIC > > + > > + > > +Build requirements > > +------------------ > > + > > +In addition to the requirements described in :doc:`general`, building > Open > > +vSwitch with AF_XDP will require the following: > > + > > +- libbpf from kernel source tree (kernel 5.0.0 or later) > > + > > +- Linux kernel XDP support, with the following options (required) > > + ``_CONFIG_BPF=y`` > > + > > + ``_CONFIG_BPF_SYSCALL=y`` > > + > > + ``_CONFIG_XDP_SOCKETS=y`` > > + > > + > > +- The following optional Kconfig options are also recommended, but not > > + required: > > + > > + ``_CONFIG_BPF_JIT=y`` (Performance) > > + > > + ``_CONFIG_HAVE_BPF_JIT=y`` (Performance) > > + > > + ``_CONFIG_XDP_SOCKETS_DIAG=y`` (Debugging) > > + > > +- If possible, run **./xdpsock -r -N -z -i <your device>** under > > + linux/samples/bpf. This is the OVS indepedent benchmark tools for > AF_XDP. > > + It makes sure your basic kernel requirements are met for AF_XDP. > > + > > + > > +Installing > > +---------- > > +For OVS to use AF_XDP netdev, it has to be configured with LIBBPF > support. > > +Frist, clone a recent version of Linux bpf-next tree:: > > + > > + git clone git:// > git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git > > + > > +Second, go into the Linux source directory and build libbpf in the tools > > +directory:: > > + > > + cd bpf-next/ > > + cd tools/lib/bpf/ > > + make && make install > > + make install_headers > > + > > +.. note:: > > + Make sure xsk.h and bpf.h are installed in system's library path, > > + e.g. /usr/local/include/bpf/ or /usr/include/bpf/ > > + > > +Make sure the libbpf.so is installed correctly:: > > + > > + ldconfig > > + ldconfig -p | grep libbpf > > + > > + > > +Third, ensure the standard OVS requirements are installed and > > +bootstrap/configure the package:: > > + > > + ./boot.sh && ./configure --enable-afxdp > > + > > +Finally, build and install OVS:: > > + > > + make && make install > > + > > +To kick start end-to-end autotesting:: > > + > > + uname -a # make sure having 5.0+ kernel > > + make check-afxdp > > + > > +if a test case fails, check the log at:: > > + > > + cat > tests/system-afxdp-testsuite.dir/<number>/system-afxdp-testsuite.log > > + > > + > > +Setup AF_XDP netdev > > +------------------- > > +Before running OVS with AF_XDP, make sure the libbpf and libelf are > > +set-up right:: > > + > > + ldd vswitchd/ovs-vswitchd > > + > > +Open vSwitch should be started using userspace datapath as described > > +in :doc:`general`:: > > + > > + ovs-vswitchd --disable-system > > + ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev > > + > > +.. note:: > > + OVS AF_XDP netdev is using the userspace datapath, the same datapath > > + as used by OVS-DPDK. So it requires --disable-system for > ovs-vswitchd > > + and datapath_type=netdev when adding a new bridge. > > + > > +Make sure your device support AF_XDP, and to use 1 PMD (on core 4) > > +on 1 queue (queue 0) device, configure these options: **pmd-cpu-mask, > > +pmd-rxq-affinity, and n_rxq**. The **xdpmode** can be "drv" or "skb":: > > + > > + ethtool -L enp2s0 combined 1 > > + ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x10 > > + ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \ > > + options:n_rxq=1 options:xdpmode=drv \ > > + other_config:pmd-rxq-affinity="0:4" > > + > > +Or, use 4 pmds/cores and 4 queues by doing:: > > + > > + ethtool -L enp2s0 combined 4 > > + ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x36 > > + ovs-vsctl add-port br0 enp2s0 -- set interface enp2s0 type="afxdp" \ > > + options:n_rxq=4 options:xdpmode=drv \ > > + other_config:pmd-rxq-affinity="0:1,1:2,2:3,3:4" > > + > > +To validate that the bridge has successfully instantiated, you can use > the:: > > + > > + ovs-vsctl show > > + > > +should show something like:: > > + > > + Port "ens802f0" > > + Interface "ens802f0" > > + type: afxdp > > + options: {n_rxq="1", xdpmode=drv} > > + > > +Otherwise, enable debug by:: > > + > > + ovs-appctl vlog/set netdev_afxdp::dbg > > + > > + > > +References > > +---------- > > +Most of the design details are described in the paper presented at > > +Linux Plumber 2018, "Bringing the Power of eBPF to Open vSwitch"[1], > > +section 4, and slides[2][4]. > > +"The Path to DPDK Speeds for AF XDP"[3] gives a very good introduction > > +about AF_XDP current and future work. > > + > > + > > +[1] http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-afxdp.pdf > > + > > +[2] > http://vger.kernel.org/lpc_net2018_talks/ovs-ebpf-lpc18-presentation.pdf > > + > > +[3] > http://vger.kernel.org/lpc_net2018_talks/lpc18_paper_af_xdp_perf-v2.pdf > > + > > +[4] > https://ovsfall2018.sched.com/event/IO7p/fast-userspace-ovs-with-afxdp > > + > > + > > +Performance Tuning > > +------------------ > > +The name of the game is to keep your CPU running in userspace, allowing > PMD > > +to keep polling the AF_XDP queues without any interferences from kernel. > > + > > +#. Make sure everything is in the same NUMA node (memory used by > AF_XDP, pmd > > + running cores, device plug-in slot) > > + > > +#. Isolate your CPU by doing isolcpu at grub configure. > > + > > +#. IRQ should not set to pmd running core. > > + > > +#. The Spectre and Meltdown fixes increase the overhead of system calls. > > + > > +Debugging performance issue > > +~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > +While running the traffic, use linux perf tool to see where your cpu > > +spends its cycle:: > > + > > + cd bpf-next/tools/perf > > + make > > + ./perf record -p `pidof ovs-vswitchd` sleep 10 > > + ./perf report > > + > > +Measure your system call rate by doing:: > > + > > + pstree -p `pidof ovs-vswitchd` > > + strace -c -p <your pmd's PID> > > + > > +Or, use OVS pmd tool:: > > + > > + ovs-appctl dpif-netdev/pmd-stats-show > > + > > + > > +Example Script > > +-------------- > > + > > +Below is a script using namespaces and veth peer:: > > + > > + #!/bin/bash > > + ovs-vswitchd --no-chdir --pidfile -vvconn -vofproto_dpif -vunixctl \ > > + --disable-system --detach \ > > + ovs-vsctl -- add-br br0 -- set Bridge br0 \ > > + protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14 \ > > + fail-mode=secure datapath_type=netdev > > + ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev > > + > > + ip netns add at_ns0 > > + ovs-appctl vlog/set netdev_afxdp::dbg > > + > > + ip link add p0 type veth peer name afxdp-p0 > > + ip link set p0 netns at_ns0 > > + ip link set dev afxdp-p0 up > > + ovs-vsctl add-port br0 afxdp-p0 -- \ > > + set interface afxdp-p0 external-ids:iface-id="p0" type="afxdp" > > + > > + ip netns exec at_ns0 sh << NS_EXEC_HEREDOC > > + ip addr add "10.1.1.1/24" dev p0 > > + ip link set dev p0 up > > + NS_EXEC_HEREDOC > > + > > + ip netns add at_ns1 > > + ip link add p1 type veth peer name afxdp-p1 > > + ip link set p1 netns at_ns1 > > + ip link set dev afxdp-p1 up > > + > > + ovs-vsctl add-port br0 afxdp-p1 -- \ > > + set interface afxdp-p1 external-ids:iface-id="p1" type="afxdp" > > + ip netns exec at_ns1 sh << NS_EXEC_HEREDOC > > + ip addr add "10.1.1.2/24" dev p1 > > + ip link set dev p1 up > > + NS_EXEC_HEREDOC > > + > > + ip netns exec at_ns0 ping -i .2 10.1.1.2 > > + > > + > > +Limitations/Known Issues > > +------------------------ > > +#. Device's numa ID is always 0, need a way to find numa id from a > netdev. > > +#. No QoS support because AF_XDP netdev by-pass the Linux TC layer. A > possible > > + work-around is to use OpenFlow meter action. > > +#. AF_XDP device added to bridge, remove, and added again will fail. > > +#. Most of the tests are done using i40e single port. Multiple ports and > > + also ixgbe driver also needs to be tested. > > +#. No latency test result (TODO items) > > + > > + > > +make check-afxdp > > +---------------- > > +When executing 'make check-afxdp', OVS creates namespaces, sets up > AF_XDP on > > +veth devices and kicks start the testing. So far we have the following > test > > +cases:: > > + > > + AF_XDP netdev datapath-sanity > > + > > + 1: datapath - ping between two ports ok > > + 2: datapath - ping between two ports on vlan ok > > + 3: datapath - ping6 between two ports ok > > + 4: datapath - ping6 between two ports on vlan ok > > + 5: datapath - ping over vxlan tunnel ok > > + 6: datapath - ping over vxlan6 tunnel ok > > + 7: datapath - ping over gre tunnel ok > > + 8: datapath - ping over erspan v1 tunnel ok > > + 9: datapath - ping over erspan v2 tunnel ok > > + 10: datapath - ping over ip6erspan v1 tunnel ok > > + 11: datapath - ping over ip6erspan v2 tunnel ok > > + 12: datapath - ping over geneve tunnel ok > > + 13: datapath - ping over geneve6 tunnel ok > > + 14: datapath - clone action ok > > + 15: datapath - basic truncate action ok > > + > > + conntrack > > + > > + 16: conntrack - controller ok > > + 17: conntrack - force commit ok > > + 18: conntrack - ct flush by 5-tuple ok > > + 19: conntrack - IPv4 ping ok > > + 20: conntrack - get_nconns and get/set_maxconns ok > > + 21: conntrack - IPv6 ping ok > > + > > + system-ovn > > + > > + 22: ovn -- 2 LRs connected via LS, gateway router, SNAT and DNAT ok > > + 23: ovn -- 2 LRs connected via LS, gateway router, easy SNAT ok > > + 24: ovn -- multiple gateway routers, SNAT and DNAT ok > > + 25: ovn -- load-balancing ok > > + 26: ovn -- load-balancing - same subnet. ok > > + 27: ovn -- load balancing in gateway router ok > > + 28: ovn -- multiple gateway routers, load-balancing ok > > + 29: ovn -- load balancing in router with gateway router port ok > > + 30: ovn -- DNAT and SNAT on distributed router - N/S ok > > + 31: ovn -- DNAT and SNAT on distributed router - E/W ok > > + > > + > > +Bug Reporting > > +------------- > > + > > +Please report problems to d...@openvswitch.org. > > diff --git a/Documentation/intro/install/index.rst > b/Documentation/intro/install/index.rst > > index 3193c736cf17..c27a9c9d16ff 100644 > > --- a/Documentation/intro/install/index.rst > > +++ b/Documentation/intro/install/index.rst > > @@ -45,6 +45,7 @@ Installation from Source > > xenserver > > userspace > > dpdk > > + afxdp > > > > Installation from Packages > > -------------------------- > > diff --git a/acinclude.m4 b/acinclude.m4 > > index 301aeb70d82a..d80f2494d514 100644 > > --- a/acinclude.m4 > > +++ b/acinclude.m4 > > @@ -221,6 +221,29 @@ AC_DEFUN([OVS_FIND_DEPENDENCY], [ > > ]) > > ]) > > > > +dnl OVS_CHECK_LINUX_AF_XDP > > +dnl > > +dnl Check both Linux kernel AF_XDP and libbpf support > > +AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [ > > + AC_MSG_CHECKING([whether AF_XDP is supported]) > > + AC_ARG_ENABLE([afxdp], > > + [AC_HELP_STRING([--enable-afxdp], [Enable AF-XDP > support])], > > + [], [enable_afxdp=no]) > > + AC_CHECK_HEADER([bpf/libbpf.h], > > + [HAVE_LIBBPF=yes], > > + [HAVE_LIBBPF=no]) > > + AC_CHECK_HEADER([linux/if_xdp.h], > > + [HAVE_IF_XDP=yes], > > + [HAVE_IF_XDP=no]) > > + AM_CONDITIONAL([SUPPORT_AF_XDP], > > + [test "$enable_afxdp" = yes && test "$HAVE_LIBBPF" = > yes && test "$HAVE_IF_XDP" = yes]) > > + AM_COND_IF([SUPPORT_AF_XDP], [ > > + AC_DEFINE([HAVE_AF_XDP], [1], [Define to 1 if AF-XDP support is > available and enabled.]) > > + LIBBPF_LDADD=" -lbpf -lelf" > > + AC_SUBST([LIBBPF_LDADD]) > > + ]) > > +]) > > + > > I think that configure should fail in case we have no required headers. > It's confusing that I explicitly enabled afxdp, but OVS was built without > its support. > One more thing is that AC_MSG_CHECKING requires subsequent AC_MSG_RESULT, > otherwise it will look not good. > > Suggesting following incremental: > > diff --git a/acinclude.m4 b/acinclude.m4 > index d80f2494d..c919af570 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -225,23 +225,26 @@ dnl OVS_CHECK_LINUX_AF_XDP > dnl > dnl Check both Linux kernel AF_XDP and libbpf support > AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [ > - AC_MSG_CHECKING([whether AF_XDP is supported]) > AC_ARG_ENABLE([afxdp], > [AC_HELP_STRING([--enable-afxdp], [Enable AF-XDP > support])], > [], [enable_afxdp=no]) > - AC_CHECK_HEADER([bpf/libbpf.h], > - [HAVE_LIBBPF=yes], > - [HAVE_LIBBPF=no]) > - AC_CHECK_HEADER([linux/if_xdp.h], > - [HAVE_IF_XDP=yes], > - [HAVE_IF_XDP=no]) > - AM_CONDITIONAL([SUPPORT_AF_XDP], > - [test "$enable_afxdp" = yes && test "$HAVE_LIBBPF" = yes > && test "$HAVE_IF_XDP" = yes]) > - AM_COND_IF([SUPPORT_AF_XDP], [ > - AC_DEFINE([HAVE_AF_XDP], [1], [Define to 1 if AF-XDP support is > available and enabled.]) > + AC_MSG_CHECKING([whether AF_XDP is enabled]) > + if test "$enable_afxdp" != yes; then > + AC_MSG_RESULT([no]) > + else > + AC_MSG_RESULT([yes]) > + > + AC_CHECK_HEADER([bpf/libbpf.h], [], > + [AC_MSG_ERROR([unable to find bpf/libbpf.h for AF_XDP support])]) > + > + AC_CHECK_HEADER([linux/if_xdp.h], [], > + [AC_MSG_ERROR([unable to find linux/if_xdp.h for AF_XDP support])]) > + > + AC_DEFINE([HAVE_AF_XDP], [1], > + [Define to 1 if AF-XDP support is available and enabled.]) > LIBBPF_LDADD=" -lbpf -lelf" > AC_SUBST([LIBBPF_LDADD]) > - ]) > + fi > ]) > > dnl OVS_CHECK_DPDK > --- > > > > dnl OVS_CHECK_DPDK > > dnl > > dnl Configure DPDK source tree > > diff --git a/configure.ac b/configure.ac > > index 505e3d041e93..29c90b73f836 100644 > > --- a/configure.ac > > +++ b/configure.ac > > @@ -99,6 +99,7 @@ OVS_CHECK_SPHINX > > OVS_CHECK_DOT > > OVS_CHECK_IF_DL > > OVS_CHECK_STRTOK_R > > +OVS_CHECK_LINUX_AF_XDP > > AC_CHECK_DECLS([sys_siglist], [], [], [[#include <signal.h>]]) > > AC_CHECK_MEMBERS([struct stat.st_mtim.tv_nsec, struct > stat.st_mtimensec], > > [], [], [[#include <sys/stat.h>]]) > > diff --git a/lib/automake.mk b/lib/automake.mk > > index cc5dccf39d6b..8b9df5635bbe 100644 > > --- a/lib/automake.mk > > +++ b/lib/automake.mk > > @@ -9,6 +9,7 @@ lib_LTLIBRARIES += lib/libopenvswitch.la > > > > lib_libopenvswitch_la_LIBADD = $(SSL_LIBS) > > lib_libopenvswitch_la_LIBADD += $(CAPNG_LDADD) > > +lib_libopenvswitch_la_LIBADD += $(LIBBPF_LDADD) > > > > if WIN32 > > lib_libopenvswitch_la_LIBADD += ${PTHREAD_LIBS} > > @@ -327,7 +328,11 @@ lib_libopenvswitch_la_SOURCES = \ > > lib/lldp/lldpd.c \ > > lib/lldp/lldpd.h \ > > lib/lldp/lldpd-structs.c \ > > - lib/lldp/lldpd-structs.h > > + lib/lldp/lldpd-structs.h \ > > + lib/xdpsock.c \ > > + lib/xdpsock.h \ > > + lib/netdev-afxdp.c \ > > + lib/netdev-afxdp.h > > Maybe it's better to move all these files under #ifdef HAVE_AF_XDP ? > > > > > if WIN32 > > lib_libopenvswitch_la_SOURCES += \ > > diff --git a/lib/dp-packet.c b/lib/dp-packet.c > > index 0976a35e758b..a61552f72988 100644 > > --- a/lib/dp-packet.c > > +++ b/lib/dp-packet.c > > @@ -22,6 +22,9 @@ > > #include "netdev-dpdk.h" > > #include "openvswitch/dynamic-string.h" > > #include "util.h" > > +#ifdef HAVE_AF_XDP > > +#include "xdpsock.h" > > +#endif > > > > static void > > dp_packet_init__(struct dp_packet *b, size_t allocated, enum > dp_packet_source source) > > @@ -122,6 +125,16 @@ dp_packet_uninit(struct dp_packet *b) > > * created as a dp_packet */ > > free_dpdk_buf((struct dp_packet*) b); > > #endif > > + } else if (b->source == DPBUF_AFXDP) { > > +#ifdef HAVE_AF_XDP > > + struct dp_packet_afxdp *xpacket; > > + > > + xpacket = dp_packet_cast_afxdp(b); > > + if (xpacket->mpool) { > > + umem_elem_push(xpacket->mpool, dp_packet_base(b)); > > + } > > +#endif > > Why not making the same trick as we have for DPDK few lines above? > i.e. wrap this part in a function like 'free_afxdp_buf' and move it > to the netdev-afxdp.c ? You will not need to expose so many internals > to generic code. dp_packet_cast_afxdp() will also be moved there along > with 'struct dp_packet_afxdp'. > > BTW, I hope, someday, I'll finally implement 'dp-packet-memory-provider' > abstraction for OVS. > Hi Ilya, Can you share more detail about this idea, dp-packet-memory-provider? Why do we need it? Thanks William > > > + return; > > } > > } > > } > > @@ -248,6 +261,8 @@ dp_packet_resize__(struct dp_packet *b, size_t > new_headroom, size_t new_tailroom > > case DPBUF_STACK: > > OVS_NOT_REACHED(); > > > > + case DPBUF_AFXDP: > > + OVS_NOT_REACHED(); > > Some space required between cases. > > > case DPBUF_STUB: > > b->source = DPBUF_MALLOC; > > new_base = xmalloc(new_allocated); > > @@ -433,6 +448,7 @@ dp_packet_steal_data(struct dp_packet *b) > > { > > void *p; > > ovs_assert(b->source != DPBUF_DPDK); > > + ovs_assert(b->source != DPBUF_AFXDP); > > > > if (b->source == DPBUF_MALLOC && dp_packet_data(b) == > dp_packet_base(b)) { > > p = dp_packet_data(b); > > diff --git a/lib/dp-packet.h b/lib/dp-packet.h > > index a5e9ade1244a..774728eef330 100644 > > --- a/lib/dp-packet.h > > +++ b/lib/dp-packet.h > > @@ -25,6 +25,10 @@ > > #include <rte_mbuf.h> > > #endif > > > > +#ifdef HAVE_AF_XDP > > +#include "lib/xdpsock.h" > > +#endif > > + > > #include "netdev-dpdk.h" > > #include "openvswitch/list.h" > > #include "packets.h" > > @@ -42,6 +46,7 @@ enum OVS_PACKED_ENUM dp_packet_source { > > DPBUF_DPDK, /* buffer data is from DPDK allocated > memory. > > * ref to dp_packet_init_dpdk() in > dp-packet.c. > > */ > > + DPBUF_AFXDP, /* buffer data from XDP frame */ > > Please, move the comment one space left. > > > }; > > > > #define DP_PACKET_CONTEXT_SIZE 64 > > @@ -89,6 +94,20 @@ struct dp_packet { > > }; > > }; > > > > +struct dp_packet_afxdp { > > + struct umem_pool *mpool; > > + struct dp_packet packet; > > +}; > > + > > +#if HAVE_AF_XDP > > +static struct dp_packet_afxdp * > > +dp_packet_cast_afxdp(const struct dp_packet *d OVS_UNUSED) > > +{ > > + ovs_assert(d->source == DPBUF_AFXDP); > > + return CONTAINER_OF(d, struct dp_packet_afxdp, packet); > > +} > > +#endif > > + > > static inline void *dp_packet_data(const struct dp_packet *); > > static inline void dp_packet_set_data(struct dp_packet *, void *); > > static inline void *dp_packet_base(const struct dp_packet *); > > @@ -183,7 +202,21 @@ dp_packet_delete(struct dp_packet *b) > > free_dpdk_buf((struct dp_packet*) b); > > return; > > } > > - > > + if (b->source == DPBUF_AFXDP) { > > +#ifdef HAVE_AF_XDP > > + struct dp_packet_afxdp *xpacket; > > + > > + /* if a packet is received from afxdp port, > > + * and tx to a system port. Then we need to > > + * push the rx umem back here > > + */ > > + xpacket = dp_packet_cast_afxdp(b); > > + if (xpacket->mpool) { > > + umem_elem_push(xpacket->mpool, dp_packet_base(b)); > > + } > > +#endif > > + return; > > + } > > dp_packet_uninit(b); > > free(b); > > } > > diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h > > index 859c05613ddf..e47cf73bf3c9 100644 > > --- a/lib/dpif-netdev-perf.h > > +++ b/lib/dpif-netdev-perf.h > > @@ -198,6 +198,19 @@ cycles_counter_update(struct pmd_perf_stats *s) > > { > > #ifdef DPDK_NETDEV > > return s->last_tsc = rte_get_tsc_cycles(); > > +#elif HAVE_AF_XDP > > + union { > > + uint64_t tsc_64; > > + struct { > > + uint32_t lo_32; > > + uint32_t hi_32; > > + }; > > + } tsc; > > + asm volatile("rdtsc" : > > + "=a" (tsc.lo_32), > > + "=d" (tsc.hi_32)); > > We need to check that we're on x86 machine. > Build should fail, I think. For now, you may add following code > to the head of netdev-afxdp.c: > > #if !defined(__i386__) && !defined(__x86_64__) > #error AF_XDP supported only for Linux on x86 or x86_64 > #endif > > > + > > + return s->last_tsc = tsc.tsc_64; > > #else > > return s->last_tsc = 0; > > #endif > > diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c > > new file mode 100644 > > index 000000000000..4c71061fc102 > > --- /dev/null > > +++ b/lib/netdev-afxdp.c > > @@ -0,0 +1,589 @@ > > +/* > > + * Copyright (c) 2018, 2019 Nicira, Inc. > > + * > > + * Licensed under the Apache License, Version 2.0 (the "License"); > > + * you may not use this file except in compliance with the License. > > + * You may obtain a copy of the License at: > > + * > > + * http://www.apache.org/licenses/LICENSE-2.0 > > + * > > + * Unless required by applicable law or agreed to in writing, software > > + * distributed under the License is distributed on an "AS IS" BASIS, > > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > > + * See the License for the specific language governing permissions and > > + * limitations under the License. > > + */ > > + > > +#include <config.h> > > +#ifdef HAVE_AF_XDP > > +#include "netdev-linux.h" > > +#include <errno.h> > > +#include <fcntl.h> > > +#include <sys/types.h> > > +#include <netinet/in.h> > > +#include <arpa/inet.h> > > +#include <inttypes.h> > > +#include <sys/ioctl.h> > > +#include <sys/socket.h> > > +#include <sys/utsname.h> > > +#include <netpacket/packet.h> > > +#include <net/if.h> > > +#include <net/if_arp.h> > > +#include <net/route.h> > > +#include <poll.h> > > +#include <stdlib.h> > > +#include <string.h> > > +#include <unistd.h> > > + > > +#include "coverage.h" > > +#include "dp-packet.h" > > +#include "dpif-netlink.h" > > +#include "dpif-netdev.h" > > +#include "openvswitch/dynamic-string.h" > > +#include "fatal-signal.h" > > +#include "hash.h" > > +#include "openvswitch/hmap.h" > > +#include "netdev-provider.h" > > +#include "netdev-tc-offloads.h" > > +#include "netdev-vport.h" > > +#include "netlink-notifier.h" > > +#include "netlink-socket.h" > > +#include "netlink.h" > > +#include "netnsid.h" > > +#include "openvswitch/ofpbuf.h" > > +#include "openflow/openflow.h" > > +#include "ovs-atomic.h" > > +#include "packets.h" > > +#include "openvswitch/poll-loop.h" > > +#include "rtnetlink.h" > > +#include "openvswitch/shash.h" > > +#include "socket-util.h" > > +#include "sset.h" > > +#include "tc.h" > > +#include "timer.h" > > +#include "unaligned.h" > > +#include "openvswitch/vlog.h" > > +#include "util.h" > > +#include "netdev-afxdp.h" > > + > > +#include <linux/if_ether.h> > > +#include <linux/if_tun.h> > > +#include <linux/types.h> > > +#include <linux/ethtool.h> > > +#include <linux/mii.h> > > +#include <linux/rtnetlink.h> > > +#include <linux/sockios.h> > > +#include <linux/if_xdp.h> > > +#include "xdpsock.h" > > + > > +#ifndef SOL_XDP > > +#define SOL_XDP 283 > > +#endif > > +#ifndef AF_XDP > > +#define AF_XDP 44 > > +#endif > > +#ifndef PF_XDP > > +#define PF_XDP AF_XDP > > +#endif > > + > > +VLOG_DEFINE_THIS_MODULE(netdev_afxdp); > > +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20); > > + > > +#define UMEM2DESC(elem, base) ((uint64_t)((char *)elem - (char *)base)) > > +#define UMEM2XPKT(base, i) \ > > + ALIGNED_CAST(struct dp_packet_afxdp *, (char *)base + \ > > + i * sizeof(struct dp_packet_afxdp)) > > + > > +static uint32_t opt_xdp_bind_flags = XDP_COPY; > > +static uint32_t opt_xdp_flags = > > + XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_SKB_MODE; > > +#ifdef USE_DRVMODE_DEFAULT > > If I'll define this, build will fail. > Should there be ifdef-else-end ? > > > +static uint32_t opt_xdp_bind_flags = XDP_ZEROCOPY; > > +static uint32_t opt_xdp_flags = > > + XDP_FLAGS_UPDATE_IF_NOEXIST | XDP_FLAGS_DRV_MODE; > > +#endif > > +static uint32_t prog_id; > > + > > +static struct xsk_umem_info *xsk_configure_umem(void *buffer, uint64_t > size) > > +{ > > + struct xsk_umem_info *umem; > > + int ret; > > + int i; > > + > > + umem = xcalloc(1, sizeof(*umem)); > > + if (!umem) { > > + VLOG_FATAL("xsk config umem failed (%s)", ovs_strerror(errno)); > > xcalloc can't fail. > > > + } > > + > > + ret = xsk_umem__create(&umem->umem, buffer, size, &umem->fq, > &umem->cq, > > + NULL); > > + > > + if (ret) { > > + VLOG_FATAL("xsk umem create failed (%s) mode: %s", > > + ovs_strerror(errno), > > + opt_xdp_bind_flags == XDP_COPY ? "SKB": "DRV"); > > Why so FATAL? Can we just return NULL and fail the > netdev_linux_rxq_construct? > > > + } > > + > > + umem->buffer = buffer; > > + > > + /* set-up umem pool */ > > + umem_pool_init(&umem->mpool, NUM_FRAMES); > > + > > + for (i = NUM_FRAMES - 1; i >= 0; i--) { > > + struct umem_elem *elem; > > + > > + elem = ALIGNED_CAST(struct umem_elem *, > > + (char *)umem->buffer + i * FRAME_SIZE); > > + umem_elem_push(&umem->mpool, elem); > > + } > > + > > + /* set-up metadata */ > > + xpacket_pool_init(&umem->xpool, NUM_FRAMES); > > + > > + VLOG_DBG("%s xpacket pool from %p to %p", __func__, > > + umem->xpool.array, > > + (char *)umem->xpool.array + > > + NUM_FRAMES * sizeof(struct dp_packet_afxdp)); > > + > > + for (i = NUM_FRAMES - 1; i >= 0; i--) { > > + struct dp_packet_afxdp *xpacket; > > + struct dp_packet *packet; > > + > > + xpacket = UMEM2XPKT(umem->xpool.array, i); > > + xpacket->mpool = &umem->mpool; > > + > > + packet = &xpacket->packet; > > + packet->source = DPBUF_AFXDP; > > + } > > + > > + return umem; > > +} > > + > > +static struct xsk_socket_info * > > +xsk_configure_socket(struct xsk_umem_info *umem, uint32_t ifindex, > > + uint32_t queue_id) > > +{ > > + struct xsk_socket_config cfg; > > + struct xsk_socket_info *xsk; > > + char devname[IF_NAMESIZE]; > > + uint32_t idx; > > + int ret; > > + int i; > > + > > + xsk = xcalloc(1, sizeof(*xsk)); > > + if (!xsk) { > > + VLOG_FATAL("xsk calloc failed (%s)", ovs_strerror(errno)); > > xcalloc can't fail. > > > + } > > + > > + xsk->umem = umem; > > + cfg.rx_size = CONS_NUM_DESCS; > > + cfg.tx_size = PROD_NUM_DESCS; > > + cfg.libbpf_flags = 0; > > + cfg.xdp_flags = opt_xdp_flags; > > + cfg.bind_flags = opt_xdp_bind_flags; > > + > > + if (if_indextoname(ifindex, devname) == NULL) { > > + VLOG_FATAL("ifindex %d devname failed (%s)", > > + ifindex, ovs_strerror(errno)); > > Every little misconfiguration will lead to aborting. It's probably OK > for the experimantal feature, but I don't like this. > > > + } > > + > > + ret = xsk_socket__create(&xsk->xsk, devname, queue_id, umem->umem, > > + &xsk->rx, &xsk->tx, &cfg); > > + if (ret) { > > + VLOG_FATAL("xsk_socket_create failed (%s) mode: %s qid: %d", > > + ovs_strerror(errno), > > + opt_xdp_bind_flags == XDP_COPY ? "SKB": "DRV", > > + queue_id); > > + } > > + > > + /* make sure the XDP program is there */ > > + ret = bpf_get_link_xdp_id(ifindex, &prog_id, opt_xdp_flags); > > + if (ret) { > > + VLOG_FATAL("get XDP prog ID failed (%s)", ovs_strerror(errno)); > > + } > > + > > + ret = xsk_ring_prod__reserve(&xsk->umem->fq, > > + PROD_NUM_DESCS, > > + &idx); > > + if (ret != PROD_NUM_DESCS) { > > + VLOG_FATAL("fq set-up failed (%s)", ovs_strerror(errno)); > > + } > > + > > + for (i = 0; > > + i < PROD_NUM_DESCS * FRAME_SIZE; > > + i += FRAME_SIZE) { > > + struct umem_elem *elem; > > + uint64_t addr; > > + > > + elem = umem_elem_pop(&xsk->umem->mpool); > > + addr = UMEM2DESC(elem, xsk->umem->buffer); > > + > > + *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx++) = addr; > > + } > > + > > + xsk_ring_prod__submit(&xsk->umem->fq, > > + PROD_NUM_DESCS); > > + return xsk; > > +} > > + > > +struct xsk_socket_info * > > +xsk_configure(int ifindex, int xdp_queue_id) > > +{ > > + struct xsk_socket_info *xsk; > > + struct xsk_umem_info *umem; > > + void *bufs; > > + int ret; > > + > > + ret = posix_memalign(&bufs, getpagesize(), > > + NUM_FRAMES * FRAME_SIZE); > > In the future we'll need to use HAVE_POSIX_MEMALIGN, probably. > > Do we need to clear the allocated memory? > > > + ovs_assert(!ret); > > + > > + /* Create sockets... */ > > + umem = xsk_configure_umem(bufs, > > + NUM_FRAMES * FRAME_SIZE); > > + xsk = xsk_configure_socket(umem, ifindex, xdp_queue_id); > > + return xsk; > > +} > > + > > +static void OVS_UNUSED vlog_hex_dump(const void *buf, size_t count) > > +{ > > + struct ds ds = DS_EMPTY_INITIALIZER; > > + ds_put_hex_dump(&ds, buf, count, 0, false); > > + VLOG_DBG_RL(&rl, "%s", ds_cstr(&ds)); > > + ds_destroy(&ds); > > +} > > + > > +void > > +xsk_destroy(struct xsk_socket_info *xsk) > > +{ > > + struct xsk_umem *umem; > > + > > + if (!xsk) { > > + return; > > + } > > + > > + umem = xsk->umem->umem; > > + xsk_socket__delete(xsk->xsk); > > + (void)xsk_umem__delete(umem); > > + > > + /* cleanup umem pool */ > > + umem_pool_cleanup(&xsk->umem->mpool); > > + > > + /* cleanup metadata pool */ > > + xpacket_pool_cleanup(&xsk->umem->xpool); > > +} > > + > > +static inline void OVS_UNUSED > > +print_xsk_stat(struct xsk_socket_info *xsk OVS_UNUSED) { > > + struct xdp_statistics stat; > > + socklen_t optlen; > > + > > + optlen = sizeof(stat); > > please don't paranthesize the argument of sizeof if it's name of variable. > > > + ovs_assert(getsockopt(xsk_socket__fd(xsk->xsk), SOL_XDP, > XDP_STATISTICS, > > + &stat, &optlen) == 0); > > + > > + VLOG_DBG_RL(&rl, "rx dropped %llu, rx_invalid %llu, tx_invalid > %llu", > > + stat.rx_dropped, > > + stat.rx_invalid_descs, > > + stat.tx_invalid_descs); > > +} > > + > > +int > > +netdev_afxdp_set_config(struct netdev *netdev, const struct smap *args, > > + char **errp OVS_UNUSED) > > +{ > > + const char *xdpmode; > > + int new_n_rxq; > > + > > + /* TODO: add mutex lock */ > > + new_n_rxq = MAX(smap_get_int(args, "n_rxq", NR_QUEUE), 1); > > + > > + if (netdev->n_rxq != new_n_rxq) { > > + > > + if (new_n_rxq > MAX_XSKQ) { > > + VLOG_WARN("set n_rxq %d too large", new_n_rxq); > > + goto out; > > Just return EINVAL. > > > + } > > + > > + netdev->n_rxq = new_n_rxq; > > This is wrong. You must not update netdev->n_rxq here. This should > be done on reconfiguration. > > > + VLOG_INFO("set AF_XDP device %s to %d n_rxq", netdev->name, > new_n_rxq); > > + netdev_request_reconfigure(netdev); > > + } > > + > > + xdpmode = smap_get(args, "xdpmode"); > > + if (xdpmode && strncmp(xdpmode, "drv", 3) == 0) { > > + if (opt_xdp_bind_flags != XDP_ZEROCOPY) { > > + opt_xdp_bind_flags = XDP_ZEROCOPY; > > + opt_xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | > XDP_FLAGS_DRV_MODE; > > + } > > + VLOG_INFO("AF_XDP device %s in ZC driver mode", netdev->name); > > + } else { > > + opt_xdp_bind_flags = XDP_COPY; > > + opt_xdp_flags = XDP_FLAGS_UPDATE_IF_NOEXIST | > XDP_FLAGS_SKB_MODE; > > + VLOG_INFO("AF_XDP device %s in SKB mode", netdev->name); > > + } > > Looks like changing "xdpmode" while port already added will > lead to incorrect work. You, probably, need to forbid this or > prepare the proper reconfiguration process. > > > + > > +out: > > + return 0; > > +} > > + > > +int > > +netdev_afxdp_get_config(const struct netdev *netdev, struct smap *args) > > +{ > > + /* TODO: add mutex lock */ > > + smap_add_format(args, "n_rxq", "%d", netdev->n_rxq); > > + smap_add_format(args, "xdpmode", "%s", > > + opt_xdp_bind_flags == XDP_ZEROCOPY ? "drv" : "skb"); > > + > > + return 0; > > +} > > + > > +int > > +netdev_afxdp_get_numa_id(const struct netdev *netdev) > > +{ > > + /* FIXME: Get netdev's PCIe device ID, then find > > + * its NUMA node id. > > + */ > > + VLOG_INFO("FIXME: Device %s always use numa id 0", netdev->name); > > + return 0; > > +} > > + > > +void > > +xsk_remove_xdp_program(uint32_t ifindex) > > +{ > > + uint32_t curr_prog_id = 0; > > + > > + /* remove_xdp_program() */ > > + if (bpf_get_link_xdp_id(ifindex, &curr_prog_id, opt_xdp_flags)) { > > + bpf_set_link_xdp_fd(ifindex, -1, opt_xdp_flags); > > + } > > + if (prog_id == curr_prog_id) { > > + bpf_set_link_xdp_fd(ifindex, -1, opt_xdp_flags); > > + } else if (!curr_prog_id) { > > + VLOG_WARN("couldn't find a prog id on a given interface"); > > + } else { > > + VLOG_WARN("program on interface changed, not removing"); > > + } > > +} > > + > > +/* Receive packet from AF_XDP socket */ > > +int > > +netdev_linux_rxq_xsk(struct xsk_socket_info *xsk, > > + struct dp_packet_batch *batch) > > +{ > > + unsigned int rcvd, i; > > + uint32_t idx_rx = 0, idx_fq = 0; > > + int ret = 0; > > + > > + /* See if there is any packet on RX queue, > > + * if yes, idx_rx is the index having the packet. > > + */ > > + rcvd = xsk_ring_cons__peek(&xsk->rx, BATCH_SIZE, &idx_rx); > > + if (!rcvd) { > > + return 0; > > + } > > + > > + /* Form a dp_packet batch from descriptor in RX queue */ > > + for (i = 0; i < rcvd; i++) { > > + uint64_t addr = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->addr; > > + uint32_t len = xsk_ring_cons__rx_desc(&xsk->rx, idx_rx)->len; > > + char *pkt = xsk_umem__get_data(xsk->umem->buffer, addr); > > + uint64_t index; > > + > > + struct dp_packet_afxdp *xpacket; > > + struct dp_packet *packet; > > + > > + index = addr >> FRAME_SHIFT; > > + xpacket = UMEM2XPKT(xsk->umem->xpool.array, index); > > + > > + packet = &xpacket->packet; > > + xpacket->mpool = &xsk->umem->mpool; > > + > > + if (packet->source != DPBUF_AFXDP) { > > + /* FIXME: might be a bug */ > > Need to log something here. Rate-limited. > > > + continue; > > + } > > + > > + /* Initialize the struct dp_packet */ > > + if (opt_xdp_bind_flags == XDP_ZEROCOPY) { > > + dp_packet_set_base(packet, pkt - FRAME_HEADROOM); > > + } else { > > + /* SKB mode */ > > + dp_packet_set_base(packet, pkt); > > + } > > + dp_packet_set_data(packet, pkt); > > + dp_packet_set_size(packet, len); > > + > > + /* Add packet into batch, increase batch->count */ > > + dp_packet_batch_add(batch, packet); > > + > > + idx_rx++; > > + } > > + > > + /* We've consume rcvd packets in RX, now re-fill the > > + * same number back to FILL queue. > > + */ > > + for (i = 0; i < rcvd; i++) { > > + uint64_t index; > > + struct umem_elem *elem; > > + > > + ret = xsk_ring_prod__reserve(&xsk->umem->fq, 1, &idx_fq); > > + while (ret == 0) { > > + /* The FILL queue is full, so retry. (or skip)? */ > > + ret = xsk_ring_prod__reserve(&xsk->umem->fq, 1, &idx_fq); > > + } > > + > > + /* Get one free umem, program it into FILL queue */ > > + elem = umem_elem_pop(&xsk->umem->mpool); > > + index = (uint64_t)((char *)elem - (char *)xsk->umem->buffer); > > + ovs_assert((index & FRAME_SHIFT_MASK) == 0); > > + *xsk_ring_prod__fill_addr(&xsk->umem->fq, idx_fq) = index; > > + > > + idx_fq++; > > + } > > + xsk_ring_prod__submit(&xsk->umem->fq, rcvd); > > + > > + /* Release the RX queue */ > > + xsk_ring_cons__release(&xsk->rx, rcvd); > > + xsk->rx_npkts += rcvd; > > + > > +#ifdef AFXDP_DEBUG > > + print_xsk_stat(xsk); > > +#endif > > + return 0; > > +} > > + > > +static void kick_tx(struct xsk_socket_info *xsk) > > +{ > > + int ret; > > + > > + ret = sendto(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, > 0); > > + if (ret >= 0 || errno == ENOBUFS || errno == EAGAIN || errno == > EBUSY) { > > + return; > > + } > > +} > > + > > +int > > +netdev_linux_afxdp_batch_send(struct xsk_socket_info *xsk, > > + struct dp_packet_batch *batch) > > +{ > > + uint32_t tx_done, idx_cq = 0; > > + struct dp_packet *packet; > > + uint32_t idx; > > + int j; > > + > > + /* Make sure we have enough TX descs */ > > + if (xsk_ring_prod__reserve(&xsk->tx, batch->count, &idx) == 0) { > > + return -EAGAIN; > > + } > > + > > + DP_PACKET_BATCH_FOR_EACH (i, packet, batch) { > > + struct dp_packet_afxdp *xpacket; > > + struct umem_elem *elem; > > + uint64_t index; > > + > > + elem = umem_elem_pop(&xsk->umem->mpool); > > + if (!elem) { > > + return -EAGAIN; > > + } > > + > > + memcpy(elem, dp_packet_data(packet), dp_packet_size(packet)); > > + > > + index = (uint64_t)((char *)elem - (char *)xsk->umem->buffer); > > + xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->addr = index; > > + xsk_ring_prod__tx_desc(&xsk->tx, idx + i)->len > > + = dp_packet_size(packet); > > + > > + if (packet->source == DPBUF_AFXDP) { > > + xpacket = dp_packet_cast_afxdp(packet); > > + umem_elem_push(xpacket->mpool, dp_packet_base(packet)); > > + /* Avoid freeing it twice at dp_packet_uninit */ > > + xpacket->mpool = NULL; > > Why you're freeing packets here? 'netdev_linux_send' will do that for you. > > > + } > > + } > > + xsk_ring_prod__submit(&xsk->tx, batch->count); > > + xsk->outstanding_tx += batch->count; > > + > > +retry: > > + kick_tx(xsk); > > + > > + /* Process CQ */ > > Maybe it's better to process CQ on rx ? > It's unknown when we'll be here next time, but we'll definitely > call rx function soon. > > > + tx_done = xsk_ring_cons__peek(&xsk->umem->cq, batch->count, > &idx_cq); > > + if (tx_done > 0) { > > + xsk->outstanding_tx -= tx_done; > > + xsk->tx_npkts += tx_done; > > + } > > + > > + /* Recycle back to umem pool */ > > + for (j = 0; j < tx_done; j++) { > > + struct umem_elem *elem; > > + uint64_t addr; > > + > > + addr = *xsk_ring_cons__comp_addr(&xsk->umem->cq, idx_cq++); > > + > > + elem = ALIGNED_CAST(struct umem_elem *, > > + (char *)xsk->umem->buffer + addr); > > + umem_elem_push(&xsk->umem->mpool, elem); > > + } > > + xsk_ring_cons__release(&xsk->umem->cq, tx_done); > > + > > + if (xsk->outstanding_tx > PROD_NUM_DESCS - (PROD_NUM_DESCS >> 2)) { > > + /* If there are still a lot not transmitted, > > + * try harder. > > + */ > > + goto retry; > > + } > > + > > + return 0; > > +} > > + > > +#else > > +#include "openvswitch/compiler.h" > > +#include "netdev-afxdp.h" > > + > > +struct xsk_socket_info * > > +xsk_configure(int ifindex OVS_UNUSED, int xdp_queue_id OVS_UNUSED) > > +{ > > + return NULL; > > +} > > + > > +void > > +xsk_destroy(struct xsk_socket_info *xsk OVS_UNUSED) > > +{ > > +} > > + > > +int > > +netdev_linux_rxq_xsk(struct xsk_socket_info *xsk OVS_UNUSED, > > + struct dp_packet_batch *batch OVS_UNUSED) > > +{ > > + return 0; > > +} > > + > > +int > > +netdev_linux_afxdp_batch_send(struct xsk_socket_info *xsk OVS_UNUSED, > > + struct dp_packet_batch *batch OVS_UNUSED) > > +{ > > + return 0; > > +} > > + > > +int > > +netdev_afxdp_set_config(struct netdev *netdev OVS_UNUSED, > > + const struct smap *args OVS_UNUSED, > > + char **errp OVS_UNUSED) > > +{ > > + return 0; > > +} > > + > > +int > > +netdev_afxdp_get_config(const struct netdev *netdev OVS_UNUSED, > > + struct smap *args OVS_UNUSED) > > +{ > > + return 0; > > +} > > + > > +int > > +netdev_afxdp_get_numa_id(const struct netdev *netdev OVS_UNUSED) > > +{ > > + return 0; > > +} > > +#endif > > diff --git a/lib/netdev-afxdp.h b/lib/netdev-afxdp.h > > new file mode 100644 > > index 000000000000..ea05612a7c0f > > --- /dev/null > > +++ b/lib/netdev-afxdp.h > > @@ -0,0 +1,47 @@ > > +/* > > + * Copyright (c) 2018 Nicira, Inc. > > + * > > + * Licensed under the Apache License, Version 2.0 (the "License"); > > + * you may not use this file except in compliance with the License. > > + * You may obtain a copy of the License at: > > + * > > + * http://www.apache.org/licenses/LICENSE-2.0 > > + * > > + * Unless required by applicable law or agreed to in writing, software > > + * distributed under the License is distributed on an "AS IS" BASIS, > > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > > + * See the License for the specific language governing permissions and > > + * limitations under the License. > > + */ > > + > > +#ifndef NETDEV_AFXDP_H > > +#define NETDEV_AFXDP_H 1 > > + > > +#include <stdint.h> > > +#include <stdbool.h> > > + > > +/* These functions are Linux AF_XDP specific, so they should be used > directly > > + * only by Linux-specific code. */ > > +#define MAX_XSKQ 16 > > +struct netdev; > > +struct xsk_socket_info; > > +struct xdp_umem; > > +struct dp_packet_batch; > > +struct smap; > > + > > +struct xsk_socket_info *xsk_configure(int ifindex, int xdp_queue_id); > > +void xsk_destroy(struct xsk_socket_info *xsk); > > + > > +int netdev_linux_rxq_xsk(struct xsk_socket_info *xsk, > > + struct dp_packet_batch *batch); > > + > > +int netdev_linux_afxdp_batch_send(struct xsk_socket_info *xsk, > > + struct dp_packet_batch *batch); > > + > > +void xsk_remove_xdp_program(uint32_t ifindex); > > +int netdev_afxdp_set_config(struct netdev *netdev, const struct smap > *args, > > + char **errp); > > +int netdev_afxdp_get_config(const struct netdev *netdev, struct smap > *args); > > +int netdev_afxdp_get_numa_id(const struct netdev *netdev); > > + > > +#endif /* netdev-afxdp.h */ > > diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c > > index f75d73fd39f8..337760ca3333 100644 > > --- a/lib/netdev-linux.c > > +++ b/lib/netdev-linux.c > > @@ -75,6 +75,7 @@ > > #include "unaligned.h" > > #include "openvswitch/vlog.h" > > #include "util.h" > > +#include "netdev-afxdp.h" > > > > VLOG_DEFINE_THIS_MODULE(netdev_linux); > > > > @@ -531,6 +532,7 @@ struct netdev_linux { > > > > /* LAG information. */ > > bool is_lag_master; /* True if the netdev is a LAG master. > */ > > + struct xsk_socket_info *xsk[MAX_XSKQ]; /* af_xdp socket */ > > }; > > > > struct netdev_rxq_linux { > > @@ -580,12 +582,18 @@ is_netdev_linux_class(const struct netdev_class > *netdev_class) > > } > > > > static bool > > +is_afxdp_netdev(const struct netdev *netdev) > > +{ > > + return netdev_get_class(netdev) == &netdev_afxdp_class; > > +} > > + > > +static bool > > is_tap_netdev(const struct netdev *netdev) > > { > > return netdev_get_class(netdev) == &netdev_tap_class; > > } > > > > -static struct netdev_linux * > > +struct netdev_linux * > > netdev_linux_cast(const struct netdev *netdev) > > { > > ovs_assert(is_netdev_linux_class(netdev_get_class(netdev))); > > @@ -1084,6 +1092,25 @@ netdev_linux_destruct(struct netdev *netdev_) > > atomic_count_dec(&miimon_cnt); > > } > > > > +#if HAVE_AF_XDP > > + if (is_afxdp_netdev(netdev_)) { > > + int ifindex; > > + int ret, i; > > + > > + ret = get_ifindex(netdev_, &ifindex); > > + if (ret) { > > + VLOG_ERR("get ifindex error"); > > + } else { > > + for (i = 0; i < MAX_XSKQ; i++) { > > + if (netdev->xsk[i]) { > > + VLOG_INFO("destroy xsk[%d]", i); > > + xsk_destroy(netdev->xsk[i]); > > + } > > + } > > + xsk_remove_xdp_program(ifindex); > > + } > > + } > > +#endif > > ovs_mutex_destroy(&netdev->mutex); > > } > > > > @@ -1113,6 +1140,32 @@ netdev_linux_rxq_construct(struct netdev_rxq > *rxq_) > > rx->is_tap = is_tap_netdev(netdev_); > > if (rx->is_tap) { > > rx->fd = netdev->tap_fd; > > + } else if (is_afxdp_netdev(netdev_)) { > > +#if HAVE_AF_XDP > > + struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; > > + int ifindex; > > + int xdp_queue_id = rxq_->queue_id; > > + struct xsk_socket_info *xsk; > > + > > + if (setrlimit(RLIMIT_MEMLOCK, &r)) { > > + VLOG_ERR("ERROR: setrlimit(RLIMIT_MEMLOCK) \"%s\"\n", > > + ovs_strerror(errno)); > > + ovs_assert(0); > > + } > > + > > + VLOG_DBG("%s: %s: queue=%d configuring xdp sock", > > + __func__, netdev_->name, xdp_queue_id); > > + > > + /* Get ethernet device index. */ > > + error = get_ifindex(&netdev->up, &ifindex); > > + if (error) { > > + goto error; > > + } > > + > > + xsk = xsk_configure(ifindex, xdp_queue_id); > > + netdev->xsk[xdp_queue_id] = xsk; > > + rx->fd = xsk_socket__fd(xsk->xsk); /* for netdev layer to poll > */ > > +#endif > > } else { > > struct sockaddr_ll sll; > > int ifindex, val; > > @@ -1318,9 +1371,16 @@ netdev_linux_rxq_recv(struct netdev_rxq *rxq_, > struct dp_packet_batch *batch, > > { > > struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_); > > struct netdev *netdev = rx->up.netdev; > > - struct dp_packet *buffer; > > + struct dp_packet *buffer = NULL; > > ssize_t retval; > > int mtu; > > + struct netdev_linux *netdev_ = netdev_linux_cast(netdev); > > + > > + if (is_afxdp_netdev(netdev)) { > > + int qid = rxq_->queue_id; > > + > > + return netdev_linux_rxq_xsk(netdev_->xsk[qid], batch); > > + } > > > > if (netdev_linux_get_mtu__(netdev_linux_cast(netdev), &mtu)) { > > mtu = ETH_PAYLOAD_MAX; > > @@ -1329,6 +1389,7 @@ netdev_linux_rxq_recv(struct netdev_rxq *rxq_, > struct dp_packet_batch *batch, > > /* Assume Ethernet port. No need to set packet_type. */ > > buffer = dp_packet_new_with_headroom(VLAN_ETH_HEADER_LEN + mtu, > > DP_NETDEV_HEADROOM); > > + > > retval = (rx->is_tap > > ? netdev_linux_rxq_recv_tap(rx->fd, buffer) > > : netdev_linux_rxq_recv_sock(rx->fd, buffer)); > > @@ -1473,14 +1534,15 @@ netdev_linux_tap_batch_send(struct netdev > *netdev_, > > * The kernel maintains a packet transmission queue, so the caller is > not > > * expected to do additional queuing of packets. */ > > static int > > -netdev_linux_send(struct netdev *netdev_, int qid OVS_UNUSED, > > +netdev_linux_send(struct netdev *netdev_, int qid, > > struct dp_packet_batch *batch, > > bool concurrent_txq OVS_UNUSED) > > { > > int error = 0; > > int sock = 0; > > > > - if (!is_tap_netdev(netdev_)) { > > + if (!is_tap_netdev(netdev_) && > > + !is_afxdp_netdev(netdev_)) { > > if (netdev_linux_netnsid_is_remote(netdev_linux_cast(netdev_))) > { > > error = EOPNOTSUPP; > > goto free_batch; > > @@ -1499,6 +1561,10 @@ netdev_linux_send(struct netdev *netdev_, int qid > OVS_UNUSED, > > } > > > > error = netdev_linux_sock_batch_send(sock, ifindex, batch); > > + } else if (is_afxdp_netdev(netdev_)) { > > + struct netdev_linux *netdev = netdev_linux_cast(netdev_); > > + > > + error = netdev_linux_afxdp_batch_send(netdev->xsk[qid], batch); > > } else { > > error = netdev_linux_tap_batch_send(netdev_, batch); > > } > > @@ -3323,6 +3389,7 @@ const struct netdev_class netdev_linux_class = { > > NETDEV_LINUX_CLASS_COMMON, > > LINUX_FLOW_OFFLOAD_API, > > .type = "system", > > + .is_pmd = false, > > .construct = netdev_linux_construct, > > .get_stats = netdev_linux_get_stats, > > .get_features = netdev_linux_get_features, > > @@ -3333,6 +3400,7 @@ const struct netdev_class netdev_linux_class = { > > const struct netdev_class netdev_tap_class = { > > NETDEV_LINUX_CLASS_COMMON, > > .type = "tap", > > + .is_pmd = false, > > .construct = netdev_linux_construct_tap, > > .get_stats = netdev_tap_get_stats, > > .get_features = netdev_linux_get_features, > > @@ -3343,10 +3411,23 @@ const struct netdev_class netdev_internal_class > = { > > NETDEV_LINUX_CLASS_COMMON, > > LINUX_FLOW_OFFLOAD_API, > > .type = "internal", > > + .is_pmd = false, > > .construct = netdev_linux_construct, > > .get_stats = netdev_internal_get_stats, > > .get_status = netdev_internal_get_status, > > }; > > + > > +const struct netdev_class netdev_afxdp_class = { > > + NETDEV_LINUX_CLASS_COMMON, > > + .type = "afxdp", > > + .is_pmd = true, > > + .construct = netdev_linux_construct, > > + .get_stats = netdev_linux_get_stats, > > + .get_status = netdev_linux_get_status, > > + .set_config = netdev_afxdp_set_config, > > + .get_config = netdev_afxdp_get_config, > > + .get_numa_id = netdev_afxdp_get_numa_id, > > +}; > > > > > > #define CODEL_N_QUEUES 0x0000 > > diff --git a/lib/netdev-linux.h b/lib/netdev-linux.h > > index 17ca9120168a..afcb20ee8d0a 100644 > > --- a/lib/netdev-linux.h > > +++ b/lib/netdev-linux.h > > @@ -28,6 +28,7 @@ struct netdev; > > int netdev_linux_ethtool_set_flag(struct netdev *netdev, uint32_t flag, > > const char *flag_name, bool enable); > > int linux_get_ifindex(const char *netdev_name); > > +struct netdev_linux *netdev_linux_cast(const struct netdev *netdev); > > > > #define LINUX_FLOW_OFFLOAD_API \ > > .flow_flush = netdev_tc_flow_flush, \ > > diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h > > index fb0c27e6e8e8..5bf041316503 100644 > > --- a/lib/netdev-provider.h > > +++ b/lib/netdev-provider.h > > @@ -902,6 +902,7 @@ extern const struct netdev_class netdev_linux_class; > > #endif > > extern const struct netdev_class netdev_internal_class; > > extern const struct netdev_class netdev_tap_class; > > +extern const struct netdev_class netdev_afxdp_class; > > > > #ifdef __cplusplus > > } > > diff --git a/lib/netdev.c b/lib/netdev.c > > index 7d7ecf6f0946..c30016b34033 100644 > > --- a/lib/netdev.c > > +++ b/lib/netdev.c > > @@ -145,6 +145,7 @@ netdev_initialize(void) > > netdev_register_provider(&netdev_linux_class); > > netdev_register_provider(&netdev_internal_class); > > netdev_register_provider(&netdev_tap_class); > > + netdev_register_provider(&netdev_afxdp_class); > > netdev_vport_tunnel_register(); > > #endif > > #if defined(__FreeBSD__) || defined(__NetBSD__) > > diff --git a/lib/xdpsock.c b/lib/xdpsock.c > > new file mode 100644 > > index 000000000000..f9fe94b9e36a > > --- /dev/null > > +++ b/lib/xdpsock.c > > @@ -0,0 +1,210 @@ > > +/* > > + * Copyright (c) 2018, 2019 Nicira, Inc. > > + * > > + * Licensed under the Apache License, Version 2.0 (the "License"); > > + * you may not use this file except in compliance with the License. > > + * You may obtain a copy of the License at: > > + * > > + * http://www.apache.org/licenses/LICENSE-2.0 > > + * > > + * Unless required by applicable law or agreed to in writing, software > > + * distributed under the License is distributed on an "AS IS" BASIS, > > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > > + * See the License for the specific language governing permissions and > > + * limitations under the License. > > + */ > > +#include <config.h> > > +#include <ctype.h> > > +#include <errno.h> > > +#include <fcntl.h> > > +#include <stdarg.h> > > +#include <stdlib.h> > > +#include <string.h> > > +#include <sys/stat.h> > > +#include <sys/types.h> > > +#include <syslog.h> > > +#include <time.h> > > +#include <unistd.h> > > +#include "openvswitch/vlog.h" > > +#include "async-append.h" > > +#include "coverage.h" > > +#include "dirs.h" > > +#include "ovs-thread.h" > > +#include "sat-math.h" > > +#include "socket-util.h" > > +#include "svec.h" > > +#include "syslog-direct.h" > > +#include "syslog-libc.h" > > +#include "syslog-provider.h" > > +#include "timeval.h" > > +#include "unixctl.h" > > +#include "util.h" > > +#include "ovs-atomic.h" > > +#include "openvswitch/compiler.h" > > +#include "dp-packet.h" > > + > > +#ifdef HAVE_AF_XDP > > +#include "xdpsock.h" > > + > > +static inline void ovs_spinlock_init(ovs_spinlock_t *sl) > > +{ > > + sl->locked = 0; > > +} > > + > > +static inline void ovs_spin_lock(ovs_spinlock_t *sl) > > +{ > > + int exp = 0; > > + > > + while (!__atomic_compare_exchange_n(&sl->locked, &exp, 1, 0, > > + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) { > > + while (__atomic_load_n(&sl->locked, __ATOMIC_RELAXED)) { > > > These atomics are compiler specific. Please use: > > while (!atomic_compare_exchange_strong_explicit(&sl->locked, &exp, 1, > memory_order_acquire, > memory_order_relaxed)) > { > locked = 1; > while (locked) { > atomic_read_relaxed(&sl->locked, &locked); > } > exp = 0; > } > > > + ; > > + } > > + exp = 0; > > + } > > +} > > + > > +static inline void ovs_spin_unlock(ovs_spinlock_t *sl) > > +{ > > + __atomic_store_n(&sl->locked, 0, __ATOMIC_RELEASE); > > atomic_store_explicit(&sl->locked, 0, memory_order_release); > > > +} > > + > > +static inline int OVS_UNUSED ovs_spin_trylock(ovs_spinlock_t *sl) > > +{ > > + int exp = 0; > > + return __atomic_compare_exchange_n(&sl->locked, &exp, 1, > > + 0, /* disallow spurious failure */ > > + __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); > > > return atomic_compare_exchange_strong_explicit(&sl->locked, &exp, 1, > memory_order_acquire, > memory_order_relaxed); > > > > +} > > + > > +void > > +__umem_elem_push_n(struct umem_pool *umemp OVS_UNUSED, void **addrs, > int n) > > +{ > > + void *ptr; > > + > > + if (OVS_UNLIKELY(umemp->index + n > umemp->size)) { > > + OVS_NOT_REACHED(); > > + } > > + > > + ptr = &umemp->array[umemp->index]; > > + memcpy(ptr, addrs, n * sizeof(void *)); > > + umemp->index += n; > > +} > > + > > +inline void > > +__umem_elem_push(struct umem_pool *umemp OVS_UNUSED, void *addr) > > +{ > > + umemp->array[umemp->index++] = addr; > > +} > > + > > +void > > +umem_elem_push(struct umem_pool *umemp OVS_UNUSED, void *addr) > > +{ > > + > > + if (OVS_UNLIKELY(umemp->index >= umemp->size)) { > > + /* stack is full */ > > + /* it's possible that one umem gets pushed twice, > > + * because actions=1,2,3... multiple ports? > > + */ > > + OVS_NOT_REACHED(); > > + } > > + > > + ovs_assert(((uint64_t)addr & FRAME_SHIFT_MASK) == 0); > > + > > + ovs_spin_lock(&umemp->mutex); > > + __umem_elem_push(umemp, addr); > > + ovs_spin_unlock(&umemp->mutex); > > +} > > + > > +void > > +__umem_elem_pop_n(struct umem_pool *umemp OVS_UNUSED, void **addrs, int > n) > > +{ > > + void *ptr; > > + > > + umemp->index -= n; > > + > > + if (OVS_UNLIKELY(umemp->index < 0)) { > > + OVS_NOT_REACHED(); > > + } > > + > > + ptr = &umemp->array[umemp->index]; > > + memcpy(addrs, ptr, n * sizeof(void *)); > > +} > > + > > +inline void * > > +__umem_elem_pop(struct umem_pool *umemp OVS_UNUSED) > > +{ > > + return umemp->array[--umemp->index]; > > +} > > + > > +void * > > +umem_elem_pop(struct umem_pool *umemp OVS_UNUSED) > > +{ > > + void *ptr; > > + > > + ovs_spin_lock(&umemp->mutex); > > + ptr = __umem_elem_pop(umemp); > > + ovs_spin_unlock(&umemp->mutex); > > + > > + return ptr; > > +} > > + > > +void ** > > +__umem_pool_alloc(unsigned int size) > > +{ > > + void *bufs; > > + > > + ovs_assert(posix_memalign(&bufs, getpagesize(), > > + size * sizeof(void *)) == 0); > > + memset(bufs, 0, size * sizeof(void *)); > > + return (void **)bufs; > > +} > > + > > +unsigned int > > +umem_elem_count(struct umem_pool *mpool) > > +{ > > + return mpool->index; > > +} > > + > > +int > > +umem_pool_init(struct umem_pool *umemp OVS_UNUSED, unsigned int size) > > +{ > > + umemp->array = __umem_pool_alloc(size); > > + if (!umemp->array) { > > + OVS_NOT_REACHED(); > > + } > > + > > + umemp->size = size; > > + umemp->index = 0; > > + ovs_spinlock_init(&umemp->mutex); > > + return 0; > > +} > > + > > +void > > +umem_pool_cleanup(struct umem_pool *umemp OVS_UNUSED) > > +{ > > + free(umemp->array); > > +} > > + > > +/* AF_XDP metadata init/destroy */ > > +int > > +xpacket_pool_init(struct xpacket_pool *xp, unsigned int size) > > +{ > > + void *bufs; > > + > > + ovs_assert(posix_memalign(&bufs, getpagesize(), > > + size * sizeof(struct dp_packet_afxdp)) == > 0); > > + memset(bufs, 0, size * sizeof(struct dp_packet_afxdp)); > > + > > + xp->array = bufs; > > + xp->size = size; > > + return 0; > > +} > > + > > +void > > +xpacket_pool_cleanup(struct xpacket_pool *xp) > > +{ > > + free(xp->array); > > +} > > +#else /* !HAVE_AF_XDP below */ > > +#endif > > diff --git a/lib/xdpsock.h b/lib/xdpsock.h > > new file mode 100644 > > index 000000000000..cb64befe7dba > > --- /dev/null > > +++ b/lib/xdpsock.h > > @@ -0,0 +1,133 @@ > > +/* > > + * Copyright (c) 2018, 2019 Nicira, Inc. > > + * > > + * Licensed under the Apache License, Version 2.0 (the "License"); > > + * you may not use this file except in compliance with the License. > > + * You may obtain a copy of the License at: > > + * > > + * http://www.apache.org/licenses/LICENSE-2.0 > > + * > > + * Unless required by applicable law or agreed to in writing, software > > + * distributed under the License is distributed on an "AS IS" BASIS, > > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. > > + * See the License for the specific language governing permissions and > > + * limitations under the License. > > + */ > > +#ifndef XDPSOCK_H > > +#define XDPSOCK_H 1 > > +#include <errno.h> > > +#include <getopt.h> > > +#include <libgen.h> > > +#include <linux/bpf.h> > > +#include <linux/if_link.h> > > +#include <linux/if_xdp.h> > > +#include <linux/if_ether.h> > > +#include <net/if.h> > > +#include <signal.h> > > +#include <stdbool.h> > > +#include <stdio.h> > > +#include <stdlib.h> > > +#include <string.h> > > +#include <net/ethernet.h> > > +#include <sys/resource.h> > > +#include <sys/socket.h> > > +#include <sys/mman.h> > > +#include <time.h> > > +#include <unistd.h> > > +#include <pthread.h> > > +#include <locale.h> > > +#include <sys/types.h> > > +#include <poll.h> > > +#include <bpf/libbpf.h> > > + > > +#include "ovs-atomic.h" > > +#include "openvswitch/thread.h" > > + > > +/* bpf/xsk.h uses the following macros not defined in OVS, > > + * so re-define them before include. > > + */ > > +#define unlikely OVS_UNLIKELY > > +#define likely OVS_LIKELY > > +#define barrier() __asm__ __volatile__("": : :"memory") > > +#define smp_rmb() barrier() > > +#define smp_wmb() barrier() > > These barriers also x86 specific. We'll need to fix that in > the future before removing build constraints. > > > +#include <bpf/xsk.h> > > + > > +#define FRAME_HEADROOM XDP_PACKET_HEADROOM > > +#define FRAME_SIZE XSK_UMEM__DEFAULT_FRAME_SIZE > > +#define BATCH_SIZE NETDEV_MAX_BURST > > +#define FRAME_SHIFT XSK_UMEM__DEFAULT_FRAME_SHIFT > > +#define FRAME_SHIFT_MASK ((1<<FRAME_SHIFT)-1) > > + > > +#define NUM_FRAMES 1024 > > +#define PROD_NUM_DESCS 128 > > +#define CONS_NUM_DESCS 128 > > + > > +#ifdef USE_XSK_DEFAULT > > +#define PROD_NUM_DESCS XSK_RING_PROD__DEFAULT_NUM_DESCS > > +#define CONS_NUM_DESCS XSK_RING_CONS__DEFAULT_NUM_DESCS > > +#endif > > + > > +typedef struct { > > + volatile int locked; > > atomic_int locked; > > or atomic_bool. > > > +} ovs_spinlock_t; > > + > > +/* LIFO ptr_array */ > > +struct umem_pool { > > + int index; /* point to top */ > > + unsigned int size; > > + ovs_spinlock_t mutex; > > + void **array; /* a pointer array */ > > +}; > > + > > +/* array-based dp_packet_afxdp */ > > +struct xpacket_pool { > > + unsigned int size; > > + struct dp_packet_afxdp **array; > > +}; > > + > > +struct xsk_umem_info { > > + struct umem_pool mpool; > > + struct xpacket_pool xpool; > > + struct xsk_ring_prod fq; > > + struct xsk_ring_cons cq; > > + struct xsk_umem *umem; > > + void *buffer; > > +}; > > + > > +struct xsk_socket_info { > > + struct xsk_ring_cons rx; > > + struct xsk_ring_prod tx; > > + struct xsk_umem_info *umem; > > + struct xsk_socket *xsk; > > + unsigned long rx_npkts; > > + unsigned long tx_npkts; > > + unsigned long prev_rx_npkts; > > + unsigned long prev_tx_npkts; > > + uint32_t outstanding_tx; > > +}; > > + > > +struct umem_elem_head { > > + unsigned int index; > > + struct ovs_mutex mutex; > > + uint32_t n; > > +}; > > + > > +struct umem_elem { > > + struct umem_elem *next; > > +}; > > + > > +void __umem_elem_push(struct umem_pool *umemp, void *addr); > > +void umem_elem_push(struct umem_pool *umemp, void *addr); > > +void *__umem_elem_pop(struct umem_pool *umemp); > > +void *umem_elem_pop(struct umem_pool *umemp); > > +void **__umem_pool_alloc(unsigned int size); > > +int umem_pool_init(struct umem_pool *umemp, unsigned int size); > > +void umem_pool_cleanup(struct umem_pool *umemp); > > +unsigned int umem_elem_count(struct umem_pool *mpool); > > +void __umem_elem_pop_n(struct umem_pool *umemp, void **addrs, int n); > > +void __umem_elem_push_n(struct umem_pool *umemp, void **addrs, int n); > > +int xpacket_pool_init(struct xpacket_pool *xp, unsigned int size); > > +void xpacket_pool_cleanup(struct xpacket_pool *xp); > > + > > +#endif > > diff --git a/tests/automake.mk b/tests/automake.mk > > index ea16532dd2a0..715cef9a6b3b 100644 > > --- a/tests/automake.mk > > +++ b/tests/automake.mk > > @@ -4,12 +4,14 @@ EXTRA_DIST += \ > > $(SYSTEM_TESTSUITE_AT) \ > > $(SYSTEM_KMOD_TESTSUITE_AT) \ > > $(SYSTEM_USERSPACE_TESTSUITE_AT) \ > > + $(SYSTEM_AFXDP_TESTSUITE_AT) \ > > $(SYSTEM_OFFLOADS_TESTSUITE_AT) \ > > $(SYSTEM_DPDK_TESTSUITE_AT) \ > > $(OVSDB_CLUSTER_TESTSUITE_AT) \ > > $(TESTSUITE) \ > > $(SYSTEM_KMOD_TESTSUITE) \ > > $(SYSTEM_USERSPACE_TESTSUITE) \ > > + $(SYSTEM_AFXDP_TESTSUITE) \ > > $(SYSTEM_OFFLOADS_TESTSUITE) \ > > $(SYSTEM_DPDK_TESTSUITE) \ > > $(OVSDB_CLUSTER_TESTSUITE) \ > > @@ -158,6 +160,11 @@ SYSTEM_USERSPACE_TESTSUITE_AT = \ > > tests/system-userspace-macros.at \ > > tests/system-userspace-packet-type-aware.at > > > > +SYSTEM_AFXDP_TESTSUITE_AT = \ > > + tests/system-afxdp-testsuite.at \ > > + tests/system-afxdp-traffic.at \ > > + tests/system-afxdp-macros.at > > + > > SYSTEM_TESTSUITE_AT = \ > > tests/system-common-macros.at \ > > tests/system-ovn.at \ > > @@ -182,6 +189,7 @@ TESTSUITE = $(srcdir)/tests/testsuite > > TESTSUITE_PATCH = $(srcdir)/tests/testsuite.patch > > SYSTEM_KMOD_TESTSUITE = $(srcdir)/tests/system-kmod-testsuite > > SYSTEM_USERSPACE_TESTSUITE = $(srcdir)/tests/system-userspace-testsuite > > +SYSTEM_AFXDP_TESTSUITE = $(srcdir)/tests/system-afxdp-testsuite > > SYSTEM_OFFLOADS_TESTSUITE = $(srcdir)/tests/system-offloads-testsuite > > SYSTEM_DPDK_TESTSUITE = $(srcdir)/tests/system-dpdk-testsuite > > OVSDB_CLUSTER_TESTSUITE = $(srcdir)/tests/ovsdb-cluster-testsuite > > @@ -315,6 +323,11 @@ check-system-userspace: all > > set $(SHELL) '$(SYSTEM_USERSPACE_TESTSUITE)' -C tests > AUTOTEST_PATH='$(AUTOTEST_PATH)'; \ > > "$$@" $(TESTSUITEFLAGS) -j1 || (test X'$(RECHECK)' = Xyes && "$$@" > --recheck) > > > > +check-afxdp: all > > + $(MAKE) install > > + set $(SHELL) '$(SYSTEM_AFXDP_TESTSUITE)' -C tests > AUTOTEST_PATH='$(AUTOTEST_PATH)' $(TESTSUITEFLAGS) -j1; \ > > + "$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck) > > + > > check-offloads: all > > set $(SHELL) '$(SYSTEM_OFFLOADS_TESTSUITE)' -C tests > AUTOTEST_PATH='$(AUTOTEST_PATH)'; \ > > "$$@" $(TESTSUITEFLAGS) -j1 || (test X'$(RECHECK)' = Xyes && "$$@" > --recheck) > > @@ -352,6 +365,10 @@ $(SYSTEM_USERSPACE_TESTSUITE): package.m4 > $(SYSTEM_TESTSUITE_AT) $(SYSTEM_USERSP > > $(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at > > $(AM_V_at)mv $@.tmp $@ > > > > +$(SYSTEM_AFXDP_TESTSUITE): package.m4 $(SYSTEM_TESTSUITE_AT) > $(SYSTEM_AFXDP_TESTSUITE_AT) $(COMMON_MACROS_AT) > > + $(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at > > + $(AM_V_at)mv $@.tmp $@ > > + > > $(SYSTEM_OFFLOADS_TESTSUITE): package.m4 $(SYSTEM_TESTSUITE_AT) > $(SYSTEM_OFFLOADS_TESTSUITE_AT) $(COMMON_MACROS_AT) > > $(AM_V_GEN)$(AUTOTEST) -I '$(srcdir)' -o $@.tmp $@.at > > $(AM_V_at)mv $@.tmp $@ > > diff --git a/tests/system-afxdp-macros.at b/tests/system-afxdp-macros.at > > new file mode 100644 > > index 000000000000..2c58c2d6554b > > --- /dev/null > > +++ b/tests/system-afxdp-macros.at > > @@ -0,0 +1,153 @@ > > +# _ADD_BR([name]) > > +# > > +# Expands into the proper ovs-vsctl commands to create a bridge with the > > +# appropriate type and properties > > +m4_define([_ADD_BR], [[add-br $1 -- set Bridge $1 datapath_type=netdev > protocols=OpenFlow10,OpenFlow11,OpenFlow12,OpenFlow13,OpenFlow14,OpenFlow15 > fail-mode=secure ]]) > > + > > +# OVS_TRAFFIC_VSWITCHD_START([vsctl-args], [vsctl-output], [=override]) > > +# > > +# Creates a database and starts ovsdb-server, starts ovs-vswitchd > > +# connected to that database, calls ovs-vsctl to create a bridge named > > +# br0 with predictable settings, passing 'vsctl-args' as additional > > +# commands to ovs-vsctl. If 'vsctl-args' causes ovs-vsctl to provide > > +# output (e.g. because it includes "create" commands) then > 'vsctl-output' > > +# specifies the expected output after filtering through uuidfilt. > > +m4_define([OVS_TRAFFIC_VSWITCHD_START], > > + [ > > + export OVS_PKGDATADIR=$(`pwd`) > > + _OVS_VSWITCHD_START([--disable-system]) > > + AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| > uuidfilt])], [0], [$2]) > > +]) > > + > > +# OVS_TRAFFIC_VSWITCHD_STOP([WHITELIST], [extra_cmds]) > > +# > > +# Gracefully stops ovs-vswitchd and ovsdb-server, checking their log > files > > +# for messages with severity WARN or higher and signaling an error if > any > > +# is present. The optional WHITELIST may contain shell-quoted "sed" > > +# commands to delete any warnings that are actually expected, e.g.: > > +# > > +# OVS_TRAFFIC_VSWITCHD_STOP(["/expected error/d"]) > > +# > > +# 'extra_cmds' are shell commands to be executed afte > OVS_VSWITCHD_STOP() is > > +# invoked. They can be used to perform additional cleanups such as name > space > > +# removal. > > +m4_define([OVS_TRAFFIC_VSWITCHD_STOP], > > + [OVS_VSWITCHD_STOP([dnl > > +$1";/netdev_linux.*obtaining netdev stats via vport failed/d > > +/dpif_netlink.*Generic Netlink family 'ovs_datapath' does not exist. > The Open vSwitch kernel module is probably not loaded./d > > +/dpif_netdev(revalidator.*)|ERR|internal error parsing flow key/d > > +/dpif(revalidator.*)|WARN|netdev@ovs-netdev: failed to put/d > > +"]) > > + AT_CHECK([:; $2]) > > + ]) > > + > > +m4_define([ADD_VETH_AFXDP], > > + [ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77]) > > + CONFIGURE_AFXDP_VETH_OFFLOADS([$1]) > > + AT_CHECK([ip link set $1 netns $2]) > > + AT_CHECK([ip link set dev ovs-$1 up]) > > + AT_CHECK([ovs-vsctl add-port $3 ovs-$1 -- \ > > + set interface ovs-$1 external-ids:iface-id="$1" > type="afxdp"]) > > + NS_CHECK_EXEC([$2], [ip addr add $4 dev $1 $7]) > > + NS_CHECK_EXEC([$2], [ip link set dev $1 up]) > > + if test -n "$5"; then > > + NS_CHECK_EXEC([$2], [ip link set dev $1 address $5]) > > + fi > > + if test -n "$6"; then > > + NS_CHECK_EXEC([$2], [ip route add default via $6]) > > + fi > > + on_exit 'ip link del ovs-$1' > > + ] > > +) > > + > > +# CONFIGURE_AFXDP_VETH_OFFLOADS([VETH]) > > +# > > +# Disable TX offloads and VLAN offloads for veths used in AF_XDP. > > +m4_define([CONFIGURE_AFXDP_VETH_OFFLOADS], > > + [AT_CHECK([ethtool -K $1 tx off], [0], [ignore], [ignore]) > > + AT_CHECK([ethtool -K $1 rxvlan off], [0], [ignore], [ignore]) > > + AT_CHECK([ethtool -K $1 txvlan off], [0], [ignore], [ignore]) > > + ] > > +) > > + > > +# CONFIGURE_VETH_OFFLOADS([VETH]) > > +# > > +# Disable TX offloads for veths. The userspace datapath uses the > AF_PACKET > > +# socket to receive packets for veths. Unfortunately, the AF_PACKET > socket > > +# doesn't play well with offloads: > > +# 1. GSO packets are received without segmentation and therefore > discarded. > > +# 2. Packets with offloaded partial checksum are received with the wrong > > +# checksum, therefore discarded by the receiver. > > +# > > +# By disabling tx offloads in the non-OVS side of the veth peer we make > sure > > +# that the AF_PACKET socket will not receive bad packets. > > +# > > +# This is a workaround, and should be removed when offloads are properly > > +# supported in netdev-linux. > > +m4_define([CONFIGURE_VETH_OFFLOADS], > > + [AT_CHECK([ethtool -K $1 tx off], [0], [ignore], [ignore])] > > +) > > + > > +# CHECK_CONNTRACK() > > +# > > +# Perform requirements checks for running conntrack tests. > > +# > > +m4_define([CHECK_CONNTRACK], > > + [AT_SKIP_IF([test $HAVE_PYTHON = no])] > > +) > > + > > +# CHECK_CONNTRACK_ALG() > > +# > > +# Perform requirements checks for running conntrack ALG tests. The > userspace > > +# supports FTP and TFTP. > > +# > > +m4_define([CHECK_CONNTRACK_ALG]) > > + > > +# CHECK_CONNTRACK_FRAG() > > +# > > +# Perform requirements checks for running conntrack fragmentations > tests. > > +# The userspace doesn't support fragmentation yet, so skip the tests. > > +m4_define([CHECK_CONNTRACK_FRAG], > > +[ > > + AT_SKIP_IF([:]) > > +]) > > + > > +# CHECK_CONNTRACK_LOCAL_STACK() > > +# > > +# Perform requirements checks for running conntrack tests with local > stack. > > +# While the kernel connection tracker automatically passes all the > connection > > +# tracking state from an internal port to the OpenvSwitch kernel > module, there > > +# is simply no way of doing that with the userspace, so skip the tests. > > +m4_define([CHECK_CONNTRACK_LOCAL_STACK], > > +[ > > + AT_SKIP_IF([:]) > > +]) > > + > > +# CHECK_CONNTRACK_NAT() > > +# > > +# Perform requirements checks for running conntrack NAT tests. The > userspace > > +# datapath supports NAT. > > +# > > +m4_define([CHECK_CONNTRACK_NAT]) > > + > > +# CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE() > > +# > > +# Perform requirements checks for running ovs-dpctl flush-conntrack by > > +# conntrack 5-tuple test. The userspace datapath does not support > > +# this feature yet. > > +m4_define([CHECK_CT_DPIF_FLUSH_BY_CT_TUPLE], > > +[ > > + AT_SKIP_IF([:]) > > +]) > > + > > +# CHECK_CT_DPIF_SET_GET_MAXCONNS() > > +# > > +# Perform requirements checks for running ovs-dpctl ct-set-maxconns or > > +# ovs-dpctl ct-get-maxconns. The userspace datapath does support this > feature. > > +m4_define([CHECK_CT_DPIF_SET_GET_MAXCONNS]) > > + > > +# CHECK_CT_DPIF_GET_NCONNS() > > +# > > +# Perform requirements checks for running ovs-dpctl ct-get-nconns. The > > +# userspace datapath does support this feature. > > +m4_define([CHECK_CT_DPIF_GET_NCONNS]) > > diff --git a/tests/system-afxdp-testsuite.at b/tests/ > system-afxdp-testsuite.at > > new file mode 100644 > > index 000000000000..538c0d15d556 > > --- /dev/null > > +++ b/tests/system-afxdp-testsuite.at > > @@ -0,0 +1,26 @@ > > +AT_INIT > > + > > +AT_COPYRIGHT([Copyright (c) 2018 Nicira, Inc. > > + > > +Licensed under the Apache License, Version 2.0 (the "License"); > > +you may not use this file except in compliance with the License. > > +You may obtain a copy of the License at: > > + > > + http://www.apache.org/licenses/LICENSE-2.0 > > + > > +Unless required by applicable law or agreed to in writing, software > > +distributed under the License is distributed on an "AS IS" BASIS, > > +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > > +See the License for the specific language governing permissions and > > +limitations under the License.]) > > + > > +m4_ifdef([AT_COLOR_TESTS], [AT_COLOR_TESTS]) > > + > > +m4_include([tests/ovs-macros.at]) > > +m4_include([tests/ovsdb-macros.at]) > > +m4_include([tests/ofproto-macros.at]) > > +m4_include([tests/system-afxdp-macros.at]) > > +m4_include([tests/system-common-macros.at]) > > + > > +m4_include([tests/system-afxdp-traffic.at]) > > +m4_include([tests/system-ovn.at]) > > diff --git a/tests/system-afxdp-traffic.at b/tests/ > system-afxdp-traffic.at > > new file mode 100644 > > index 000000000000..26f72acf48ef > > --- /dev/null > > +++ b/tests/system-afxdp-traffic.at > > @@ -0,0 +1,978 @@ > > +AT_BANNER([AF_XDP netdev datapath-sanity]) > > + > > +AT_SETUP([datapath - ping between two ports]) > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +ulimit -l unlimited > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping between two ports on vlan]) > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +ADD_VLAN(p0, at_ns0, 100, "10.2.2.1/24") > > +ADD_VLAN(p1, at_ns1, 100, "10.2.2.2/24") > > + > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.2.2.2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping6 between two ports]) > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "fc00::1/96") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "fc00::2/96") > > + > > +dnl Linux seems to take a little time to get its IPv6 stack in order. > Without > > +dnl waiting, we get occasional failures due to the following error: > > +dnl "connect: Cannot assign requested address" > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::2]) > > + > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 6 fc00::2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping6 between two ports on vlan]) > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "fc00::1/96") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "fc00::2/96") > > + > > +ADD_VLAN(p0, at_ns0, 100, "fc00:1::1/96") > > +ADD_VLAN(p1, at_ns1, 100, "fc00:1::2/96") > > + > > +dnl Linux seems to take a little time to get its IPv6 stack in order. > Without > > +dnl waiting, we get occasional failures due to the following error: > > +dnl "connect: Cannot assign requested address" > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00:1::2]) > > + > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00:1::2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping6 -s 1600 -q -c 3 -i 0.3 -w 2 fc00:1::2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping6 -s 3200 -q -c 3 -i 0.3 -w 2 fc00:1::2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over vxlan tunnel]) > > +OVS_CHECK_VXLAN() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "172.31.1.1/24") > > +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL([vxlan], [br0], [at_vxlan0], [172.31.1.1], [ > 10.1.1.100/24]) > > +ADD_NATIVE_TUNNEL([vxlan], [at_vxlan1], [at_ns0], [172.31.1.100], [ > 10.1.1.1/24], > > + [id 0 dstport 4789]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add 172.31.1.92/24 br-underlay], [0], > [OK > > +]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over vxlan6 tunnel]) > > +OVS_CHECK_VXLAN_UDP6ZEROCSUM() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "fc00::1/64", [], [], "nodad") > > +AT_CHECK([ip addr add dev br-underlay "fc00::100/64" nodad]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL6([vxlan], [br0], [at_vxlan0], [fc00::1], [10.1.1.100/24 > ]) > > +ADD_NATIVE_TUNNEL6([vxlan], [at_vxlan1], [at_ns0], [fc00::100], [ > 10.1.1.1/24], > > + [id 0 dstport 4789 udp6zerocsumtx udp6zerocsumrx]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add fc00::100/64 br-underlay], [0], [OK > > +]) > > + > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::100]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00::100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over gre tunnel]) > > +OVS_CHECK_GRE() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "172.31.1.1/24") > > +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL([gre], [br0], [at_gre0], [172.31.1.1], [10.1.1.100/24]) > > +ADD_NATIVE_TUNNEL([gretap], [ns_gre0], [at_ns0], [172.31.1.100], [ > 10.1.1.1/24]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add 172.31.1.92/24 br-underlay], [0], > [OK > > +]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over erspan v1 tunnel]) > > +OVS_CHECK_GRE() > > +OVS_CHECK_ERSPAN() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "172.31.1.1/24") > > +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL([erspan], [br0], [at_erspan0], [172.31.1.1], [ > 10.1.1.100/24], [options:key=1 options:erspan_ver=1 options:erspan_idx=7]) > > +ADD_NATIVE_TUNNEL([erspan], [ns_erspan0], [at_ns0], [172.31.1.100], [ > 10.1.1.1/24], [seq key 1 erspan_ver 1 erspan 7]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add 172.31.1.92/24 br-underlay], [0], > [OK > > +]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +dnl NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +NS_CHECK_EXEC([at_ns0], [ping -s 1200 -i 0.3 -c 3 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over erspan v2 tunnel]) > > +OVS_CHECK_GRE() > > +OVS_CHECK_ERSPAN() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "172.31.1.1/24") > > +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL([erspan], [br0], [at_erspan0], [172.31.1.1], [ > 10.1.1.100/24], [options:key=1 options:erspan_ver=2 options:erspan_dir=1 > options:erspan_hwid=0x7]) > > +ADD_NATIVE_TUNNEL([erspan], [ns_erspan0], [at_ns0], [172.31.1.100], [ > 10.1.1.1/24], [seq key 1 erspan_ver 2 erspan_dir egress erspan_hwid 7]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add 172.31.1.92/24 br-underlay], [0], > [OK > > +]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +dnl NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +NS_CHECK_EXEC([at_ns0], [ping -s 1200 -i 0.3 -c 3 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over ip6erspan v1 tunnel]) > > +OVS_CHECK_GRE() > > +OVS_CHECK_ERSPAN() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "fc00:100::1/96", [], [], nodad) > > +AT_CHECK([ip addr add dev br-underlay "fc00:100::100/96" nodad]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL6([ip6erspan], [br0], [at_erspan0], [fc00:100::1], [ > 10.1.1.100/24], > > + [options:key=123 options:erspan_ver=1 > options:erspan_idx=0x7]) > > +ADD_NATIVE_TUNNEL6([ip6erspan], [ns_erspan0], [at_ns0], [fc00:100::100], > > + [10.1.1.1/24], [local fc00:100::1 seq key 123 > erspan_ver 1 erspan 7]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add fc00:100::1/96 br-underlay], [0], [OK > > +]) > > + > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 2 fc00:100::100]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00:100::100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over ip6erspan v2 tunnel]) > > +OVS_CHECK_GRE() > > +OVS_CHECK_ERSPAN() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "fc00:100::1/96", [], [], nodad) > > +AT_CHECK([ip addr add dev br-underlay "fc00:100::100/96" nodad]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL6([ip6erspan], [br0], [at_erspan0], [fc00:100::1], [ > 10.1.1.100/24], > > + [options:key=121 options:erspan_ver=2 > options:erspan_dir=0 options:erspan_hwid=0x7]) > > +ADD_NATIVE_TUNNEL6([ip6erspan], [ns_erspan0], [at_ns0], [fc00:100::100], > > + [10.1.1.1/24], > > + [local fc00:100::1 seq key 121 erspan_ver 2 > erspan_dir ingress erspan_hwid 0x7]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add fc00:100::1/96 br-underlay], [0], [OK > > +]) > > + > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 2 fc00:100::100]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00:100::100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over geneve tunnel]) > > +OVS_CHECK_GENEVE() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "172.31.1.1/24") > > +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL([geneve], [br0], [at_gnv0], [172.31.1.1], [10.1.1.100/24 > ]) > > +ADD_NATIVE_TUNNEL([geneve], [ns_gnv0], [at_ns0], [172.31.1.100], [ > 10.1.1.1/24], > > + [vni 0]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add 172.31.1.100/24 br-underlay], [0], > [OK > > +]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - ping over geneve6 tunnel]) > > +OVS_CHECK_GENEVE_UDP6ZEROCSUM() > > + > > +OVS_TRAFFIC_VSWITCHD_START() > > +ADD_BR([br-underlay]) > > + > > +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"]) > > +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"]) > > + > > +ADD_NAMESPACES(at_ns0) > > + > > +dnl Set up underlay link from host into the namespace using veth pair. > > +ADD_VETH_AFXDP(p0, at_ns0, br-underlay, "fc00::1/64", [], [], "nodad") > > +AT_CHECK([ip addr add dev br-underlay "fc00::100/64" nodad]) > > +AT_CHECK([ip link set dev br-underlay up]) > > + > > +dnl Set up tunnel endpoints on OVS outside the namespace and with a > native > > +dnl linux device inside the namespace. > > +ADD_OVS_TUNNEL6([geneve], [br0], [at_gnv0], [fc00::1], [10.1.1.100/24]) > > +ADD_NATIVE_TUNNEL6([geneve], [ns_gnv0], [at_ns0], [fc00::100], [ > 10.1.1.1/24], > > + [vni 0 udp6zerocsumtx udp6zerocsumrx]) > > + > > +AT_CHECK([ovs-appctl ovs/route/add 10.1.1.100/24 br0], [0], [OK > > +]) > > +AT_CHECK([ovs-appctl ovs/route/add fc00::100/64 br-underlay], [0], [OK > > +]) > > + > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::100]) > > + > > +dnl First, check the underlay > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00::100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +dnl Okay, now check the overlay with different packet sizes > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 1600 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > +NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 2 10.1.1.100 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - clone action]) > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +AT_CHECK([ovs-vsctl -- set interface ovs-p0 ofport_request=1 \ > > + -- set interface ovs-p1 ofport_request=2]) > > + > > +AT_DATA([flows.txt], [dnl > > +priority=1 actions=NORMAL > > +priority=10 > in_port=1,ip,actions=clone(mod_dl_dst(50:54:00:00:00:0a),set_field:192.168.3.3->ip_dst), > output:2 > > +priority=10 > in_port=2,ip,actions=clone(mod_dl_src(ae:c6:7e:54:8d:4d),mod_dl_dst(50:54:00:00:00:0b),set_field:192.168.4.4->ip_dst, > controller), output:1 > > +]) > > +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) > > + > > +AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl --detach --no-chdir > --pidfile 2> ofctl_monitor.log]) > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +AT_CHECK([cat ofctl_monitor.log | STRIP_MONITOR_CSUM], [0], [dnl > > > +icmp,vlan_tci=0x0000,dl_src=ae:c6:7e:54:8d:4d,dl_dst=50:54:00:00:00:0b,nw_src=10.1.1.2,nw_dst=192.168.4.4,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > icmp_csum: <skip> > > > +icmp,vlan_tci=0x0000,dl_src=ae:c6:7e:54:8d:4d,dl_dst=50:54:00:00:00:0b,nw_src=10.1.1.2,nw_dst=192.168.4.4,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > icmp_csum: <skip> > > > +icmp,vlan_tci=0x0000,dl_src=ae:c6:7e:54:8d:4d,dl_dst=50:54:00:00:00:0b,nw_src=10.1.1.2,nw_dst=192.168.4.4,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=0,icmp_code=0 > icmp_csum: <skip> > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([datapath - basic truncate action]) > > +AT_SKIP_IF([test $HAVE_NC = no]) > > +OVS_TRAFFIC_VSWITCHD_START() > > +AT_CHECK([ovs-ofctl del-flows br0]) > > + > > +dnl Create p0 and ovs-p0(1) > > +ADD_NAMESPACES(at_ns0) > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +NS_CHECK_EXEC([at_ns0], [ip link set dev p0 address e6:66:c1:11:11:11]) > > +NS_CHECK_EXEC([at_ns0], [arp -s 10.1.1.2 e6:66:c1:22:22:22]) > > + > > +dnl Create p1(3) and ovs-p1(2), packets received from ovs-p1 will > appear in p1 > > +AT_CHECK([ip link add p1 type veth peer name ovs-p1]) > > +on_exit 'ip link del ovs-p1' > > +AT_CHECK([ip link set dev ovs-p1 up]) > > +AT_CHECK([ip link set dev p1 up]) > > +AT_CHECK([ovs-vsctl add-port br0 ovs-p1 -- set interface ovs-p1 > ofport_request=2]) > > +dnl Use p1 to check the truncated packet > > +AT_CHECK([ovs-vsctl add-port br0 p1 -- set interface p1 > ofport_request=3]) > > + > > +dnl Create p2(5) and ovs-p2(4) > > +AT_CHECK([ip link add p2 type veth peer name ovs-p2]) > > +on_exit 'ip link del ovs-p2' > > +AT_CHECK([ip link set dev ovs-p2 up]) > > +AT_CHECK([ip link set dev p2 up]) > > +AT_CHECK([ovs-vsctl add-port br0 ovs-p2 -- set interface ovs-p2 > ofport_request=4]) > > +dnl Use p2 to check the truncated packet > > +AT_CHECK([ovs-vsctl add-port br0 p2 -- set interface p2 > ofport_request=5]) > > + > > +dnl basic test > > +AT_CHECK([ovs-ofctl del-flows br0]) > > +AT_DATA([flows.txt], [dnl > > +in_port=3 dl_dst=e6:66:c1:22:22:22 actions=drop > > +in_port=5 dl_dst=e6:66:c1:22:22:22 actions=drop > > +in_port=1 dl_dst=e6:66:c1:22:22:22 > actions=output(port=2,max_len=100),output:4 > > +]) > > +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) > > + > > +dnl use this file as payload file for ncat > > +AT_CHECK([dd if=/dev/urandom of=payload200.bin bs=200 count=1 2> > /dev/null]) > > +on_exit 'rm -f payload200.bin' > > +NS_CHECK_EXEC([at_ns0], [nc $NC_EOF_OPT -u 10.1.1.2 1234 < > payload200.bin]) > > + > > +dnl packet with truncated size > > +AT_CHECK([ovs-appctl revalidator/purge], [0]) > > +AT_CHECK([ovs-ofctl dump-flows br0 table=0 | grep "in_port=3" | sed -n > 's/.*\(n\_bytes=[[0-9]]*\).*/\1/p'], [0], [dnl > > +n_bytes=100 > > +]) > > +dnl packet with original size > > +AT_CHECK([ovs-appctl revalidator/purge], [0]) > > +AT_CHECK([ovs-ofctl dump-flows br0 table=0 | grep "in_port=5" | sed -n > 's/.*\(n\_bytes=[[0-9]]*\).*/\1/p'], [0], [dnl > > +n_bytes=242 > > +]) > > + > > +dnl more complicated output actions > > +AT_CHECK([ovs-ofctl del-flows br0]) > > +AT_DATA([flows.txt], [dnl > > +in_port=3 dl_dst=e6:66:c1:22:22:22 actions=drop > > +in_port=5 dl_dst=e6:66:c1:22:22:22 actions=drop > > +in_port=1 dl_dst=e6:66:c1:22:22:22 > actions=output(port=2,max_len=100),output:4,output(port=2,max_len=100),output(port=4,max_len=100),output:2,output(port=4,max_len=200),output(port=2,max_len=65535) > > +]) > > +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) > > + > > +NS_CHECK_EXEC([at_ns0], [nc $NC_EOF_OPT -u 10.1.1.2 1234 < > payload200.bin]) > > + > > +dnl 100 + 100 + 242 + min(65535,242) = 684 > > +AT_CHECK([ovs-appctl revalidator/purge], [0]) > > +AT_CHECK([ovs-ofctl dump-flows br0 table=0 | grep "in_port=3" | sed -n > 's/.*\(n\_bytes=[[0-9]]*\).*/\1/p'], [0], [dnl > > +n_bytes=684 > > +]) > > +dnl 242 + 100 + min(242,200) = 542 > > +AT_CHECK([ovs-ofctl dump-flows br0 table=0 | grep "in_port=5" | sed -n > 's/.*\(n\_bytes=[[0-9]]*\).*/\1/p'], [0], [dnl > > +n_bytes=542 > > +]) > > + > > +dnl SLOW_ACTION: disable kernel datapath truncate support > > +dnl Repeat the test above, but exercise the SLOW_ACTION code path > > +AT_CHECK([ovs-appctl dpif/set-dp-features br0 trunc false], [0]) > > + > > +dnl SLOW_ACTION test1: check datapatch actions > > +AT_CHECK([ovs-ofctl del-flows br0]) > > +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) > > + > > +AT_CHECK([ovs-appctl ofproto/trace br0 > "in_port=1,dl_type=0x800,dl_src=e6:66:c1:11:11:11,dl_dst=e6:66:c1:22:22:22,nw_src=192.168.0.1,nw_dst=192.168.0.2,nw_proto=6,tp_src=8,tp_dst=9"], > [0], [stdout]) > > +AT_CHECK([tail -3 stdout], [0], > > +[Datapath actions: > trunc(100),3,5,trunc(100),3,trunc(100),5,3,trunc(200),5,trunc(65535),3 > > +This flow is handled by the userspace slow path because it: > > + - Uses action(s) not supported by datapath. > > +]) > > + > > +dnl SLOW_ACTION test2: check actual packet truncate > > +AT_CHECK([ovs-ofctl del-flows br0]) > > +AT_CHECK([ovs-ofctl add-flows br0 flows.txt]) > > +NS_CHECK_EXEC([at_ns0], [nc $NC_EOF_OPT -u 10.1.1.2 1234 < > payload200.bin]) > > + > > +dnl 100 + 100 + 242 + min(65535,242) = 684 > > +AT_CHECK([ovs-appctl revalidator/purge], [0]) > > +AT_CHECK([ovs-ofctl dump-flows br0 table=0 | grep "in_port=3" | sed -n > 's/.*\(n\_bytes=[[0-9]]*\).*/\1/p'], [0], [dnl > > +n_bytes=684 > > +]) > > + > > +dnl 242 + 100 + min(242,200) = 542 > > +AT_CHECK([ovs-ofctl dump-flows br0 table=0 | grep "in_port=5" | sed -n > 's/.*\(n\_bytes=[[0-9]]*\).*/\1/p'], [0], [dnl > > +n_bytes=542 > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > + > > +AT_BANNER([conntrack]) > > + > > +AT_SETUP([conntrack - controller]) > > +CHECK_CONNTRACK() > > +OVS_TRAFFIC_VSWITCHD_START() > > +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg > ofproto_dpif_upcall:dbg]) > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from > ns1->ns0. > > +AT_DATA([flows.txt], [dnl > > +priority=1,action=drop > > +priority=10,arp,action=normal > > +priority=100,in_port=1,udp,action=ct(commit),controller > > +priority=100,in_port=2,ct_state=-trk,udp,action=ct(table=0) > > +priority=100,in_port=2,ct_state=+trk+est,udp,action=controller > > +]) > > + > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > + > > +AT_CAPTURE_FILE([ofctl_monitor.log]) > > +AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl --detach --no-chdir > --pidfile 2> ofctl_monitor.log]) > > + > > +dnl Send an unsolicited reply from port 2. This should be dropped. > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 ct\(table=0\) > '50540000000a50540000000908004500001c000000000011a4cd0a0101020a0101010002000100080000']) > > + > > +dnl OK, now start a new connection from port 1. > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 1 > ct\(commit\),controller > '50540000000a50540000000908004500001c000000000011a4cd0a0101010a0101020001000200080000']) > > + > > +dnl Now try a reply from port 2. > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 2 ct\(table=0\) > '50540000000a50540000000908004500001c000000000011a4cd0a0101020a0101010002000100080000']) > > + > > +dnl Check this output. We only see the latter two packets, not the > first. > > +AT_CHECK([cat ofctl_monitor.log], [0], [dnl > > +NXT_PACKET_IN2 (xid=0x0): total_len=42 in_port=1 (via action) > data_len=42 (unbuffered) > > > +udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=1,tp_dst=2 > udp_csum:0 > > +NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 > ct_state=est|rpl|trk,ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2,ip,in_port=2 > (via action) data_len=42 (unbuffered) > > > +udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=2,tp_dst=1 > udp_csum:0 > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([conntrack - force commit]) > > +CHECK_CONNTRACK() > > +OVS_TRAFFIC_VSWITCHD_START() > > +AT_CHECK([ovs-appctl vlog/set dpif:dbg dpif_netdev:dbg > ofproto_dpif_upcall:dbg]) > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +AT_DATA([flows.txt], [dnl > > +priority=1,action=drop > > +priority=10,arp,action=normal > > +priority=100,in_port=1,udp,action=ct(force,commit),controller > > +priority=100,in_port=2,ct_state=-trk,udp,action=ct(table=0) > > > +priority=100,in_port=2,ct_state=+trk+est,udp,action=ct(force,commit,table=1) > > +table=1,in_port=2,ct_state=+trk,udp,action=controller > > +]) > > + > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > + > > +AT_CAPTURE_FILE([ofctl_monitor.log]) > > +AT_CHECK([ovs-ofctl monitor br0 65534 invalid_ttl --detach --no-chdir > --pidfile 2> ofctl_monitor.log]) > > + > > +dnl Send an unsolicited reply from port 2. This should be dropped. > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=2 > packet=50540000000a50540000000908004500001c000000000011a4cd0a0101020a0101010002000100080000 > actions=resubmit(,0)"]) > > + > > +dnl OK, now start a new connection from port 1. > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1 > packet=50540000000a50540000000908004500001c000000000011a4cd0a0101010a0101020001000200080000 > actions=resubmit(,0)"]) > > + > > +dnl Now try a reply from port 2. > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=2 > packet=50540000000a50540000000908004500001c000000000011a4cd0a0101020a0101010002000100080000 > actions=resubmit(,0)"]) > > + > > +AT_CHECK([ovs-appctl revalidator/purge], [0]) > > + > > +dnl Check this output. We only see the latter two packets, not the > first. > > +AT_CHECK([cat ofctl_monitor.log], [0], [dnl > > +NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=1 (via > action) data_len=42 (unbuffered) > > > +udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=1,tp_dst=2 > udp_csum:0 > > +NXT_PACKET_IN2 (xid=0x0): table_id=1 cookie=0x0 total_len=42 > ct_state=new|trk,ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1,ip,in_port=2 > (via action) data_len=42 (unbuffered) > > > +udp,vlan_tci=0x0000,dl_src=50:54:00:00:00:09,dl_dst=50:54:00:00:00:0a,nw_src=10.1.1.2,nw_dst=10.1.1.1,nw_tos=0,nw_ecn=0,nw_ttl=0,tp_src=2,tp_dst=1 > udp_csum:0 > > +]) > > + > > +dnl > > +dnl Check that the directionality has been changed by force commit. > > +dnl > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.2,"], [], [dnl > > > +udp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2) > > +]) > > + > > +dnl OK, now send another packet from port 1 and see that it switches > again > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1 > packet=50540000000a50540000000908004500001c000000000011a4cd0a0101010a0101020001000200080000 > actions=resubmit(,0)"]) > > +AT_CHECK([ovs-appctl revalidator/purge], [0]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.1,"], [], [dnl > > > +udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),reply=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1) > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([conntrack - ct flush by 5-tuple]) > > +CHECK_CONNTRACK() > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +AT_DATA([flows.txt], [dnl > > +priority=1,action=drop > > +priority=10,arp,action=normal > > +priority=100,in_port=1,udp,action=ct(commit),2 > > +priority=100,in_port=2,udp,action=ct(zone=5,commit),1 > > +priority=100,in_port=1,icmp,action=ct(commit),2 > > +priority=100,in_port=2,icmp,action=ct(zone=5,commit),1 > > +]) > > + > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > + > > +dnl Test UDP from port 1 > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1 > packet=50540000000a50540000000908004500001c000000000011a4cd0a0101010a0101020001000200080000 > actions=resubmit(,0)"]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.1,"], [], [dnl > > > +udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),reply=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1) > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/flush-conntrack > 'ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=17,ct_tp_src=2,ct_tp_dst=1']) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.1,"], [1], [dnl > > +]) > > + > > +dnl Test UDP from port 2 > > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=2 > packet=50540000000a50540000000908004500001c000000000011a4cd0a0101020a0101010002000100080000 > actions=resubmit(,0)"]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.2,"], [0], [dnl > > > +udp,orig=(src=10.1.1.2,dst=10.1.1.1,sport=2,dport=1),reply=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=2),zone=5 > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/flush-conntrack zone=5 > 'ct_nw_src=10.1.1.1,ct_nw_dst=10.1.1.2,ct_nw_proto=17,ct_tp_src=1,ct_tp_dst=2']) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], > [dnl > > +]) > > + > > +dnl Test ICMP traffic > > +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.2,"], [0], [stdout]) > > +AT_CHECK([cat stdout | FORMAT_CT(10.1.1.1)], [0],[dnl > > > +icmp,orig=(src=10.1.1.2,dst=10.1.1.1,id=<cleared>,type=8,code=0),reply=(src=10.1.1.1,dst=10.1.1.2,id=<cleared>,type=0,code=0),zone=5 > > +]) > > + > > +ICMP_ID=`cat stdout | cut -d ',' -f4 | cut -d '=' -f2` > > > +ICMP_TUPLE=ct_nw_src=10.1.1.2,ct_nw_dst=10.1.1.1,ct_nw_proto=1,icmp_id=$ICMP_ID,icmp_type=8,icmp_code=0 > > +AT_CHECK([ovs-appctl dpctl/flush-conntrack zone=5 $ICMP_TUPLE]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep > "orig=.src=10\.1\.1\.2,"], [1], [dnl > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([conntrack - IPv4 ping]) > > +CHECK_CONNTRACK() > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from > ns1->ns0. > > +AT_DATA([flows.txt], [dnl > > +priority=1,action=drop > > +priority=10,arp,action=normal > > +priority=100,in_port=1,icmp,action=ct(commit),2 > > +priority=100,in_port=2,icmp,ct_state=-trk,action=ct(table=0) > > +priority=100,in_port=2,icmp,ct_state=+trk+est,action=1 > > +]) > > + > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > + > > +dnl Pings from ns0->ns1 should work fine. > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], > [dnl > > > +icmp,orig=(src=10.1.1.1,dst=10.1.1.2,id=<cleared>,type=8,code=0),reply=(src=10.1.1.2,dst=10.1.1.1,id=<cleared>,type=0,code=0) > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/flush-conntrack]) > > + > > +dnl Pings from ns1->ns0 should fail. > > +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 | > FORMAT_PING], [0], [dnl > > +7 packets transmitted, 0 received, 100% packet loss, time 0ms > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([conntrack - get_nconns and get/set_maxconns]) > > +CHECK_CONNTRACK() > > +CHECK_CT_DPIF_SET_GET_MAXCONNS() > > +CHECK_CT_DPIF_GET_NCONNS() > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "10.1.1.1/24") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "10.1.1.2/24") > > + > > +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from > ns1->ns0. > > +AT_DATA([flows.txt], [dnl > > +priority=1,action=drop > > +priority=10,arp,action=normal > > +priority=100,in_port=1,icmp,action=ct(commit),2 > > +priority=100,in_port=2,icmp,ct_state=-trk,action=ct(table=0) > > +priority=100,in_port=2,icmp,ct_state=+trk+est,action=1 > > +]) > > + > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > + > > +dnl Pings from ns0->ns1 should work fine. > > +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], > [dnl > > > +icmp,orig=(src=10.1.1.1,dst=10.1.1.2,id=<cleared>,type=8,code=0),reply=(src=10.1.1.2,dst=10.1.1.1,id=<cleared>,type=0,code=0) > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-set-maxconns one-bad-dp], [2], [], [dnl > > +ovs-vswitchd: maxconns missing or malformed (Invalid argument) > > +ovs-appctl: ovs-vswitchd: server returned an error > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-set-maxconns a], [2], [], [dnl > > +ovs-vswitchd: maxconns missing or malformed (Invalid argument) > > +ovs-appctl: ovs-vswitchd: server returned an error > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-set-maxconns one-bad-dp 10], [2], [], [dnl > > +ovs-vswitchd: datapath not found (Invalid argument) > > +ovs-appctl: ovs-vswitchd: server returned an error > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-maxconns one-bad-dp], [2], [], [dnl > > +ovs-vswitchd: datapath not found (Invalid argument) > > +ovs-appctl: ovs-vswitchd: server returned an error > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-nconns one-bad-dp], [2], [], [dnl > > +ovs-vswitchd: datapath not found (Invalid argument) > > +ovs-appctl: ovs-vswitchd: server returned an error > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-nconns], [], [dnl > > +1 > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-maxconns], [], [dnl > > +3000000 > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-set-maxconns 10], [], [dnl > > +setting maxconns successful > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-maxconns], [], [dnl > > +10 > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/flush-conntrack]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-nconns], [], [dnl > > +0 > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/ct-get-maxconns], [], [dnl > > +10 > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > + > > +AT_SETUP([conntrack - IPv6 ping]) > > +CHECK_CONNTRACK() > > +OVS_TRAFFIC_VSWITCHD_START() > > + > > +ADD_NAMESPACES(at_ns0, at_ns1) > > + > > +ADD_VETH_AFXDP(p0, at_ns0, br0, "fc00::1/96") > > +ADD_VETH_AFXDP(p1, at_ns1, br0, "fc00::2/96") > > + > > +AT_DATA([flows.txt], [dnl > > + > > +dnl ICMPv6 echo request and reply go to table 1. The rest of the > traffic goes > > +dnl through normal action. > > +table=0,priority=10,icmp6,icmp_type=128,action=goto_table:1 > > +table=0,priority=10,icmp6,icmp_type=129,action=goto_table:1 > > +table=0,priority=1,action=normal > > + > > +dnl Allow everything from ns0->ns1. Only allow return traffic from > ns1->ns0. > > +table=1,priority=100,in_port=1,icmp6,action=ct(commit),2 > > +table=1,priority=100,in_port=2,icmp6,ct_state=-trk,action=ct(table=0) > > +table=1,priority=100,in_port=2,icmp6,ct_state=+trk+est,action=1 > > +table=1,priority=1,action=drop > > +]) > > + > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > + > > +OVS_WAIT_UNTIL([ip netns exec at_ns0 ping6 -c 1 fc00::2]) > > + > > +dnl The above ping creates state in the connection tracker. We're not > > +dnl interested in that state. > > +AT_CHECK([ovs-appctl dpctl/flush-conntrack]) > > + > > +dnl Pings from ns1->ns0 should fail. > > +NS_CHECK_EXEC([at_ns1], [ping6 -q -c 3 -i 0.3 -w 2 fc00::1 | > FORMAT_PING], [0], [dnl > > +7 packets transmitted, 0 received, 100% packet loss, time 0ms > > +]) > > + > > +dnl Pings from ns0->ns1 should work fine. > > +NS_CHECK_EXEC([at_ns0], [ping6 -q -c 3 -i 0.3 -w 2 fc00::2 | > FORMAT_PING], [0], [dnl > > +3 packets transmitted, 3 received, 0% packet loss, time 0ms > > +]) > > + > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(fc00::2)], [0], > [dnl > > > +icmpv6,orig=(src=fc00::1,dst=fc00::2,id=<cleared>,type=128,code=0),reply=(src=fc00::2,dst=fc00::1,id=<cleared>,type=129,code=0) > > +]) > > + > > +OVS_TRAFFIC_VSWITCHD_STOP > > +AT_CLEANUP > > > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev