This functionality was deprecated in 3.7 due to lack of use, testing and maintenance. It's time to remove it.
We still have the compile-time optimized versions of lookup functions for different subtable bit configurations, so some parts of the infrastructure for them stays. But dpif-netdev-lookup-generic is now the only dpif-netdev-lookup implementation. So, removing the dpif-netdev-lookup infra, hooking up dpif-netdev-lookup-generic directly into the callers and renaming it similarly to the rest of the sub-modules of dpif-netdev. The 'private' part in the names doesn't really make sense anymore. Will be renamed in the next commit to avoid unnecessary complexity in the diff. Signed-off-by: Ilya Maximets <[email protected]> --- Documentation/intro/install/dpdk.rst | 32 +- Documentation/topics/dpdk/bridge.rst | 92 ---- Documentation/topics/testing.rst | 43 -- NEWS | 1 + acinclude.m4 | 52 +- configure.ac | 2 - lib/automake.mk | 41 +- lib/dpif-netdev-lookup-autovalidator.c | 109 ----- lib/dpif-netdev-lookup-avx512-gather.c | 445 ------------------ lib/dpif-netdev-lookup.c | 193 -------- lib/dpif-netdev-lookup.h | 92 ---- ...-generic.c => dpif-netdev-private-dpcls.c} | 37 +- lib/dpif-netdev-private-dpcls.h | 13 +- lib/dpif-netdev-private-flow.h | 3 - lib/dpif-netdev-private-thread.h | 4 - lib/dpif-netdev-unixctl.man | 12 - lib/dpif-netdev.c | 156 +----- m4/openvswitch.m4 | 70 --- tests/pmd.at | 68 --- 19 files changed, 34 insertions(+), 1431 deletions(-) delete mode 100644 lib/dpif-netdev-lookup-autovalidator.c delete mode 100644 lib/dpif-netdev-lookup-avx512-gather.c delete mode 100644 lib/dpif-netdev-lookup.c delete mode 100644 lib/dpif-netdev-lookup.h rename lib/{dpif-netdev-lookup-generic.c => dpif-netdev-private-dpcls.c} (91%) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 8bc15529b..0928e0d51 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -155,19 +155,8 @@ has to be configured to build against the DPDK library (``--with-dpdk``). While ``--with-dpdk`` is required, you can pass any other configuration option described in :ref:`general-configuring`. - .. note:: - The AVX512 Datapath Classifier Performance feature is deprecated and will - be removed in a future release. - It is strongly recommended to build OVS with at least ``-msse4.2`` and - ``-mpopcnt`` optimization flags. If these flags are not enabled, the AVX512 - optimized DPCLS implementation is not available in the resulting binary. - For technical details see the subtable registration code in the - ``lib/dpif-netdev-lookup.c`` file. - - An example that enables the AVX512 optimizations is:: - - $ ./configure --with-dpdk=static CFLAGS="-Ofast -msse4.2 -mpopcnt" + ``-mpopcnt`` optimization flags. #. Build and install OVS, as described in :ref:`general-building` @@ -181,25 +170,6 @@ Additional information can be found in :doc:`general`. __ https://github.com/openvswitch/ovs/blob/main/rhel/README.RHEL.rst -Possible issues when enabling AVX512 -++++++++++++++++++++++++++++++++++++ - -The enabling of ISA optimized builds requires build-system support. -Certain versions of the assembler provided by binutils is known to have -AVX512 assembling issues. The binutils versions affected are 2.30 and 2.31. -As many distros backport fixes to previous versions of a package, checking -the version output of ``as -v`` can err on the side of disabling AVX512. To -remedy this, the OVS build system uses a build-time check to see if ``as`` -will correctly assemble the AVX512 code. The output of a good version when -running the ``./configure`` step of the build process is as follows:: - - $ checking binutils avx512 assembler checks passing... yes - -If a bug is detected in the binutils assembler, it would indicate ``no``. -Build an updated binutils, or request a backport of this binutils -fix commit ``2069ccaf8dc28ea699bd901fdd35d90613e4402a`` to fix the issue. - - Setup ----- diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index ab09f89f1..163bcc2e2 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -161,95 +161,3 @@ currently turned off by default. To turn on SMC:: $ ovs-vsctl --no-wait set Open_vSwitch . other_config:smc-enable=true - -Datapath Classifier Performance -------------------------------- - -.. note:: - - The AVX512 Datapath Classifier Performance feature is deprecated and will be - removed in a future release. - -The datapath classifier (dpcls) performs wildcard rule matching, a compute -intensive process of matching a packet ``miniflow`` to a rule ``miniflow``. The -code that does this compute work impacts datapath performance, and optimizing -it can provide higher switching performance. - -Modern CPUs provide extensive SIMD instructions which can be used to get higher -performance. The CPU OVS is being deployed on must be capable of running these -SIMD instructions in order to take advantage of the performance benefits. -In OVS v2.14 runtime CPU detection was introduced to enable identifying if -these CPU ISA additions are available, and to allow the user to enable them. - -OVS provides multiple implementations of dpcls. The following command enables -the user to check what implementations are available in a running instance:: - - $ ovs-appctl dpif-netdev/subtable-lookup-info-get - Available dpcls implementations: - autovalidator (Use count: 1, Priority: 5) - generic (Use count: 0, Priority: 1) - avx512_gather (Use count: 0, Priority: 3) - -To set the priority of a lookup function, run the ``prio-set`` command:: - - $ ovs-appctl dpif-netdev/subtable-lookup-prio-set avx512_gather 5 - Lookup priority change affected 1 dpcls ports and 1 subtables. - -The highest priority lookup function is used for classification, and the output -above indicates that one subtable of one DPCLS port is has changed its lookup -function due to the command being run. To verify the prioritization, re-run the -get command, note the updated priority of the ``avx512_gather`` function:: - - $ ovs-appctl dpif-netdev/subtable-lookup-info-get - Available dpcls implementations: - autovalidator (Use count: 1, Priority: 5) - generic (Use count: 0, Priority: 1) - avx512_gather (Use count: 0, Priority: 3) - -If two lookup functions have the same priority, the first one in the list is -chosen, and the 2nd occurrence of that priority is not used. Put in logical -terms, a subtable is chosen if its priority is greater than the previous -best candidate. - -Note that the ``avx512_gather`` implementation uses instructions which may be -affected by the Gather Data Sampling (GDS) vulnerability, aka Downfall, -mitigation (see documentation for CVE-2022-40982 for details). This could -result in lower performance when these mitigations are enabled. - -Optimizing Specific Subtable Search -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -.. note:: - - The AVX512 Optimizing Specific Subtable Search feature is deprecated and - will be removed in a future release. - -During the packet classification, the datapath can use specialized lookup -tables to optimize the search. However, not all situations are optimized. If -you see a message like the following one in the OVS logs, it means that there -is no specialized implementation available for the current network traffic:: - - Using non-specialized AVX512 lookup for subtable (X,Y) and possibly others. - -In this case, OVS will continue to process the traffic normally using a more -generic lookup table. - -Additional specialized lookups can be added to OVS if the user provides that -log message along with the command output as show below to the OVS mailing -list. Note that the numbers in the log message (``subtable (X,Y)``) need to -match with the numbers in the provided command output -(``dp-extra-info:miniflow_bits(X,Y)``). - -``ovs-appctl dpctl/dump-flows -m``, which results in output like this:: - - ufid:82770b5d-ca38-44ff-8283-74ba36bd1ca5, skb_priority(0/0),skb_mark(0/0) - ,ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0), - dp_hash(0/0),in_port(pcap0),packet_type(ns=0,id=0),eth(src=00:00:00:00:00: - 00/00:00:00:00:00:00,dst=ff:ff:ff:ff:ff:ff/00:00:00:00:00:00),eth_type( - 0x8100),vlan(vid=1,pcp=0),encap(eth_type(0x0800),ipv4(src=127.0.0.1/0.0.0.0 - ,dst=127.0.0.1/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=53/0, - dst=53/0)), packets:77072681, bytes:3545343326, used:0.000s, dp:ovs, - actions:vhostuserclient0, dp-extra-info:miniflow_bits(4,1) - -Please send an email to the OVS mailing list [email protected] with -the output of the ``dp-extra-info:miniflow_bits(4,1)`` values. diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst index e3b06321a..278b5c1d0 100644 --- a/Documentation/topics/testing.rst +++ b/Documentation/topics/testing.rst @@ -326,49 +326,6 @@ To invoke the DPDK offloads testsuite with the userspace datapath, run:: This has only been tested on NVIDIA blades due to the limited availability of other blades that support rte_flow. -Userspace datapath: Testing and Validation of CPU-specific Optimizations -++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ - -.. note:: - The AVX512 CPU-specific optimization features are deprecated and will be - removed in a future release. - -As multiple versions of the datapath classifier each with different CPU ISA -optimizations, it is important to validate that they all give the exact same -results. To easily test all the implementations, an ``autovalidator`` -implementation of them exists. This implementation runs all other available -implementations, and verifies that the results are identical. - -Running the OVS unit tests with the autovalidator enabled ensures all -implementations provide the same results. Note that the performance of the -autovalidator is lower than all other implementations, as it tests the scalar -implementation against itself, and against all other enabled implementations. - -To adjust the autovalidator priority for a datapath classifier, use this -command:: - - $ ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 7 - -To run the OVS unit test suite with the autovalidator as the default -implementation, it is required to recompile OVS. During the recompilation, -the default priority of the `autovalidator` implementation is set to the -maximum priority, ensuring every test will be run with every implementation:: - - $ ./configure --enable-autovalidator - -The following line should be seen in the configuration log when the above -options are used:: - - checking whether DPCLS Autovalidator is default implementation... yes - -Compile OVS in debug mode to have `ovs_assert` statements error out if -there is a mismatch in the datapath classifier lookup. - -.. note:: - Run all the available testsuites including `make check`, - `make check-system-userspace` and `make check-dpdk` to ensure the optimal - test coverage. - Kernel datapath +++++++++++++++ diff --git a/NEWS b/NEWS index c828ae301..288ab8cc4 100644 --- a/NEWS +++ b/NEWS @@ -10,6 +10,7 @@ Post-v3.7.0 'dpdk-probe-at-init' config option, see ovs-vswitchd.conf.db(5)). - The following deprecated AVX512-specific features of the userspace datapath are now removed: + * AVX512-optimized DPCLS lookups. * AVX512-optimized action handling. * AVX512-optimized packet parsing (miniflow extraction). * AVX512-optimized DPIF input processing. diff --git a/acinclude.m4 b/acinclude.m4 index 58d5b9df8..bc26a284b 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -14,39 +14,6 @@ # See the License for the specific language governing permissions and # limitations under the License. -dnl Set OVS DPCLS Autovalidator as default subtable search at compile time? -dnl This enables automatically running all unit tests with all DPCLS -dnl implementations. -AC_DEFUN([OVS_CHECK_DPCLS_AUTOVALIDATOR], [ - AC_ARG_ENABLE([autovalidator], - [AS_HELP_STRING([--enable-autovalidator], - [Enable DPCLS autovalidator as default subtable - search implementation.])], - [autovalidator=yes],[autovalidator=no]) - AC_MSG_CHECKING([whether DPCLS Autovalidator is default implementation]) - if test "$autovalidator" != yes; then - AC_MSG_RESULT([no]) - else - AC_DEFINE([DPCLS_AUTOVALIDATOR_DEFAULT], [1], - [Autovalidator for the userspace datapath classifier is a - default implementation.]) - AC_MSG_RESULT([yes]) - AC_MSG_WARN( - [Explicit AVX512 feature support will be deprecated in the next release.]) - fi -]) - -dnl OVS_CHECK_AVX512 -dnl -dnl Checks if compiler and binutils supports various AVX512 ISA. -AC_DEFUN([OVS_CHECK_AVX512], [ - OVS_CHECK_BINUTILS_AVX512 - OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512f], [HAVE_AVX512F]) - OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512bw], [HAVE_AVX512BW]) - OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512vl], [HAVE_AVX512VL]) - OVS_CHECK_AVX512VPOPCNTDQ -]) - dnl OVS_ENABLE_WERROR AC_DEFUN([OVS_ENABLE_WERROR], [AC_ARG_ENABLE( @@ -435,11 +402,7 @@ AC_DEFUN([OVS_CHECK_DPDK], [ # forces in pkg-config since this could override user-specified options. # It's enough to have -mssse3 to build with DPDK headers. DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-march=[[^ ]]*//g') - # Also stripping out '-mno-avx512f'. Support for AVX512 will be disabled - # if OVS will detect that it's broken. OVS could be built with a - # completely different toolchain that correctly supports AVX512, flags - # forced by DPDK only breaks our feature detection mechanism and leads to - # build failures: https://github.com/openvswitch/ovs-issues/issues/201 + # Also stripping out '-mno-avx512f' for the same reasons. DPDK_INCLUDE=$(echo "$DPDK_INCLUDE" | sed 's/-mno-avx512f//g') OVS_CFLAGS="$OVS_CFLAGS $DPDK_INCLUDE" OVS_ENABLE_OPTION([-mssse3]) @@ -613,19 +576,6 @@ AC_DEFUN([OVS_CONDITIONAL_CC_OPTION], AM_CONDITIONAL([$2], [test $ovs_have_cc_option = yes])]) dnl ---------------------------------------------------------------------- -dnl OVS_CONDITIONAL_CC_OPTION_DEFINE([OPTION], [CONDITIONAL]) -dnl Check whether the given C compiler OPTION is accepted. -dnl If so, enable the given Automake CONDITIONAL and define it. -dnl Example: OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512f], [HAVE_AVX512F]) -AC_DEFUN([OVS_CONDITIONAL_CC_OPTION_DEFINE], - [OVS_CHECK_CC_OPTION( - [$1], [ovs_have_cc_option=yes], [ovs_have_cc_option=no]) - AM_CONDITIONAL([$2], [test $ovs_have_cc_option = yes]) - if test "$ovs_have_cc_option" = yes; then - AC_DEFINE([$2], [1], - [Define to 1 if compiler supports the '$1' option.]) - fi]) - dnl OVS_CHECK_SPARSE_TARGET dnl dnl The "cgcc" script from "sparse" isn't very good at detecting the diff --git a/configure.ac b/configure.ac index 1a790adb8..f02c6f313 100644 --- a/configure.ac +++ b/configure.ac @@ -186,8 +186,6 @@ OVS_CONDITIONAL_CC_OPTION([-Wno-unused-parameter], [HAVE_WNO_UNUSED_PARAMETER]) OVS_ENABLE_WERROR_TOP OVS_ENABLE_SPARSE OVS_CTAGS_IDENTIFIERS -OVS_CHECK_DPCLS_AUTOVALIDATOR -OVS_CHECK_AVX512 AC_ARG_VAR(KARCH, [Kernel Architecture String]) AC_SUBST(KARCH) diff --git a/lib/automake.mk b/lib/automake.mk index 954a62778..9f9a5d574 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -15,34 +15,6 @@ lib_libopenvswitch_la_LDFLAGS = \ -Wl,--version-script=$(top_builddir)/lib/libopenvswitch.sym \ $(AM_LDFLAGS) -if HAVE_AVX512F -if HAVE_LD_AVX512_GOOD -# Build library of avx512 code with CPU ISA CFLAGS enabled. This allows the -# compiler to use the ISA features required for the ISA optimized code-paths. -# Use LDFLAGS to compile only static library of this code, as it should be -# statically linked into vswitchd even if vswitchd is a shared build. -noinst_LTLIBRARIES += lib/libopenvswitchavx512.la -lib_libopenvswitch_la_LIBADD += lib/libopenvswitchavx512.la -lib_libopenvswitchavx512_la_CFLAGS = \ - -mavx512f \ - -mbmi \ - -mbmi2 \ - -fPIC \ - $(AM_CFLAGS) -if HAVE_AVX512BW -if HAVE_AVX512VL -lib_libopenvswitchavx512_la_CFLAGS += \ - -mavx512bw \ - -mavx512vl -lib_libopenvswitchavx512_la_SOURCES = \ - lib/dpif-netdev-lookup-avx512-gather.c -endif # HAVE_AVX512VL -endif # HAVE_AVX512BW -lib_libopenvswitchavx512_la_LDFLAGS = \ - -static -endif # HAVE_LD_AVX512_GOOD -endif # HAVE_AVX512F - # Build core vswitch libraries as before lib_libopenvswitch_la_SOURCES = \ lib/aes128.c \ @@ -112,19 +84,16 @@ lib_libopenvswitch_la_SOURCES = \ lib/dp-packet-gso.c \ lib/dp-packet-gso.h \ lib/dpdk.h \ - lib/dpif-netdev-lookup.h \ - lib/dpif-netdev-lookup.c \ - lib/dpif-netdev-lookup-autovalidator.c \ - lib/dpif-netdev-lookup-generic.c \ - lib/dpif-netdev.c \ - lib/dpif-netdev.h \ + lib/dpif-netdev-perf.c \ + lib/dpif-netdev-perf.h \ lib/dpif-netdev-private-dfc.c \ lib/dpif-netdev-private-dfc.h \ + lib/dpif-netdev-private-dpcls.c \ lib/dpif-netdev-private-dpcls.h \ lib/dpif-netdev-private-flow.h \ lib/dpif-netdev-private-thread.h \ - lib/dpif-netdev-perf.c \ - lib/dpif-netdev-perf.h \ + lib/dpif-netdev.c \ + lib/dpif-netdev.h \ lib/dpif-offload.c \ lib/dpif-offload.h \ lib/dpif-offload-dummy.c \ diff --git a/lib/dpif-netdev-lookup-autovalidator.c b/lib/dpif-netdev-lookup-autovalidator.c deleted file mode 100644 index 475e1ab1e..000000000 --- a/lib/dpif-netdev-lookup-autovalidator.c +++ /dev/null @@ -1,109 +0,0 @@ -/* - * Copyright (c) 2020 Intel Corporation. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at: - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#include <config.h> -#include "dpif-netdev.h" -#include "dpif-netdev-lookup.h" -#include "openvswitch/vlog.h" - -VLOG_DEFINE_THIS_MODULE(dpif_lookup_autovalidator); - -/* This file implements an automated validator for subtable search - * implementations. It compares the results of the generic scalar search result - * with ISA optimized implementations. - * - * Note the goal is *NOT* to test the *specialized* versions of subtables, as - * the compiler performs the specialization - and we rely on the correctness of - * the compiler to not break those specialized variants. - * - * The goal is to ensure identical results of the different implementations, - * despite that the implementations may have different methods to get those - * results. - * - * Example: AVX-512 ISA uses different instructions and algorithm to the scalar - * implementation, however the results (rules[] output) must be the same. - */ - -dpcls_subtable_lookup_func -dpcls_subtable_autovalidator_probe(uint32_t u0 OVS_UNUSED, - uint32_t u1 OVS_UNUSED); - -static uint32_t -dpcls_subtable_autovalidator(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules_good) -{ - const uint32_t u0_bit_count = subtable->mf_bits_set_unit0; - const uint32_t u1_bit_count = subtable->mf_bits_set_unit1; - - /* Scalar generic - the "known correct" version. */ - dpcls_subtable_lookup_func lookup_good; - lookup_good = dpcls_subtable_generic_probe(u0_bit_count, u1_bit_count); - - /* Run actual scalar implementation to get known good results. */ - uint32_t matches_good = lookup_good(subtable, keys_map, keys, rules_good); - - struct dpcls_subtable_lookup_info_t *lookup_funcs; - int32_t lookup_func_count = dpcls_subtable_lookup_info_get(&lookup_funcs); - if (lookup_func_count < 0) { - VLOG_ERR("failed to get lookup subtable function implementations\n"); - return 0; - } - - /* Ensure the autovalidator is the 0th item in the lookup_funcs array. */ - ovs_assert(lookup_funcs[0].probe(0, 0) == dpcls_subtable_autovalidator); - - /* Now compare all other implementations against known good results. - * Note we start iterating from array[1], as 0 is the autotester itself. - */ - for (int i = 1; i < lookup_func_count; i++) { - dpcls_subtable_lookup_func lookup_func; - lookup_func = lookup_funcs[i].probe(u0_bit_count, - u1_bit_count); - - /* If its probe returns a function, then test it. */ - if (lookup_func) { - struct dpcls_rule *rules_test[NETDEV_MAX_BURST]; - size_t rules_size = sizeof(struct dpcls_rule *) * NETDEV_MAX_BURST; - memset(rules_test, 0, rules_size); - uint32_t matches_test = lookup_func(subtable, keys_map, keys, - rules_test); - - /* Ensure same packets matched against subtable. */ - if (matches_good != matches_test) { - VLOG_ERR("matches_good 0x%x != matches_test 0x%x in func %s\n", - matches_good, matches_test, lookup_funcs[i].name); - } - - /* Ensure rules matched are the same for scalar / others. */ - int j; - ULLONG_FOR_EACH_1 (j, matches_test) { - ovs_assert(rules_good[j] == rules_test[j]); - } - } - } - - return matches_good; -} - -dpcls_subtable_lookup_func -dpcls_subtable_autovalidator_probe(uint32_t u0 OVS_UNUSED, - uint32_t u1 OVS_UNUSED) -{ - /* Always return the same validator tester, it works for all subtables. */ - return dpcls_subtable_autovalidator; -} diff --git a/lib/dpif-netdev-lookup-avx512-gather.c b/lib/dpif-netdev-lookup-avx512-gather.c deleted file mode 100644 index b916b2487..000000000 --- a/lib/dpif-netdev-lookup-avx512-gather.c +++ /dev/null @@ -1,445 +0,0 @@ -/* - * Copyright (c) 2020, Intel Corporation. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at: - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#ifdef __x86_64__ -#if !defined(__CHECKER__) - -#include <config.h> - -#include "dpif-netdev.h" -#include "dpif-netdev-lookup.h" - -#include "cmap.h" -#include "flow.h" -#include "pvector.h" -#include "openvswitch/vlog.h" - -#include "immintrin.h" - -/* Each AVX512 register (zmm register in assembly notation) can contain up to - * 512 bits, which is equivalent to 8 uint64_t variables. This is the maximum - * number of miniflow blocks that can be processed in a single pass of the - * AVX512 code at a time. - */ -#define NUM_U64_IN_ZMM_REG (8) - -/* This implementation of AVX512 gather allows up to 16 blocks of MF data to be - * present in the blocks_cache, hence the multiply by 2 in the blocks count. - */ -#define MF_BLOCKS_PER_PACKET (NUM_U64_IN_ZMM_REG * 2) - -/* Blocks cache size is the maximum number of miniflow blocks that this - * implementation of lookup can handle. - */ -#define BLOCKS_CACHE_SIZE (NETDEV_MAX_BURST * MF_BLOCKS_PER_PACKET) - -/* The gather instruction can handle a scale for the size of the items to - * gather. For uint64_t data, this scale is 8. - */ -#define GATHER_SCALE_8 (8) - - -VLOG_DEFINE_THIS_MODULE(dpif_lookup_avx512_gather); - -static inline __m512i -_mm512_popcnt_epi64_manual(__m512i v_in) -{ - static const uint8_t pop_lut[64] = { - 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, - 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, - 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, - 0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, - }; - __m512i v_pop_lut = _mm512_loadu_si512(pop_lut); - - __m512i v_in_srl8 = _mm512_srli_epi64(v_in, 4); - __m512i v_nibble_mask = _mm512_set1_epi8(0xF); - __m512i v_in_lo = _mm512_and_si512(v_in, v_nibble_mask); - __m512i v_in_hi = _mm512_and_si512(v_in_srl8, v_nibble_mask); - - __m512i v_lo_pop = _mm512_shuffle_epi8(v_pop_lut, v_in_lo); - __m512i v_hi_pop = _mm512_shuffle_epi8(v_pop_lut, v_in_hi); - __m512i v_u8_pop = _mm512_add_epi8(v_lo_pop, v_hi_pop); - - return _mm512_sad_epu8(v_u8_pop, _mm512_setzero_si512()); -} - -/* Wrapper function required to enable ISA. First check if the compiler - * supports the ISA itself. If the ISA is supported, enable it via the - * attribute target. If the ISA is not supported by the compiler it indicates - * the compiler is too old or is not capable of compiling the requested ISA - * level, so fallback to the integer manual implementation. - */ -#if HAVE_AVX512VPOPCNTDQ -static inline __m512i -__attribute__((__target__("avx512vpopcntdq"))) -_mm512_popcnt_epi64_wrapper(__m512i v_in) -{ - return _mm512_popcnt_epi64(v_in); -} -#else -static inline __m512i -_mm512_popcnt_epi64_wrapper(__m512i v_in) -{ - return _mm512_popcnt_epi64_manual(v_in); -} -#endif - -static inline uint64_t -netdev_rule_matches_key(const struct dpcls_rule *rule, - const uint32_t mf_bits_total, - const uint64_t * block_cache) -{ - const uint64_t *keyp = miniflow_get_values(&rule->flow.mf); - const uint64_t *maskp = miniflow_get_values(&rule->mask->mf); - const uint32_t lane_mask = (1ULL << mf_bits_total) - 1; - - /* Always load a full cache line from blocks_cache. Other loads must be - * trimmed to the amount of data required for mf_bits_total blocks. - */ - uint32_t res_mask; - - /* To avoid a loop, we have two iterations of a block of code here. - * Note the scope brackets { } are used to avoid accidental variable usage - * in the second iteration. - */ - { - __m512i v_blocks = _mm512_loadu_si512(&block_cache[0]); - __m512i v_mask = _mm512_maskz_loadu_epi64(lane_mask, &maskp[0]); - __m512i v_key = _mm512_maskz_loadu_epi64(lane_mask, &keyp[0]); - __m512i v_data = _mm512_and_si512(v_blocks, v_mask); - res_mask = _mm512_mask_cmpeq_epi64_mask(lane_mask, v_data, v_key); - } - - if (mf_bits_total > 8) { - uint32_t lane_mask_gt8 = lane_mask >> 8; - __m512i v_blocks = _mm512_loadu_si512(&block_cache[8]); - __m512i v_mask = _mm512_maskz_loadu_epi64(lane_mask_gt8, &maskp[8]); - __m512i v_key = _mm512_maskz_loadu_epi64(lane_mask_gt8, &keyp[8]); - __m512i v_data = _mm512_and_si512(v_blocks, v_mask); - uint32_t c = _mm512_mask_cmpeq_epi64_mask(lane_mask_gt8, v_data, - v_key); - res_mask |= (c << 8); - } - - /* Returns 1 assuming result of SIMD compare is all blocks matching. */ - return res_mask == lane_mask; -} - -/* Takes u0 and u1 inputs, and gathers the next 8 blocks to be stored - * contiguously into the blocks cache. Note that the pointers and bitmasks - * passed into this function must be incremented for handling next 8 blocks. - * - * Register contents on entry: - * v_u0: register with all u64 lanes filled with u0 bits. - * v_u1: register with all u64 lanes filled with u1 bits. - * pkt_blocks: pointer to packet blocks. - * tbl_blocks: pointer to table blocks. - * tbl_mf_masks: pointer to miniflow bitmasks for this subtable. - * u1_bcast_msk: bitmask of lanes where u1 bits are used. - * pkt_mf_u0_pop: population count of bits in u0 of the packet. - * zero_mask: bitmask of lanes to zero as packet doesn't have mf bits set. - * u64_lanes_mask: bitmask of lanes to process. - * use_vpop: compile-time constant indicating if VPOPCNT instruction allowed. - */ -static inline ALWAYS_INLINE __m512i -avx512_blocks_gather(__m512i v_u0, - __m512i v_u1, - const void *pkt_blocks, - const void *tbl_blocks, - const void *tbl_mf_masks, - __mmask64 u1_bcast_msk, - const uint64_t pkt_mf_u0_pop, - __mmask64 zero_mask, - __mmask64 u64_lanes_mask, - const uint32_t use_vpop) -{ - /* Suggest to compiler to load tbl blocks ahead of gather(). */ - __m512i v_tbl_blocks = _mm512_maskz_loadu_epi64(u64_lanes_mask, - tbl_blocks); - - /* Blend u0 and u1 bits together for these 8 blocks. */ - __m512i v_pkt_bits = _mm512_mask_blend_epi64(u1_bcast_msk, v_u0, v_u1); - - /* Load pre-created tbl miniflow bitmasks, bitwise AND with them. */ - __m512i v_tbl_masks = _mm512_maskz_loadu_epi64(u64_lanes_mask, - tbl_mf_masks); - __m512i v_masks = _mm512_and_si512(v_pkt_bits, v_tbl_masks); - - /* Calculate AVX512 popcount for u64 lanes using the native instruction - * if available, or using emulation if not available. - */ - __m512i v_popcnts; - if (use_vpop) { - v_popcnts = _mm512_popcnt_epi64_wrapper(v_masks); - } else { - v_popcnts = _mm512_popcnt_epi64_manual(v_masks); - } - - /* Add popcounts and offset for u1 bits. */ - __m512i v_idx_u0_offset = _mm512_maskz_set1_epi64(u1_bcast_msk, - pkt_mf_u0_pop); - __m512i v_indexes = _mm512_add_epi64(v_popcnts, v_idx_u0_offset); - - /* Gather u64 blocks from packet miniflow. */ - __m512i v_zeros = _mm512_setzero_si512(); - __m512i v_blocks = _mm512_mask_i64gather_epi64(v_zeros, u64_lanes_mask, - v_indexes, pkt_blocks, - GATHER_SCALE_8); - - /* Mask pkt blocks with subtable blocks, k-mask to zero lanes. */ - __m512i v_masked_blocks = _mm512_maskz_and_epi64(zero_mask, v_blocks, - v_tbl_blocks); - return v_masked_blocks; -} - -static inline uint32_t ALWAYS_INLINE -avx512_lookup_impl(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules, - const uint32_t bit_count_u0, - const uint32_t bit_count_u1, - const uint32_t use_vpop) -{ - OVS_ALIGNED_VAR(CACHE_LINE_SIZE)uint64_t block_cache[BLOCKS_CACHE_SIZE]; - uint32_t hashes[NETDEV_MAX_BURST]; - - const uint32_t n_pkts = __builtin_popcountll(keys_map); - ovs_assert(NETDEV_MAX_BURST >= n_pkts); - - const uint32_t bit_count_total = bit_count_u0 + bit_count_u1; - const uint64_t bit_count_total_mask = (1ULL << bit_count_total) - 1; - - const uint64_t tbl_u0 = subtable->mask.mf.map.bits[0]; - const uint64_t tbl_u1 = subtable->mask.mf.map.bits[1]; - - const uint64_t *tbl_blocks = miniflow_get_values(&subtable->mask.mf); - const uint64_t *tbl_mf_masks = subtable->mf_masks; - - int i; - ULLONG_FOR_EACH_1 (i, keys_map) { - /* Create mask register with packet-specific u0 offset. - * Note that as 16 blocks can be handled in total, the width of the - * mask register must be >=16. - */ - const uint64_t pkt_mf_u0_bits = keys[i]->mf.map.bits[0]; - const uint64_t pkt_mf_u0_pop = __builtin_popcountll(pkt_mf_u0_bits); - const __mmask64 u1_bcast_mask = (UINT64_MAX << bit_count_u0); - - /* Broadcast u0, u1 bitmasks to 8x u64 lanes. */ - __m512i v_u0 = _mm512_set1_epi64(keys[i]->mf.map.bits[0]); - __m512i v_u1 = _mm512_set1_epi64(keys[i]->mf.map.bits[1]); - - /* Zero out bits that pkt doesn't have: - * - 2x pext() to extract bits from packet miniflow as needed by TBL - * - Shift u1 over by bit_count of u0, OR to create zero bitmask - */ - uint64_t u0_to_zero = _pext_u64(keys[i]->mf.map.bits[0], tbl_u0); - uint64_t u1_to_zero = _pext_u64(keys[i]->mf.map.bits[1], tbl_u1); - const uint64_t zero_mask_wip = (u1_to_zero << bit_count_u0) | - u0_to_zero; - const uint64_t zero_mask = zero_mask_wip & bit_count_total_mask; - - /* Get ptr to packet data blocks. */ - const uint64_t *pkt_blocks = miniflow_get_values(&keys[i]->mf); - - /* Store first 8 blocks cache, full cache line aligned. */ - __m512i v_blocks = avx512_blocks_gather(v_u0, v_u1, - &pkt_blocks[0], - &tbl_blocks[0], - &tbl_mf_masks[0], - u1_bcast_mask, - pkt_mf_u0_pop, - zero_mask, - bit_count_total_mask, - use_vpop); - _mm512_storeu_si512(&block_cache[i * MF_BLOCKS_PER_PACKET], v_blocks); - - if (bit_count_total > 8) { - /* Shift masks over by 8. - * Pkt blocks pointer remains 0, it is incremented by popcount. - * Move tbl and mf masks pointers forward. - * Increase offsets by 8. - * Re-run same gather code. - */ - uint64_t zero_mask_gt8 = (zero_mask >> 8); - uint64_t u1_bcast_mask_gt8 = (u1_bcast_mask >> 8); - uint64_t bit_count_gt8_mask = bit_count_total_mask >> 8; - - __m512i v_blocks_gt8 = avx512_blocks_gather(v_u0, v_u1, - &pkt_blocks[0], - &tbl_blocks[8], - &tbl_mf_masks[8], - u1_bcast_mask_gt8, - pkt_mf_u0_pop, - zero_mask_gt8, - bit_count_gt8_mask, - use_vpop); - _mm512_storeu_si512(&block_cache[(i * MF_BLOCKS_PER_PACKET) + 8], - v_blocks_gt8); - } - - } - - /* Hash the now linearized blocks of packet metadata. */ - ULLONG_FOR_EACH_1 (i, keys_map) { - uint64_t *block_ptr = &block_cache[i * MF_BLOCKS_PER_PACKET]; - uint32_t hash = hash_add_words64(0, block_ptr, bit_count_total); - hashes[i] = hash_finish(hash, bit_count_total * 8); - } - - /* Lookup: this returns a bitmask of packets where the hash table had - * an entry for the given hash key. Presence of a hash key does not - * guarantee matching the key, as there can be hash collisions. - */ - uint32_t found_map; - const struct cmap_node *nodes[NETDEV_MAX_BURST]; - found_map = cmap_find_batch(&subtable->rules, keys_map, hashes, nodes); - - /* Verify that packet actually matched rule. If not found, a hash - * collision has taken place, so continue searching with the next node. - */ - ULLONG_FOR_EACH_1 (i, found_map) { - struct dpcls_rule *rule; - - CMAP_NODE_FOR_EACH (rule, cmap_node, nodes[i]) { - const uint32_t cidx = i * MF_BLOCKS_PER_PACKET; - uint32_t match = netdev_rule_matches_key(rule, bit_count_total, - &block_cache[cidx]); - if (OVS_LIKELY(match)) { - rules[i] = rule; - subtable->hit_cnt++; - goto next; - } - } - - /* None of the found rules was a match. Clear the i-th bit to - * search for this key in the next subtable. */ - ULLONG_SET0(found_map, i); - next: - ; /* Keep Sparse happy. */ - } - - return found_map; -} - -/* Use a different pattern to conditionally use the VPOPCNTDQ target attribute - * here. - * The usual pattern using a '#if HAVE_AVX512VPOPCNTDQ' type check won't work - * inside a macro. - * Define VPOPCNTDQ_TARGET which will either be the "avx512vpopcntdq" target - * attribute or nothing depending on AVX512VPOPCNTDQ support in the compiler. - */ -#if HAVE_AVX512VPOPCNTDQ -#define VPOPCNTDQ_TARGET __attribute__((__target__("avx512vpopcntdq"))) -#else -#define VPOPCNTDQ_TARGET -#endif - -/* Expand out specialized functions with U0 and U1 bit attributes. As the - * AVX512 vpopcnt instruction is not supported on all AVX512 capable CPUs, - * create two functions for each miniflow signature. This allows the runtime - * CPU detection in probe() to select the ideal implementation. - */ -#define DECLARE_OPTIMIZED_LOOKUP_FUNCTION(U0, U1) \ - static uint32_t \ - dpcls_avx512_gather_mf_##U0##_##U1(struct dpcls_subtable *subtable, \ - uint32_t keys_map, \ - const struct netdev_flow_key *keys[], \ - struct dpcls_rule **rules) \ - { \ - const uint32_t use_vpop = 0; \ - return avx512_lookup_impl(subtable, keys_map, keys, rules, \ - U0, U1, use_vpop); \ - } \ - \ - static uint32_t VPOPCNTDQ_TARGET \ - dpcls_avx512_gather_mf_##U0##_##U1##_vpop(struct dpcls_subtable *subtable,\ - uint32_t keys_map, \ - const struct netdev_flow_key *keys[], \ - struct dpcls_rule **rules) \ - { \ - const uint32_t use_vpop = 1; \ - return avx512_lookup_impl(subtable, keys_map, keys, rules, \ - U0, U1, use_vpop); \ - } \ - -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(9, 4) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(9, 1) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(8, 1) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5, 3) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5, 2) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(5, 1) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 1) -DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 0) - -/* Check if a specialized function is valid for the required subtable. - * The use_vpop variable is used to decide if the VPOPCNT instruction can be - * used or not. - */ -#define CHECK_LOOKUP_FUNCTION(U0, U1, use_vpop) \ - ovs_assert((U0 + U1) <= (NUM_U64_IN_ZMM_REG * 2)); \ - if (!f && u0_bits == U0 && u1_bits == U1) { \ - if (use_vpop) { \ - f = dpcls_avx512_gather_mf_##U0##_##U1##_vpop; \ - } else { \ - f = dpcls_avx512_gather_mf_##U0##_##U1; \ - } \ - } - -static uint32_t -dpcls_avx512_gather_mf_any(struct dpcls_subtable *subtable, uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules) -{ - const uint32_t use_vpop = 0; - return avx512_lookup_impl(subtable, keys_map, keys, rules, - subtable->mf_bits_set_unit0, - subtable->mf_bits_set_unit1, - use_vpop); -} - -dpcls_subtable_lookup_func -dpcls_subtable_avx512_gather_probe__(uint32_t u0_bits, uint32_t u1_bits, - bool use_vpop) -{ - dpcls_subtable_lookup_func f = NULL; - - CHECK_LOOKUP_FUNCTION(9, 4, use_vpop); - CHECK_LOOKUP_FUNCTION(9, 1, use_vpop); - CHECK_LOOKUP_FUNCTION(8, 1, use_vpop); - CHECK_LOOKUP_FUNCTION(5, 3, use_vpop); - CHECK_LOOKUP_FUNCTION(5, 2, use_vpop); - CHECK_LOOKUP_FUNCTION(5, 1, use_vpop); - CHECK_LOOKUP_FUNCTION(4, 1, use_vpop); - CHECK_LOOKUP_FUNCTION(4, 0, use_vpop); - - /* Check if the _any looping version of the code can perform this miniflow - * lookup. Performance gain may be less pronounced due to non-specialized - * hashing, however there is usually a good performance win overall. - */ - if (!f && (u0_bits + u1_bits) < (NUM_U64_IN_ZMM_REG * 2)) { - f = dpcls_avx512_gather_mf_any; - VLOG_INFO_ONCE("Using non-specialized AVX512 lookup for subtable" - " (%d,%d) and possibly others.", u0_bits, u1_bits); - } - - return f; -} - -#endif /* CHECKER */ -#endif /* __x86_64__ */ diff --git a/lib/dpif-netdev-lookup.c b/lib/dpif-netdev-lookup.c deleted file mode 100644 index 4c1379aa5..000000000 --- a/lib/dpif-netdev-lookup.c +++ /dev/null @@ -1,193 +0,0 @@ -/* - * Copyright (c) 2020 Intel Corporation. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at: - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#include <config.h> -#include <errno.h> -#include "dpif-netdev-lookup.h" - -#include "cpu.h" -#include "openvswitch/vlog.h" - -VLOG_DEFINE_THIS_MODULE(dpif_netdev_lookup); -#define DPCLS_IMPL_AVX512_CHECK (__x86_64__ && HAVE_AVX512F \ - && HAVE_LD_AVX512_GOOD && HAVE_AVX512BW && __SSE4_2__) - -#if DPCLS_IMPL_AVX512_CHECK -static dpcls_subtable_lookup_func -dpcls_subtable_avx512_gather_probe(uint32_t u0_bits, uint32_t u1_bits) -{ - if (!cpu_has_isa(OVS_CPU_ISA_X86_AVX512F) - || !cpu_has_isa(OVS_CPU_ISA_X86_BMI2)) { - return NULL; - } - - return dpcls_subtable_avx512_gather_probe__(u0_bits, u1_bits, - cpu_has_isa(OVS_CPU_ISA_X86_VPOPCNTDQ)); -} -#endif - -/* Actual list of implementations goes here */ -static struct dpcls_subtable_lookup_info_t subtable_lookups[] = { - /* The autovalidator implementation will not be used by default, it must - * be enabled at compile time to be the default lookup implementation. The - * user may enable it at runtime using the normal "prio-set" command if - * desired. The compile time default switch is here to enable all unit - * tests to transparently run with the autovalidator. - */ -#ifdef DPCLS_AUTOVALIDATOR_DEFAULT - { .prio = 255, -#else - { .prio = 0, -#endif - .probe = dpcls_subtable_autovalidator_probe, - .name = "autovalidator", - .usage_cnt = ATOMIC_COUNT_INIT(0), }, - - /* The default scalar C code implementation. */ - { .prio = 1, - .probe = dpcls_subtable_generic_probe, - .name = "generic", - .usage_cnt = ATOMIC_COUNT_INIT(0), }, - -#if DPCLS_IMPL_AVX512_CHECK - /* Only available on x86_64 bit builds with SSE 4.2 used for OVS core. */ - { .prio = 0, - .probe = dpcls_subtable_avx512_gather_probe, - .name = "avx512_gather", - .usage_cnt = ATOMIC_COUNT_INIT(0), }, -#else - /* Disabling AVX512 at compile time, as compile time requirements not met. - * This could be due to a number of reasons: - * 1) core OVS is not compiled with SSE4.2 instruction set. - * The SSE42 instructions are required to use CRC32 ISA for high- - * performance hashing. Consider ./configure of OVS with -msse42 (or - * newer) to enable CRC32 hashing and higher performance. - * 2) The assembler in binutils versions 2.30 and 2.31 has bugs in AVX512 - * assembly. Compile time probes check for this assembler issue, and - * disable the HAVE_LD_AVX512_GOOD check if an issue is detected. - * Please upgrade binutils, or backport this binutils fix commit: - * 2069ccaf8dc28ea699bd901fdd35d90613e4402a - */ -#endif -}; - -int -dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr) -{ - if (out_ptr == NULL) { - return -1; - } - - *out_ptr = subtable_lookups; - return ARRAY_SIZE(subtable_lookups); -} - -/* sets the priority of the lookup function with "name". */ -int -dpcls_subtable_set_prio(const char *name, uint8_t priority) -{ - for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) { - if (strcmp(name, subtable_lookups[i].name) == 0) { - subtable_lookups[i].prio = priority; - VLOG_INFO("Subtable function '%s' set priority to %d\n", - name, priority); - return 0; - } - } - VLOG_WARN("Subtable function '%s' not found, failed to set priority\n", - name); - return -EINVAL; -} - -dpcls_subtable_lookup_func -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count, - struct dpcls_subtable_lookup_info_t **info) -{ - struct dpcls_subtable_lookup_info_t *best_info = NULL; - dpcls_subtable_lookup_func best_func = NULL; - int prio = -1; - - /* Iter over each subtable impl, and get highest priority one. */ - for (int i = 0; i < ARRAY_SIZE(subtable_lookups); i++) { - struct dpcls_subtable_lookup_info_t *impl_info = &subtable_lookups[i]; - dpcls_subtable_lookup_func probed_func; - - if (impl_info->prio <= prio) { - continue; - } - - probed_func = subtable_lookups[i].probe(u0_bit_count, - u1_bit_count); - if (!probed_func) { - continue; - } - - best_func = probed_func; - best_info = impl_info; - prio = impl_info->prio; - } - - /* Programming error - we must always return a valid func ptr. */ - ovs_assert(best_func != NULL && best_info != NULL); - - VLOG_DBG("Subtable lookup function '%s' with units (%d,%d), priority %d\n", - best_info->name, u0_bit_count, u1_bit_count, prio); - - if (info) { - *info = best_info; - } - return best_func; -} - -void -dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info) -{ - if (info) { - atomic_count_inc(&info->usage_cnt); - } -} - -void -dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info) -{ - if (info) { - atomic_count_dec(&info->usage_cnt); - } -} - -void -dpcls_impl_print_stats(struct ds *reply) -{ - struct dpcls_subtable_lookup_info_t *lookup_funcs = NULL; - int count = dpcls_subtable_lookup_info_get(&lookup_funcs); - - /* Add all DPCLS functions to reply string. */ - ds_put_cstr(reply, "Available dpcls implementations:\n"); - - for (int i = 0; i < count; i++) { - ds_put_format(reply, " %s (Use count: %d, Priority: %d", - lookup_funcs[i].name, - atomic_count_get(&lookup_funcs[i].usage_cnt), - lookup_funcs[i].prio); - - if (ds_last(reply) == ' ') { - ds_put_cstr(reply, "none"); - } - - ds_put_cstr(reply, ")\n"); - } - -} diff --git a/lib/dpif-netdev-lookup.h b/lib/dpif-netdev-lookup.h deleted file mode 100644 index ac6d97317..000000000 --- a/lib/dpif-netdev-lookup.h +++ /dev/null @@ -1,92 +0,0 @@ -/* - * Copyright (c) 2020 Intel Corporation. - * - * Licensed under the Apache License, Version 2.0 (the "License"); - * you may not use this file except in compliance with the License. - * You may obtain a copy of the License at: - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -#ifndef DPIF_NETDEV_LOOKUP_H -#define DPIF_NETDEV_LOOKUP_H 1 - -#include <config.h> -#include "dpif-netdev.h" -#include "dpif-netdev-private-dpcls.h" -#include "dpif-netdev-private-thread.h" - -/* Function to perform a probe for the subtable bit fingerprint. - * Returns NULL if not valid, or a valid function pointer to call for this - * subtable on success. - */ -typedef -dpcls_subtable_lookup_func (*dpcls_subtable_probe_func)(uint32_t u0_bit_count, - uint32_t u1_bit_count); - -/* Prototypes for subtable implementations */ -dpcls_subtable_lookup_func -dpcls_subtable_autovalidator_probe(uint32_t u0_bit_count, - uint32_t u1_bit_count); - -/* Probe function to select a specialized version of the generic lookup - * implementation. This provides performance benefit due to compile-time - * optimizations such as loop-unrolling. These are enabled by the compile-time - * constants in the specific function implementations. - */ -dpcls_subtable_lookup_func -dpcls_subtable_generic_probe(uint32_t u0_bit_count, uint32_t u1_bit_count); - -/* Probe function for AVX-512 gather implementation */ -dpcls_subtable_lookup_func -dpcls_subtable_avx512_gather_probe__(uint32_t u0_bit_cnt, uint32_t u1_bit_cnt, - bool use_vpop); - - -/* Subtable registration and iteration helpers */ -struct dpcls_subtable_lookup_info_t { - /* higher priority gets used over lower values. This allows deployments - * to select the best implementation for the use-case. - */ - uint8_t prio; - - /* Probe function: tests if the (u0,u1) combo is supported. If not - * supported, this function returns NULL. If supported, a function pointer - * is returned which when called will perform the lookup on the subtable. - */ - dpcls_subtable_probe_func probe; - - /* Human readable name, used in setting subtable priority commands */ - const char *name; - - /* Counter which holds the usage count of each implementations. */ - atomic_count usage_cnt; -}; - -int dpcls_subtable_set_prio(const char *name, uint8_t priority); -void dpcls_info_inc_usage(struct dpcls_subtable_lookup_info_t *info); -void dpcls_info_dec_usage(struct dpcls_subtable_lookup_info_t *info); - -/* Lookup the best subtable lookup implementation for the given u0,u1 count. */ -dpcls_subtable_lookup_func -dpcls_subtable_get_best_impl(uint32_t u0_bit_count, uint32_t u1_bit_count, - struct dpcls_subtable_lookup_info_t **info); - -/* Retrieve the array of lookup implementations for iteration. - * On error, returns a negative number. - * On success, returns the size of the arrays pointed to by the out parameter. - */ -int -dpcls_subtable_lookup_info_get(struct dpcls_subtable_lookup_info_t **out_ptr); - -/* Prints dpcls subtables in use for different implementations. */ -void -dpcls_impl_print_stats(struct ds *reply); - -#endif /* dpif-netdev-lookup.h */ diff --git a/lib/dpif-netdev-lookup-generic.c b/lib/dpif-netdev-private-dpcls.c similarity index 91% rename from lib/dpif-netdev-lookup-generic.c rename to lib/dpif-netdev-private-dpcls.c index 76f92dd5e..31e1a357e 100644 --- a/lib/dpif-netdev-lookup-generic.c +++ b/lib/dpif-netdev-private-dpcls.c @@ -17,7 +17,7 @@ #include <config.h> #include "dpif-netdev.h" -#include "dpif-netdev-lookup.h" +#include "dpif-netdev-private-dpcls.h" #include "bitmap.h" #include "cmap.h" @@ -31,7 +31,7 @@ #include "packets.h" #include "pvector.h" -VLOG_DEFINE_THIS_MODULE(dpif_lookup_generic); +VLOG_DEFINE_THIS_MODULE(dpif_netdev_dpcls); /* Lookup functions below depends on the internal structure of flowmap. */ BUILD_ASSERT_DECL(FLOWMAP_UNITS == 2); @@ -176,12 +176,12 @@ netdev_rule_matches_key(const struct dpcls_rule *rule, * compiler might decide to not inline, and performance will suffer. */ static inline uint32_t ALWAYS_INLINE -lookup_generic_impl(struct dpcls_subtable *subtable, - uint32_t keys_map, - const struct netdev_flow_key *keys[], - struct dpcls_rule **rules, - const uint32_t bit_count_u0, - const uint32_t bit_count_u1) +lookup_impl(struct dpcls_subtable *subtable, + uint32_t keys_map, + const struct netdev_flow_key *keys[], + struct dpcls_rule **rules, + const uint32_t bit_count_u0, + const uint32_t bit_count_u1) { const uint32_t n_pkts = count_1bits(keys_map); ovs_assert(NETDEV_MAX_BURST >= n_pkts); @@ -265,9 +265,9 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, * compilers available optimizations, this function has lower performance * than the below specialized functions. */ - return lookup_generic_impl(subtable, keys_map, keys, rules, - subtable->mf_bits_set_unit0, - subtable->mf_bits_set_unit1); + return lookup_impl(subtable, keys_map, keys, rules, + subtable->mf_bits_set_unit0, + subtable->mf_bits_set_unit1); } /* Expand out specialized functions with U0 and U1 bit attributes. */ @@ -279,7 +279,7 @@ dpcls_subtable_lookup_generic(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[],\ struct dpcls_rule **rules) \ { \ - return lookup_generic_impl(subtable, keys_map, keys, rules, U0, U1); \ + return lookup_impl(subtable, keys_map, keys, rules, U0, U1); \ } \ DECLARE_OPTIMIZED_LOOKUP_FUNCTION(9, 4) @@ -297,14 +297,9 @@ DECLARE_OPTIMIZED_LOOKUP_FUNCTION(4, 0) f = dpcls_subtable_lookup_mf_u0w##U0##_u1w##U1; \ } -/* Probe function to lookup an available specialized function. - * If capable to run the requested miniflow fingerprint, this function returns - * the most optimal implementation for that miniflow fingerprint. - * @retval Non-NULL A valid function to handle the miniflow bit pattern - * @retval NULL The requested miniflow is not supported by this implementation. - */ +/* Probe function to lookup an available specialized function. */ dpcls_subtable_lookup_func -dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) +dpcls_subtable_lookup_probe(uint32_t u0_bits, uint32_t u1_bits) { dpcls_subtable_lookup_func f = NULL; @@ -318,10 +313,10 @@ dpcls_subtable_generic_probe(uint32_t u0_bits, uint32_t u1_bits) CHECK_LOOKUP_FUNCTION(4, 0); if (f) { - VLOG_DBG("Subtable using Generic Optimized for u0 %d, u1 %d\n", + VLOG_DBG("Subtable using lookup function optimized for u0 %d, u1 %d\n", u0_bits, u1_bits); } else { - /* Always return the generic function. */ + /* Return generic function, if there is no specialized variant. */ f = dpcls_subtable_lookup_generic; } diff --git a/lib/dpif-netdev-private-dpcls.h b/lib/dpif-netdev-private-dpcls.h index bbf28bcdb..7949134bb 100644 --- a/lib/dpif-netdev-private-dpcls.h +++ b/lib/dpif-netdev-private-dpcls.h @@ -63,6 +63,12 @@ uint32_t (*dpcls_subtable_lookup_func)(struct dpcls_subtable *subtable, const struct netdev_flow_key *keys[], struct dpcls_rule **rules); +/* Probe function to lookup an available specialized lookup function. + * Returns the most optimal implementation for the miniflow fingerprint. + */ +dpcls_subtable_lookup_func dpcls_subtable_lookup_probe(uint32_t u0_bits, + uint32_t u1_bits); + /* A set of rules that all have the same fields wildcarded. */ struct dpcls_subtable { /* The fields are only used by writers. */ @@ -83,11 +89,8 @@ struct dpcls_subtable { /* The lookup function to use for this subtable. If there is a known * property of the subtable (eg: only 3 bits of miniflow metadata is * used for the lookup) then this can point at an optimized version of - * the lookup function for this particular subtable. The lookup function - * can be used at any time by a PMD thread, so it's declared as an atomic - * here to prevent garbage from being read. */ - ATOMIC(dpcls_subtable_lookup_func) lookup_func; - struct dpcls_subtable_lookup_info_t *lookup_func_info; + * the lookup function for this particular subtable. */ + dpcls_subtable_lookup_func lookup_func; /* Caches the masks to match a packet to, reducing runtime calculations. */ uint64_t *mf_masks; diff --git a/lib/dpif-netdev-private-flow.h b/lib/dpif-netdev-private-flow.h index 308c5113f..f05382626 100644 --- a/lib/dpif-netdev-private-flow.h +++ b/lib/dpif-netdev-private-flow.h @@ -18,9 +18,6 @@ #ifndef DPIF_NETDEV_PRIVATE_FLOW_H #define DPIF_NETDEV_PRIVATE_FLOW_H 1 -#include "dpif.h" -#include "dpif-netdev-private-dpcls.h" - #include <stdbool.h> #include <stdint.h> diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h index bc76c86d2..2ee855ca4 100644 --- a/lib/dpif-netdev-private-thread.h +++ b/lib/dpif-netdev-private-thread.h @@ -18,10 +18,6 @@ #ifndef DPIF_NETDEV_PRIVATE_THREAD_H #define DPIF_NETDEV_PRIVATE_THREAD_H 1 -#include "dpif.h" -#include "dpif-netdev-perf.h" -#include "dpif-netdev-private-dfc.h" - #include <stdbool.h> #include <stdint.h> diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man index 2b2450884..c78a87550 100644 --- a/lib/dpif-netdev-unixctl.man +++ b/lib/dpif-netdev-unixctl.man @@ -229,15 +229,3 @@ recirculation (only in balance-tcp mode). When this is the case, the above command prints the load-balancing information of the bonds configured in datapath \fIdp\fR showing the interface associated with each bucket (hash). -. -.IP "\fBdpif-netdev/subtable-lookup-prio-get\fR" -Lists the DPCLS implementations or lookup functions that are available as well -as their priorities. -. -.IP "\fBdpif-netdev/subtable-lookup-prio-set\fR \fIlookup_function\fR \ -\fIprio\fR" -Sets the priority of a lookup function by name, \fIlookup_function\fR, and -priority, \fIprio\fR, which should be a positive integer value. The highest -priority lookup function is used for classification. - -The number of affected dpcls ports and subtables is returned. diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 55507f797..73565838f 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -42,7 +42,6 @@ #include "csum.h" #include "dp-packet.h" #include "dpif.h" -#include "dpif-netdev-lookup.h" #include "dpif-netdev-perf.h" #include "dpif-netdev-private-dfc.h" #include "dpif-netdev-private-dpcls.h" @@ -201,7 +200,6 @@ struct dp_packet_flow_map { static void dpcls_init(struct dpcls *); static void dpcls_destroy(struct dpcls *); static void dpcls_sort_subtable_vector(struct dpcls *); -static uint32_t dpcls_subtable_lookup_reprobe(struct dpcls *cls); static void dpcls_insert(struct dpcls *, struct dpcls_rule *, const struct netdev_flow_key *mask); static void dpcls_remove(struct dpcls *, struct dpcls_rule *); @@ -932,98 +930,6 @@ sorted_poll_thread_list(struct dp_netdev *dp, *n = k; } -static void -dpif_netdev_subtable_lookup_get(struct unixctl_conn *conn, int argc OVS_UNUSED, - const char *argv[] OVS_UNUSED, - void *aux OVS_UNUSED) -{ - struct ds reply = DS_EMPTY_INITIALIZER; - - dpcls_impl_print_stats(&reply); - unixctl_command_reply(conn, ds_cstr(&reply)); - ds_destroy(&reply); -} - -static void -dpif_netdev_subtable_lookup_set(struct unixctl_conn *conn, int argc OVS_UNUSED, - const char *argv[], void *aux OVS_UNUSED) -{ - /* This function requires 2 parameters (argv[1] and argv[2]) to execute. - * argv[1] is subtable name - * argv[2] is priority - */ - const char *func_name = argv[1]; - - errno = 0; - char *err_char; - uint32_t new_prio = strtoul(argv[2], &err_char, 10); - uint32_t lookup_dpcls_changed = 0; - uint32_t lookup_subtable_changed = 0; - struct shash_node *node; - if (errno != 0 || new_prio > UINT8_MAX) { - unixctl_command_reply_error(conn, - "error converting priority, use integer in range 0-255\n"); - return; - } - - int32_t err = dpcls_subtable_set_prio(func_name, new_prio); - if (err) { - unixctl_command_reply_error(conn, - "error, subtable lookup function not found\n"); - return; - } - - ovs_mutex_lock(&dp_netdev_mutex); - SHASH_FOR_EACH (node, &dp_netdevs) { - struct dp_netdev *dp = node->data; - - /* Get PMD threads list, required to get DPCLS instances. */ - size_t n; - struct dp_netdev_pmd_thread **pmd_list; - sorted_poll_thread_list(dp, &pmd_list, &n); - - /* take port mutex as HMAP iters over them. */ - ovs_rwlock_rdlock(&dp->port_rwlock); - - for (size_t i = 0; i < n; i++) { - struct dp_netdev_pmd_thread *pmd = pmd_list[i]; - if (pmd->core_id == NON_PMD_CORE_ID) { - continue; - } - - struct dp_netdev_port *port = NULL; - HMAP_FOR_EACH (port, node, &dp->ports) { - odp_port_t in_port = port->port_no; - struct dpcls *cls = dp_netdev_pmd_lookup_dpcls(pmd, in_port); - if (!cls) { - continue; - } - ovs_mutex_lock(&pmd->flow_mutex); - uint32_t subtbl_changes = dpcls_subtable_lookup_reprobe(cls); - ovs_mutex_unlock(&pmd->flow_mutex); - if (subtbl_changes) { - lookup_dpcls_changed++; - lookup_subtable_changed += subtbl_changes; - } - } - } - - /* release port mutex before netdev mutex. */ - ovs_rwlock_unlock(&dp->port_rwlock); - free(pmd_list); - } - ovs_mutex_unlock(&dp_netdev_mutex); - - struct ds reply = DS_EMPTY_INITIALIZER; - ds_put_format(&reply, - "Lookup priority change affected %d dpcls ports and %d subtables.\n", - lookup_dpcls_changed, lookup_subtable_changed); - const char *reply_str = ds_cstr(&reply); - unixctl_command_reply(conn, reply_str); - VLOG_INFO("%s", reply_str); - ds_destroy(&reply); -} - static void dpif_netdev_pmd_rebalance(struct unixctl_conn *conn, int argc, const char *argv[], void *aux OVS_UNUSED) @@ -1290,16 +1196,6 @@ dpif_netdev_init(void) unixctl_command_register("dpif-netdev/bond-show", "[dp]", 0, 1, dpif_netdev_bond_show, NULL); - unixctl_command_register("dpif-netdev/subtable-lookup-prio-set", - "[lookup_func] [prio]", - 2, 2, dpif_netdev_subtable_lookup_set, - NULL); - unixctl_command_register("dpif-netdev/subtable-lookup-info-get", "", - 0, 0, dpif_netdev_subtable_lookup_get, - NULL); - unixctl_command_register("dpif-netdev/subtable-lookup-prio-get", NULL, - 0, 0, dpif_netdev_subtable_lookup_get, - NULL); return 0; } @@ -9142,7 +9038,6 @@ dpcls_destroy_subtable(struct dpcls *cls, struct dpcls_subtable *subtable) pvector_remove(&cls->subtables, subtable); cmap_remove(&cls->subtables_map, &subtable->cmap_node, subtable->mask.hash); - dpcls_info_dec_usage(subtable->lookup_func_info); ovsrcu_postpone(dpcls_subtable_destroy_cb, subtable); } @@ -9188,14 +9083,8 @@ dpcls_create_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) /* Get the preferred subtable search function for this (u0,u1) subtable. * The function is guaranteed to always return a valid implementation, and - * possibly an ISA optimized, and/or specialized implementation. Initialize - * the subtable search function atomically to avoid garbage data being read - * by the PMD thread. - */ - atomic_init(&subtable->lookup_func, - dpcls_subtable_get_best_impl(unit0, unit1, - &subtable->lookup_func_info)); - dpcls_info_inc_usage(subtable->lookup_func_info); + * possibly a specialized implementation. */ + subtable->lookup_func = dpcls_subtable_lookup_probe(unit0, unit1); cmap_insert(&cls->subtables_map, &subtable->cmap_node, mask->hash); /* Add the new subtable at the end of the pvector (with no hits yet) */ @@ -9221,47 +9110,6 @@ dpcls_find_subtable(struct dpcls *cls, const struct netdev_flow_key *mask) return dpcls_create_subtable(cls, mask); } -/* Checks for the best available implementation for each subtable lookup - * function, and assigns it as the lookup function pointer for each subtable. - * Returns the number of subtables that have changed lookup implementation. - * This function requires holding a flow_mutex when called. This is to make - * sure modifications done by this function are not overwritten. This could - * happen if dpcls_sort_subtable_vector() is called at the same time as this - * function. - */ -static uint32_t -dpcls_subtable_lookup_reprobe(struct dpcls *cls) -{ - struct pvector *pvec = &cls->subtables; - uint32_t subtables_changed = 0; - struct dpcls_subtable *subtable = NULL; - - PVECTOR_FOR_EACH (subtable, pvec) { - uint32_t u0_bits = subtable->mf_bits_set_unit0; - uint32_t u1_bits = subtable->mf_bits_set_unit1; - void *old_func = subtable->lookup_func; - struct dpcls_subtable_lookup_info_t *old_info; - old_info = subtable->lookup_func_info; - /* Set the subtable lookup function atomically to avoid garbage data - * being read by the PMD thread. */ - atomic_store_relaxed(&subtable->lookup_func, - dpcls_subtable_get_best_impl(u0_bits, u1_bits, - &subtable->lookup_func_info)); - if (old_func != subtable->lookup_func) { - subtables_changed += 1; - } - - if (old_info != subtable->lookup_func_info) { - /* In theory, functions can be shared between implementations, so - * do an explicit check on the function info structures. */ - dpcls_info_dec_usage(old_info); - dpcls_info_inc_usage(subtable->lookup_func_info); - } - } - - return subtables_changed; -} - /* Periodically sort the dpcls subtable vectors according to hit counts */ static void dpcls_sort_subtable_vector(struct dpcls *cls) diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4 index a83299f4b..f9b73945b 100644 --- a/m4/openvswitch.m4 +++ b/m4/openvswitch.m4 @@ -287,76 +287,6 @@ AC_DEFUN([OVS_CHECK_SPHINX], AC_ARG_VAR([SPHINXBUILD]) AM_CONDITIONAL([HAVE_SPHINX], [test "$SPHINXBUILD" != none])]) - -dnl Checks whether the build system implements the vpopcntdq instruction. The -dnl compiler and assembler each separately need to support vpopcntdq. In order -dnl to test the assembler with the below code snippet, set the optimization -dnl level of the function to "O0" so it won't be optimized away by the -dnl compiler. -AC_DEFUN([OVS_CHECK_AVX512VPOPCNTDQ], [ - AC_MSG_CHECKING([whether compiler correctly emits AVX512-VPOPCNTDQ]) - AC_COMPILE_IFELSE( - [AC_LANG_PROGRAM([#include <immintrin.h> - void - __attribute__((__target__("avx512vpopcntdq"))) - __attribute__((optimize("O0"))) - check_vpopcntdq(void) - { - __m512i v_test; - v_test = _mm512_popcnt_epi64(v_test); - }],[])], - [AC_MSG_RESULT([yes]) - ovs_cv_avx512vpopcntdq_good=yes], - [AC_MSG_RESULT([no]) - ovs_cv_avx512vpopcntdq_good=no]) - if test "$ovs_cv_avx512vpopcntdq_good" = yes; then - AC_DEFINE([HAVE_AVX512VPOPCNTDQ], [1], - [Define to 1 if the build system implements the vpopcntdq - instruction.]) - fi - AM_CONDITIONAL([HAVE_AVX512VPOPCNTDQ], - [test "$ovs_cv_avx512vpopcntdq_good" = yes])]) - -dnl Checks for binutils/assembler known issue with AVX512. -dnl Due to backports, we probe assembling a reproducer instead of checking -dnl binutils version string. More details, including ASM dumps and debug here: -dnl GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90028 -dnl The checking of binutils funcationality instead of LD version is similar -dnl to as how DPDK proposes to solve this issue: -dnl http://patches.dpdk.org/patch/71723/ -AC_DEFUN([OVS_CHECK_BINUTILS_AVX512], - [OVS_CHECK_CC_OPTION( - [-mavx512f], - [AC_CACHE_CHECK( - [binutils avx512 assembler checks passing], - [ovs_cv_binutils_avx512_good], - [dnl Assemble a short snippet to test for issue in "build-aux" dir: - mkdir -p build-aux - OBJFILE=build-aux/binutils_avx512_check.o - GATHER_PARAMS='0x8(,%ymm1,1),%ymm0{%k2}' - if ($CC -dumpmachine | grep x86_64) >/dev/null 2>&1; then - echo "vpgatherqq $GATHER_PARAMS" | as --64 -o $OBJFILE - - if (objdump -d --no-show-raw-insn $OBJFILE | grep -q $GATHER_PARAMS) >/dev/null 2>&1; then - ovs_cv_binutils_avx512_good=yes - else - ovs_cv_binutils_avx512_good=no - dnl Explicitly disallow avx512f to stop compiler auto-vectorizing - dnl and causing zmm usage with buggy binutils versions. - CFLAGS="$CFLAGS -mno-avx512f" - fi - rm $OBJFILE - else - dnl non x86_64 architectures don't have avx512, so not affected - ovs_cv_binutils_avx512_good=no - fi])], - [ovs_cv_binutils_avx512_good=no]) - if test "$ovs_cv_binutils_avx512_good" = yes; then - AC_DEFINE([HAVE_LD_AVX512_GOOD], [1], - [Define to 1 if binutils correctly supports AVX512.]) - fi - AM_CONDITIONAL([HAVE_LD_AVX512_GOOD], - [test "$ovs_cv_binutils_avx512_good" = yes])]) - dnl Checks for dot. AC_DEFUN([OVS_CHECK_DOT], [AC_CACHE_CHECK( diff --git a/tests/pmd.at b/tests/pmd.at index 677d0feb1..4f1f7a4e8 100644 --- a/tests/pmd.at +++ b/tests/pmd.at @@ -1182,74 +1182,6 @@ AT_CHECK([ovs-appctl dpctl/del-dp dummy@dp0], [0], [dnl OVS_VSWITCHD_STOP AT_CLEANUP -AT_SETUP([PMD - dpcls configuration]) -OVS_VSWITCHD_START([], [], [], [--dummy-numa 0,0]) -AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dummy-pmd]) - -AT_CHECK([ovs-vsctl show], [], [stdout]) -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 3], [0], [dnl -Lookup priority change affected 0 dpcls ports and 0 subtables. -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-info-get | grep autovalidator], [], [dnl - autovalidator (Use count: 0, Priority: 3) -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 4], [0], [dnl -Lookup priority change affected 0 dpcls ports and 0 subtables. -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-info-get | grep generic], [], [dnl - generic (Use count: 0, Priority: 4) -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 8], [0], [dnl -Lookup priority change affected 0 dpcls ports and 0 subtables. -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-info-get | grep generic], [], [dnl - generic (Use count: 0, Priority: 8) -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator 8], [0], [dnl -Lookup priority change affected 0 dpcls ports and 0 subtables. -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-info-get | grep autovalidator], [], [dnl - autovalidator (Use count: 0, Priority: 8) -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 0], [0], [dnl -Lookup priority change affected 0 dpcls ports and 0 subtables. -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-info-get | grep generic], [], [dnl - generic (Use count: 0, Priority: 0) -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 255], [0], [dnl -Lookup priority change affected 0 dpcls ports and 0 subtables. -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-info-get | grep generic], [], [dnl - generic (Use count: 0, Priority: 255) -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic -1], [2], -[], [dnl -error converting priority, use integer in range 0-255 -ovs-appctl: ovs-vswitchd: server returned an error -]) - -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set generic 300], [2], -[], [dnl -error converting priority, use integer in range 0-255 -ovs-appctl: ovs-vswitchd: server returned an error -]) - -OVS_VSWITCHD_STOP -AT_CLEANUP - AT_SETUP([PMD - pmd sleep]) OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy-pmd options:n_rxq=8 options:numa_id=1], [], [], [--dummy-numa 0,0,0,1,1,8,8]) -- 2.53.0 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
