On 18.12.2019 21:31, Yi-Hung Wei wrote: > Currently, the AF_XDP socket (XSK) related memory are allocated by main > thread in the main thread's NUMA domain. With the patch that detects > netdev-linux's NUMA node id, the PMD thread of AF_XDP port will be run on > the AF_XDP netdev's NUMA domain. If the net device's NUMA domain > is different from the main thread's NUMA domain, we will have two > cross-NUMA memory accesses (netdev <-> memory, memory <-> CPU). > > This patch addresses the aforementioned issue by allocating > the memory in the net device's NUMA domain. > > Signed-off-by: Yi-Hung Wei <[email protected]> > --- > v8: > - Addreess review comments from Eelco and Ilya in patch 2. > * Use OVS_FIND_DEPENDENCY(). > * Avoid the locking issue when calling netdev_get_numa_id(). > * Check NETDEV_NUMA_UNSPEC. > * Use return value from netdev_get_numa_id() directly, and > check NETDEV_NUMA_UNSPEC case. > * Use numa_set_preferred(). > > --- > Documentation/intro/install/afxdp.rst | 2 +- > acinclude.m4 | 2 ++ > include/sparse/automake.mk | 1 + > include/sparse/numa.h | 27 +++++++++++++++++++++++++++ > lib/netdev-afxdp.c | 13 +++++++++++++ > 5 files changed, 44 insertions(+), 1 deletion(-) > create mode 100644 include/sparse/numa.h > > diff --git a/Documentation/intro/install/afxdp.rst > b/Documentation/intro/install/afxdp.rst > index 7b0736c96114..c4685fa7ebac 100644 > --- a/Documentation/intro/install/afxdp.rst > +++ b/Documentation/intro/install/afxdp.rst > @@ -164,7 +164,7 @@ If a test case fails, check the log at:: > > Setup AF_XDP netdev > ------------------- > -Before running OVS with AF_XDP, make sure the libbpf and libelf are > +Before running OVS with AF_XDP, make sure the libbpf, libelf, and libnuma are > set-up right:: > > ldd vswitchd/ovs-vswitchd > diff --git a/acinclude.m4 b/acinclude.m4 > index 542637ac8cb8..f73dc9bf7e3c 100644 > --- a/acinclude.m4 > +++ b/acinclude.m4 > @@ -286,6 +286,8 @@ AC_DEFUN([OVS_CHECK_LINUX_AF_XDP], [ > AC_CHECK_FUNCS([pthread_spin_lock], [], > [AC_MSG_ERROR([unable to find pthread_spin_lock for AF_XDP support])]) > > + OVS_FIND_DEPENDENCY([numa_alloc_onnode], [numa], [libnuma]) > + > AC_DEFINE([HAVE_AF_XDP], [1], > [Define to 1 if AF_XDP support is available and enabled.]) > LIBBPF_LDADD=" -lbpf -lelf" > diff --git a/include/sparse/automake.mk b/include/sparse/automake.mk > index 073631e8c082..974ad3fe55f7 100644 > --- a/include/sparse/automake.mk > +++ b/include/sparse/automake.mk > @@ -5,6 +5,7 @@ noinst_HEADERS += \ > include/sparse/bits/floatn.h \ > include/sparse/assert.h \ > include/sparse/math.h \ > + include/sparse/numa.h \ > include/sparse/netinet/in.h \ > include/sparse/netinet/ip6.h \ > include/sparse/netpacket/packet.h \ > diff --git a/include/sparse/numa.h b/include/sparse/numa.h > new file mode 100644 > index 000000000000..3691a0eaf729 > --- /dev/null > +++ b/include/sparse/numa.h > @@ -0,0 +1,27 @@ > +/* > + * Copyright (c) 2019 Nicira, Inc. > + * > + * Licensed under the Apache License, Version 2.0 (the "License"); > + * you may not use this file except in compliance with the License. > + * You may obtain a copy of the License at: > + * > + * http://www.apache.org/licenses/LICENSE-2.0 > + * > + * Unless required by applicable law or agreed to in writing, software > + * distributed under the License is distributed on an "AS IS" BASIS, > + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > + * See the License for the specific language governing permissions and > + * limitations under the License. > + */ > + > +#ifndef __CHECKER__ > +#error "Use this header only with sparse. It is not a correct > implementation." > +#endif > + > +/* Avoid sparse warning: non-ANSI function declaration of function" */ > +#define numa_get_membind_compat() numa_get_membind_compat(void) > +#define numa_get_interleave_mask_compat() > numa_get_interleave_mask_compat(void) > +#define numa_get_run_node_mask_compat() numa_get_run_node_mask_compat(void) > + > +/* Get actual <numa.h> definitions for us to annotate and build on. */ > +#include_next<numa.h> > diff --git a/lib/netdev-afxdp.c b/lib/netdev-afxdp.c > index 91b70b298e57..9a7dd8208f8f 100644 > --- a/lib/netdev-afxdp.c > +++ b/lib/netdev-afxdp.c > @@ -26,6 +26,7 @@ > #include <linux/rtnetlink.h> > #include <linux/if_xdp.h> > #include <net/if.h> > +#include <numa.h> > #include <poll.h> > #include <stdlib.h> > #include <sys/resource.h> > @@ -661,6 +662,14 @@ netdev_afxdp_reconfigure(struct netdev *netdev) > struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; > int err = 0; > > + /* Allocate all the xsk related memory in the netdev's NUMA domain. */ > + struct bitmask *old_bm = NULL; > + int numa_id = netdev_get_numa_id(netdev); > + if (numa_id != NETDEV_NUMA_UNSPEC) { > + old_bm = numa_get_membind(); > + numa_set_preferred(numa_id); > + } > + > ovs_mutex_lock(&dev->mutex); > > if (netdev->n_rxq == dev->requested_n_rxq > @@ -692,6 +701,10 @@ netdev_afxdp_reconfigure(struct netdev *netdev) > netdev_change_seq_changed(netdev); > out: > ovs_mutex_unlock(&dev->mutex); > + if (old_bm) { > + numa_set_membind(old_bm);
This will not return previous numa policy, it will set policy to membind, which might be not expected by the user. I don't see a valid wrapper for that, so it seems like the only way is to use get/set_mempolicy directly for restoring the original memory policy. BTW, you're not allowed to use any libnuma functions if !numa_available(). You need to check it first somewhere. > + numa_bitmask_free(old_bm); > + } > return err; > } > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
