I hit send too quick Alessandro; one clarification inline On Wed, Mar 28, 2018 at 9:13 AM, Darrell Ball <dlu...@gmail.com> wrote:
> Another aspect (besides what Ilya mentioned) you might want to check is to > look at OVS patchwork for your patches, > after you submit, and check that they are there, firstly. > Also check that they look like other accepted patches overall and for > chunks of similar code constructs. > > https://patchwork.ozlabs.org/project/openvswitch/list/ > > Check that your patches can be applied on top of an updated master branch > of OVS. > > I did a quick pass over the raw diff and noticed that in many cases you > are already using lots of OVS apis which good. > > A few pointers: > 1/ Try to use inline functions as much as possible, instead of macros > 2/ Think about portability - Don't use direct calls to pthread_ apis for > example > I am specifically referring to the locking apis, like pthread_spin_ 3/ Create wrappers for new locks that use generic OVS lock apis > 4/ Clearly describe any build dependencies, if any, in the install guide > documentation. > 5/ Think about portability for parts of the code and look how that is > handled in other cases. > 6/ I think it would be helpful for you to describe one or more use cases > for netmap, for the general user. > 7/ Think about testing and see what we can do to automate - we have system > tests that run with > make check-kmod and make check-system-userspace > Existing files are tests/system-traffic.at and tests/system-ovn.at, > which is shared for Linux and userspace datapath > 8/ You might want to describe some tests results, including performance > numbers in the cover letter. > > Cheers Darrell > > > On Wed, Mar 28, 2018 at 1:50 AM, Alessandro Rosetti < > alessandro.rose...@gmail.com> wrote: > >> Hi Darrell, Ilya and everyone else, >> >> I'm contacting you since you were interested. >> I've posted the patch that implements netmap in OVS attaching the file in >> the mail, did I do it wrong? >> https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/345371.html >> >> I'm posting it inline now, >> sorry for the mess! >> >> Alessandro. >> >> ---------------------------------------------------------------------- >> >> diff --git a/acinclude.m4 b/acinclude.m4 >> index d61e37a5e..d9dd9fbd1 100644 >> --- a/acinclude.m4 >> +++ b/acinclude.m4 >> @@ -341,6 +341,36 @@ AC_DEFUN([OVS_CHECK_DPDK], [ >> AM_CONDITIONAL([DPDK_NETDEV], test "$DPDKLIB_FOUND" = true) >> ]) >> >> +dnl OVS_CHECK_NETMAP >> +dnl >> +dnl Check netmap >> +AC_DEFUN([OVS_CHECK_NETMAP], [ >> + AC_ARG_WITH([netmap], >> + [AC_HELP_STRING([--with-netmap], [Enable NETMAP])], >> + [have_netmap=true]) >> + AC_MSG_CHECKING([whether netmap datapath is enabled]) >> + >> + if test "$have_netmap" != true || test "$with_netmap" = no; then >> + AC_MSG_RESULT([no]) >> + else >> + AC_MSG_RESULT([yes]) >> + NETMAP_FOUND=false >> + AC_LINK_IFELSE( >> + [AC_LANG_PROGRAM([#include <net/if.h> >> + #include<netinet/in.h> >> + #include<net/netmap.h> >> + #include<net/netmap_user.h>], [])], >> + [NETMAP_FOUND=true]) >> + if $NETMAP_FOUND; then >> + AC_DEFINE([NETMAP_NETDEV], [1], [NETMAP datapath is enabled.]) >> + else >> + AC_MSG_ERROR([Could not find NETMAP headers]) >> + fi >> + fi >> + >> + AM_CONDITIONAL([NETMAP_NETDEV], test "$NETMAP_FOUND" = true) >> +]) >> + >> dnl OVS_GREP_IFELSE(FILE, REGEX, [IF-MATCH], [IF-NO-MATCH]) >> dnl >> dnl Greps FILE for REGEX. If it matches, runs IF-MATCH, otherwise >> IF-NO-MATCH. >> @@ -900,7 +930,7 @@ dnl with or without modifications, as long as this >> notice is preserved. >> >> AC_DEFUN([_OVS_CHECK_CC_OPTION], [dnl >> m4_define([ovs_cv_name], [ovs_cv_[]m4_translit([$1], [-= ], [__])])dnl >> - AC_CACHE_CHECK([whether $CC accepts $1], [ovs_cv_name], >> + AC_CACHE_CHECK([whether $CC accepts $1], [ovs_cv_name], >> [ovs_save_CFLAGS="$CFLAGS" >> dnl Include -Werror in the compiler options, because without -Werror >> dnl clang's GCC-compatible compiler driver does not return a failure >> @@ -951,7 +981,7 @@ dnl OVS_ENABLE_OPTION([OPTION]) >> dnl Check whether the given C compiler OPTION is accepted. >> dnl If so, add it to WARNING_FLAGS. >> dnl Example: OVS_ENABLE_OPTION([-Wdeclaration-after-statement]) >> -AC_DEFUN([OVS_ENABLE_OPTION], >> +AC_DEFUN([OVS_ENABLE_OPTION], >> [OVS_CHECK_CC_OPTION([$1], [WARNING_FLAGS="$WARNING_FLAGS $1"]) >> AC_SUBST([WARNING_FLAGS])]) >> >> diff --git a/configure.ac b/configure.ac >> index 9940a1a45..24cd4718c 100644 >> --- a/configure.ac >> +++ b/configure.ac >> @@ -180,6 +180,7 @@ AC_SUBST(KARCH) >> OVS_CHECK_LINUX >> OVS_CHECK_LINUX_TC >> OVS_CHECK_DPDK >> +OVS_CHECK_NETMAP >> OVS_CHECK_PRAGMA_MESSAGE >> AC_SUBST([OVS_CFLAGS]) >> AC_SUBST([OVS_LDFLAGS]) >> diff --git a/lib/automake.mk b/lib/automake.mk >> index 5c26e0f33..4ccd9e22a 100644 >> --- a/lib/automake.mk >> +++ b/lib/automake.mk >> @@ -134,12 +134,14 @@ lib_libopenvswitch_la_SOURCES = \ >> lib/namemap.c \ >> lib/netdev-dpdk.h \ >> lib/netdev-dummy.c \ >> + lib/netdev-netmap.h \ >> lib/netdev-provider.h \ >> lib/netdev-vport.c \ >> lib/netdev-vport.h \ >> lib/netdev-vport-private.h \ >> lib/netdev.c \ >> lib/netdev.h \ >> + lib/netmap.h \ >> lib/netflow.h \ >> lib/netlink.c \ >> lib/netlink.h \ >> @@ -403,6 +405,15 @@ lib_libopenvswitch_la_SOURCES += \ >> lib/dpdk-stub.c >> endif >> >> +if NETMAP_NETDEV >> +lib_libopenvswitch_la_SOURCES += \ >> + lib/netmap.c \ >> + lib/netdev-netmap.c >> +else >> +lib_libopenvswitch_la_SOURCES += \ >> + lib/netmap-stub.c >> +endif >> + >> if WIN32 >> lib_libopenvswitch_la_SOURCES += \ >> lib/dpif-netlink.c \ >> diff --git a/lib/dp-packet.c b/lib/dp-packet.c >> index 443c22504..e917e6d6a 100644 >> --- a/lib/dp-packet.c >> +++ b/lib/dp-packet.c >> @@ -92,6 +92,7 @@ dp_packet_use_const(struct dp_packet *b, const void >> *data, size_t size) >> dp_packet_set_size(b, size); >> } >> >> + >> /* Initializes 'b' as an empty dp_packet that contains the 'allocated' >> bytes. >> * DPDK allocated dp_packet and *data is allocated from one continous >> memory >> * region as part of memory pool, so in memory data start right after >> @@ -105,6 +106,19 @@ dp_packet_init_dpdk(struct dp_packet *b, size_t >> allocated) >> b->source = DPBUF_DPDK; >> } >> >> +/* Initializes 'b' as a dp_packet whose data points to a netmap buffer >> of size >> + * 'size' bytes. */ >> +#ifdef NETMAP_NETDEV >> +void >> +dp_packet_init_netmap(struct dp_packet *b, void *data, size_t size) >> +{ >> + b->source = DPBUF_NETMAP; >> + dp_packet_set_base(b, data); >> + dp_packet_set_data(b, data); >> + dp_packet_set_size(b, size); >> +} >> +#endif >> + >> /* Initializes 'b' as an empty dp_packet with an initial capacity of >> 'size' >> * bytes. */ >> void >> @@ -125,6 +139,11 @@ dp_packet_uninit(struct dp_packet *b) >> /* If this dp_packet was allocated by DPDK it must have been >> * created as a dp_packet */ >> free_dpdk_buf((struct dp_packet*) b); >> +#endif >> + } else if (b->source == DPBUF_NETMAP) { >> +#ifdef NETMAP_NETDEV >> + /* If this dp_packet was allocated by NETMAP, release it. */ >> + netmap_free_packet(b); >> #endif >> } >> } >> @@ -241,6 +260,9 @@ dp_packet_resize__(struct dp_packet *b, size_t >> new_headroom, size_t new_tailroom >> case DPBUF_DPDK: >> OVS_NOT_REACHED(); >> >> + case DPBUF_NETMAP: >> + OVS_NOT_REACHED(); >> + >> case DPBUF_MALLOC: >> if (new_headroom == dp_packet_headroom(b)) { >> new_base = xrealloc(dp_packet_base(b), new_allocated); >> diff --git a/lib/dp-packet.h b/lib/dp-packet.h >> index 21c8ca525..bd7832533 100644 >> --- a/lib/dp-packet.h >> +++ b/lib/dp-packet.h >> @@ -26,6 +26,7 @@ >> #endif >> >> #include "netdev-dpdk.h" >> +#include "netdev-netmap.h" >> #include "openvswitch/list.h" >> #include "packets.h" >> #include "util.h" >> @@ -42,6 +43,7 @@ enum OVS_PACKED_ENUM dp_packet_source { >> DPBUF_DPDK, /* buffer data is from DPDK allocated >> memory. >> * ref to dp_packet_init_dpdk() in >> dp-packet.c. >> */ >> + DPBUF_NETMAP, /* Buffers are from netmap allocated >> memory. */ >> }; >> >> #define DP_PACKET_CONTEXT_SIZE 64 >> @@ -60,6 +62,9 @@ struct dp_packet { >> uint32_t size_; /* Number of bytes in use. */ >> uint32_t rss_hash; /* Packet hash. */ >> bool rss_hash_valid; /* Is the 'rss_hash' valid? */ >> +#endif >> +#ifdef NETMAP_NETDEV >> + uint32_t buf_idx; /* Netmap slot index. */ >> #endif >> enum dp_packet_source source; /* Source of memory allocated as >> 'base'. */ >> >> @@ -115,6 +120,7 @@ void dp_packet_use_stub(struct dp_packet *, void *, >> size_t); >> void dp_packet_use_const(struct dp_packet *, const void *, size_t); >> >> void dp_packet_init_dpdk(struct dp_packet *, size_t allocated); >> +void dp_packet_init_netmap(struct dp_packet *, void *, size_t); >> >> void dp_packet_init(struct dp_packet *, size_t); >> void dp_packet_uninit(struct dp_packet *); >> @@ -173,6 +179,13 @@ dp_packet_delete(struct dp_packet *b) >> * created as a dp_packet */ >> free_dpdk_buf((struct dp_packet*) b); >> return; >> + } else if (b->source == DPBUF_NETMAP) { >> + /* It was allocated by a netdev_netmap, it will be marked >> + * for reuse. */ >> +#ifdef NETMAP_NETDEV >> + netmap_free_packet(b); >> +#endif >> + return; >> } >> >> dp_packet_uninit(b); >> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c >> index b07fc6b8b..af81c992b 100644 >> --- a/lib/dpif-netdev.c >> +++ b/lib/dpif-netdev.c >> @@ -4119,11 +4119,14 @@ reload: >> >> /* List port/core affinity */ >> for (i = 0; i < poll_cnt; i++) { >> - VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n", >> - pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx), >> - netdev_rxq_get_queue_id(poll_list[i].rxq->rx)); >> - /* Reset the rxq current cycles counter. */ >> - dp_netdev_rxq_set_cycles(poll_list[i].rxq, RXQ_CYCLES_PROC_CURR, >> 0); >> + VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n", >> + pmd->core_id, netdev_rxq_get_name(poll_list[ >> i].rxq->rx), >> + netdev_rxq_get_queue_id(poll_list[i].rxq->rx)); >> + /* Reset the rxq current cycles counter. */ >> + dp_netdev_rxq_set_cycles(poll_list[i].rxq, >> RXQ_CYCLES_PROC_CURR, 0); >> +#ifdef NETMAP_NETDEV >> + netmap_init_port(poll_list[i].rxq->rx); >> +#endif >> } >> >> if (!poll_cnt) { >> diff --git a/lib/netdev-netmap.c b/lib/netdev-netmap.c >> new file mode 100644 >> index 000000000..87b292895 >> --- /dev/null >> +++ b/lib/netdev-netmap.c >> @@ -0,0 +1,1014 @@ >> +#include <config.h> >> + >> +#include <errno.h> >> +#include <math.h> >> +#include <net/if.h> >> +#include <netinet/in.h> >> +#include <net/netmap.h> >> +#define NETMAP_WITH_LIBS >> +#include <net/netmap_user.h> >> +#include <sys/ioctl.h> >> +#include <sys/syscall.h> >> + >> +#include "dpif.h" >> +#include "netdev.h" >> +#include "netdev-provider.h" >> +#include "netmap.h" >> +#include "netdev-netmap.h" >> +#include "openvswitch/list.h" >> +#include "openvswitch/poll-loop.h" >> +#include "openvswitch/vlog.h" >> +#include "ovs-thread.h" >> +#include "packets.h" >> +#include "smap.h" >> + >> +#define DP_BLOCK_SIZE NETDEV_MAX_BURST * 2 >> +#define DEFAULT_RSYNC_INTVAL 5 >> + >> +VLOG_DEFINE_THIS_MODULE(netdev_netmap); >> + >> +static struct vlog_rate_limit rl OVS_UNUSED = VLOG_RATE_LIMIT_INIT(5, >> 100); >> + >> +struct netdev_netmap { >> + struct netdev up; >> + struct nm_desc *nmd; >> + >> + uint64_t timestamp; >> + uint32_t rxsync_intval; >> + >> + struct ovs_list list_node; >> + long tid; >> + struct nm_alloc *nma; >> + >> + struct ovs_mutex mutex OVS_ACQ_AFTER(netmap_mutex); >> + pthread_spinlock_t tx_lock; >> + >> + struct netdev_stats stats; >> + struct eth_addr hwaddr; >> + enum netdev_flags flags; >> + >> + int mtu; >> + int requested_mtu; >> +}; >> + >> +struct netdev_rxq_netmap { >> + struct netdev_rxq up; >> +}; >> + >> +static void netdev_netmap_destruct(struct netdev *netdev); >> + >> +static bool >> +is_netmap_class(const struct netdev_class *class) >> +{ >> + return class->destruct == netdev_netmap_destruct; >> +} >> + >> +static struct netdev_netmap * >> +netdev_netmap_cast(const struct netdev *netdev) >> +{ >> + ovs_assert(is_netmap_class(netdev_get_class(netdev))); >> + return CONTAINER_OF(netdev, struct netdev_netmap, up); >> +} >> + >> +static struct netdev_rxq_netmap * >> +netdev_rxq_netmap_cast(const struct netdev_rxq *rx) >> +{ >> + ovs_assert(is_netmap_class(netdev_get_class(rx->netdev))); >> + return CONTAINER_OF(rx, struct netdev_rxq_netmap, up); >> +} >> + >> +static struct ovs_mutex netmap_mutex = OVS_MUTEX_INITIALIZER; >> + >> +/* Blocks are used to store DP_BLOCK_SIZE preallocated netmap dp_packets. >> + * During receive operation, dp_packets are allocated by moving them >> from a >> + * block to a dp_batch. A block is refilled when packets are freed. >> + * Each netmap dp_packet has source type set to DPBUF_NETMAP, with >> buf_idx >> + * identifying a netmap buffer. Packets in the blocks (or in flight >> within OVS) >> + * are not attached to any netmap ring, i.e. their buf_idx is not stored >> in >> + * any netmap slot. On receive or transmit, the netmap buffer owned by a >> + * dp_packet is swapped with one attached to a receive/transmit ring >> slot, >> + * by simply swapping the buf_idx values. */ >> +struct nm_block { >> + struct ovs_list node; /* Blocks can be chained >> + * in a list. */ >> + struct dp_packet* packets[DP_BLOCK_SIZE]; /* Array of dp_packets. */ >> + uint16_t idx; /* Array index of the >> current >> + * packet. */ >> +}; >> + >> +enum nm_block_type { >> + NM_BLOCK_TYPE_PUT = 0, >> + NM_BLOCK_TYPE_GET = 1, >> +}; >> + >> +/* Global data structures of the netmap dp_packet allocator. */ >> +static struct nm_runtime { >> + struct ovs_list port_list; /* List of all netmap netdevs. */ >> + struct ovs_list block_list[2]; /* Lists for dp_packet blocks: one for >> + * empty and one for full ones. */ >> + void *mem; >> + uint16_t memid; >> + uint32_t memsize; >> + uint32_t nextrabufs; >> +} nmr = { 0 }; >> + >> +/* Each thread uses a pair of blocks for allocations and deallocations. >> */ >> +struct nm_alloc { >> + struct nm_block *block[2]; /* Blocks used by TX/RX to >> allocate/dealloacte >> + * dp_packets. */ >> +}; >> + >> +/* Thread local allocators for packet allocations/dellocations */ >> +DEFINE_STATIC_PER_THREAD_DATA(struct nm_alloc, nma, { 0 }); >> +#define NMA nma_get() >> +#define PUTB nma_get()->block[NM_BLOCK_TYPE_PUT] >> +#define GETB nma_get()->block[NM_BLOCK_TYPE_GET] >> + >> +/* Creates a new block. >> + * The block can be empty or initialized with new dp_packets associated >> to >> + * netmap buffers not attached to a netmap ring. */ >> +static struct nm_block* >> +nm_block_new(struct nm_desc *nmd) { >> + struct nm_block *block; >> + >> + block = xmalloc(sizeof(struct nm_block)); >> + block->idx = 0; >> + ovs_list_init(&block->node); >> + >> + if (nmd) { >> + struct dp_packet *packet; >> + struct netmap_ring *ring = NETMAP_RXRING(nmd->nifp, 0); >> + uint32_t idx = nmd->nifp->ni_bufs_head; >> + >> + for (int i = 0; idx && i < DP_BLOCK_SIZE; >> + i++, idx = *(uint32_t *)NETMAP_BUF(ring, idx)) { >> + packet = dp_packet_new(0); >> + packet->buf_idx = idx; >> + packet->source = DPBUF_NETMAP; >> + block->packets[block->idx++] = packet; >> + } >> + >> + nmd->nifp->ni_bufs_head = idx; >> + } >> + >> + return block; >> +} >> + >> +/* Swaps blocks from nm_runtime in order to replace the current block >> with >> + * an empty or full block. >> + * if we want GETB to be swapped with a block filled with dp_packets we >> will >> + * speciry NM_BLOCK_TYPE_GET. >> + * if we want PUTB to be swapped with a block filled with dp_packets we >> will >> + * speciry NM_BLOCK_TYPE_PUT. */ >> +static void >> +nm_block_swap_global(enum nm_block_type type) { >> + struct nm_block **bselect = NULL; >> + struct nm_block *bswap = NULL, *btmp; >> + >> + ovs_mutex_lock(&netmap_mutex); >> + >> + bselect = &(NMA->block[type]); >> + >> + /* Try to pop a block form the correct list */ >> + if (!ovs_list_is_empty(&nmr.block_list[type])) { >> + bswap = CONTAINER_OF(ovs_list_pop_front(&nmr.block_list[type]), >> + struct nm_block, node); >> + } else { >> + bswap = nm_block_new(NULL); >> + } >> + >> + /* Swap blocks. */ >> + if (OVS_LIKELY(bswap)) { >> + btmp = *bselect; >> + *bselect = bswap; >> + /* If the current block is empty it will be pushed to the empty >> list >> + * and viceversa if it not empty. */ >> + type = btmp->idx ? NM_BLOCK_TYPE_GET : NM_BLOCK_TYPE_PUT; >> + ovs_list_push_back(&nmr.block_list[type], &btmp->node); >> + } >> + >> + ovs_mutex_unlock(&netmap_mutex); >> +} >> + >> +/* Swap the two blocks of the local allocator. */ >> +static void >> +nm_block_swap_local(void) { >> + struct nm_block* block = GETB; >> + GETB = PUTB; >> + PUTB = block; >> +} >> + >> +/* Frees a block from memory. >> + * If nmd is specified we will return extra buffers to this >> + * nm_desc if the block contains any dp_packet. */ >> +static void >> +nm_block_free(struct nm_block* b, struct nm_desc *nmd) { >> + if (b) { >> + if (nmd) { >> + struct netmap_ring *ring = NETMAP_RXRING(nmd->nifp, 0); >> + >> + for (int i = 0; i < b->idx; i++) { >> + struct dp_packet *packet = b->packets[i]; >> + if (packet) { >> + uint32_t *e = (uint32_t *) NETMAP_BUF(ring, >> packet->buf_idx); >> + *e = nmd->nifp->ni_bufs_head; >> + nmd->nifp->ni_bufs_head = packet->buf_idx; >> + free(packet); >> + } >> + } >> + } >> + >> + free(b); >> + } >> +} >> + >> +/* Set up the port by checking if any other port has already been opened. >> + * Prepare blocks of dp_packets. */ >> +static int >> +netmap_setup_port(struct nm_desc *nmd) { >> + ovs_mutex_lock(&netmap_mutex); >> + >> + if (ovs_list_size(&nmr.port_list)) { >> + /* Netmap memory has already been set up, check if the new port >> uses >> + * the same memid */ >> + if (nmr.memid != nmd->req.nr_arg2) { >> + VLOG_WARN("unable to add this port, it has a new mem_id >> (%x->%x)", >> + nmr.memid, nmd->req.nr_arg2); >> + ovs_mutex_unlock(&netmap_mutex); >> + return 1; >> + } >> + } else { >> + /* We are initializing the first Netmap port: setup Netmap memory >> + * to this process. */ >> + nmr.memid = nmd->req.nr_arg2; >> + nmr.memsize = nmd->req.nr_memsize; >> + nmr.mem = mmap(0, nmr.memsize, PROT_WRITE | PROT_READ, >> + MAP_SHARED, nmd->fd, 0); >> + >> + if (nmr.mem == MAP_FAILED) { >> + VLOG_WARN("mmap has failed!"); >> + ovs_mutex_unlock(&netmap_mutex); >> + return 1; >> + } >> + } >> + >> + /* Now we can set up the following nmd fields */ >> + { >> + struct netmap_if *nifp; >> + >> + nmd->memsize = nmr.memsize; >> + nmd->mem = nmr.mem; >> + nifp = NETMAP_IF(nmd->mem, nmd->req.nr_offset); >> + *(struct netmap_if **)(uintptr_t)&(nmd->nifp) = nifp; >> + } >> + >> + /* Allocate a number of blocks containing dp_packets. The total >> number >> + * of extrabuffers to be used is multiple of the blocksize */ >> + uint32_t nextrabufs = nmd->req.nr_arg3 & ~(DP_BLOCK_SIZE-1); >> + struct nm_block *block; >> + for (int i = 0 ; i < (nextrabufs/DP_BLOCK_SIZE); i++) { >> + block = nm_block_new(nmd); >> + ovs_list_push_back(&nmr.block_list[NM_BLOCK_TYPE_GET], >> &block->node); >> + } >> + >> + ovs_mutex_unlock(&netmap_mutex); >> + >> + return 0; >> +} >> + >> +/* This function initializes some variables and has to be called in the >> pmd >> + * thread reload. >> + * Thanks to this we can initialize thread local blocks and recognize >> + * if there are other ports using our thread-id. */ >> +void >> +netmap_init_port(struct netdev_rxq *rxq) { >> + >> + ovs_mutex_lock(&netmap_mutex); >> + >> + if(is_netmap_class(netdev_get_class(rxq->netdev))) { >> + struct netdev_netmap *dev = netdev_netmap_cast(rxq->netdev); >> + dev->tid = syscall(SYS_gettid); >> + dev->nma = NMA; >> + } >> + >> + /* We need to initialize new blocks in the local allocator */ >> + if (!GETB) { >> + GETB = nm_block_new(NULL); >> + } >> + >> + if (!PUTB) { >> + PUTB = nm_block_new(NULL); >> + } >> + >> + ovs_mutex_unlock(&netmap_mutex); >> +} >> + >> +/* This function is called upon dp_packet deallocation. The pointer is >> not >> + * dellocated but saved in a nm_block that has free space. */ >> +void >> +netmap_free_packet(struct dp_packet* packet) { >> + struct nm_block* block = PUTB; >> + >> + if (OVS_UNLIKELY(block->idx == (DP_BLOCK_SIZE - 1))) { >> + block = GETB; >> + if (OVS_UNLIKELY(block->idx == (DP_BLOCK_SIZE - 1))) { >> + nm_block_swap_global(NM_BLOCK_TYPE_PUT); >> + block = PUTB; >> + } >> + } >> + >> + block->packets[block->idx++] = packet; >> +} >> + >> +/* Allocate 'n' dp_packets to the batch. This operation might require >> + * multiple memcpy operations. If no thread local nm_block has data we >> need >> + * to ask for a new block to the nm_runtime. */ >> +static int >> +netmap_alloc_packets(struct dp_packet_batch* b, size_t n) { >> + struct nm_block* block; >> + size_t step, tot = 0, s; >> + >> + for (step = 0; step < 3; step++) { >> + block = GETB; >> + s = MIN(n, block->idx); >> + memcpy(&b->packets[tot], &block->packets[block->idx - s], >> + s * sizeof(struct dp_packet*)); >> + block->idx -= s; >> + tot += s; >> + n -= s; >> + >> + if (n == 0) { >> + break; >> + } else if (OVS_LIKELY(step == 0)) { >> + nm_block_swap_local(); >> + } else { >> + nm_block_swap_global(NM_BLOCK_TYPE_GET); >> + } >> + } >> + >> + return tot; >> +} >> + >> +/* Set up some values from the configuration. */ >> +void >> +netmap_init_config(const struct smap *ovs_other_config) { >> + nmr.nextrabufs = (uint32_t) >> + smap_get_int(ovs_other_config, "netmap-nextrabufs", >> DP_BLOCK_SIZE); >> + >> + nmr.nextrabufs &= ~(DP_BLOCK_SIZE-1); >> + >> + VLOG_INFO("nextrabufs: %d", nmr.nextrabufs); >> +} >> + >> +static struct netdev_rxq * >> +netdev_netmap_rxq_alloc(void) >> +{ >> + struct netdev_rxq_netmap *rx = xzalloc(sizeof *rx); >> + return &rx->up; >> +} >> + >> +static int >> +netdev_netmap_rxq_construct(struct netdev_rxq *rxq OVS_UNUSED) >> +{ >> + /* Nothing to do here */ >> + return 0; >> +} >> + >> +static void >> +netdev_netmap_rxq_destruct(struct netdev_rxq *rxq OVS_UNUSED) >> +{ >> + /* Nothing to do here */ >> + return; >> +} >> + >> +static void >> +netdev_netmap_rxq_dealloc(struct netdev_rxq *rxq) >> +{ >> + struct netdev_rxq_netmap *rx = netdev_rxq_netmap_cast(rxq); >> + free(rx); >> +} >> + >> +static struct netdev * >> +netdev_netmap_alloc(void) >> +{ >> + struct netdev_netmap *dev; >> + >> + dev = (struct netdev_netmap *) xzalloc(sizeof *dev); >> + if (dev) { >> + return &dev->up; >> + } >> + >> + return NULL; >> +} >> + >> +static int >> +netdev_netmap_construct(struct netdev *netdev) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + const char *ifname = netdev_get_name(netdev); >> + >> + struct nmreq req; >> + memset(&req, 0 , sizeof(req)); >> + req.nr_arg3 = nmr.nextrabufs; >> + >> + /* Open Netmap port requesting a number of extrabuffers. We also >> avoid to >> + * mmap netmap memory here. */ >> + dev->nmd = nm_open(ifname, &req, NM_OPEN_NO_MMAP, NULL); >> + >> + if (!dev->nmd) { >> + if (!errno) { >> + VLOG_WARN("opening port \"%s\" failed: not a netmap port", >> ifname); >> + } else { >> + VLOG_WARN("opening port \"%s\" failed: %s", ifname, >> + ovs_strerror(errno)); >> + } >> + return EINVAL; >> + } else { >> + VLOG_INFO("opening port \"%s\"", ifname); >> + } >> + >> + /* Check if we have enough extra buffers to create a nm_block. */ >> + if (dev->nmd->req.nr_arg3 < DP_BLOCK_SIZE) { >> + VLOG_WARN("not enough extra buffers(%d/%d), closing port", >> + dev->nmd->req.nr_arg3, DP_BLOCK_SIZE); >> + nm_close(dev->nmd); >> + return EINVAL; >> + } >> + >> + /* Possibly mmap netmap memory, initialize the nm_desc, nm_runtime. >> + * Allocate some nm_blocks using the extrabuffers given to this >> port. */ >> + if (netmap_setup_port(dev->nmd)) { >> + VLOG_WARN("could not setup \"%s\" port", ifname); >> + nm_close(dev->nmd); >> + return EINVAL; >> + } >> + >> + ovs_list_init(&dev->list_node); >> + ovs_mutex_lock(&netmap_mutex); >> + ovs_list_push_front(&nmr.port_list, &dev->list_node); >> + ovs_mutex_unlock(&netmap_mutex); >> + >> + ovs_mutex_init(&dev->mutex); >> + pthread_spin_init(&dev->tx_lock, PTHREAD_PROCESS_SHARED); >> + eth_addr_random(&dev->hwaddr); >> + dev->flags = NETDEV_UP | NETDEV_PROMISC; >> + dev->timestamp = netmap_rdtsc(); >> + dev->rxsync_intval = DEFAULT_RSYNC_INTVAL; >> + dev->requested_mtu = NETMAP_RXRING(dev->nmd->nifp, 0)->nr_buf_size; >> + netdev_request_reconfigure(netdev); >> + >> + return 0; >> +} >> + >> +static void >> +netdev_netmap_destruct(struct netdev *netdev) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + struct nm_block* b; >> + >> + ovs_mutex_lock(&netmap_mutex); >> + VLOG_INFO("closing port \"%s\"", (const char*) >> netdev_get_name(netdev)); >> + >> + ovs_list_remove(&dev->list_node); >> + >> + /* A netmap netdev is being removed. >> + * If this is the last netmap port we remove all blocks. */ >> + if (!ovs_list_size(&nmr.port_list)) { >> + LIST_FOR_EACH_POP(b, node, &nmr.block_list[NM_BLOCK_TYPE_PUT]) { >> + nm_block_free(b, dev->nmd); >> + } >> + >> + LIST_FOR_EACH_POP(b, node, &nmr.block_list[NM_BLOCK_TYPE_GET]) { >> + nm_block_free(b, dev->nmd); >> + } >> + } else { >> + struct netdev_netmap *d; >> + enum nm_block_type type; >> + int last_thread_port = true; >> + >> + /* Check if there are other netmap ports using the same thread >> id. */ >> + LIST_FOR_EACH(d, list_node, &nmr.port_list) { >> + if (dev->tid == d->tid) { >> + last_thread_port = false; >> + break; >> + } >> + } >> + >> + /* If there are no ports using this thread id we return thread >> local >> + * blocks to the global allocator nm_runtime. */ >> + if (last_thread_port) { >> + b = dev->nma->block[NM_BLOCK_TYPE_PUT]; >> + type = b->idx ? NM_BLOCK_TYPE_GET : NM_BLOCK_TYPE_PUT; >> + ovs_list_push_front(&nmr.block_list[type], &b->node); >> + dev->nma->block[NM_BLOCK_TYPE_PUT] = NULL; >> + >> + b = dev->nma->block[NM_BLOCK_TYPE_GET]; >> + type = b->idx ? NM_BLOCK_TYPE_GET : NM_BLOCK_TYPE_PUT; >> + ovs_list_push_front(&nmr.block_list[type], &b->node); >> + dev->nma->block[NM_BLOCK_TYPE_GET] = NULL; >> + } >> + >> + /* We will now try to free a number of blocks equal to the blocks >> + * allocated when the port was created. >> + * Each block is then freed returning the extra bufs to the >> nm_desc. */ >> + int nblocks = nmr.nextrabufs / DP_BLOCK_SIZE; >> + LIST_FOR_EACH_POP(b, node, &nmr.block_list[NM_BLOCK_TYPE_GET]) { >> + nm_block_free(b, dev->nmd); >> + if (!--nblocks) { >> + break; >> + } >> + } >> + >> + if (!ovs_list_is_empty(&nmr.block_list[NM_BLOCK_TYPE_PUT])) { >> + struct ovs_list *list_node = ovs_list_pop_front( >> + &nmr.block_list[NM_BLOCK_TYPE_ >> PUT]); >> + b = CONTAINER_OF(list_node, struct nm_block, node); >> + nm_block_free(b, dev->nmd); >> + } >> + } >> + >> + ovs_mutex_unlock(&netmap_mutex); >> + >> + /* Now we can close the port. */ >> + nm_close(dev->nmd); >> +} >> + >> +static void >> +netdev_netmap_dealloc(struct netdev *netdev) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_destroy(&dev->mutex); >> + pthread_spin_destroy(&dev->tx_lock); >> + >> + free(dev); >> +} >> + >> +static int >> +netdev_netmap_class_init(void) >> +{ >> + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> + >> + if (ovsthread_once_start(&once)) { >> + ovs_list_init(&nmr.block_list[NM_BLOCK_TYPE_PUT]); >> + ovs_list_init(&nmr.block_list[NM_BLOCK_TYPE_GET]); >> + ovs_list_init(&nmr.port_list); >> + ovsthread_once_done(&once); >> + } >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_reconfigure(struct netdev *netdev) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + int err = 0; >> + >> + ovs_mutex_lock(&dev->mutex); >> + >> + if (dev->mtu == dev->requested_mtu) { >> + /* Reconfiguration is unnecessary */ >> + goto out; >> + } >> + >> + dev->mtu = dev->requested_mtu; >> + netdev_change_seq_changed(netdev); >> + >> +out: >> + ovs_mutex_unlock(&dev->mutex); >> + return err; >> +} >> + >> +static int >> +netdev_netmap_get_config(const struct netdev *netdev, struct smap *args) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + smap_add_format(args, "mtu", "%d", dev->mtu); >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_set_config(struct netdev *netdev, const struct smap *args, >> + char **errp OVS_UNUSED) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + dev->rxsync_intval = smap_get_int(args, "rxsync-intval", >> + DEFAULT_RSYNC_INTVAL); >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static inline void >> +netmap_rxsync(struct netdev_netmap *dev) >> +{ >> + uint64_t now = netmap_rdtsc(); >> + unsigned int diff = TSC2US(now - dev->timestamp); >> + >> + if (diff < dev->rxsync_intval) { >> + /* skipping rxsync */ >> + return; >> + } >> + >> + ioctl(dev->nmd->fd, NIOCRXSYNC, NULL); >> + >> + /* update current timestamp */ >> + dev->timestamp = now; >> +} >> + >> +static inline void >> +netmap_swap_slot(struct dp_packet *packet, struct netmap_slot *s) { >> + uint32_t idx; >> + >> + idx = s->buf_idx; >> + s->buf_idx = packet->buf_idx; >> + s->flags |= NS_BUF_CHANGED; >> + packet->buf_idx = idx; >> +} >> + >> +static int >> +netdev_netmap_send(struct netdev *netdev, int qid OVS_UNUSED, >> + struct dp_packet_batch *batch, bool concurrent_txq) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + struct nm_desc *nmd = dev->nmd; >> + uint16_t r, nrings = dev->nmd->nifp->ni_tx_rings; >> + uint32_t budget = batch->count, count = 0; >> + bool again = false; >> + >> + if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { >> + dp_packet_delete_batch(batch, true); >> + return 0; >> + } >> + >> + if (OVS_UNLIKELY(concurrent_txq)) { >> + pthread_spin_lock(&dev->tx_lock); >> + } >> + >> +try_again: >> + for (r = 0; r < nrings; r++) { >> + struct netmap_ring *ring; >> + uint32_t head, space; >> + >> + ring = NETMAP_TXRING(nmd->nifp, nmd->cur_tx_ring); >> + space = nm_ring_space(ring); /* Available slots in this ring. */ >> + head = ring->head; >> + >> + if (space > budget) { >> + space = budget; >> + } >> + budget -= space; >> + >> + /* Transmit as much as possible in this ring. */ >> + while (space--) { >> + struct netmap_slot *ts = &ring->slot[head]; >> + struct dp_packet *packet = batch->packets[count++]; >> + >> + ts->len = dp_packet_get_send_len(packet); >> + >> + if (OVS_UNLIKELY(packet->source != DPBUF_NETMAP)) { >> + /* send packet copying data to the netmap slot */ >> + memcpy(NETMAP_BUF(ring, ts->buf_idx), >> + dp_packet_data(packet), ts->len); >> + } else { >> + /* send packet using zerocopy */ >> + netmap_swap_slot(packet, ts); >> + } >> + >> + head = nm_ring_next(ring, head); >> + } >> + >> + ring->head = ring->cur = head; >> + >> + /* We may have exhausted the budget */ >> + if (OVS_LIKELY(!budget)) { >> + break; >> + } >> + >> + /* We still have packets to send, select next ring. */ >> + if (OVS_UNLIKELY(++dev->nmd->cur_tx_ring == nrings)) { >> + nmd->cur_tx_ring = 0; >> + } >> + } >> + >> + ioctl(dev->nmd->fd, NIOCTXSYNC, NULL); >> + >> + if (OVS_UNLIKELY(!count && !again)) { >> + again = true; >> + goto try_again; >> + } >> + >> + dp_packet_delete_batch(batch, true); >> + >> + if (OVS_UNLIKELY(concurrent_txq)) { >> + pthread_spin_unlock(&dev->tx_lock); >> + } >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch >> *batch) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(rxq->netdev); >> + struct nm_desc *nmd = dev->nmd; >> + uint16_t r, nrings = nmd->nifp->ni_rx_rings; >> + uint32_t budget = 0; >> + >> + if (OVS_UNLIKELY(!(dev->flags & NETDEV_UP))) { >> + return EAGAIN; >> + } >> + >> + /* check how much we can receive */ >> + for (r = nmd->first_rx_ring; r < nrings; r++) { >> + budget += nm_ring_space(NETMAP_RXRING(nmd->nifp, r)); >> + } >> + >> + /* sync if there is no packet */ >> + if (budget == 0) { >> + netmap_rxsync(dev); >> + return EAGAIN; >> + } >> + >> + /* allocate the batch */ >> + budget = netmap_alloc_packets(batch, MIN(budget, NETDEV_MAX_BURST)); >> + >> + for (r = 0; r < nrings; r++) { >> + struct netmap_ring *ring; >> + uint32_t head, space; >> + >> + ring = NETMAP_RXRING(nmd->nifp, nmd->cur_rx_ring); >> + head = ring->head; >> + space = nm_ring_space(ring); >> + >> + if (space > budget) { >> + space = budget; >> + } >> + budget -= space; >> + >> + /* Receive as much as possible from this ring. */ >> + while (space--) { >> + struct netmap_slot *rs = &ring->slot[head]; >> + struct dp_packet *packet = batch->packets[batch->count++]; >> + dp_packet_init_netmap(packet, NETMAP_BUF(ring, rs->buf_idx), >> + rs->len); >> + /* receiving from a netmap port we can always zero copy >> here. */ >> + netmap_swap_slot(packet, rs); >> + head = nm_ring_next(ring, head); >> + } >> + >> + ring->cur = ring->head = head; >> + >> + /* check if the batch has been filled. */ >> + if (!budget) { >> + break; >> + } >> + >> + /* batch isn't full, try to receive on other rings. */ >> + if (OVS_UNLIKELY(++nmd->cur_rx_ring == nrings)) { >> + nmd->cur_rx_ring = 0; >> + } >> + } >> + >> + dp_packet_batch_init_packet_fields(batch); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_get_ifindex(const struct netdev *netdev) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + /* Calculate hash from the netdev name. Ensure that ifindex is a >> 24-bit >> + * postive integer to meet RFC 2863 recommendations. >> + */ >> + int ifindex = hash_string(netdev->name, 0) % 0xfffffe + 1; >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return ifindex; >> +} >> + >> +static int >> +netdev_netmap_get_mtu(const struct netdev *netdev, int *mtu) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + *mtu = dev->mtu; >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_set_mtu(struct netdev *netdev, int mtu) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + if (mtu > NETMAP_RXRING(dev->nmd->nifp, 0)->nr_buf_size >> + || mtu < ETH_HEADER_LEN) { >> + VLOG_WARN("%s: unsupported MTU %d\n", dev->up.name, mtu); >> + return EINVAL; >> + } >> + >> + ovs_mutex_lock(&dev->mutex); >> + if (dev->requested_mtu != mtu) { >> + dev->requested_mtu = mtu; >> + netdev_request_reconfigure(netdev); >> + } >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_set_etheraddr(struct netdev *netdev, const struct >> eth_addr mac) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + dev->hwaddr = mac; >> + netdev_change_seq_changed(netdev); >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_get_etheraddr(const struct netdev *netdev, struct >> eth_addr *mac) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + *mac = dev->hwaddr; >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_update_flags(struct netdev *netdev, >> + enum netdev_flags off, enum netdev_flags on, >> + enum netdev_flags *old_flagsp) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + >> + if ((off | on) & ~(NETDEV_UP | NETDEV_PROMISC)) { >> + return EINVAL; >> + } >> + >> + *old_flagsp = dev->flags; >> + dev->flags |= on; >> + dev->flags &= ~off; >> + >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_get_carrier(const struct netdev *netdev, bool *carrier) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + *carrier = true; >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_get_stats(const struct netdev *netdev, struct netdev_stats >> *stats) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + stats->tx_packets = dev->stats.tx_packets; >> + stats->tx_bytes = dev->stats.tx_bytes; >> + stats->rx_packets = dev->stats.rx_packets; >> + stats->rx_bytes = dev->stats.rx_bytes; >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +static int >> +netdev_netmap_get_status(const struct netdev *netdev, struct smap *args) >> +{ >> + struct netdev_netmap *dev = netdev_netmap_cast(netdev); >> + >> + ovs_mutex_lock(&dev->mutex); >> + smap_add_format(args, "mtu", "%d", dev->mtu); >> + ovs_mutex_unlock(&dev->mutex); >> + >> + return 0; >> +} >> + >> +#define NETDEV_NETMAP_CLASS(NAME, PMD, INIT, CONSTRUCT, DESTRUCT, >> SET_CONFIG, \ >> + SET_TX_MULTIQ, SEND, SEND_WAIT, GET_CARRIER, GET_STATS, >> GET_FEATURES, \ >> + GET_STATUS, RECONFIGURE, RXQ_RECV, RXQ_WAIT) \ >> +{ \ >> + NAME, \ >> + PMD, /* is_pmd */ \ >> + INIT, /* init */ \ >> + NULL, /* netdev_netmap_run */ \ >> + NULL, /* netdev_netmap_wait */ \ >> + netdev_netmap_alloc, \ >> + CONSTRUCT, \ >> + DESTRUCT, \ >> + netdev_netmap_dealloc, \ >> + netdev_netmap_get_config, \ >> + SET_CONFIG, \ >> + NULL, /* get_tunnel_config */ \ >> + NULL, /* build header */ \ >> + NULL, /* push header */ \ >> + NULL, /* pop header */ \ >> + NULL, /* get numa id */ \ >> + SET_TX_MULTIQ, /* tx multiq */ \ >> + SEND, /* send */ \ >> + SEND_WAIT, \ >> + netdev_netmap_set_etheraddr, \ >> + netdev_netmap_get_etheraddr, \ >> + netdev_netmap_get_mtu, \ >> + netdev_netmap_set_mtu, \ >> + netdev_netmap_get_ifindex, \ >> + GET_CARRIER, \ >> + NULL, /* get_carrier_resets */ \ >> + NULL, /* get_miimon */ \ >> + GET_STATS, \ >> + NULL, /* get_custom_stats */ \ >> + \ >> + NULL, /* get_features */ \ >> + NULL, /* set_advertisements */ \ >> + NULL, /* get_pt_mode */ \ >> + \ >> + NULL, /* set_policing */ \ >> + NULL, /* get_qos_types */ \ >> + NULL, /* get_qos_capabilities */ \ >> + NULL, /* get_qos */ \ >> + NULL, /* set_qos */ \ >> + NULL, /* get_queue */ \ >> + NULL, /* set_queue */ \ >> + NULL, /* delete_queue */ \ >> + NULL, /* get_queue_stats */ \ >> + NULL, /* queue_dump_start */ \ >> + NULL, /* queue_dump_next */ \ >> + NULL, /* queue_dump_done */ \ >> + NULL, /* dump_queue_stats */ \ >> + \ >> + NULL, /* set_in4 */ \ >> + NULL, /* get_addr_list */ \ >> + NULL, /* add_router */ \ >> + NULL, /* get_next_hop */ \ >> + GET_STATUS, \ >> + NULL, /* arp_lookup */ \ >> + \ >> + netdev_netmap_update_flags, \ >> + RECONFIGURE, \ >> + \ >> + netdev_netmap_rxq_alloc, \ >> + netdev_netmap_rxq_construct, \ >> + netdev_netmap_rxq_destruct, \ >> + netdev_netmap_rxq_dealloc, \ >> + RXQ_RECV, \ >> + RXQ_WAIT, \ >> + NULL, /* rxq_drain */ \ >> + NO_OFFLOAD_API \ >> +} >> + >> +static const struct netdev_class netmap_class = >> + NETDEV_NETMAP_CLASS( >> + "netmap", >> + true, >> + netdev_netmap_class_init, >> + netdev_netmap_construct, >> + netdev_netmap_destruct, >> + netdev_netmap_set_config, >> + NULL, >> + netdev_netmap_send, >> + NULL, >> + netdev_netmap_get_carrier, >> + netdev_netmap_get_stats, >> + NULL, >> + netdev_netmap_get_status, >> + netdev_netmap_reconfigure, >> + netdev_netmap_rxq_recv, >> + NULL); >> + >> +void >> +netdev_netmap_register(void) >> +{ >> + netdev_register_provider(&netmap_class); >> +} >> diff --git a/lib/netdev-netmap.h b/lib/netdev-netmap.h >> new file mode 100644 >> index 000000000..49fe8c319 >> --- /dev/null >> +++ b/lib/netdev-netmap.h >> @@ -0,0 +1,13 @@ >> +#ifndef NETDEV_NETMAP_H >> +#define NETDEV_NETMAP_H >> + >> +struct netdev_rxq; >> +struct smap; >> +struct dp_packet; >> + >> +void netmap_init_port(struct netdev_rxq *); >> +void netmap_init_config(const struct smap *); >> +void netmap_free_packet(struct dp_packet *); >> +void netdev_netmap_register(void); >> + >> +#endif /* netdev-netmap.h */ >> diff --git a/lib/netmap-stub.c b/lib/netmap-stub.c >> new file mode 100644 >> index 000000000..62f7a06b8 >> --- /dev/null >> +++ b/lib/netmap-stub.c >> @@ -0,0 +1,21 @@ >> +#include <config.h> >> +#include "netmap.h" >> + >> +#include "smap.h" >> +#include "ovs-thread.h" >> +#include "openvswitch/vlog.h" >> + >> +VLOG_DEFINE_THIS_MODULE(netmap); >> + >> +void >> +netmap_init(const struct smap *ovs_other_config) >> +{ >> + if (smap_get_bool(ovs_other_config, "netmap-init", false)) { >> + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> + >> + if (ovsthread_once_start(&once)) { >> + VLOG_ERR("NETMAP not supported in this copy of Open >> vSwitch."); >> + ovsthread_once_done(&once); >> + } >> + } >> +} >> diff --git a/lib/netmap.c b/lib/netmap.c >> new file mode 100644 >> index 000000000..b4147e0ad >> --- /dev/null >> +++ b/lib/netmap.c >> @@ -0,0 +1,76 @@ >> +#include <config.h> >> + >> +#include <fcntl.h> >> +#include <pthread.h> >> +#include <stdio.h> >> +#include <sys/time.h> /* timersub */ >> +#include <stdlib.h> >> +#include <string.h> >> +#include <stdint.h> >> +#include <unistd.h> /* read() */ >> + >> +#include "dirs.h" >> +#include "netdev-netmap.h" >> +#include "netmap.h" >> +#include "openvswitch/vlog.h" >> +#include "smap.h" >> + >> +VLOG_DEFINE_THIS_MODULE(netmap); >> + >> +/* initialize to avoid a division by 0 */ >> +uint64_t netmap_ticks_per_second = 1000000000; /* set by calibrate_tsc */ >> + >> +/* >> + * do an idle loop to compute the clock speed. We expect >> + * a constant TSC rate and locked on all CPUs. >> + * Returns ticks per second >> + */ >> +static uint64_t >> +netmap_calibrate_tsc(void) >> +{ >> + struct timeval a, b; >> + uint64_t ta_0, ta_1, tb_0, tb_1, dmax = ~0; >> + uint64_t da, db, cy = 0; >> + int i; >> + for (i=0; i < 3; i++) { >> + ta_0 = netmap_rdtsc(); >> + gettimeofday(&a, NULL); >> + ta_1 = netmap_rdtsc(); >> + usleep(20000); >> + tb_0 = netmap_rdtsc(); >> + gettimeofday(&b, NULL); >> + tb_1 = netmap_rdtsc(); >> + da = ta_1 - ta_0; >> + db = tb_1 - tb_0; >> + if (da + db < dmax) { >> + cy = (b.tv_sec - a.tv_sec)*1000000 + b.tv_usec - a.tv_usec; >> + cy = (double)(tb_0 - ta_1)*1000000/(double)cy; >> + dmax = da + db; >> + } >> + } >> + netmap_ticks_per_second = cy; >> + return cy; >> +} >> + >> +void >> +netmap_init(const struct smap *ovs_other_config) >> +{ >> + static bool enabled = false; >> + >> + if (enabled || !ovs_other_config) { >> + return; >> + } >> + >> + if (smap_get_bool(ovs_other_config, "netmap-init", false)) { >> + static struct ovsthread_once once_enable = >> OVSTHREAD_ONCE_INITIALIZER; >> + if (ovsthread_once_start(&once_enable)) { >> + netmap_calibrate_tsc(); >> + netmap_init_config(ovs_other_config); >> + netdev_netmap_register(); >> + enabled = true; >> + ovsthread_once_done(&once_enable); >> + VLOG_INFO("NETMAP Enabled"); >> + } >> + } else >> + VLOG_INFO_ONCE("NETMAP Disabled - Use other_config:netmap-init >> to enable"); >> +} >> diff --git a/lib/netmap.h b/lib/netmap.h >> new file mode 100644 >> index 000000000..34ff7b7a2 >> --- /dev/null >> +++ b/lib/netmap.h >> @@ -0,0 +1,27 @@ >> +#ifndef NETMAP_H >> +#define NETMAP_H >> + >> +#include <stdint.h> >> + >> +extern uint64_t netmap_ticks_per_second; >> +#define US2TSC(x) ((x)*netmap_ticks_per_second/1000000UL) >> +#define TSC2US(x) ((x)*1000000UL/netmap_ticks_per_second) >> + >> +#if 0 /* gcc intrinsic */ >> +#include <x86intrin.h> >> +#define rdtsc __rdtsc >> +#else >> +static inline uint64_t >> +netmap_rdtsc(void) >> +{ >> + uint32_t hi, lo; >> + __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi)); >> + return (uint64_t)lo | ((uint64_t)hi << 32); >> +} >> +#endif >> + >> +struct smap; >> + >> +void netmap_init(const struct smap *ovs_other_config); >> + >> +#endif /* netmap.h */ >> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c >> index d90997e3a..2dfcbb7f6 100644 >> --- a/vswitchd/bridge.c >> +++ b/vswitchd/bridge.c >> @@ -38,6 +38,7 @@ >> #include "mac-learning.h" >> #include "mcast-snooping.h" >> #include "netdev.h" >> +#include "netmap.h" >> #include "nx-match.h" >> #include "ofproto/bond.h" >> #include "ofproto/ofproto.h" >> @@ -2977,6 +2978,7 @@ bridge_run(void) >> if (cfg) { >> netdev_set_flow_api_enabled(&cfg->other_config); >> dpdk_init(&cfg->other_config); >> + netmap_init(&cfg->other_config); >> } >> >> /* Initialize the ofproto library. This only needs to run once, but >> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml >> index f899a1976..f6dd6e7b6 100644 >> --- a/vswitchd/vswitch.xml >> +++ b/vswitchd/vswitch.xml >> @@ -217,6 +217,46 @@ >> </p> >> </column> >> >> + <column name="other_config" key="netmap-init" >> + type='{"type": "boolean"}'> >> + <p> >> + Set this value to <code>true</code> to enable runtime support >> for >> + NETMAP ports. The vswitch must have compile-time support for >> NETMAP as >> + well. >> + </p> >> + <p> >> + The default value is <code>false</code>. Changing this value >> requires >> + restarting the daemon >> + </p> >> + <p> >> + If this value is <code>false</code> at startup, any netmap >> ports which >> + are configured in the bridge will fail. >> + </p> >> + </column> >> + >> + <column name="other_config" key="netmap-nextrabufs" >> + type='{"type": "integer", "minInteger": 32}'> >> + <p> >> + Specifies the number of extra buffers to be requested to >> netmap >> + when opening each netmap port. >> + </p> >> + <p> >> + Each packet received or transmitted by OVS from/to a netmap >> port >> + needs an extra buffer. The OVS netmap runtime needs at least >> a >> + batch worth of extra buffers (32 packets) for each port to >> function >> + properly. More extra buffers may be necessary if OVS >> temporarily >> + stores netmap buffers within its internal queues. >> + </p> >> + </column> >> + >> + <column name="other_config" key="rxsync-intval" >> + type='{"type": "integer", "minInteger": 0}'> >> + <p> >> + Specifies the minimum time (in microseconds) between two >> + consecutive rxsync calls issued on a netmap port. >> + </p> >> + </column> >> + >> <column name="other_config" key="dpdk-init" >> type='{"type": "boolean"}'> >> <p> >> >> >> 2018-03-20 15:07 GMT+01:00 Alessandro Rosetti < >> alessandro.rose...@gmail.com>: >> >>> Hi Darrell, >>> >>> I'm developing netmap support for my thesis and I hope it will make it >>> for OVS 2.10. >>> In the next days I'm going to post the first prototype patch that is >>> almost ready >>> >>> Thanks to you, >>> Alessandro >>> >>> On 19 Mar 2018 9:26 pm, "Darrell Ball" <dlu...@gmail.com> wrote: >>> >>>> Hi Alessandro >>>> >>>> I also think this would be interesting. >>>> Is netmap integration being actively being worked on for OVS 2.10 ? >>>> >>>> Thanks Darrell >>>> >>>> On Wed, Feb 7, 2018 at 9:19 AM, Ilya Maximets <i.maxim...@samsung.com> >>>> wrote: >>>> >>>>> > Hi, >>>>> >>>>> Hi, Alessandro. >>>>> >>>>> > >>>>> > My name is Alessandro Rosetti, and I'm currently adding netmap >>>>> support to >>>>> > ovs, following an approach similar to DPDK. >>>>> >>>>> Good to know that someone started to work on this. IMHO, it's a good >>>>> idea. >>>>> I also wanted to try to implement this someday, but had no much time. >>>>> >>>>> > >>>>> > I've created a new netdev: netdev_netmap that uses the pmd >>>>> infrastructure. >>>>> > The prototype I have seems to work fine (I still need to tune >>>>> performance, >>>>> > test optional features, and test more complex topologies.) >>>>> >>>>> Cool. Looking forward for your RFC patch-set. >>>>> >>>>> > >>>>> > I have a question about the lifetime of dp_packets. >>>>> > Is there any guarantee that the dp_packets allocated in a receive >>>>> callback >>>>> > (e.g. netdev_netmap_rxq_recv) are consumed by OVS (e.g. dropped, >>>>> cloned, or >>>>> > sent to other ports) **before** a subsequent call to the receive >>>>> callback >>>>> > (on the same port)? >>>>> > Or is it possible for dp_packets to be stored somewhere (e.g. in an >>>>> OVS >>>>> > internal queue) and live across subsequent invocations of the receive >>>>> > callback that allocated them? >>>>> >>>>> I think that there was never such a guarantee, but recent changes in >>>>> userspace >>>>> datapath completely ruined this assumption. I mean output packet >>>>> batching support. >>>>> >>>>> Please refer the following commits for details: >>>>> 009e003 2017-12-14 | dpif-netdev: Output packet batching. >>>>> c71ea3c 2018-01-15 | dpif-netdev: Time based output batching. >>>>> 00adb8d 2018-01-15 | docs: Describe output packet batching in DPDK >>>>> guide. >>>>> >>>>> > >>>>> > I need to know if this is the case to check that my current >>>>> prototype is >>>>> > safe. >>>>> > I use per-port pre-allocation of dp_packets, for maximum >>>>> performance. I've >>>>> > seen that DPDK uses its internal allocator to allocate and deallocate >>>>> > dp_packets, but netmap does not expose one. >>>>> > Each packet received with netmap is created as a new type dp_packet: >>>>> > DPBUF_NETMAP. The data points to a netmap buffer (preallocated by the >>>>> > kernel). >>>>> > When I receive data (netdev_netmap_rxq_recv) I reuse the dp_packets, >>>>> > updating the internal pointer and a couple of additional informations >>>>> > stored inside the dp_packet. >>>>> > When I have to send data I use zero copy if dp_packet is >>>>> DPBUF_NETMAP and >>>>> > copy if it's not. >>>>> > >>>>> > Thanks for the help! >>>>> > Alessandro. >>>>> >>>>> >>>>> _______________________________________________ >>>>> dev mailing list >>>>> d...@openvswitch.org >>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev >>>>> >>>> >>>> >> > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev