Re: [ovs-dev] [patch v1 1/2] conntrack: Fix nat_clean.
this will need a V2. On Tue, Aug 28, 2018 at 7:37 PM, Darrell Ball wrote: > nat_clean has a defunct optimization for calculating a hash outside the > scope of a bucket lock; move the line into the scope of the lock. Needs > backporting to 2.8. > > Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/ > 351629.html > Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.") > Signed-off-by: Darrell Ball > --- > lib/conntrack.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/lib/conntrack.c b/lib/conntrack.c > index be8debb..44cb91b 100644 > --- a/lib/conntrack.c > +++ b/lib/conntrack.c > @@ -778,10 +778,11 @@ nat_clean(struct conntrack *ct, struct conn *conn, > { > ct_rwlock_wrlock(&ct->resources_lock); > nat_conn_keys_remove(&ct->nat_conn_keys, &conn->rev_key, > ct->hash_basis); > -ct_rwlock_unlock(&ct->resources_lock); > -ct_lock_unlock(&ctb->lock); > unsigned bucket_rev_conn = > hash_to_bucket(conn_key_hash(&conn->rev_key, ct->hash_basis)); > +ct_rwlock_unlock(&ct->resources_lock); > +ct_lock_unlock(&ctb->lock); > + > ct_lock_lock(&ct->buckets[bucket_rev_conn].lock); > ct_rwlock_wrlock(&ct->resources_lock); > long long now = time_msec(); > -- > 1.9.1 > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch v2] datapath: Fix builds on older kernels.
On older kernels, for example 3.19, the function rt6_get_cookie() is not available and used with ipv6 config enabled; it was introduced in 4.2. Put back the replacement function if it does not exist. Add a 3.19 version to travis. CC: Yifeng Sun Fixes: bf61b8b1c1db ("datapath: Add support for kernel 4.16.x & 4.17.x.") Signed-off-by: Darrell Ball --- v1->v2: add 3.19 to travis per Yifeng's suggestion. .travis.yml | 1 + acinclude.m4| 5 datapath/linux/Modules.mk | 1 + datapath/linux/compat/include/net/ip6_fib.h | 43 + 4 files changed, 50 insertions(+) create mode 100644 datapath/linux/compat/include/net/ip6_fib.h diff --git a/.travis.yml b/.travis.yml index 21447b5..a2ef8bd 100644 --- a/.travis.yml +++ b/.travis.yml @@ -41,6 +41,7 @@ env: - KERNEL=4.14.63 - KERNEL=4.9.120 - KERNEL=4.4.148 + - KERNEL=3.19.8 - KERNEL=3.16.57 - TESTSUITE=1 LIBS=-ljemalloc diff --git a/acinclude.m4 b/acinclude.m4 index ab141bd..0690bae 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -459,6 +459,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ OVS_GREP_IFELSE([$KSRC/arch/x86/include/asm/checksum_32.h], [src_err,], [OVS_DEFINE([HAVE_CSUM_COPY_DBG])]) + OVS_GREP_IFELSE([$KSRC/include/net/ip6_fib.h], [rt6_get_cookie], + [OVS_DEFINE([HAVE_RT6_GET_COOKIE])]) + OVS_GREP_IFELSE([$KSRC/include/net/addrconf.h], [ipv6_dst_lookup.*net], [OVS_DEFINE([HAVE_IPV6_DST_LOOKUP_NET])]) OVS_GREP_IFELSE([$KSRC/include/net/addrconf.h], [ipv6_stub]) @@ -803,6 +806,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ [OVS_DEFINE(HAVE_NF_CONNTRACK_HELPER_PUT)]) OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],:space:]]]SKB_GSO_UDP[[[:space:, [OVS_DEFINE([HAVE_SKB_GSO_UDP])]) + OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE], + [OVS_DEFINE([HAVE_DST_NOCACHE])]) OVS_FIND_FIELD_IFELSE([$KSRC/include/net/rtnetlink.h], [rtnl_link_ops], [extack], [OVS_DEFINE([HAVE_EXT_ACK_IN_RTNL_LINKOPS])]) diff --git a/datapath/linux/Modules.mk b/datapath/linux/Modules.mk index b06ca15..e31d784 100644 --- a/datapath/linux/Modules.mk +++ b/datapath/linux/Modules.mk @@ -82,6 +82,7 @@ openvswitch_headers += \ linux/compat/include/net/inetpeer.h \ linux/compat/include/net/ip.h \ linux/compat/include/net/ip_tunnels.h \ +linux/compat/include/net/ip6_fib.h \ linux/compat/include/net/ip6_route.h \ linux/compat/include/net/ip6_tunnel.h \ linux/compat/include/net/ipv6.h \ diff --git a/datapath/linux/compat/include/net/ip6_fib.h b/datapath/linux/compat/include/net/ip6_fib.h new file mode 100644 index 000..0cc4358 --- /dev/null +++ b/datapath/linux/compat/include/net/ip6_fib.h @@ -0,0 +1,43 @@ +/* + * Linux INET6 implementation + * + * Authors: + * Pedro Roque + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _IP6_FIB_WRAPPER_H +#define _IP6_FIB_WRAPPER_H + +#include_next + +#ifndef HAVE_RT6_GET_COOKIE + +#ifndef RTF_PCPU +#define RTF_PCPU0x4000 +#endif + +#ifndef RTF_LOCAL +#define RTF_LOCAL 0x8000 +#endif + +#define rt6_get_cookie rpl_rt6_get_cookie +static inline u32 rt6_get_cookie(const struct rt6_info *rt) +{ + if (rt->rt6i_flags & RTF_PCPU || +#ifdef HAVE_DST_NOCACHE + (unlikely(rt->dst.flags & DST_NOCACHE) && rt->dst.from)) +#else + (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->dst.from)) +#endif + rt = (struct rt6_info *)(rt->dst.from); + + return rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0; +} +#endif + +#endif -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch v1 2/2] conntrack: Skip ephemeral ports fallback for DNAT.
Ephemeral port fallback is being done for DNAT; stop it. Nees backporting to 2.8. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351629.html Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.") Signed-off-by: Darrell Ball --- lib/conntrack.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/lib/conntrack.c b/lib/conntrack.c index 44cb91b..c434084 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -2182,7 +2182,9 @@ nat_select_range_tuple(struct conntrack *ct, const struct conn *conn, uint16_t port = first_port; bool all_ports_tried = false; -bool original_ports_tried = false; +/* For DNAT, we don't try ephemeral ports. */ +bool ephemeral_ports_tried = +conn->nat_info->nat_action & NAT_ACTION_DST ? true : false; struct ct_addr first_addr = ct_addr; while (true) { @@ -2228,8 +2230,8 @@ nat_select_range_tuple(struct conntrack *ct, const struct conn *conn, ct_addr = conn->nat_info->min_addr; } if (!memcmp(&ct_addr, &first_addr, sizeof ct_addr)) { -if (!original_ports_tried) { -original_ports_tried = true; +if (!ephemeral_ports_tried) { +ephemeral_ports_tried = true; ct_addr = conn->nat_info->min_addr; min_port = MIN_NAT_EPHEMERAL_PORT; max_port = MAX_NAT_EPHEMERAL_PORT; -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch v1 1/2] conntrack: Fix nat_clean.
nat_clean has a defunct optimization for calculating a hash outside the scope of a bucket lock; move the line into the scope of the lock. Needs backporting to 2.8. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351629.html Fixes: 286de2729955 ("dpdk: Userspace Datapath: Introduce NAT Support.") Signed-off-by: Darrell Ball --- lib/conntrack.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/conntrack.c b/lib/conntrack.c index be8debb..44cb91b 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -778,10 +778,11 @@ nat_clean(struct conntrack *ct, struct conn *conn, { ct_rwlock_wrlock(&ct->resources_lock); nat_conn_keys_remove(&ct->nat_conn_keys, &conn->rev_key, ct->hash_basis); -ct_rwlock_unlock(&ct->resources_lock); -ct_lock_unlock(&ctb->lock); unsigned bucket_rev_conn = hash_to_bucket(conn_key_hash(&conn->rev_key, ct->hash_basis)); +ct_rwlock_unlock(&ct->resources_lock); +ct_lock_unlock(&ctb->lock); + ct_lock_lock(&ct->buckets[bucket_rev_conn].lock); ct_rwlock_wrlock(&ct->resources_lock); long long now = time_msec(); -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v6] dpif-netdev: Avoid reordering of packets in a batch with same megaflow
>-Original Message- >From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] >On Behalf Of Ilya Maximets >Sent: Wednesday, August 22, 2018 8:33 AM >To: Stokes, Ian ; Vishal Deep Ajmera >; d...@openvswitch.org >Subject: Re: [ovs-dev] [PATCH v6] dpif-netdev: Avoid reordering of packets in >a batch with same megaflow > >master. > >Hi, Ian, everyone. >It's ok to merge. I still don't like the change, but I'll get along with it. >P.S. Current version can not be applied cleanly, minor rebase required. > >Below text is not directly related to this patch. >--- >Meanwhile, in general, I think that the main processing part >of dpif-netdev (dp_netdev_input__) is overcomplicated. It looks >like we have 4 different levels of packet batching (some of them >exists at the same time): >1. rx batches >2. Per-flow batches >3. new flow maps >4. output batches. >And we're only constantly batching packets in a different data >structures. That's frustrating. > >I'd like to refactor it in a way of more iterative and sequential >processing like using "keys_map" approach from "dpcls_lookup" and >call different processing functions until "keys_map" is not empty. >Each processing function may use "ULLONG_FOR_EACH_1" for this >"key_map", set up rules for matched packets end disable them >from "keys_map" using "ULLONG_SET0". >Right now we have 5 options to classify packets: >1. has_flow_mark() >2. emc_lookup() >3. smc_lookup_batch() >4. dpcls_lookup() >5. handle_packet_upcall() >some of them works with packet batches, some with individual packets. >IMHO, if we'll make a plain pipeline from them calling the next >stage until there are some unhandled packets in "keys_map", it could >be much simpler in compare to current call tree with a mess of >processing functions, each of which implemented in a different way. >In the end we'll be able to push all the packets to per-flow batches >and execute actions. > >In fact, this will effectively resolve current reordering issue too. > >I'm not sure if the above text is parsable. =) Sorry, if not. > >Any thoughts? >Anyway, I'm going to try. [Wang, Yipeng] Hi, Ilya, Your suggestion of refactoring current code would be nice. Here is my two cents: 1. Previously when I implemented SMC cache, I tried to create a key array, and pass it to EMC_batch processing and SMC batch processing as two stages(mostly for performance reasons, batching EMC enables me to do software pipelining). The issue (at least what I observed) is miniflow_extracting to a full 32-element key array before EMC processing occupies so many cache lines thus it is costly. Current implementation reuses the same key cache line to do miniflow_extract if previous key hits EMC. This helps cache locality on the mostly-EMC-hit case. So if we assume EMC-hit is common, you may consider EMC lookup and miniflow extract separately from others for performance reasons. 2. Using masks to pass batches around is a good improvement on code readability. However, I don't know a convenient way to implement software pipelining with masks. So sometimes a compact pointer array may still benefit if software pipelining is needed. Please include me once you have a draft. Thanks Yipeng ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 2/2] dpif-netdev: Prevent unsafe access when retrieving meter stats.
dpif_netdev_meter_get() retrieved a pointer to a meter entry without holding a lock. It's possible that another thread could have deleted that entry between retrieving the pointer and dereferencing the pointer. This makes the function hold the lock the entire time the meter entry is needed. Found by inspection. Signed-off-by: Justin Pettit --- lib/dpif-netdev.c | 16 ++-- 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 8b0b3745860b..7c0300cc554a 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -5243,20 +5243,22 @@ dpif_netdev_meter_get(const struct dpif *dpif, struct ofputil_meter_stats *stats, uint16_t n_bands) { const struct dp_netdev *dp = get_dp_netdev(dpif); -const struct dp_meter *meter; uint32_t meter_id = meter_id_.uint32; +int retval = 0; if (meter_id >= MAX_METERS) { return EFBIG; } -meter = dp->meters[meter_id]; + +meter_lock(dp, meter_id); +const struct dp_meter *meter = dp->meters[meter_id]; if (!meter) { -return ENOENT; +retval = ENOENT; +goto done; } if (stats) { int i = 0; -meter_lock(dp, meter_id); stats->packet_in_count = meter->packet_count; stats->byte_in_count = meter->byte_count; @@ -5264,11 +5266,13 @@ dpif_netdev_meter_get(const struct dpif *dpif, stats->bands[i].packet_count = meter->bands[i].packet_count; stats->bands[i].byte_count = meter->bands[i].byte_count; } -meter_unlock(dp, meter_id); stats->n_bands = i; } -return 0; + +done: +meter_unlock(dp, meter_id); +return retval; } static int -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/2] dpif-netdev: Don't check if xcalloc() failed when creating meter.
xcalloc() can't return null. Signed-off-by: Justin Pettit --- lib/dpif-netdev.c | 64 +++ 1 file changed, 31 insertions(+), 33 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 807a462503ee..8b0b3745860b 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -5199,44 +5199,42 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id meter_id, /* Allocate meter */ meter = xzalloc(sizeof *meter + config->n_bands * sizeof(struct dp_meter_band)); -if (meter) { -meter->flags = config->flags; -meter->n_bands = config->n_bands; -meter->max_delta_t = 0; -meter->used = time_usec(); - -/* set up bands */ -for (i = 0; i < config->n_bands; ++i) { -uint32_t band_max_delta_t; - -/* Set burst size to a workable value if none specified. */ -if (config->bands[i].burst_size == 0) { -config->bands[i].burst_size = config->bands[i].rate; -} -meter->bands[i].up = config->bands[i]; -/* Convert burst size to the bucket units: */ -/* pkts => 1/1000 packets, kilobits => bits. */ -meter->bands[i].up.burst_size *= 1000; -/* Initialize bucket to empty. */ -meter->bands[i].bucket = 0; - -/* Figure out max delta_t that is enough to fill any bucket. */ -band_max_delta_t -= meter->bands[i].up.burst_size / meter->bands[i].up.rate; -if (band_max_delta_t > meter->max_delta_t) { -meter->max_delta_t = band_max_delta_t; -} +meter->flags = config->flags; +meter->n_bands = config->n_bands; +meter->max_delta_t = 0; +meter->used = time_usec(); + +/* set up bands */ +for (i = 0; i < config->n_bands; ++i) { +uint32_t band_max_delta_t; + +/* Set burst size to a workable value if none specified. */ +if (config->bands[i].burst_size == 0) { +config->bands[i].burst_size = config->bands[i].rate; } -meter_lock(dp, mid); -dp_delete_meter(dp, mid); /* Free existing meter, if any */ -dp->meters[mid] = meter; -meter_unlock(dp, mid); +meter->bands[i].up = config->bands[i]; +/* Convert burst size to the bucket units: */ +/* pkts => 1/1000 packets, kilobits => bits. */ +meter->bands[i].up.burst_size *= 1000; +/* Initialize bucket to empty. */ +meter->bands[i].bucket = 0; -return 0; +/* Figure out max delta_t that is enough to fill any bucket. */ +band_max_delta_t += meter->bands[i].up.burst_size / meter->bands[i].up.rate; +if (band_max_delta_t > meter->max_delta_t) { +meter->max_delta_t = band_max_delta_t; +} } -return ENOMEM; + +meter_lock(dp, mid); +dp_delete_meter(dp, mid); /* Free existing meter, if any */ +dp->meters[mid] = meter; +meter_unlock(dp, mid); + +return 0; } static int -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] CAP 100NF 0603 X7R 10% 50V--CL10B104KB SAMSUNG 0.018usd
Hello dear How are you? What can I support you recently? Any need in electronic component,pls feel free to sent to me . We can offer we can offer minimun quantity, short lead time,good quality parts,hard to find parts,one year guarantee for you! Hot offer (On stock) CAP 100NF 0603 X7R 10% 50V--CL10B104KB 4K/R SAMSUNG 2017+ 0.018usd CAP 20PF 0603 NPO 5% 50V--CL10C200JB 4K/R SAMSUNG 2017+ 0.0114usd CAP 15PF 0603 NPO 5% 50V--CL10C150JB 4K/R SAMSUNG 2017+ 0.0114usd CAP 10UF 0603 X5R 10% 6.3V--CL10A106KQ 4K/R SAMSUNG 2017+ 0.019usd CAP 10UF 0603 X5R 20% 25V--CL10A106MA 4K/R SAMSUNG 2017+ 0.025usd RES 100R 0603 1% YAGEO/PHYCOMP--RC0603FR-07100RL 5K/R YAGEO 2017+ 0.0023usd RES 10K 0603 1% YAGEO/PHYCOMP--RC0603FR-0710KL 5K/R YAGEO 2017+ 0.0023usd RES 2K2 0603 1% YAGEO/PHYCOMP--RC0603FR-072K2L 5K/R YAGEO 2017+ 0.0023usd RES 1K8 0603 1% YAGEO/PHYCOMP--RC0603FR-071K8L 5K/R YAGEO 2017+ 0.0023usd RES 120R 0603 1% YAGEO/PHYCOMP--RC0603FR-07120RL 5K/R YAGEO 2017+ 0.0023usd RES 150R 0603 1% YAGEO/PHYCOMP--RC0603FR-07150RL 5K/R YAGEO 2017+ 0.0023usd RES 4K87 0603 1%--RC0603FR-074K87L 5K/R YAGEO 2017+ 0.0023usd Strong lines: Microchip, Xilinx, Altera,Micron(1 piece order;1 year warranty;100 price reference) Irene Sales Manager Hard FindElectronics Ltd. 315, Shahe Rod, Long Gang District, Shenzhen, CN, 518000 Tel: +86-755-8418 8103 Fax: +86-755-8418 8303 Skype: irene.hardfind Follow us: Facebook & Linkedin Email: ir...@hardfindelectronics.com Web: www.hardfindelectronics.com Your trust small quantity & short lead time ISO 9001:2008 Certified distributor Please log in our website for more electronic components If you don't want to receive this mail, pls return with "remove" on the subject line. 如果你不想再收到该产品的推荐邮件,请点击这里退订 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Unable to recieve tagged packets on OVS bridge
On Wed, Aug 29, 2018 at 12:10:03AM +0200, Alvaro Jimenez wrote: > I'm trying to recieve tagged packets and route them using ovs depending on > the VLAN tag value. I created and added eth0.101, eth0.201 and eth0.301 to > the ovs bridge. I'm able to recieve packets in those interfaces but there's > no VLAN tag when i tcpdump them. Moreover, when i tcpdump eth0 interface i > am able to see the tagged frames. Are the vlan subinterfaces popping the > VLAN tag? Yes. > Is there any other way to recieve the tagged frames on the > openvswitch? Yes. The documentation and the FAQ talk about VLANs a lot. Did you read ovs-vsctl(8)? Did you read the FAQ chapter about VLANs? http://docs.openvswitch.org/en/latest/faq/vlan/ ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Unable to recieve tagged packets on OVS bridge
Hi everyone, I'm trying to recieve tagged packets and route them using ovs depending on the VLAN tag value. I created and added eth0.101, eth0.201 and eth0.301 to the ovs bridge. I'm able to recieve packets in those interfaces but there's no VLAN tag when i tcpdump them. Moreover, when i tcpdump eth0 interface i am able to see the tagged frames. Are the vlan subinterfaces popping the VLAN tag? Is there any other way to recieve the tagged frames on the openvswitch? Thanks in advance. Álvaro ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] debian: Move libovn out from package libopenvswitch.
Thanks Han and aginwala. I applied this to master. On Mon, Aug 27, 2018 at 07:34:50PM -0700, aginwala wrote: > Tested-by: aginwala > On Mon, Aug 27, 2018 at 9:42 AM Ben Pfaff wrote: > > > On Fri, Aug 24, 2018 at 06:07:24PM -0700, Han Zhou wrote: > > > From: Han Zhou > > > > > > Since we are packaging OVN and OVS components separately, libovn > > > shouldn't belong to OVS, so move it to ovn-common. Also, remove > > > it from libopenvswitch-dev. > > > > > > Signed-off-by: Han Zhou > > > > This is one where I'd appreciate a Tested-by: from someone. Anyone out > > there? :-) > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 6/6] system-dpdk: Connect network namespaces via dpdkvhostuser ports
This adds a new test to the 'check-dpdk' subsystem that will exercise allocations, PMDs, and the vhost-user code path. Signed-off-by: Bala Sankaran Co-authored-by: Aaron Conole Signed-off-by: Aaron Conole --- tests/system-dpdk.at | 77 1 file changed, 77 insertions(+) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 58dc8aaae..914a1b644 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -1,3 +1,6 @@ +m4_define([CONFIGURE_VETH_OFFLOADS], + [AT_CHECK([ethtool -K $1 tx off], [0], [ignore], [ignore])]) + AT_BANNER([OVS-DPDK unit tests]) dnl -- @@ -74,3 +77,77 @@ OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch kernel module is probably \@EAL: No free hugepages reported in hugepages-1048576kB@d"]) AT_CLEANUP dnl -- + + + +dnl -- +dnl Ping vhost-user-client port +AT_SETUP([OVS-DPDK datapath - ping vhost-user-client ports]) +AT_KEYWORDS([dpdk]) +OVS_DPDK_PRE_CHECK() +OVS_DPDK_START() + +dnl Add userspace bridge and attach it to OVS +AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) +AT_CHECK([ovs-vsctl add-port br10 vhu0 -- set Interface vhu0 \ + type=dpdkvhostuser], [], + [stdout], [stderr]) +AT_CHECK([ovs-vsctl show], [], [stdout]) + +dnl Parse log file +AT_CHECK([grep "VHOST_CONFIG: vhost-user server: socket created" \ + ovs-vswitchd.log], [], [stdout]) +AT_CHECK([grep "Socket $OVS_RUNDIR/vhu0 created for vhost-user port vhu0" \ + ovs-vswitchd.log], [], [stdout]) +AT_CHECK([grep "VHOST_CONFIG: bind to $OVS_RUNDIR/vhu0" ovs-vswitchd.log], [], + [stdout]) + +dnl Set up namespaces +ADD_NAMESPACES(ns1, ns2) + +dnl execute testpmd in background +on_exit "pkill -f -x -9 'tail -f /dev/null'" +tail -f /dev/null | testpmd --socket-mem=512 \ + --vdev="net_virtio_user,path=$OVS_RUNDIR/vhu0" \ + --vdev="net_tap0,iface=tap0" --file-prefix page0 \ + --single-file-segments -- -a >$OVS_RUNDIR/testpmd-vhu0.log 2>&1 & + +dnl add veth device +ADD_VETH(tap1, ns2, br10, "172.31.110.12/24") + +dnl give settling time to the testpmd processes - NOTE: this is bad form. +sleep 10 + +dnl move the tap devices to the namespaces +AT_CHECK([ps aux | grep testpmd], [], [stdout], [stderr]) +AT_CHECK([ip link show], [], [stdout], [stderr]) +AT_CHECK([ip link set tap0 netns ns1], [], [stdout], [stderr]) + +AT_CHECK([ip netns exec ns1 ip link show], [], [stdout], [stderr]) +AT_CHECK([ip netns exec ns1 ip link show | grep tap0], [], [stdout], [stderr]) +AT_CHECK([ip netns exec ns1 ip link set tap0 up], [], [stdout], [stderr]) +AT_CHECK([ip netns exec ns1 ip addr add 172.31.110.11/24 dev tap0], [], + [stdout], [stderr]) + +AT_CHECK([ip netns exec ns1 ip link show], [], [stdout], [stderr]) +AT_CHECK([ip netns exec ns2 ip link show], [], [stdout], [stderr]) +AT_CHECK([ip netns exec ns1 arping -c 4 -I tap0 172.31.110.12], [], [stdout], + [stderr]) + +dnl clean up the testpmd now +pkill -f -x -9 'tail -f /dev/null' + +dnl Clean up +AT_CHECK([ovs-vsctl del-port br10 vhu0], [], [stdout], [stderr]) +OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch kernel module is probably not loaded.@d +\@Failed to enable flow control@d +\@VHOST_CONFIG: recvmsg failed@d +\@VHOST_CONFIG: failed to connect to $OVS_RUNDIR/vhu0: No such file or directory@d +\@Global register is changed during@d +\@dpdkvhostuser ports are considered deprecated; please migrate to dpdkvhostuserclient ports.@d +\@failed to enumerate system datapaths: No such file or directory@d +\@EAL: Invalid NUMA socket, default to 0@d +\@EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !@d +\@EAL: No free hugepages reported in hugepages-1048576kB@d"]) +AT_CLEANUP +dnl -- -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 5/6] system-dpdk: Convert /tmp to use OVS_RUNDIR
When multiple users run the DPDK testsuite their dependence on /tmp will cause conflicts. Use the RUNDIR as a dynamic path to overcome this. NOTE: This still doesn't solve the dependency on /var/run that DPDK requires. Signed-off-by: Bala Sankaran Co-authored-by: Aaron Conole Signed-off-by: Aaron Conole --- tests/system-dpdk.at | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 834ba06fb..58dc8aaae 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -54,20 +54,20 @@ OVS_DPDK_START() dnl Add userspace bridge and attach it to OVS AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) -AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/dpdkvhostclient0], [], [stdout], [stderr]) +AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface dpdkvhostuserclient0 type=dpdkvhostuserclient options:vhost-server-path=$OVS_RUNDIR/dpdkvhostclient0], [], [stdout], [stderr]) AT_CHECK([ovs-vsctl show], [], [stdout]) sleep 2 dnl Parse log file AT_CHECK([grep "VHOST_CONFIG: vhost-user client: socket created" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' mode, using client socket" ovs-vswitchd.log], [], [stdout]) -AT_CHECK([grep "VHOST_CONFIG: /tmp/dpdkvhostclient0: reconnecting..." ovs-vswitchd.log], [], [stdout]) +AT_CHECK([grep "VHOST_CONFIG: $OVS_RUNDIR/dpdkvhostclient0: reconnecting..." ovs-vswitchd.log], [], [stdout]) dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch kernel module is probably not loaded.@d \@Failed to enable flow control@d -\@VHOST_CONFIG: failed to connect to /tmp/dpdkvhostclient0: No such file or directory@d +\@VHOST_CONFIG: failed to connect to $OVS_RUNDIR/dpdkvhostclient0: No such file or directory@d \@Global register is changed during@d \@EAL: Invalid NUMA socket, default to 0@d \@EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !@d -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 4/6] system-dpdk: Use a different character marker for sed commands
From: Aaron Conole The default marker for sed commands according to the manual is /, but this is inconvenient when working with paths. The solution is either to escape all instances of / or use sed's \cREGEXc feature. Signed-off-by: Aaron Conole --- tests/system-dpdk.at | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 723ba794f..834ba06fb 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -65,12 +65,12 @@ AT_CHECK([grep "VHOST_CONFIG: /tmp/dpdkvhostclient0: reconnecting..." ovs-vswitc dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP(["/does not exist. The Open vSwitch kernel module is probably not loaded./d -/Failed to enable flow control/d -/failed to connect to \/tmp\/dpdkvhostclient0: No such file or directory/d -/Global register is changed during/d -/EAL: Invalid NUMA socket, default to 0/d -/EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !/d -/EAL: No free hugepages reported in hugepages-1048576kB/d"]) +OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch kernel module is probably not loaded.@d +\@Failed to enable flow control@d +\@VHOST_CONFIG: failed to connect to /tmp/dpdkvhostclient0: No such file or directory@d +\@Global register is changed during@d +\@EAL: Invalid NUMA socket, default to 0@d +\@EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !@d +\@EAL: No free hugepages reported in hugepages-1048576kB@d"]) AT_CLEANUP dnl -- -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 3/6] system-dpdk: Allow running the dpdk tests from a VM
From: Aaron Conole Some VM configurations result in CPU flags that cause warnings to be issued by the DPDK libraries. When these warnings are issued, the tests will fail. This commit adds the unreliable tsc warning to the list of ignored warnings. Signed-off-by: Aaron Conole --- tests/system-dpdk.at | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index c1c908411..723ba794f 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -11,6 +11,7 @@ AT_CHECK([grep "EAL" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "DPDK Enabled - initialized" ovs-vswitchd.log], [], [stdout]) OVS_VSWITCHD_STOP(["/Global register is changed during/d /EAL: Invalid NUMA socket, default to 0/d +/EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !/d /EAL: No free hugepages reported in hugepages-1048576kB/d"]) AT_CLEANUP dnl -- @@ -36,6 +37,7 @@ AT_CHECK([ovs-vsctl del-port br10 phy0], [], [stdout], [stderr]) OVS_VSWITCHD_STOP("/does not exist. The Open vSwitch kernel module is probably not loaded./d /Failed to enable flow control/d /Global register is changed during/d +/EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !/d /EAL: No free hugepages reported in hugepages-1048576kB/d ") AT_CLEANUP @@ -68,6 +70,7 @@ OVS_VSWITCHD_STOP(["/does not exist. The Open vSwitch kernel module is probably /failed to connect to \/tmp\/dpdkvhostclient0: No such file or directory/d /Global register is changed during/d /EAL: Invalid NUMA socket, default to 0/d +/EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using unreliable clock cycles !/d /EAL: No free hugepages reported in hugepages-1048576kB/d"]) AT_CLEANUP dnl -- -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 1/6] system-dpdk: update test suite for non-phy testing
From: Aaron Conole This allows a system that doesn't have a dedicated DPDK nic to execute some DPDK tests. In this fashion, tests that operate on virtual ports (such as dpdkvhostuserclient) can be executed in a wider set of environments. Signed-off-by: Aaron Conole --- tests/system-dpdk-macros.at | 18 +++--- tests/system-dpdk.at| 16 2 files changed, 23 insertions(+), 11 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 0762ee055..2e5571fc4 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -2,7 +2,6 @@ # # Check prerequisites for DPDK tests. Following settings are checked: # - Hugepages -# - UIO driver # m4_define([OVS_DPDK_PRE_CHECK], [dnl Check Hugepages @@ -11,13 +10,26 @@ m4_define([OVS_DPDK_PRE_CHECK], AT_CHECK([mount], [], [stdout]) AT_CHECK([grep 'hugetlbfs' stdout], [], [stdout], []) +]) + + +# OVS_DPDK_PRE_PHY_SKIP() +# +# Skip any phy related tests if the PHY variable is not set. +# This is done by checking for a bound driver. +# +m4_define([OVS_DPDK_PRE_PHY_SKIP], + [dnl Perform the precheck + OVS_DPDK_PRE_CHECK() + dnl Check if VFIO or UIO driver is loaded - AT_CHECK([lsmod | grep -E "igb_uio|vfio"], [], [stdout]) + AT_SKIP_IF([ ! (lsmod | grep -E "igb_uio|vfio") ], [], [stdout]) dnl Find PCI address candidate, skip if there is no DPDK-compatible NIC AT_CHECK([$DPDK_DIR/usertools/dpdk-devbind.py -s | head -n +4 | tail -1], [], [stdout]) AT_CHECK([cat stdout | cut -d" " -s -f1 > PCI_ADDR]) - AT_CHECK([test -s PCI_ADDR || exit 77]) + AT_SKIP_IF([ ! test -s PCI_ADDR ]) + ]) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 3d21b0136..6901d19e6 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -4,14 +4,14 @@ dnl -- dnl Check if EAL init is successfull AT_SETUP([OVS-DPDK datapath - EAL init]) AT_KEYWORDS([dpdk]) -dnl OVS_DPDK_PRE_CHECK() +OVS_DPDK_PRE_CHECK() OVS_DPDK_START() AT_CHECK([grep "DPDK Enabled - initializing..." ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "EAL" ovs-vswitchd.log], [], [stdout]) AT_CHECK([grep "DPDK Enabled - initialized" ovs-vswitchd.log], [], [stdout]) -OVS_VSWITCHD_STOP("/Global register is changed during/d -/EAL: No free hugepages reported in hugepages-1048576kB/d -") +OVS_VSWITCHD_STOP(["/Global register is changed during/d +/EAL: Invalid NUMA socket, default to 0/d +/EAL: No free hugepages reported in hugepages-1048576kB/d"]) AT_CLEANUP dnl -- @@ -22,7 +22,7 @@ dnl Add standard DPDK PHY port AT_SETUP([OVS-DPDK datapath - add standard DPDK port]) AT_KEYWORDS([dpdk]) -OVS_DPDK_PRE_CHECK() +OVS_DPDK_PRE_PHY_SKIP() OVS_DPDK_START() dnl Add userspace bridge and attach it to OVS @@ -63,11 +63,11 @@ AT_CHECK([grep "VHOST_CONFIG: /tmp/dpdkvhostclient0: reconnecting..." ovs-vswitc dnl Clean up AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], [stderr]) -OVS_VSWITCHD_STOP("/does not exist. The Open vSwitch kernel module is probably not loaded./d +OVS_VSWITCHD_STOP(["/does not exist. The Open vSwitch kernel module is probably not loaded./d /Failed to enable flow control/d /failed to connect to \/tmp\/dpdkvhostclient0: No such file or directory/d /Global register is changed during/d -/EAL: No free hugepages reported in hugepages-1048576kB/d -") +/EAL: Invalid NUMA socket, default to 0/d +/EAL: No free hugepages reported in hugepages-1048576kB/d"]) AT_CLEANUP dnl -- -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 2/6] system-dpdk: skip all tests if there are no hugepages
A failure is quite harsh in this scenario. It's better to simply skip all the tests and let the user look at the logs to understand the missing hugepages. Signed-off-by: Bala Sankaran Co-authored-by: Aaron Conole Signed-off-by: Aaron Conole --- tests/system-dpdk-macros.at | 2 +- tests/system-dpdk.at| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at index 2e5571fc4..f772a1945 100644 --- a/tests/system-dpdk-macros.at +++ b/tests/system-dpdk-macros.at @@ -6,7 +6,7 @@ m4_define([OVS_DPDK_PRE_CHECK], [dnl Check Hugepages AT_CHECK([cat /proc/meminfo], [], [stdout]) - AT_CHECK([grep HugePages_ stdout], [], [stdout]) + AT_SKIP_IF([egrep 'HugePages_Free: *0' stdout], [], [stdout]) AT_CHECK([mount], [], [stdout]) AT_CHECK([grep 'hugetlbfs' stdout], [], [stdout], []) diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at index 6901d19e6..c1c908411 100644 --- a/tests/system-dpdk.at +++ b/tests/system-dpdk.at @@ -47,7 +47,7 @@ dnl -- dnl Add vhost-user-client port AT_SETUP([OVS-DPDK datapath - add vhost-user-client port]) AT_KEYWORDS([dpdk]) - +OVS_DPDK_PRE_CHECK() OVS_DPDK_START() dnl Add userspace bridge and attach it to OVS -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v3 0/6] system-dpdk: Add support to connect two namespaces
This allows system-dpdk test suite to ping two namespaces via a veth and dpdkvhostuser port, using testpmd as a forwarding agent. For the initial test, testpmd included with 18.11-rc0 is used, while ovs is linked against DPDK 17.11 LTS. Some additional enhancements are added to the dpdk testsuite to make it easier to use. v3: * Added documentation for patches 1 and 2 * Fixed a commit message typo for patch 5 * Replaced vhu0 with dpdkvhostuser0, made every first character after dnl to be uppercase Aaron Conole (3): system-dpdk: update test suite for non-phy testing system-dpdk: Allow running the dpdk tests from a VM system-dpdk: Use a different character marker for sed commands Bala Sankaran (3): system-dpdk: skip all tests if there are no hugepages system-dpdk: Convert /tmp to use OVS_RUNDIR system-dpdk: Connect network namespaces via dpdkvhostuser ports tests/system-dpdk-macros.at | 20 +-- tests/system-dpdk.at| 108 +++- 2 files changed, 110 insertions(+), 18 deletions(-) -- 2.17.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVS DPDK Latest & HWOL Branches
On Tue, Aug 28, 2018 at 11:32:31AM +0100, Ian Stokes wrote: > On 8/27/2018 5:16 PM, Ben Pfaff wrote: > >On Mon, Aug 27, 2018 at 04:05:39PM +, Ophir Munk wrote: > >>4. How can I inspect the new branches? Currently I am not seeing them. > > > >I do not think that Ian has created the new branches yet. > > I can create these today. > > There was some discussion as regards the branch names. Before creating them > are people happy with 'branch-dpdk-latest' and 'branch-dpdk-hwol' ? > > If there are no objections then I'll go ahead with these today. I'd leave out the "branch-" prefixes. The existing branches only have those prefixes because e.g. "2.10" seemed too easy to confuse with a release name. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] datapath-windows: Add support to configure ct zone limits
This patch implements limiting conntrack entries per zone using dpctl commands. Example: ovs-appctl dpctl/ct-set-limits default=5 zone=1,limit=2 zone=1,limit=3 ovs-appctl dpct/ct-del-limits zone=4 ovs-appctl dpct/ct-get-limits zone=1,2,3 - Also update the netlink-socket.c to support netlink family 'OVS_WIN_NL_CTLIMIT_FAMILY_ID' for conntrack zone limit. Signed-off-by: Anand Kumar v1->v2: - Use spinlock to guard against multiple access. - Use Interlock api to update zone counters. - Address review comments. --- datapath-windows/include/OvsDpInterfaceExt.h | 1 + datapath-windows/ovsext/Conntrack.c | 167 ++- datapath-windows/ovsext/Conntrack.h | 12 ++ datapath-windows/ovsext/Datapath.c | 34 +- lib/netlink-socket.c | 5 + 5 files changed, 216 insertions(+), 3 deletions(-) diff --git a/datapath-windows/include/OvsDpInterfaceExt.h b/datapath-windows/include/OvsDpInterfaceExt.h index db91c3e..5fd8000 100644 --- a/datapath-windows/include/OvsDpInterfaceExt.h +++ b/datapath-windows/include/OvsDpInterfaceExt.h @@ -72,6 +72,7 @@ */ #define OVS_WIN_NL_CT_FAMILY_ID (NLMSG_MIN_TYPE + 7) +#define OVS_WIN_NL_CTLIMIT_FAMILY_ID (NLMSG_MIN_TYPE + 8) #define OVS_WIN_NL_INVALID_MCGRP_ID 0 #define OVS_WIN_NL_MCGRP_START_ID100 diff --git a/datapath-windows/ovsext/Conntrack.c b/datapath-windows/ovsext/Conntrack.c index dd16602..d0900bd 100644 --- a/datapath-windows/ovsext/Conntrack.c +++ b/datapath-windows/ovsext/Conntrack.c @@ -27,13 +27,17 @@ #define WINDOWS_TICK 1000 #define SEC_TO_UNIX_EPOCH 11644473600LL #define SEC_TO_NANOSEC 10LL +#define CT_MAX_ZONE UINT16_MAX + 1 KSTART_ROUTINE OvsConntrackEntryCleaner; static PLIST_ENTRY ovsConntrackTable; static OVS_CT_THREAD_CTX ctThreadCtx; static PNDIS_RW_LOCK_EX *ovsCtBucketLock = NULL; +static NDIS_SPIN_LOCK ovsCtZoneLock; +static POVS_CT_ZONE_INFO zoneInfo = NULL; extern POVS_SWITCH_CONTEXT gOvsSwitchContext; static ULONG ctTotalEntries; +static ULONG defaultCtLimit; static __inline OvsCtFlush(UINT16 zone, struct ovs_key_ct_tuple_ipv4 *tuple); static __inline NDIS_STATUS @@ -94,6 +98,20 @@ OvsInitConntrack(POVS_SWITCH_CONTEXT context) ZwClose(threadHandle); threadHandle = NULL; +zoneInfo = OvsAllocateMemoryWithTag(sizeof(OVS_CT_ZONE_INFO) * +CT_MAX_ZONE, OVS_CT_POOL_TAG); +if (zoneInfo == NULL) { +status = STATUS_INSUFFICIENT_RESOURCES; +goto freeBucketLock; +} + +NdisAllocateSpinLock(&ovsCtZoneLock); +defaultCtLimit = CT_MAX_ENTRIES; +for (UINT32 i = 0; i < CT_MAX_ZONE; i++) { +zoneInfo[i].entries = 0; +zoneInfo[i].limit = defaultCtLimit; +} + status = OvsNatInit(); if (status != STATUS_SUCCESS) { @@ -149,6 +167,25 @@ OvsCleanupConntrack(VOID) OvsFreeMemoryWithTag(ovsCtBucketLock, OVS_CT_POOL_TAG); ovsCtBucketLock = NULL; OvsNatCleanup(); +NdisFreeSpinLock(&ovsCtZoneLock); +if (zoneInfo) { +OvsFreeMemoryWithTag(zoneInfo, OVS_CT_POOL_TAG); +} +} + +VOID +OvsCtSetZoneLimit(int zone, ULONG value) { +NdisAcquireSpinLock(&ovsCtZoneLock); +if (zone == -1) { +/* Set default limit for all zones. */ +defaultCtLimit = value; +for (UINT32 i = 0; i < CT_MAX_ZONE; i++) { +zoneInfo[i].limit = value; +} +} else { +zoneInfo[(UINT16)zone].limit = value; +} +NdisReleaseSpinLock(&ovsCtZoneLock); } /* @@ -263,6 +300,7 @@ OvsCtAddEntry(POVS_CT_ENTRY entry, &entry->link); NdisInterlockedIncrement((PLONG)&ctTotalEntries); +NdisInterlockedIncrement((PLONG)&zoneInfo[ctx->key.zone].entries); NdisReleaseRWLock(ovsCtBucketLock[bucketIdx], &lockState); return TRUE; } @@ -437,6 +475,7 @@ OvsCtEntryDelete(POVS_CT_ENTRY entry, BOOLEAN forceDelete) if (entry->natInfo.natAction) { OvsNatDeleteKey(&entry->key); } +NdisInterlockedDecrement((PLONG)&zoneInfo[entry->key.zone].entries); OvsPostCtEventEntry(entry, OVS_EVENT_CT_DELETE); RemoveEntryList(&entry->link); OVS_RELEASE_SPIN_LOCK(&(entry->lock), irql); @@ -877,12 +916,16 @@ OvsCtExecute_(OvsForwardingContext *fwdCtx, &entryCreated); } else { -if (commit && ctTotalEntries >= CT_MAX_ENTRIES) { +if (commit && (ctTotalEntries >= CT_MAX_ENTRIES || +zoneInfo[ctx.key.zone].entries >= zoneInfo[ctx.key.zone].limit)) { /* Don't proceed with processing if the max limit has been hit. * This blocks only new entries from being created and doesn't * affect existing connections. */ -OVS_LOG_ERROR("Conntrack Limit hit: %lu", ctTotalEntries); +OVS_LOG_ERROR("Conntrack Limit hit: zone(%u), zoneLimit(%lu)," +
Re: [ovs-dev] OVS DPDK Latest & HWOL Branches
Ian Stokes writes: > On 8/27/2018 5:16 PM, Ben Pfaff wrote: >> I can help with some of these. >> >> On Mon, Aug 27, 2018 at 04:05:39PM +, Ophir Munk wrote: >>> Ian, can you please specify the practical steps regarding the new branches? >>> Specifically, what is the procedure for adding a new patch for >>> either of the branches (OVS DPDK latest or HWOK)? >>> 1. What should the patch title include? >> >> I guess this is up to Ian, although he should coordinate with Aaron to >> make sure that the patch robot understands too. > > We discussed this at the community call last week but it's good to > raise it on the ML again for wider input. > > The process we would follow is that patches would be submitted to the > d...@openvswitch.org. > > As the patches affects a particular branch and not master, > contributors should submit the change with the target branch listed in > the subject line of the patch. > > The git format-patch argument --subject-prefix may be used when > posting the patch, for example: > > $ git format-patch HEAD --subject-prefix="PATCH branch-dpdk-latest" > > or > > $ git format-patch HEAD --subject-prefix="PATCH branch-dpdk-hwol" > > @Aaron, would it be possible to setup the 0-day robot to recognize > patches with the above subject header and apply/build the correlating > branch? Absolutely do-able. I'm getting ready for DPDK Userspace summit, so it *might* get delayed a bit, but I'll have it implemented asap. >> >>> 2. Who is going to merge a new patch (as well as ongoing master >>> branch updates) into the relevant branch? >> >> I expect that Ian will be the only one pushing to the new branches. > > Yes, I'll handle merging the patches as well as the typical validation > via vsperf to help identify any issues a patch may cause to > performance or functionality. > >> >>> 3. Can I have write permissions in the new branches? >> >> I'd prefer to have just Ian doing this work for now. >> >>> 4. How can I inspect the new branches? Currently I am not seeing them. >> >> I do not think that Ian has created the new branches yet. > > I can create these today. > > There was some discussion as regards the branch names. Before creating > them are people happy with 'branch-dpdk-latest' and 'branch-dpdk-hwol' > ? > > If there are no objections then I'll go ahead with these today. > > Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4 2/2] NEWS: Add entry for pmd-rxq-assign.
Signed-off-by: Kevin Traynor Acked-by: Eelco Chaudron --- NEWS | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 33b4d8a..33b3638 100644 --- a/NEWS +++ b/NEWS @@ -9,5 +9,7 @@ Post-v2.10.0 - ovn: * ovn-ctl: allow passing user:group ids to the OVN daemons. - + - DPDK: + * Add option for simple round-robin based Rxq to PMD assignment. + It can be set with pmd-rxq-assign. v2.10.0 - xx xxx -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v4 1/2] dpif-netdev: Add round-robin based rxq to pmd assignment.
Prior to OVS 2.9 automatic assignment of Rxqs to PMDs (i.e. CPUs) was done by round-robin. That was changed in OVS 2.9 to ordering the Rxqs based on their measured processing cycles. This was to assign the busiest Rxqs to different PMDs, improving aggregate throughput. For the most part the new scheme should be better, but there could be situations where a user prefers a simple round-robin scheme because Rxqs from a single port are more likely to be spread across multiple PMDs, and/or traffic is very bursty/unpredictable. Add 'pmd-rxq-assign' config to allow a user to select round-robin based assignment. Signed-off-by: Kevin Traynor Acked-by: Eelco Chaudron --- V4: - Modified warning log (Ilya) V3: - Rolled in some style and vswitch.xml changes (Ilya) - Set cycles mode by default on wrong config (Ilya) V2: - simplified nextpmd change (Eelco) - removed confusing doc sentence (Eelco) - fixed commit msg (Ilya) - made change in pmd-rxq-assign value also perform re-assignment (Ilya) - renamed to roundrobin mode (Ilya) - moved vswitch.xml changes to right config section (Ilya) - comment/log updates - moved NEWS update to separate patch as it's been changing on master Documentation/topics/dpdk/pmd.rst | 33 +--- lib/dpif-netdev.c | 83 +-- tests/pmd.at | 12 +- vswitchd/vswitch.xml | 24 +++ 4 files changed, 123 insertions(+), 29 deletions(-) diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst index 5f0671e..dd9172d 100644 --- a/Documentation/topics/dpdk/pmd.rst +++ b/Documentation/topics/dpdk/pmd.rst @@ -113,10 +113,15 @@ means that this thread will only poll the *pinned* Rx queues. If ``pmd-rxq-affinity`` is not set for Rx queues, they will be assigned to PMDs -(cores) automatically. Where known, the processing cycles that have been stored -for each Rx queue will be used to assign Rx queue to PMDs based on a round -robin of the sorted Rx queues. For example, take the following example, where -there are five Rx queues and three cores - 3, 7, and 8 - available and the -measured usage of core cycles per Rx queue over the last interval is seen to -be: +(cores) automatically. + +The algorithm used to automatically assign Rxqs to PMDs can be set by:: + +$ ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign= + +By default, ``cycles`` assignment is used where the Rxqs will be ordered by +their measured processing cycles, and then be evenly assigned in descending +order to PMDs based on an up/down walk of the PMDs. For example, where there +are five Rx queues and three cores - 3, 7, and 8 - available and the measured +usage of core cycles per Rx queue over the last interval is seen to be: - Queue #0: 30% @@ -132,4 +137,20 @@ The Rx queues will be assigned to the cores in the following order:: Core 8: Q3 (60%) | Q0 (30%) +Alternatively, ``roundrobin`` assignment can be used, where the Rxqs are +assigned to PMDs in a round-robined fashion. This algorithm was used by +default prior to OVS 2.9. For example, given the following ports and queues: + +- Port #0 Queue #0 (P0Q0) +- Port #0 Queue #1 (P0Q1) +- Port #1 Queue #0 (P1Q0) +- Port #1 Queue #1 (P1Q1) +- Port #1 Queue #2 (P1Q2) + +The Rx queues may be assigned to the cores in the following order:: + +Core 3: P0Q0 | P1Q1 +Core 7: P0Q1 | P1Q2 +Core 8: P1Q0 | + To see the current measured usage history of PMD core cycles for each Rx queue:: diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 807a462..ae24e6a 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -349,4 +349,6 @@ struct dp_netdev { struct id_pool *tx_qid_pool; struct ovs_mutex tx_qid_pool_mutex; +/* Use measured cycles for rxq to pmd assignment. */ +bool pmd_rxq_assign_cyc; /* Protects the access of the 'struct dp_netdev_pmd_thread' @@ -1500,4 +1502,5 @@ create_dp_netdev(const char *name, const struct dpif_class *class, cmap_init(&dp->poll_threads); +dp->pmd_rxq_assign_cyc = true; ovs_mutex_init(&dp->tx_qid_pool_mutex); @@ -3724,4 +3727,6 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) struct dp_netdev *dp = get_dp_netdev(dpif); const char *cmask = smap_get(other_config, "pmd-cpu-mask"); +const char *pmd_rxq_assign = smap_get_def(other_config, "pmd-rxq-assign", + "cycles"); unsigned long long insert_prob = smap_get_ullong(other_config, "emc-insert-inv-prob", @@ -3786,4 +3791,18 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) } } + +bool pmd_rxq_assign_cyc = !strcmp(pmd_rxq_assign, "cycles"); +if (!pmd_rxq_assign_cyc && strcmp(pmd_rxq_assign, "roundrobin")) { +VLOG_WARN("Unsupported Rxq to PMD assignment mode in pmd-rxq-assign. " + "Defaulting to 'cycles'."); +pmd_rxq_assign_cyc = tr
Re: [ovs-dev] [PATCH v3 1/2] dpif-netdev: Add round-robin based rxq to pmd assignment.
On 08/27/2018 04:04 PM, Ilya Maximets wrote: > On 27.08.2018 17:19, Kevin Traynor wrote: >> On 08/27/2018 02:30 PM, Ilya Maximets wrote: >>> On 25.08.2018 13:00, Kevin Traynor wrote: Prior to OVS 2.9 automatic assignment of Rxqs to PMDs (i.e. CPUs) was done by round-robin. That was changed in OVS 2.9 to ordering the Rxqs based on their measured processing cycles. This was to assign the busiest Rxqs to different PMDs, improving aggregate throughput. For the most part the new scheme should be better, but there could be situations where a user prefers a simple round-robin scheme because Rxqs from a single port are more likely to be spread across multiple PMDs, and/or traffic is very bursty/unpredictable. Add 'pmd-rxq-assign' config to allow a user to select round-robin based assignment. Signed-off-by: Kevin Traynor Acked-by: Eelco Chaudron --- V3: - Rolled in some style and vswitch.xml changes (Ilya) - Set cycles mode by default on wrong config (Ilya) V2: - simplified nextpmd change (Eelco) - removed confusing doc sentence (Eelco) - fixed commit msg (Ilya) - made change in pmd-rxq-assign value also perform re-assignment (Ilya) - renamed to roundrobin mode (Ilya) - moved vswitch.xml changes to right config section (Ilya) - comment/log updates - moved NEWS update to separate patch as it's been changing on master Documentation/topics/dpdk/pmd.rst | 33 +--- lib/dpif-netdev.c | 83 +-- tests/pmd.at | 12 +- vswitchd/vswitch.xml | 24 +++ 4 files changed, 123 insertions(+), 29 deletions(-) diff --git a/Documentation/topics/dpdk/pmd.rst b/Documentation/topics/dpdk/pmd.rst index 5f0671e..dd9172d 100644 --- a/Documentation/topics/dpdk/pmd.rst +++ b/Documentation/topics/dpdk/pmd.rst @@ -113,10 +113,15 @@ means that this thread will only poll the *pinned* Rx queues. If ``pmd-rxq-affinity`` is not set for Rx queues, they will be assigned to PMDs -(cores) automatically. Where known, the processing cycles that have been stored -for each Rx queue will be used to assign Rx queue to PMDs based on a round -robin of the sorted Rx queues. For example, take the following example, where -there are five Rx queues and three cores - 3, 7, and 8 - available and the -measured usage of core cycles per Rx queue over the last interval is seen to -be: +(cores) automatically. + +The algorithm used to automatically assign Rxqs to PMDs can be set by:: + +$ ovs-vsctl set Open_vSwitch . other_config:pmd-rxq-assign= + +By default, ``cycles`` assignment is used where the Rxqs will be ordered by +their measured processing cycles, and then be evenly assigned in descending +order to PMDs based on an up/down walk of the PMDs. For example, where there +are five Rx queues and three cores - 3, 7, and 8 - available and the measured +usage of core cycles per Rx queue over the last interval is seen to be: - Queue #0: 30% @@ -132,4 +137,20 @@ The Rx queues will be assigned to the cores in the following order:: Core 8: Q3 (60%) | Q0 (30%) +Alternatively, ``roundrobin`` assignment can be used, where the Rxqs are +assigned to PMDs in a round-robined fashion. This algorithm was used by +default prior to OVS 2.9. For example, given the following ports and queues: + +- Port #0 Queue #0 (P0Q0) +- Port #0 Queue #1 (P0Q1) +- Port #1 Queue #0 (P1Q0) +- Port #1 Queue #1 (P1Q1) +- Port #1 Queue #2 (P1Q2) + +The Rx queues may be assigned to the cores in the following order:: + +Core 3: P0Q0 | P1Q1 +Core 7: P0Q1 | P1Q2 +Core 8: P1Q0 | + To see the current measured usage history of PMD core cycles for each Rx queue:: diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 7f836bb..8f004c5 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -342,4 +342,6 @@ struct dp_netdev { struct id_pool *tx_qid_pool; struct ovs_mutex tx_qid_pool_mutex; +/* Use measured cycles for rxq to pmd assignment. */ +bool pmd_rxq_assign_cyc; /* Protects the access of the 'struct dp_netdev_pmd_thread' @@ -1493,4 +1495,5 @@ create_dp_netdev(const char *name, const struct dpif_class *class, cmap_init(&dp->poll_threads); +dp->pmd_rxq_assign_cyc = true; ovs_mutex_init(&dp->tx_qid_pool_mutex); @@ -3717,4 +3720,6 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config)
Re: [ovs-dev] [RFC patch v1] datapath: Fix builds on older kernels.
On 8/27/18, 8:34 PM, "ovs-dev-boun...@openvswitch.org on behalf of Yifeng Sun" wrote: Good catch. Just like Greg said, it is so complex when doing back porting. Do you mind putting 3.19 in the travis.yaml? So that later on we can catch this kind of bug, thanks. Thanks Yifeng I can add 3.19, for example, specifically. Later 3.16 minor versions seem to include the api. Yifeng On Mon, Aug 27, 2018 at 7:34 PM Gregory Rose wrote: > > On 8/27/2018 7:19 PM, Darrell Ball wrote: > > On older kernels, for example 3.19, the function rt6_get_cookie() is > > not available and used with ipv6 config enabled; it was introduced in > > 4.2. Put back the replacement function if it does not exist. > > Interesting that builds on 3.16.57 but not 3.19. You can never tell > what's getting backported and > to which kernel. Keeps us busy! > > Thanks Darrell, looks like a good catch. I'll let Yifeng provide the > review. > > - Greg > > > > > CC: Yifeng Sun > > Fixes: bf61b8b1c1db ("datapath: Add support for kernel 4.16.x & 4.17.x.") > > Signed-off-by: Darrell Ball > > --- > > acinclude.m4| 5 > > datapath/linux/Modules.mk | 1 + > > datapath/linux/compat/include/net/ip6_fib.h | 43 > + > > 3 files changed, 49 insertions(+) > > create mode 100644 datapath/linux/compat/include/net/ip6_fib.h > > > > diff --git a/acinclude.m4 b/acinclude.m4 > > index ab141bd..0690bae 100644 > > --- a/acinclude.m4 > > +++ b/acinclude.m4 > > @@ -459,6 +459,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ > > OVS_GREP_IFELSE([$KSRC/arch/x86/include/asm/checksum_32.h], > [src_err,], > > [OVS_DEFINE([HAVE_CSUM_COPY_DBG])]) > > > > + OVS_GREP_IFELSE([$KSRC/include/net/ip6_fib.h], [rt6_get_cookie], > > + [OVS_DEFINE([HAVE_RT6_GET_COOKIE])]) > > + > > OVS_GREP_IFELSE([$KSRC/include/net/addrconf.h], > [ipv6_dst_lookup.*net], > > [OVS_DEFINE([HAVE_IPV6_DST_LOOKUP_NET])]) > > OVS_GREP_IFELSE([$KSRC/include/net/addrconf.h], [ipv6_stub]) > > @@ -803,6 +806,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [ > > [OVS_DEFINE(HAVE_NF_CONNTRACK_HELPER_PUT)]) > > > OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],:space:]]]SKB_GSO_UDP[[[:space:, > > [OVS_DEFINE([HAVE_SKB_GSO_UDP])]) > > + OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE], > > + [OVS_DEFINE([HAVE_DST_NOCACHE])]) > > OVS_FIND_FIELD_IFELSE([$KSRC/include/net/rtnetlink.h], > [rtnl_link_ops], > > [extack], > > [OVS_DEFINE([HAVE_EXT_ACK_IN_RTNL_LINKOPS])]) > > diff --git a/datapath/linux/Modules.mk b/datapath/linux/Modules.mk > > index b06ca15..e31d784 100644 > > --- a/datapath/linux/Modules.mk > > +++ b/datapath/linux/Modules.mk > > @@ -82,6 +82,7 @@ openvswitch_headers += \ > > linux/compat/include/net/inetpeer.h \ > > linux/compat/include/net/ip.h \ > > linux/compat/include/net/ip_tunnels.h \ > > +linux/compat/include/net/ip6_fib.h \ > > linux/compat/include/net/ip6_route.h \ > > linux/compat/include/net/ip6_tunnel.h \ > > linux/compat/include/net/ipv6.h \ > > diff --git a/datapath/linux/compat/include/net/ip6_fib.h > b/datapath/linux/compat/include/net/ip6_fib.h > > new file mode 100644 > > index 000..0cc4358 > > --- /dev/null > > +++ b/datapath/linux/compat/include/net/ip6_fib.h > > @@ -0,0 +1,43 @@ > > +/* > > + * Linux INET6 implementation > > + * > > + * Authors: > > + * Pedro Roque > > + * > > + * This program is free software; you can redistribute it and/or > > + * modify it under the terms of the GNU General Public License > > + * as published by the Free Software Foundation; either version > > + * 2 of the License, or (at your option) any later version. > > + */ > > + > > +#ifndef _IP6_FIB_WRAPPER_H > > +#define _IP6_FIB_WRAPPER_H > > + > > +#include_next > > + > > +#ifndef HAVE_RT6_GET_COOKIE > > + > > +#ifndef RTF_PCPU > > +#define RTF_PCPU0x4000 > > +#endif > > + > > +#ifndef RTF_LOCAL > > +#define RTF_LOCAL 0x8000 > > +#endif > > + > > +#define rt6_get_cookie rpl_rt6_get_cookie > > +static inline u32 rt6_get_cookie(const struct rt6_info *rt) > > +{ > > + if (rt->rt6i_flags & RTF_PCPU || > > +#ifdef HAVE_DST_NOCACHE > > + (unlikely(rt->dst.flags & DST_NOCACHE) && rt->dst.from)) > > +#else > > + (unlikely(!list_emp
[ovs-dev] [PATCH per-port ingress scheduling 2/2] ingress scheduling: Provide per interface ingress priority
Allow configuration to specify an ingress priority for interfaces. Modify dpif-netdev datapath to act on this configuration so that packets on interfaces with a higher priority will tend be processed ahead of packets on lower priority interfaces. This protects traffic on higher priority interfaces from packet loss as PMDs get overloaded. Signed-off-by: Billy O'Mahony --- include/openvswitch/ofp-parse.h | 3 + lib/dpif-netdev.c | 188 +--- lib/netdev-dpdk.c | 10 +++ 3 files changed, 170 insertions(+), 31 deletions(-) diff --git a/include/openvswitch/ofp-parse.h b/include/openvswitch/ofp-parse.h index 3fdd468..d77ab8f 100644 --- a/include/openvswitch/ofp-parse.h +++ b/include/openvswitch/ofp-parse.h @@ -33,6 +33,9 @@ extern "C" { struct match; struct mf_field; struct ofputil_port_map; +struct tun_table; +struct flow_wildcards; +struct ofputil_port_map; struct ofp_protocol { const char *name; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 807a462..3ed8e09 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -42,6 +43,7 @@ #include "dpif.h" #include "dpif-netdev-perf.h" #include "dpif-provider.h" +#include "netdev-provider.h" #include "dummy.h" #include "fat-rwlock.h" #include "flow.h" @@ -49,7 +51,6 @@ #include "id-pool.h" #include "latch.h" #include "netdev.h" -#include "netdev-provider.h" #include "netdev-vport.h" #include "netlink.h" #include "odp-execute.h" @@ -460,6 +461,7 @@ struct dp_netdev_port { struct ovs_mutex txq_used_mutex; char *type; /* Port type as requested by user. */ char *rxq_affinity_list;/* Requested affinity of rx queues. */ +int ingress_prio; /* 0 lowest to 3 highest. Default 0. */ }; /* Contained by struct dp_netdev_flow's 'stats' member. */ @@ -572,6 +574,7 @@ static void dp_netdev_actions_free(struct dp_netdev_actions *); struct polled_queue { struct dp_netdev_rxq *rxq; odp_port_t port_no; +uint8_t max_reads; }; /* Contained by struct dp_netdev_pmd_thread's 'poll_list' member. */ @@ -711,6 +714,10 @@ struct dpif_netdev { uint64_t last_port_seq; }; +static int +dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, + struct dp_netdev_rxq *rxq, + odp_port_t port_no); static int get_port_by_number(struct dp_netdev *dp, odp_port_t port_no, struct dp_netdev_port **portp) OVS_REQUIRES(dp->port_mutex); @@ -3847,6 +3854,36 @@ exit: return error; } +static void +set_need_reload_on_all_pmds_for_port(struct dp_netdev *dp, odp_port_t port_no) +{ +/* Check each pmd to see if it is reading a queue belonging to + port_no and if so set need_reload of that pmd */ +struct dp_netdev_pmd_thread *pmd; +CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { +struct rxq_poll *poll; +HMAP_FOR_EACH (poll, node, &pmd->poll_list) { +if (poll->rxq->port->port_no == port_no) { +pmd->need_reload = true; +} +} +} +} + +static void +reload_affected_pmds(struct dp_netdev *dp) +{ +struct dp_netdev_pmd_thread *pmd; + +CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { +if (pmd->need_reload) { +flow_mark_flush(pmd); +dp_netdev_reload_pmd__(pmd); +pmd->need_reload = false; +} +} +} + /* Changes the affinity of port's rx queues. The changes are actually applied * in dpif_netdev_run(). */ static int @@ -3859,20 +3896,41 @@ dpif_netdev_port_set_config(struct dpif *dpif, odp_port_t port_no, const char *affinity_list = smap_get(cfg, "pmd-rxq-affinity"); ovs_mutex_lock(&dp->port_mutex); + error = get_port_by_number(dp, port_no, &port); -if (error || !netdev_is_pmd(port->netdev) -|| nullable_string_is_equal(affinity_list, port->rxq_affinity_list)) { +if (error || !netdev_is_pmd(port->netdev)) { goto unlock; } -error = dpif_netdev_port_set_rxq_affinity(port, affinity_list); -if (error) { -goto unlock; +if (!nullable_string_is_equal(affinity_list, port->rxq_affinity_list)) { +error = dpif_netdev_port_set_rxq_affinity(port, affinity_list); +if (!error) { +free(port->rxq_affinity_list); +port->rxq_affinity_list = nullable_xstrdup(affinity_list); +dp_netdev_request_reconfigure(dp); +} +} + +const char *port_prio_str = smap_get(cfg, "port_prio"); +uint8_t port_prio; +char *mallocd_err_str; /* str_to_x mallocs a str we'll need to free */ +if (port_prio_str) { +mallocd_err_str = str_to_u8(port_prio_str, "port_prio", +&port_prio); +if (!mallocd_err_str) { +if (port->ingress_prio != port_prio) { +
[ovs-dev] [PATCH per-port ingress scheduling 1/2] ingress scheduling: documentation
Signed-off-by: Billy O'Mahony --- Documentation/howto/dpdk.rst | 15 +++ vswitchd/vswitch.xml | 15 +++ 2 files changed, 30 insertions(+) diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index ab3d576..83284e7 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -360,6 +360,21 @@ devices to bridge ``br0``. Once complete, follow the below steps: $ cat /proc/interrupts | grep virtio +Ingress Scheduling +-- + +The ingress scheduling feature is described in general in +``ovs-vswitchd.conf.db (5)``. + +Ingress scheduling currently supports setting a priority for incoming packets +for an entire interface. Priority levels 0 (lowest) to 3 (highest) are +supported. The default priority is 0. + +To prioritize packets on a particular port: + +$ ovs-vsctl set Interface dpdk0 \ +ingress_sched=port_prio=3 + .. _dpdk-flow-hardware-offload: Flow Hardware Offload (Experimental) diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 71bbe95..e88a69a 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -3196,6 +3196,21 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \ + + + Configuration to allow rxd traffic to be prioritized on a per Interface + basis. + + + + The ingress priority of the port: 0 (lowest) to 3 (highest). Higher + priority ports are read more frequently than lower priority ports. + This provides enhanced protection to packets ingressing high priority + ports against being dropped due to Rx queue overflow. + + + + BFD, defined in RFC 5880 and RFC 5881, allows point-to-point -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH per-port ingress scheduling 0/2]
Hi All, I've updated the patch to account for two sets of comments on the RFCv2 see history below. This patch set implements the 'preferential read' part of the feature of ingress scheduling described at OvS 2017 Fall Conference https://www.slideshare.net/LF_OpenvSwitch/lfovs17ingress-scheduling-82280320. It allows configuration to specify an ingress priority for an entire interface. This protects traffic on higher priority interfaces from loss and latency as PMDs get overloaded. Results for physical interfaces are excellent - higher priority ports suffer much less loss: Phy i/f: |dpdk_0 dpdk_1 dpdk_2 dpdk_3 % Total Load | 25%25%25%25% Priority (3=Hi)| 0 1 2 3 ---+--- Total Offered | Load (kpps)| Pkt Loss (kpps) --- 2100 | 0 0 0 0 2300 |23 0 0 0 2500 | 308 0 0 0 2900 | 628 24 0 0 3400 | 811370 8 0 3500 | 821391 52 0 4000 | 964565238 20 This also holds true to a great extent when the 'priority' port is carrying most of the traffic: Phy i/f: |dpdk_0 dpdk_1 dpdk_2 dpdk_3 % Total Load | 10%20%30%40% Priority (3=Hi)| 0 1 2 3 ---+--- Total Offered | Load (kpps)| Pkt Loss (kpps) --- 2300 | 8 0 0 0 2500 | 181 0 0 0 2550 | 213 13 0 0 2620 | 223 63 0 9 2700 | 230 82 10 52 3000 | 262143101172 3500 | 310242249370 4000 | 361341398569 For vhostuser ports VMs running iperf3 (TCP) benefit to an appreciable extent from being on a 'priority' ports - without a drop in overall throughput. Scenario: 3 VM-pairs running iperf3 (baseline) - VM pair | 1,23,45,6 priority | 0 0 0 Tput (Gbit/s)| 3.33.33.3 Scenario: 3 VM-pairs running iperf3 (one pair prioritized) -- VM pair | 1,23,45,6 priority | 0 0 0 Tput (Gbit/s)| 2.72.74.6 History: v1: * the configuration in only in dpif-netdev and will work with any polled netdev's not just dpdk netdevs. * re-configuration of the priorities at run-time is supported. * keep configuration in Interfaces other_config * applies cleanly on 9b4f08c RFCv2: * Keep ingress prio config in netdev base rather than in each netdev type. * Account for differing rxq lengths * Applies cleanly to 4299145 RFCv1: Initial version. Billy O'Mahony (2): ingress scheduling: documentation ingress scheduling: Provide per interface ingress priority Documentation/howto/dpdk.rst| 15 include/openvswitch/ofp-parse.h | 3 + lib/dpif-netdev.c | 188 +--- lib/netdev-dpdk.c | 10 +++ vswitchd/vswitch.xml| 15 5 files changed, 200 insertions(+), 31 deletions(-) -- 2.7.4 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2 6/6] system-dpdk: Execute testpmd on the background
- Original Message - > From: "Ian Stokes" > To: "Aaron Conole" , d...@openvswitch.org > Cc: "Ciara Loftus" , "Bala Sankaran" > > Sent: Monday, 27 August, 2018 7:25:48 AM > Subject: Re: [PATCH v2 6/6] system-dpdk: Execute testpmd on the background > > On 8/22/2018 2:37 PM, Aaron Conole wrote: > > From: Bala Sankaran > > > > This adds a new test to the 'check-dpdk' subsystem that will exercise > > allocations, PMDs, and the vhost-user code path. > > > > Signed-off-by: Bala Sankaran > > Co-authored-by: Aaron Conole > > Signed-off-by: Aaron Conole > > --- > > tests/system-dpdk.at | 77 > > > > 1 file changed, 77 insertions(+) > > > > diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at > > index 58dc8aaae..914a1b644 100644 > > --- a/tests/system-dpdk.at > > +++ b/tests/system-dpdk.at > > @@ -1,3 +1,6 @@ > > +m4_define([CONFIGURE_VETH_OFFLOADS], > > + [AT_CHECK([ethtool -K $1 tx off], [0], [ignore], [ignore])]) > > + > > AT_BANNER([OVS-DPDK unit tests]) > > > > dnl > > -- > > @@ -74,3 +77,77 @@ OVS_VSWITCHD_STOP(["\@does not exist. The Open vSwitch > > kernel module is probably > > \@EAL: No free hugepages reported in hugepages-1048576kB@d"]) > > AT_CLEANUP > > dnl > > -- > > + > > + > > + > > +dnl > > -- > > +dnl Ping vhost-user-client port > This test uses vhost user server so above should be changed to reflect this. Agreed, changes made. > > > +AT_SETUP([OVS-DPDK datapath - ping vhost-user-client ports]) > > +AT_KEYWORDS([dpdk]) > > +OVS_DPDK_PRE_CHECK() > > +OVS_DPDK_START() > > + > > +dnl Add userspace bridge and attach it to OVS > > +AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev]) > > +AT_CHECK([ovs-vsctl add-port br10 vhu0 -- set Interface vhu0 \ > > I'd like to keep the name of the vhost user interfaces uniform across > the tests. Can we change the interface name to dpdkvhostuser0 instead of > vhu0? This is in keeping with the existing vhost tests and the OVS DPDK > documentation. This has been noted, I have made the necessary changes. > > > + type=dpdkvhostuser], [], > > + [stdout], [stderr]) > > +AT_CHECK([ovs-vsctl show], [], [stdout]) > > + > > +dnl Parse log file > > +AT_CHECK([grep "VHOST_CONFIG: vhost-user server: socket created" \ > > + ovs-vswitchd.log], [], [stdout]) > > +AT_CHECK([grep "Socket $OVS_RUNDIR/vhu0 created for vhost-user port vhu0" > > \ > > + ovs-vswitchd.log], [], [stdout]) > > +AT_CHECK([grep "VHOST_CONFIG: bind to $OVS_RUNDIR/vhu0" ovs-vswitchd.log], > > [], > > + [stdout]) > > + > > +dnl Set up namespaces > > +ADD_NAMESPACES(ns1, ns2) > > + > > +dnl execute testpmd in background > > To be uniform the first word after each dnl should be capitalized > (applies to a few of the other dnl added in this test also). Noted, I've applied the changes all over. > > > +on_exit "pkill -f -x -9 'tail -f /dev/null'" > > +tail -f /dev/null | testpmd --socket-mem=512 \ > > + --vdev="net_virtio_user,path=$OVS_RUNDIR/vhu0" \ > > + --vdev="net_tap0,iface=tap0" --file-prefix page0 \ > > + --single-file-segments -- -a >$OVS_RUNDIR/testpmd-vhu0.log 2>&1 > > & > > I have a few queries as regards running testpmd. > > Is the assumption that testpmd is a recognized command? Yes, I assumed that testpmd is a recognized command while doing the tests, because testpmd is an ancillary tool that comes with DPDK installation. > > How exactly were you testing this? I believe DPDK 18.08 is required in > the case of testpmd but 17.11 is still linked against for OVS? Ideally > I'd like to remove the dependency on 18.08. What issues were seen when > using testpmd in 17.11? Issues I saw were that options/ arguments for testpmd like: single-file-segments were unavailable in the DPDK-17.11 version. I am not aware if we would be able to remove the dependency on 18.08. > > Is it expected that DPDK has been installed from a repo and in a default > location? Yes, I assumed it is expected to be so. > > If it is then I think we should provision for an environmental variable > specifying the path to testpmd also. It could be the case someone is > building DPDK from source and the testpmd executable is elsewhere. The environment variable $PATH is already set and suffices this, I believe. Thank you for your suggestions and letting me know your requirements. I've made the necessary changes and I will be testing it at my end before I submit a v3 of the set of patches. Let me know if you've got any more questions. Thanks, Bala. > > Ian > > + > > +dnl add veth device > > +ADD_VETH(tap1, ns2, br10, "172.31.110.12/24") > > + > > +dnl give settling time to the testpmd processes - NOTE: this is bad
[ovs-dev] [PATCH v2] utilities: Drop shebang from bash completion script
This fixes the following warning when building Open vSwitch on the openSUSE Build Service: W: non-executable-script /usr/share/bash-completion/completions/ovs-appctl-bashcomp.bash This text file contains a shebang or is located in a path dedicated for executables, but lacks the executable bits and cannot thus be executed. If the file is meant to be an executable script, add the executable bits, otherwise remove the shebang or move the file elsewhere. The file is meant to be sourced instead of executed, so we can simply drop the shebang. Signed-off-by: Markos Chandras --- utilities/ovs-appctl-bashcomp.bash | 1 - 1 file changed, 1 deletion(-) diff --git a/utilities/ovs-appctl-bashcomp.bash b/utilities/ovs-appctl-bashcomp.bash index f7fb83047..4384be8ae 100755 --- a/utilities/ovs-appctl-bashcomp.bash +++ b/utilities/ovs-appctl-bashcomp.bash @@ -1,4 +1,3 @@ -#!/bin/bash # # A bash command completion script for ovs-appctl. # -- 2.18.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] utilities: Drop shebang from bash completion script
On 28/08/18 14:22, Aaron Conole wrote: > > > In the future, it's best to indent the lines you intend to be quoting. Good idea. > > Patchwork will process the patch slightly more liberally than > git-mailinfo, but if I use git-am to apply your mbox file, the message > gets truncated and the commit log only shows the same as the above 'msg' > file from git mailinfo. > > I think it can be fixed when applying, though (by making the change I've > outlined above). > Let me send a v2 to address that. -- markos SUSE LINUX GmbH | GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Maxfeldstr. 5, D-90409, Nürnberg ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] utilities: Drop shebang from bash completion script
Markos Chandras writes: > On 28/08/18 12:57, 0-day Robot wrote: >> Bleep bloop. Greetings Markos Chandras, I am a robot and I have >> tried out your patch. >> Thanks for your contribution. >> >> I encountered some error that I wasn't expecting. See the details below. >> >> >> checkpatch: >> ERROR: Author Markos Chandras needs to sign off. >> Lines checked: 25, Warnings: 0, Errors: 1 >> >> > > But it is signed off :) > > I guess the script may have been confused by the '---' I used to quote > the OBS warning. Yes - actually, even git mailinfo gets confused by it: 09:16:48 aconole /tmp$ git mailinfo msg ptch < ovs-dev-utilities-Drop-shebang-from-bash-completion-script.patch Author: Markos Chandras Email: mchand...@suse.de Subject: utilities: Drop shebang from bash completion script Date: Tue, 28 Aug 2018 12:21:24 +0100 09:17:03 aconole /tmp$ cat msg This fixes the following warning when building Open vSwitch on the openSUSE Build Service: 09:17:07 aconole /tmp$ In the future, it's best to indent the lines you intend to be quoting. Patchwork will process the patch slightly more liberally than git-mailinfo, but if I use git-am to apply your mbox file, the message gets truncated and the commit log only shows the same as the above 'msg' file from git mailinfo. I think it can be fixed when applying, though (by making the change I've outlined above). ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Good Day
Hallo, Sie haben eine Spende von $ 4,800,000.00, ich habe die $ 259 Millionen America Lottery gewonnen und ich gebe einen Teil davon an fünf glückliche Menschen und Wohltätigkeitsorganisationen für dieses Jahr 2018 und zum Gedenken an meine tote Frau, die an Krebs gestorben ist. Kontaktieren Sie mich für weitere Details Hello, you have a donation of $4,800,000.00, I won the $259 million America Lottery and I give part of it to five lucky people and charities for this Year 2018 and to commemorate my dead wife, who died of cancer. Contact me for more details ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] utilities: Drop shebang from bash completion script
On 28/08/18 12:57, 0-day Robot wrote: > Bleep bloop. Greetings Markos Chandras, I am a robot and I have tried out > your patch. > Thanks for your contribution. > > I encountered some error that I wasn't expecting. See the details below. > > > checkpatch: > ERROR: Author Markos Chandras needs to sign off. > Lines checked: 25, Warnings: 0, Errors: 1 > > But it is signed off :) I guess the script may have been confused by the '---' I used to quote the OBS warning. -- markos SUSE LINUX GmbH | GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Maxfeldstr. 5, D-90409, Nürnberg ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] utilities: Drop shebang from bash completion script
Bleep bloop. Greetings Markos Chandras, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Author Markos Chandras needs to sign off. Lines checked: 25, Warnings: 0, Errors: 1 Please check this out. If you feel there has been an error, please email acon...@bytheb.org Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] utilities: Drop shebang from bash completion script
This fixes the following warning when building Open vSwitch on the openSUSE Build Service: --- W: non-executable-script /usr/share/bash-completion/completions/ovs-appctl-bashcomp.bash This text file contains a shebang or is located in a path dedicated for executables, but lacks the executable bits and cannot thus be executed. If the file is meant to be an executable script, add the executable bits, otherwise remove the shebang or move the file elsewhere. --- The file is meant to be sourced instead of executed, so we can simply drop the shebang. Signed-off-by: Markos Chandras --- utilities/ovs-appctl-bashcomp.bash | 1 - 1 file changed, 1 deletion(-) diff --git a/utilities/ovs-appctl-bashcomp.bash b/utilities/ovs-appctl-bashcomp.bash index f7fb83047..4384be8ae 100755 --- a/utilities/ovs-appctl-bashcomp.bash +++ b/utilities/ovs-appctl-bashcomp.bash @@ -1,4 +1,3 @@ -#!/bin/bash # # A bash command completion script for ovs-appctl. # -- 2.18.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVS DPDK Latest & HWOL Branches
On 8/27/2018 5:16 PM, Ben Pfaff wrote: I can help with some of these. On Mon, Aug 27, 2018 at 04:05:39PM +, Ophir Munk wrote: Ian, can you please specify the practical steps regarding the new branches? Specifically, what is the procedure for adding a new patch for either of the branches (OVS DPDK latest or HWOK)? 1. What should the patch title include? I guess this is up to Ian, although he should coordinate with Aaron to make sure that the patch robot understands too. We discussed this at the community call last week but it's good to raise it on the ML again for wider input. The process we would follow is that patches would be submitted to the d...@openvswitch.org. As the patches affects a particular branch and not master, contributors should submit the change with the target branch listed in the subject line of the patch. The git format-patch argument --subject-prefix may be used when posting the patch, for example: $ git format-patch HEAD --subject-prefix="PATCH branch-dpdk-latest" or $ git format-patch HEAD --subject-prefix="PATCH branch-dpdk-hwol" @Aaron, would it be possible to setup the 0-day robot to recognize patches with the above subject header and apply/build the correlating branch? 2. Who is going to merge a new patch (as well as ongoing master branch updates) into the relevant branch? I expect that Ian will be the only one pushing to the new branches. Yes, I'll handle merging the patches as well as the typical validation via vsperf to help identify any issues a patch may cause to performance or functionality. 3. Can I have write permissions in the new branches? I'd prefer to have just Ian doing this work for now. 4. How can I inspect the new branches? Currently I am not seeing them. I do not think that Ian has created the new branches yet. I can create these today. There was some discussion as regards the branch names. Before creating them are people happy with 'branch-dpdk-latest' and 'branch-dpdk-hwol' ? If there are no objections then I'll go ahead with these today. Ian ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] Several conntrack problems, including some critical bugs.
Thank you for the testing/reports On 8/27/18, 10:47 PM, "ovs-dev-boun...@openvswitch.org on behalf of Zang MingJie" wrote: While developing application using ovs userspace conntrack, we found some bugs worth mention here. 1. conntrack_clean may causes ovs crash. conntrack_clean function iterators through all buckets, and free entries in the bucket with bucket lock, but when releasing a NAT connection, inside nat_clean function, the bucket lock is temporarily released, if other PMD acquires the lock and modifies the bucket, further loop may causing invalid memory access inside sweep_bucket function. There is a silly bug here; I hit it once myself while doing something else; I’ll send a patch. 2. occasionally incorrectly DNAT to 1024 port, despite whatever port specified. We found 2 scenarios, both leads to this result. First, consider there are two virtual server share the same backend, which are implemented by DNAT, both V1 and V2 are DNAT to R. While there is already a connection C->V1 which is DNAT as C->R, if there is another incoming connection C->V2, will also DNAT as C->R, causing conntrack table conflict, but instead of dropping the packet, the connection is DNAT to port 1024. Because the NAT function search through port 1024 - 65535 when conflict occurred. Second, if a conntrack entry is expired but not yet released, mostly in TIMEWAIT state, the client may reuse the same port to establish a new connection, when this condition is met, will also cause a conflict, the connection will DNAT to port 1024 if DNAT is used. Yep; there is a testing gap here; I’ll roll a patch. There are also some other problems under investigation, and I'll post them when we find the cause. ___ dev mailing list d...@openvswitch.org https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.openvswitch.org%2Fmailman%2Flistinfo%2Fovs-dev&data=02%7C01%7Cdball%40vmware.com%7C4b02e49629a54d74831408d60ca9c1c8%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636710320586041230&sdata=gCPVJyHkRJ6crDBSTRjpfOYLQYNO5wMWXQe2muQ%2BBBI%3D&reserved=0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev