Re: [ovs-dev] [PATCH v2 1/2] ovn-northd: Add hint in lflow to link back to acl

2017-04-14 Thread Ben Pfaff
On Thu, Mar 23, 2017 at 11:43:25PM -0700, Han Zhou wrote:
> It will be helpful for trouble-shooting if we can link a logical flow
> back to the ACL that generated it. This patch is to add a stage-hint as
> an external-id in lflow. The hint contains stage specific information.
> Now only lflows in ACL stages have hint, which is the ACL uuid, though
> the same mechanism can be used to add hint for other stages later.
> 
> Signed-off-by: Han Zhou 

Did you consider adding the hint to the match or the action as a
comment?  I guess that there are advantages and disadvantages to each
approach, and I'd like to hear that they've been considered.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] dpif-netdev: The pmd-*-show commands will show info in core order

2017-04-14 Thread Ben Pfaff
On Wed, Mar 22, 2017 at 11:10:23AM +0100, Eelco Chaudron wrote:
> The "ovs-appctl dpif-netdev/pmd-rxq-show" and "ovs-appctl
> dpif-netdev/pmd-stats-show" commands show their output per core_id,
> sorted on the hash location. My OCD was kicking in when using these
> commands, hence this change to display them in natural core_id order.
> 
> In addition I had to change a test case that would fail if the cores
> where not in order in the hash list. This is due to OVS assigning
> queues to cores based on the order in the hash list. The test case now
> checks if any core has the set of queues in the given order.
> 
> Manually tested this on my setup, and ran clang-analyze.
> 
> Signed-off-by: Eelco Chaudron 

This should be helpful, thanks.

sorted_poll_thread_list() worries me.  It calls cmap_count() and then
uses CMAP_FOR_EACH to iterate the list and assumes (asserts, even) that
the number of pmds does not change during iteration.  Is this safe?
If it is, then a clearer comment would be helpful, and it would be wise
to have an assertion for the exact number at the end.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] flow: Further refinements to flow_pop_vlan().

2017-04-14 Thread Ben Pfaff
This may help to suppress warnings from know-it-all compilers, and it helps
to make the code clearer too.

Reported-by: Darrell Ball 
Signed-off-by: Ben Pfaff 
---
Sorry, I didn't see v2 of the memset patch until I'd already applied v1.
Here's my version of this patch.

 lib/flow.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/lib/flow.c b/lib/flow.c
index 2b1ec4fed7ef..9d2ff93e89e4 100644
--- a/lib/flow.c
+++ b/lib/flow.c
@@ -2205,16 +2205,17 @@ void
 flow_pop_vlan(struct flow *flow, struct flow_wildcards *wc)
 {
 int n = flow_count_vlan_headers(flow);
-if (n == 0) {
-return;
+if (n > 1) {
+if (wc) {
+memset(>masks.vlans[1], 0xff,
+   sizeof(union flow_vlan_hdr) * (n - 1));
+}
+memmove(>vlans[0], >vlans[1],
+sizeof(union flow_vlan_hdr) * (n - 1));
 }
-if (wc) {
-memset(>masks.vlans[1], 0xff,
-   sizeof(union flow_vlan_hdr) * (n - 1));
+if (n > 0) {
+memset(>vlans[n - 1], 0, sizeof(union flow_vlan_hdr));
 }
-memmove(>vlans[0], >vlans[1],
-sizeof(union flow_vlan_hdr) * (n - 1));
-memset(>vlans[n - 1], 0, sizeof(union flow_vlan_hdr));
 }
 
 void
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [patch_v1] ovs build: Fix memset with zero size warning.

2017-04-14 Thread Ben Pfaff
On Sun, Mar 19, 2017 at 10:11:09AM -0700, Darrell Ball wrote:
> In file included from /usr/include/string.h:640:0,
>  from ./lib/string.h:20,
>  from /usr/include/netinet/icmp6.h:22,
>  from ../lib/flow.h:21,
>  from ../lib/flow.c:18:
> In function 'memset',
> inlined from 'flow_push_vlan_uninit' at ../lib/flow.c:2188:19:
> /usr/include/x86_64-linux-gnu/bits/string3.h:81:30: error:
> call to '__warn_memset_zero_len' declared with attribute warning:
> memset used with constant zero length parameter; this could be
> due to transposed parameters [-Werror]
>__warn_memset_zero_len ();
>   ^
> cc1: all warnings being treated as errors
> make[2]: *** [lib/flow.lo] Error 1
> 
> Fixes: f0fb825a3785 ("Add support for 802.1ad (QinQ tunneling)")
> Signed-off-by: Darrell Ball 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: fix ARM cross compilation failure

2017-04-14 Thread Ben Pfaff
On Tue, Mar 21, 2017 at 02:56:36PM +0800, Jianbo Liu wrote:
> The 03/21/2017 06:43, Hemant Agrawal wrote:
> >
> >
> > > -Original Message-
> > > From: Jianbo Liu [mailto:jianbo@arm.com]
> > > Sent: Tuesday, March 21, 2017 10:55 AM
> > > To: Hemant Agrawal 
> > > Cc: b...@ovn.org; d...@openvswitch.org
> > > Subject: Re: [ovs-dev] [PATCH v2] netdev-dpdk: fix ARM cross compilation
> > > failure
> > >
> > > The 03/20/2017 16:43, Hemant Agrawal wrote:
> > > > configure with_dpdk and cross compile fails with:
> > > > "configure: error: cannot check for file existence when cross
> > > > compiling"
> > > >
> > > > This is due to the usages of AC_CHECK_FILES.
> > > >
> > > > AC_CHECK_FILES only works when not cross compiling. It test a feature
> > > > of the host machine, and therefore, die when cross-compiling.
> > > >
> > > > The current patch put the check in condition, i.e. check only if not
> > > > cross-compiling.
> > >
> > > There are many defines in rte_config.h, which are important for compiling,
> > > whether it is cross compiling or not. So I disagree to remove the 
> > > checking.
> >
> >  [Hemant] can you suggest an alternate to make it work?
> 
> Copy, or execute "make config T=your_config_file" to create rte_config.h
> in your dpdk source directory.

I think that this thread reflects a misunderstanding.  I posted a patch
that matches my own understanding of the problem:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330941.html
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] configure: Fix check for rte_config.h to handle cross-compilation.

2017-04-14 Thread Ben Pfaff
The check for rte_config.h in acinclude.m4 used AC_CHECK_FILE, but this
macro is intended to check for a file on the host system, not the build
system, which means that it fails unconditionally in a cross-compilation
environment.  However, the intended check here is for a header file,
which is part of the build system.  To check for part of the build system,
we can just use "test", so this commit makes that change.

Reported-by: Hemant Agrawal 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2017-March/329994.html
Signed-off-by: Ben Pfaff 
---
 acinclude.m4 | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index 744d8f89525c..842469455914 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -180,9 +180,10 @@ AC_DEFUN([OVS_CHECK_DPDK], [
 DPDK_INCLUDE="$with_dpdk/include"
 # If 'with_dpdk' is passed install directory, point to headers
 # installed in $DESTDIR/$prefix/include/dpdk
-AC_CHECK_FILE([$DPDK_INCLUDE/rte_config.h], [],
-  [AC_CHECK_FILE([$DPDK_INCLUDE/dpdk/rte_config.h],
- [DPDK_INCLUDE=$DPDK_INCLUDE/dpdk], [])])
+   if test ! -e "$DPDK_INCLUDE/rte_config.h" && \
+  test -e "$DPDK_INCLUDE/dpdk/rte_config.h"; then
+  DPDK_INCLUDE=$DPDK_INCLUDE/dpdk/rte_config.h
+   fi
 DPDK_LIB_DIR="$with_dpdk/lib"
 ;;
 esac
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC] make logs readable only by owner

2017-04-14 Thread Ben Pfaff
On Thu, Mar 16, 2017 at 03:03:47PM +0100, Timothy Redaelli wrote:
> The Open vSwitch log directory and files are currently set world readable.
> 
> However, since only Open vSwitch users and processes need to access this
> directory and files there is no need to allow the world to access them,
> since it can result in the exposure of sensitive information.
> 
> Signed-off-by: Timothy Redaelli 
> ---
>  lib/vlog.c   | 2 +-
>  utilities/ovs-lib.in | 5 +++--
>  utilities/ovs-pki.in | 2 +-
>  3 files changed, 5 insertions(+), 4 deletions(-)
> 
> I didn't change the MKDIR_P in automake.mk since it will be removed in
> https://patchwork.ozlabs.org/patch/737029/

I agree with the intent of this patch.

On my Debian system, most logs are group- as well as owner-readable.
Are Red Hat systems different in this way?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofp-parse: Fix match parsing with [x..y]=z format.

2017-04-14 Thread Ben Pfaff
On Thu, Apr 13, 2017 at 06:31:06PM -0700, Jarno Rajahalme wrote:
> Commit 21b2fa617126 ("ofp-parse: Allow match field names in actions
> and brackets in matches.") added support for matching a consecutive
> set of bits with the [x..y]=z format, but the copying of the parsed
> value ('z') to the match was done from a wrong offset, so that the
> actual value matched would be incorrect.
> 
> Fix this and add a test case preventing regression in future.
> 
> Fixes: 21b2fa617126 ("ofp-parse: Allow match field names in actions and 
> brackets in matches.")
> Signed-off-by: Jarno Rajahalme 

Oops, thanks for the fix!

Should the test try a multibit match too, for completeness?

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] dpif: Log packet metadata on execute.

2017-04-14 Thread Ben Pfaff
On Thu, Apr 13, 2017 at 04:47:36PM -0700, Jarno Rajahalme wrote:
> Debug log output for execute operations is missing the packet
> metadata, which can be instrumental in tracing what the datapath
> should be executing.  No reason to have the metadata on the debug
> output, so add it there.
> 
> Signed-off-by: Jarno Rajahalme 

This does seem like an important oversight.

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [branch-2.7 0/4] Backport of variable length metaflow field fixes.

2017-04-14 Thread Ben Pfaff
On Wed, Mar 15, 2017 at 04:01:37PM -0700, Joe Stringer wrote:
> Commit 04f48a68c428 ("ofp-actions: Fix variable length meta-flow OXMs."), on
> branch-2.7 as 9554b03d6ab7, attempted to address incorrect encode and decode 
> of
> variable length metaflow fields where the OXM/NXM encoding of the variable
> length fields would incorrectly serialize the length. The patch addresses this
> by introducing a new per-bridge structure that adds additional metaflow fields
> for the variable-length fields on demand when the TLVs are configured by a
> controller.
> 
> Unfortunately, in the original patch there was nothing ensuring that flows
> referring to variable length fields would retain valid field references when
> controllers reconfigure the TLVs. In practice, this could lead to a crash of
> ovs-vswitchd by configuring a TLV field, adding a flow which refers to it,
> removing the TLV field, then running some traffic that hit the configured 
> flow.
> 
> This series looks to remedy the situation by reference counting the variable
> length fields and preventing a controller from reconfiguring TLV fields when
> there are active flows whose match or actions refer to the field.
> 
> This series was applied to master, but given the size of the change and the
> minor changes necessary to apply to branch-2.7, I would feel more confident in
> backporting it if there was an extra round of review to ensure that nothing 
> was
> missed when this series was first applied to master.

Thanks a lot for backporting this.  Backporting is sometimes difficult
work and rarely rewarding, so I really appreciate seeing it done.

Who should review this?  Jarno, I see you made a comment; do you plan to
review it?

If you'd like me to review it, let me know.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/5] dp-packet.h: Fix comments in dp_packet_source.

2017-04-14 Thread Ben Pfaff
On Sun, Mar 12, 2017 at 05:33:28PM +, Bhanuprakash Bodireddy wrote:
> Add the appropriate function and the source file.
> 
> Signed-off-by: Bhanuprakash Bodireddy 

This is obviously correct even to me, so I applied it to master.  Thank
you!

(I don't plan to review the rest of the series.)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovn-util: Allow /32 IP addresses for router ports.

2017-04-14 Thread Ben Pfaff
On Thu, Mar 09, 2017 at 11:46:38PM -0800, Gurucharan Shetty wrote:
> On Google cloud, a VM gets a /32 IP address. When OVN
> is deployed on such VMs, the OVN gateway router's IP
> address becomes a /32 IP address. This commit allows
> such a configuration.
> 
> Signed-off-by: Gurucharan Shetty 

A patch that adds a feature, but only deletes code, and adds a test?
SIGN ME UP!

Acked-by: Ben Pfaff 

Thank you.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 08/10] windows: return NULL in xreadlink

2017-04-14 Thread Ben Pfaff
On Mon, Feb 06, 2017 at 04:41:41AM +, Alin Serdean wrote:
> readlink does not exist on Windows.
> 
> While we could skip the function all togheter on Windows, we may add
> support for it later on. For the moment return change errno to ENOENT
> and return NULL.
> 
> FYI:
> https://blogs.windows.com/buildingapps/2016/12/02/symlinks-windows-10/#kBeZetM7P1dorllZ.97
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365680(v=vs.85).aspx
> 
> Signed-off-by: Alin Gabriel Serdean 

Thanks, applied to master.

I was really hoping that we'd get reviews for this series from more
Windowsy folks than me, but it never happened.  I hope you'll post a
revision, and I'll do my best to review it myself.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 09/10] use portable getpagesize() in system-stats

2017-04-14 Thread Ben Pfaff
On Mon, Feb 06, 2017 at 04:41:41AM +, Alin Serdean wrote:
> Use the intended portable function defined above "get_page_size()" not
> "getpagesize()".
> 
> Signed-off-by: Alin Gabriel Serdean 

I applied this to master.  Thank you!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 03/10] windows: add includes to daemon-windows

2017-04-14 Thread Ben Pfaff
On Mon, Feb 06, 2017 at 04:41:40AM +, Alin Serdean wrote:
> Add fatal-signal.h include since it uses: fatal_signal_atexit_handler
> and fatal_signal_add_hook
> 
> Use the defined getpid() function and also include  since
> it is defined in include/windows/unistd.h .
> 
> Signed-off-by: Alin Gabriel Serdean 
> ---
>  lib/daemon-windows.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/daemon-windows.c b/lib/daemon-windows.c
> index 7e2e9da..ccf4297 100644
> --- a/lib/daemon-windows.c
> +++ b/lib/daemon-windows.c
> @@ -1,5 +1,5 @@
>  /*
> - * Copyright (c) 2014 Nicira, Inc.
> + * Copyright (c) 2014, 2017 Nicira, Inc.
>   *
>   * Licensed under the Apache License, Version 2.0 (the "License");
>   * you may not use this file except in compliance with the License.
> @@ -20,7 +20,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "dirs.h"
> +#include "fatal-signal.h"
>  #include "ovs-thread.h"
>  #include "poll-loop.h"
>  #include "openvswitch/vlog.h"
> @@ -476,7 +478,7 @@ make_pidfile(void)
>  
>  fatal_signal_add_hook(unlink_pidfile, NULL, NULL, true);
>  
> -fprintf(filep_pidfile, "%d\n", _getpid());
> +fprintf(filep_pidfile, "%d\n", getpid());

This seems reasonable to me, except that usual practice would be more
like this:
fprintf(filep_pidfile, "%ld\n", (long int) getpid());
because pid_t might be short or int or long.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 02/10] windows: add definition of getpid and getcwd

2017-04-14 Thread Ben Pfaff
If GetCurrentProcessId() is a reasonable substitute for getpid(), but
the return type is different, then I would suggest an inline function,
like this:

static inline pid_t
getpid(void)
{
return GetCurrentProcessId();
}

Thanks,

Ben.

On Tue, Mar 07, 2017 at 09:07:55AM +, Sairam Venugopal wrote:
> Shouldn’t we cast the DWORD to unsigned int for the GetCurrentProcessId?
> 
> 
> 
> 
> On 2/5/17, 8:41 PM, "ovs-dev-boun...@openvswitch.org on behalf of Alin 
> Serdean"  aserd...@cloudbasesolutions.com> wrote:
> 
> >getcwd - is used in lib/util.c. getcwd is deprecated on Windows but has
> >_getcwd which is defined in :
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_sf98bd4y-28v-3Dvs.120-29.aspx=DwICAg=uilaK90D4TOVoH58JNXRgQ=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo=og4savUMMSe8GoOfKq6AMAirivJFLgVTMx5lx7hx6gk=CVjSRN456APj3-mMAQuXYxdJ4oUgduwqZHzkod6cLvQ=
> > 
> >
> >getpid - is used in several files (i.e. lib/vlog.c). getpid
> >is also and deprecated and _getpid should be used:
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_t2y34y40-28v-3Dvs.120-29.aspx=DwICAg=uilaK90D4TOVoH58JNXRgQ=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo=og4savUMMSe8GoOfKq6AMAirivJFLgVTMx5lx7hx6gk=pDh2W8ECiQdxZdHgHBdWHIDhLPcTJ9A6rb2n1YcRZ94=
> > 
> >The problem using _getpid is that the definition is in .
> >A file called process.h also exists in the lib folder. This will mess up
> >includes.
> >An option would be to use a wrapper like we use for lib/string.h(.in) but
> >that would mean to also add it to the automake chain.
> >A simple solution would be to map it to GetCurrentProcessId
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_windows_desktop_ms683180-28v-3Dvs.85-29.aspx=DwICAg=uilaK90D4TOVoH58JNXRgQ=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo=og4savUMMSe8GoOfKq6AMAirivJFLgVTMx5lx7hx6gk=QWV0dTQAbL1Jt9ZeQeUAUs-WBb8w5YW0mn1cHxFeaZs=
> > 
> >The disadvantage is the type but Windows recycles pids so in theory
> >it should be ok.
> >
> >Signed-off-by: Alin Gabriel Serdean 
> >---
> > include/windows/unistd.h | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> >diff --git a/include/windows/unistd.h b/include/windows/unistd.h
> >index 8629f7e..3f92616 100644
> >--- a/include/windows/unistd.h
> >+++ b/include/windows/unistd.h
> >@@ -18,8 +18,11 @@
> > 
> > #define WIN32_LEAN_AND_MEAN
> > #include 
> >+#include 
> > 
> > #define fsync _commit
> >+#define getpid GetCurrentProcessId
> >+#define getcwd _getcwd
> > 
> > /* Standard file descriptors.  */
> > #define STDIN_FILENO0   /* Standard input.  */
> >-- 
> >2.10.2.windows.1
> >___
> >dev mailing list
> >d...@openvswitch.org
> >https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwICAg=uilaK90D4TOVoH58JNXRgQ=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo=og4savUMMSe8GoOfKq6AMAirivJFLgVTMx5lx7hx6gk=864IlShezC_8X9CmDoHDkTjOqBJ3IcRu1LeeoRJrhdM=
> > 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: Apply flake8 to conf.py also.

2017-04-14 Thread Ben Pfaff
On Sat, Apr 15, 2017 at 01:43:09AM +0100, Stephen Finucane wrote:
> On Fri, 2017-04-14 at 14:18 -0700, Ben Pfaff wrote:
> > Suggested-by: Stephen Finucane 
> > Signed-off-by: Ben Pfaff 
> 
> Acked-by: Stephen Finucane 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 3/3] ovn-northd: Add logical flows to support native DNS

2017-04-14 Thread Ben Pfaff
On Mon, Apr 03, 2017 at 04:10:27PM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique 
> 
> OVN implements native DNS resolution which can be used to resolve the
> internal DNS names belonging to a logical datapath.
> 
> To support this, a new table 'DNS' is added in the NB DB. A new column
> 'dns_records' is added in 'Logical_Switch' table which references to the
> 'DNS' table.
> 
> Following flows are added for each logical switch if configured with
> DNS records in the 'dns_records' column
>  - A logical flow in DNS_LOOKUP stage which uses the action 'dns_lookup'
>to transform the DNS query to DNS reply packet and advances
>to the next stage - DNS_RESPONSE.
> 
>  - A logical flow in DNS_RESPONSE stage which implements the DNS responder
>by sending the DNS reply from previous stage back to the inport.
> 
> Signed-off-by: Numan Siddique 

Thank you very much!

I see a few uses of  in ovn-northd.8.xml.  I think that you can
just write - instead.

ovn-northd.c uses smap_count() in a few places to determine whether an
smap is nonempty.  I would prefer !smap_is_empty(), instead.

It looks to me like at least two places have code that check for at
least one DNS record in an ovn_datapath.  I think that a helper function
could clarify and reduce the amount of code.

For iteration with HMAP_FOR_EACH and similar constructs, please write a
space after HMAP_FOR_EACH, so that it looks a little more like the
formatting we use for a "for" statement in OVS.

sync_dns_entries() has two different HMAP_FOR_EACH_WITH_HASH loops that
search for a dns_info struct with a particular uuid.  I think that it
would be better to write a helper function, to save code and make the
intent clearer.

sync_dns_entries() calls sbrec_dns_set_datapaths() and
sbrec_dns_set_records() in two places.  I think that this could be done
in a single place, in the final loop (although it might require an extra
'delete' member in struct dns_info) or it could be a helper function.

Thanks again!

Ben
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 2/3] ovn-controller: Add 'dns_lookup' action

2017-04-14 Thread Ben Pfaff
On Mon, Apr 03, 2017 at 04:10:15PM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique 
> 
> This patch adds a new OVN action 'dns_lookup' to support native DNS.
> ovn-controller parses this action and adds a NXT_PACKET_IN2
> OF flow with 'pause' flag set.
> 
> A new table 'DNS' is added in the SB DB to look up and resolve
> the DNS queries. When a valid DNS packet is received by
> ovn-controller, it looks up the DNS name in the 'DNS' table
> and if successful, it frames a DNS reply, resumes the packet
> and stores 1 in the 1-bit subfield. If the packet is invalid
> or cannot be resolved, it resumes the packet without any
> modifications and stores 0 in the 1-bit subfield.
> 
> reg0[4] = dns_lookup(); next;
> 
> An upcoming patch will use this action and adds logical flows.
> 
> Signed-off-by: Numan Siddique 

Thanks for revising this!  DNS is important and it goes one step closer
to offering everything that OpenStack needs from OVN.

I think that Guru is planning to review this too, particularly regarding
the database schema (and its documentation, I hope), but I'll offer some
preliminary comments of my own.

I tend to think that the DNS protocol declarations should go in
lib/packets.h, where OVS generally keeps protocol declarations, instead
of ovn/lib/ovn-util.h.

I think that the ovn-trace implementation of dns_lookup should at least
set the destination bit to 0, to indicate that the lookup failed.

I suggest folding in the following incremental to make the code easier
to read.  (I haven't tested it.)

Thanks,

Ben.

--8<--cut here-->8--

diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 9c669c96884a..e349817e7f4c 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -662,6 +662,18 @@ exit:
 }
 
 static void
+put_be16(struct ofpbuf *buf, ovs_be16 x)
+{
+ofpbuf_put(buf, , sizeof x);
+}
+
+static void
+put_be32(struct ofpbuf *buf, ovs_be32 x)
+{
+ofpbuf_put(buf, , sizeof x);
+}
+
+static void
 pinctrl_handle_dns_lookup(
 struct dp_packet *pkt_in, struct ofputil_packet_in *pin,
 struct ofpbuf *userdata, struct ofpbuf *continuation,
@@ -722,27 +734,19 @@ pinctrl_handle_dns_lookup(
 /* Extract the hostname. If the hostname is - 'www.ovn.org' it would be
  * encoded as (in hex) - 03 77 77 77 03 6f 76 63 03 6f 72 67 00.
  */
-bool hostname_present = true;
 while ((in_dns_data + idx) < end && in_dns_data[idx]) {
 uint8_t label_len = in_dns_data[idx++];
 if (in_dns_data + idx + label_len > end) {
-hostname_present = false;
-break;
-}
-for (uint8_t i = 0; i < label_len; i++) {
-ds_put_char(, in_dns_data[idx++]);
+ds_destroy();
+goto exit;
 }
+ds_put_buffer(, (const char *) in_dns_data + idx, label_len);
+idx += label_len;
 ds_put_char(, '.');
 }
 
-if (!hostname_present) {
-ds_destroy();
-goto exit;
-}
-
 idx++;
 ds_chomp(, '.');
-ds_put_char(, 0);
 in_dns_data += idx;
 
 /* Query should have TYPE and CLASS fields */
@@ -809,35 +813,25 @@ pinctrl_handle_dns_lookup(
  *be IP address of the domain name.
  */
 ofpbuf_put(_answer, in_hostname, idx);
-ovs_be16 v = htons(DNS_QUERY_TYPE_A);
-ofpbuf_put(_answer, , sizeof(ovs_be16));
-v = htons(DNS_CLASS_IN);
-ofpbuf_put(_answer, , sizeof(ovs_be16));
-ovs_be32 ttl = htonl(DNS_DEFAULT_RR_TTL);
-ofpbuf_put(_answer, , sizeof(ovs_be32));
-v = htons(sizeof(ovs_be32)); /* Length of the RDATA field */
-ofpbuf_put(_answer, , sizeof(ovs_be16));
-ofpbuf_put(_answer, _addrs.ipv4_addrs[i].addr,
-   sizeof(ovs_be32));
+put_be16(_answer, htons(DNS_QUERY_TYPE_A));
+put_be16(_answer, htons(DNS_CLASS_IN));
+put_be32(_answer, htonl(DNS_DEFAULT_RR_TTL));
+put_be16(_answer, htons(sizeof(ovs_be32)));
+put_be32(_answer, ip_addrs.ipv4_addrs[i].addr);
 ancount++;
 }
 }
 
 if (query_type == DNS_QUERY_TYPE_ ||
-query_type == DNS_QUERY_TYPE_ANY) {
+query_type == DNS_QUERY_TYPE_ANY) {
 for (size_t i = 0; i < ip_addrs.n_ipv6_addrs; i++) {
 ofpbuf_put(_answer, in_hostname, idx);
-ovs_be16 v = htons(DNS_QUERY_TYPE_);
-ofpbuf_put(_answer, , sizeof(ovs_be16));
-v = htons(DNS_CLASS_IN);
-ofpbuf_put(_answer, , sizeof(ovs_be16));
-ovs_be32 ttl = htonl(DNS_DEFAULT_RR_TTL);
-ofpbuf_put(_answer, , sizeof(ovs_be32));
-/* Length of the RDATA field */
-v = htons(sizeof(ip_addrs.ipv6_addrs[i].addr.s6_addr));
-ofpbuf_put(_answer, , sizeof(ovs_be16));
-   

Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor to Accelerate Megaflow Search

2017-04-14 Thread Wang, Yipeng1
Thank you Darrell for the comments. Please take a look at my reply inlined.

> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com]
> Sent: Thursday, April 13, 2017 10:36 PM
> To: Wang, Yipeng1 ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH RFC] dpif-netdev: Add Cuckoo Distributor to
> Accelerate Megaflow Search
> 
> 
> 
> On 4/6/17, 2:48 PM, "ovs-dev-boun...@openvswitch.org on behalf of
> yipeng1.w...@intel.com"  yipeng1.w...@intel.com> wrote:
> 
> From: Yipeng Wang 
> 
> The Datapath Classifier uses tuple space search for flow classification.
> The rules are arranged into a set of tuples/subtables (each with a
> distinct mask).  Each subtable is implemented as a hash table and lookup
> is done with flow keys formed by selecting the bits from the packet header
> based on each subtable's mask. Tuple space search will sequentially search
> each subtable until a match is found. With a large number of subtables, a
> sequential search of the subtables could consume a lot of CPU cycles. In
> a testbench with a uniform traffic pattern equally distributed across 20
> subtables, we measured that up to 65% of total execution time is 
> attributed
> to the megaflow cache lookup.
> 
> This patch presents the idea of the two-layer hierarchical lookup, where a
> low overhead first level of indirection is accessed first, we call this
> level cuckoo distributor (CD). If a flow key has been inserted in the flow
> table the first level will indicate with high probability that which
> subtable to look into. A lookup is performed on the second level (the
> target subtable) to retrieve the result. If the key doesn’t have a match,
> then we revert back to the sequential search of subtables.
> 
> This patch can improve the already existing Subtable Ranking when traffic
> data has high entropy. Subtable Ranking helps minimize the number of
> traversed subtables when most of the traffic hit the same subtable.
> However, in the case of high entropy traffic such as traffic coming from
> a physical port, multiple subtables could be hit with a similar frequency.
> In this case the average subtable lookups per hit would be much greater
> than 1. In addition, CD can adaptively turn off when it finds the traffic
> mostly hit one subtable. Thus, CD will not be an overhead when Subtable
> Ranking works well.
> 
> Scheme:
> 
>  ---
> |  CD   |
>  ---
>\
> \
>  -  - -
> |sub  ||sub  |...|sub  |
> |table||table|   |table|
>  -  - -
> 
> Evaluation:
> 
> We create set of rules with various src IP. We feed traffic containing 1
> million flows with various src IP and dst IP. All the flows hit 10/20/30
> rules creating 10/20/30 subtables.
> 
> The table below shows the preliminary continuous testing results (full 
> line
> speed test) we collected with a uni-directional port-to-port setup. The
> machine we tested on is a Xeon E5 server running with 2.2GHz cores. OvS
> runs with 1 PMD. We use Spirent as the hardware traffic generator.
> 
> no.subtable: 10  20  30
> cd-ovs   3895961 3170530 2968555
> orig-ovs 2683455 1646227 1240501
> speedup  1.45x   1.92x   2.39x
> 
> 
> I have a few initial comments.
> 1) Can you present the numbers with and without __AVX2__ “enabled”.'
[Wang, Yipeng] We mainly test with AVX2 to find the upper-bound performance 
speedup of the design.  Throughput-wise, we have not optimized for the scalar 
version thus we did not present the results. If people are interested in this 
patch, we will update the implementation to consider the performance for both 
AVX and scalar in Version 2 and report the results. We may design different 
structure (mainly different entry count per bucket) for scalar and AVX to 
optimize the performance.

> 2) Can you present the numbers with say 2 and say 10 flows for some
> comparison.
[Wang, Yipeng] As long as flows cannot all fit in EMC, CD should benefit.  
Generally, CD benefit more when there are more flows fall out of EMC. We 
collect the new results and report them as following:

2 flows:
  no.subtable: 10  20  30
cd-ovs   4267332 3478251 3126763
orig-ovs 3260883 2174551 1689981
speedup  1.31x   1.60x1.85x

10 flows:
  no.subtable: 10  20  30
cd-ovs   40157833276100   2970645
orig-ovs 269288217119551302321
speedup  1.49x   1.91x   2.28x  

> 3) Is the below logic conservative for when CD would provide benefit in the 
> case
> of the 1 million flow test for example ?
> if (avg_table_cnt >= 1) {
> 

[ovs-dev] [PATCH 3/3] ofproto-dpif: Check support for resubmit with conntrack action.

2017-04-14 Thread Jarno Rajahalme
Use the existing probed support flag for the original direction tuple
to determine if resubmit(ct) can be executed or not.

Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto-dpif.c | 86 ++
 1 file changed, 45 insertions(+), 41 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 25f8adf..23afc76 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4155,13 +4155,12 @@ check_mask(struct ofproto_dpif *ofproto, const struct 
miniflow *flow)
 }
 
 static void
-report_unsupported_ct(const char *detail)
+report_unsupported_act(const char *action, const char *detail)
 {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
-VLOG_WARN_RL(, "Rejecting ct action because datapath does not support "
- "ct action%s%s (your kernel module may be out of date)",
- detail ? " " : "",
- detail ? detail : "");
+VLOG_WARN_RL(, "Rejecting %s action because datapath does not support"
+ "%s%s (your kernel module may be out of date)",
+ action, detail ? " " : "", detail ? detail : "");
 }
 
 static enum ofperr
@@ -4169,51 +4168,56 @@ check_actions(const struct ofproto_dpif *ofproto,
   const struct rule_actions *const actions)
 {
 const struct ofpact *ofpact;
+const struct odp_support *support = >backer->support.odp;
 
 OFPACT_FOR_EACH (ofpact, actions->ofpacts, actions->ofpacts_len) {
-const struct odp_support *support;
-const struct ofpact_conntrack *ct;
-const struct ofpact *a;
+if (ofpact->type == OFPACT_CT) {
+const struct ofpact_conntrack *ct;
+const struct ofpact *a;
 
-if (ofpact->type != OFPACT_CT) {
-continue;
-}
+ct = CONTAINER_OF(ofpact, struct ofpact_conntrack, ofpact);
 
-ct = CONTAINER_OF(ofpact, struct ofpact_conntrack, ofpact);
-support = >backer->support.odp;
+if (!support->ct_state) {
+report_unsupported_act("ct", "ct action");
+return OFPERR_OFPBAC_BAD_TYPE;
+}
+if ((ct->zone_imm || ct->zone_src.field) && !support->ct_zone) {
+report_unsupported_act("ct", "ct zones");
+return OFPERR_OFPBAC_BAD_ARGUMENT;
+}
+/* So far the force commit feature is implemented together with the
+ * original direction tuple feature by all datapaths, so we use the
+ * support flag for the 'ct_orig_tuple' to indicate support for the
+ * force commit feature as well. */
+if ((ct->flags & NX_CT_F_FORCE) && !support->ct_orig_tuple) {
+report_unsupported_act("ct", "force commit");
+return OFPERR_OFPBAC_BAD_ARGUMENT;
+}
 
-if (!support->ct_state) {
-report_unsupported_ct(NULL);
-return OFPERR_OFPBAC_BAD_TYPE;
-}
-if ((ct->zone_imm || ct->zone_src.field) && !support->ct_zone) {
-report_unsupported_ct("zone");
-return OFPERR_OFPBAC_BAD_ARGUMENT;
-}
-/* So far the force commit feature is implemented together with the
- * original direction tuple feature by all datapaths, so we use the
- * support flag for the 'ct_orig_tuple' to indicate support for the
- * force commit feature as well. */
-if ((ct->flags & NX_CT_F_FORCE) && !support->ct_orig_tuple) {
-report_unsupported_ct("force commit");
-return OFPERR_OFPBAC_BAD_ARGUMENT;
-}
+OFPACT_FOR_EACH(a, ct->actions, ofpact_ct_get_action_len(ct)) {
+const struct mf_field *dst = ofpact_get_mf_dst(a);
 
-OFPACT_FOR_EACH(a, ct->actions, ofpact_ct_get_action_len(ct)) {
-const struct mf_field *dst = ofpact_get_mf_dst(a);
+if (a->type == OFPACT_NAT && !support->ct_state_nat) {
+/* The backer doesn't seem to support the NAT bits in
+ * 'ct_state': assume that it doesn't support the NAT
+ * action. */
+report_unsupported_act("ct", "nat");
+return OFPERR_OFPBAC_BAD_TYPE;
+}
+if (dst && ((dst->id == MFF_CT_MARK && !support->ct_mark) ||
+(dst->id == MFF_CT_LABEL && !support->ct_label))) {
+report_unsupported_act("ct", "setting mark and/or label");
+return OFPERR_OFPBAC_BAD_SET_ARGUMENT;
+}
+}
+} else if (ofpact->type == OFPACT_RESUBMIT) {
+struct ofpact_resubmit *resubmit = ofpact_get_RESUBMIT(ofpact);
 
-if (a->type == OFPACT_NAT && !support->ct_state_nat) {
-/* The backer doesn't seem to support the NAT bits in
- * 'ct_state': assume that it doesn't support 

[ovs-dev] [PATCH 2/3] ofproto-dpif: Check if original direction matches are supported.

2017-04-14 Thread Jarno Rajahalme
Use the existing probed support flag for the original direction tuple
to determine if matches on the original direction tuple can be supported.

Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto-dpif.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index c0212f2..25f8adf 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4122,7 +4122,8 @@ check_mask(struct ofproto_dpif *ofproto, const struct 
miniflow *flow)
 support = >backer->support.odp;
 ct_state = MINIFLOW_GET_U8(flow, ct_state);
 if (support->ct_state && support->ct_zone && support->ct_mark
-&& support->ct_label && support->ct_state_nat) {
+&& support->ct_label && support->ct_state_nat
+&& support->ct_orig_tuple) {
 return ct_state & CS_UNSUPPORTED_MASK ? OFPERR_OFPBMC_BAD_MASK : 0;
 }
 
@@ -4139,6 +4140,17 @@ check_mask(struct ofproto_dpif *ofproto, const struct 
miniflow *flow)
 return OFPERR_OFPBMC_BAD_MASK;
 }
 
+if (!support->ct_orig_tuple &&
+(MINIFLOW_GET_U8(flow, ct_nw_proto) ||
+ MINIFLOW_GET_U16(flow, ct_tp_src) ||
+ MINIFLOW_GET_U16(flow, ct_tp_dst) ||
+ MINIFLOW_GET_U32(flow, ct_nw_src) ||
+ MINIFLOW_GET_U32(flow, ct_nw_dst) ||
+ !ovs_u128_is_zero(MINIFLOW_GET_U128(flow, ct_ipv6_src)) ||
+ !ovs_u128_is_zero(MINIFLOW_GET_U128(flow, ct_ipv6_dst {
+return OFPERR_OFPBMC_BAD_MASK;
+}
+
 return 0;
 }
 
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/3] ofproto-dpif: Check support for CT action force commit flag.

2017-04-14 Thread Jarno Rajahalme
So far the force commit feature is implemented together with the
original direction tuple feature by all datapaths, so we can use the
support flag for the 'ct_orig_tuple' to indicate support for the force
commit feature as well.

Better fail the flow install than rely on ovs-vswitchd log being
filled by error messages from the datapath.

Signed-off-by: Jarno Rajahalme 
---
 ofproto/ofproto-dpif.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 6a5ffb9..c0212f2 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -4178,6 +4178,14 @@ check_actions(const struct ofproto_dpif *ofproto,
 report_unsupported_ct("zone");
 return OFPERR_OFPBAC_BAD_ARGUMENT;
 }
+/* So far the force commit feature is implemented together with the
+ * original direction tuple feature by all datapaths, so we use the
+ * support flag for the 'ct_orig_tuple' to indicate support for the
+ * force commit feature as well. */
+if ((ct->flags & NX_CT_F_FORCE) && !support->ct_orig_tuple) {
+report_unsupported_ct("force commit");
+return OFPERR_OFPBAC_BAD_ARGUMENT;
+}
 
 OFPACT_FOR_EACH(a, ct->actions, ofpact_ct_get_action_len(ct)) {
 const struct mf_field *dst = ofpact_get_mf_dst(a);
-- 
2.1.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 1/3] ovn-util: Add a new util function extract_ip_addresses

2017-04-14 Thread Ben Pfaff
On Mon, Apr 03, 2017 at 04:10:00PM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique 
> 
> An upcoming commit will use this function to extract the IP (v4 and
> v6) addresses from a string without a preceding eth address.
> 
> Signed-off-by: Numan Siddique 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Layer seven processing

2017-04-14 Thread Ben Pfaff
On Sat, Apr 08, 2017 at 06:49:38PM +0430, sougol gheissi wrote:
> I have seen a video on the OVS conference 2014 named OpenvSwitch L7
> matchers & conntrack metadata, which explains the relationship between OVS
> and the nfq(network filter queue). I want to know about how to connect the
> OVS and the nfq. Is there any approach to it?!

To the best of my recollection, that talk (by Franck Baudin) was about a
proposal that did not ultimately make it into OVS.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: Add back target for datapath testing section.

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 04:04:08PM -0700, Andy Zhou wrote:
> On Fri, Apr 14, 2017 at 3:57 PM, Ben Pfaff  wrote:
> > This one really is used.  I'm honestly not sure how I missed this, because
> > I did test things.
> >
> > Fixes: 9ef589e4daf7 ("doc: Remove some link targets that aren't used 
> > anywhere.")
> > Reported-by: Andy Zhou 
> > Signed-off-by: Ben Pfaff 
> 
> Thanks for the quick turnaround.
> 
> Acked-by: Andy Zhou 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [datapath backport v2 09/10] datapath: openvswitch: Optimize sample action for the clone use cases

2017-04-14 Thread Andy Zhou
Upstream commit:
openvswitch: Optimize sample action for the clone use cases

With the introduction of open flow 'clone' action, the OVS user space
can now translate the 'clone' action into kernel datapath 'sample'
action, with 100% probability, to ensure that the clone semantics,
which is that the packet seen by the clone action is the same as the
packet seen by the action after clone, is faithfully carried out
in the datapath.

While the sample action in the datpath has the matching semantics,
its implementation is only optimized for its original use.
Specifically, there are two limitation: First, there is a 3 level of
nesting restriction, enforced at the flow downloading time. This
limit turns out to be too restrictive for the 'clone' use case.
Second, the implementation avoid recursive call only if the sample
action list has a single userspace action.

The main optimization implemented in this series removes the static
nesting limit check, instead, implement the run time recursion limit
check, and recursion avoidance similar to that of the 'recirc' action.
This optimization solve both #1 and #2 issues above.

One related optimization attempts to avoid copying flow key as
long as the actions enclosed does not change the flow key. The
detection is performed only once at the flow downloading time.

Another related optimization is to rewrite the action list
at flow downloading time in order to save the fast path from parsing
the sample action list in its original form repeatedly.

Signed-off-by: Andy Zhou 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Upstream: 798c166173ff ("openvswitch: Optimize sample action for the clone use 
cases")
Signed-off-by: Andy Zhou 
Acked-by: Joe Stringer 

---
v1->v2: Move U32_MAX definition to
   datapath/linux/compat/include/linux/kernel.h
---
 datapath/actions.c| 107 
 datapath/datapath.h   |   2 -
 datapath/flow_netlink.c   | 141 +++---
 datapath/linux/compat/include/linux/kernel.h  |   5 +
 datapath/linux/compat/include/linux/openvswitch.h |  15 +++
 5 files changed, 172 insertions(+), 98 deletions(-)

diff --git a/datapath/actions.c b/datapath/actions.c
index f300307b422b..99e9a88e13c5 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -912,73 +912,70 @@ static int output_userspace(struct datapath *dp, struct 
sk_buff *skb,
return err;
 }
 
+/* When 'last' is true, sample() should always consume the 'skb'.
+ * Otherwise, sample() should keep 'skb' intact regardless what
+ * actions are executed within sample().
+ */
 static int sample(struct datapath *dp, struct sk_buff *skb,
  struct sw_flow_key *key, const struct nlattr *attr,
- const struct nlattr *actions, int actions_len)
+ bool last)
 {
-   const struct nlattr *acts_list = NULL;
-   const struct nlattr *a;
-   int rem;
-   u32 cutlen = 0;
+   struct nlattr *actions;
+   struct nlattr *sample_arg;
+   struct sw_flow_key *orig_key = key;
+   int rem = nla_len(attr);
+   int err = 0;
+   const struct sample_arg *arg;
 
-   for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
-a = nla_next(a, )) {
-   u32 probability;
+   /* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
+   sample_arg = nla_data(attr);
+   arg = nla_data(sample_arg);
+   actions = nla_next(sample_arg, );
 
-   switch (nla_type(a)) {
-   case OVS_SAMPLE_ATTR_PROBABILITY:
-   probability = nla_get_u32(a);
-   if (!probability || prandom_u32() > probability)
-   return 0;
-   break;
-
-   case OVS_SAMPLE_ATTR_ACTIONS:
-   acts_list = a;
-   break;
-   }
+   if ((arg->probability != U32_MAX) &&
+   (!arg->probability || prandom_u32() > arg->probability)) {
+   if (last)
+   consume_skb(skb);
+   return 0;
}
 
-   rem = nla_len(acts_list);
-   a = nla_data(acts_list);
-
-   /* Actions list is empty, do nothing */
-   if (unlikely(!rem))
+   /* Unless the last action, sample works on the clone of SKB.  */
+   skb = last ? skb : skb_clone(skb, GFP_ATOMIC);
+   if (!skb) {
+   /* Out of memory, skip this sample action.
+*/
return 0;
+   }
 
-   /* The only known usage of sample action is having a single user-space
-* action, or having a truncate action followed by a single user-space
-* action. Treat this usage as a special case.
-* The 

[ovs-dev] [datapath backport v2 03/10] compat: ipv6: orphan skbs in reassembly unit.

2017-04-14 Thread Andy Zhou
From: Eric Dumazet 

Upstream commit:
ipv6: orphan skbs in reassembly unit

Andrey reported a use-after-free in IPv6 stack.

Issue here is that we free the socket while it still has skb
in TX path and in some queues.

It happens here because IPv6 reassembly unit messes skb->truesize,
breaking skb_set_owner_w() badly.

We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag:
Always orphan skbs inside ip_defrag()")
Acked-by: Joe Stringer 

==
BUG: KASAN: use-after-free in sock_wfree+0x118/0x120
Read of size 8 at addr 880062da0060 by task a.out/4140

page:ea00018b6800 count:1 mapcount:0 mapping:  (null)
index:0x0 compound_mapcount: 0
flags: 0x1008100(slab|head)
raw: 01008100   000180130013
raw: dead0100 dead0200 88006741f140 
page dumped because: kasan: bad access detected

CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15
 dump_stack+0x292/0x398 lib/dump_stack.c:51
 describe_address mm/kasan/report.c:262
 kasan_report_error+0x121/0x560 mm/kasan/report.c:370
 kasan_report mm/kasan/report.c:392
 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413
 sock_flag ./arch/x86/include/asm/bitops.h:324
 sock_wfree+0x118/0x120 net/core/sock.c:1631
 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655
 skb_release_all+0x15/0x60 net/core/skbuff.c:668
 __kfree_skb+0x15/0x20 net/core/skbuff.c:684
 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705
 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304
 inet_frag_put ./include/net/inet_frag.h:133
 nf_ct_frag6_gather+0x1125/0x38b0 
net/ipv6/netfilter/nf_conntrack_reasm.c:617
 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68
 nf_hook_entry_hookfn ./include/linux/netfilter.h:102
 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310
 nf_hook ./include/linux/netfilter.h:212
 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160
 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170
 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722
 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742
 rawv6_push_pending_frames net/ipv6/raw.c:613
 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927
 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744
 sock_sendmsg_nosec net/socket.c:635
 sock_sendmsg+0xca/0x110 net/socket.c:645
 sock_write_iter+0x326/0x620 net/socket.c:848
 new_sync_write fs/read_write.c:499
 __vfs_write+0x483/0x760 fs/read_write.c:512
 vfs_write+0x187/0x530 fs/read_write.c:560
 SYSC_write fs/read_write.c:607
 SyS_write+0xfb/0x230 fs/read_write.c:599
 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203
RIP: 0033:0x7ff26e6f5b79
RSP: 002b:7ff268e0ed98 EFLAGS: 0206 ORIG_RAX: 0001
RAX: ffda RBX: 7ff268e0f9c0 RCX: 7ff26e6f5b79
RDX: 0010 RSI: 20f50fe1 RDI: 0003
RBP: 7ff26ebc1220 R08:  R09: 
R10:  R11: 0206 R12: 
R13: 7ff268e0f9c0 R14: 7ff26efec040 R15: 0003

The buggy address belongs to the object at 880062da
 which belongs to the cache RAWv6 of size 1504
The buggy address 880062da0060 is located 96 bytes inside
 of 1504-byte region [880062da, 880062da05e0)

Freed by task 4113:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57
 save_stack+0x43/0xd0 mm/kasan/kasan.c:502
 set_track mm/kasan/kasan.c:514
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578
 slab_free_hook mm/slub.c:1352
 slab_free_freelist_hook mm/slub.c:1374
 slab_free mm/slub.c:2951
 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973
 sk_prot_free net/core/sock.c:1377
 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452
 sk_destruct+0x47/0x80 net/core/sock.c:1460
 __sk_free+0x57/0x230 net/core/sock.c:1468
 sk_free+0x23/0x30 net/core/sock.c:1479
 sock_put ./include/net/sock.h:1638
 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782
 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214
 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425
 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431
 sock_release+0x8d/0x1e0 net/socket.c:599
 sock_close+0x16/0x20 net/socket.c:1063
 __fput+0x332/0x7f0 fs/file_table.c:208
 fput+0x15/0x20 fs/file_table.c:244
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work ./include/linux/task_work.h:21
 do_exit+0x186b/0x2800 kernel/exit.c:839
 do_group_exit+0x149/0x420 

[ovs-dev] [datapath backport v2 04/10] datapath: actions: fixed a brace coding style warning.

2017-04-14 Thread Andy Zhou
From: Peter Downs 

Upstream commit:
openvswitch: actions: fixed a brace coding style warning

Fixed a brace coding style warning reported by checkpatch.pl

Signed-off-by: Peter Downs 
Signed-off-by: David S. Miller 

Upstream: f1304f7ba398 ("openvswitch: actions: fixed a brace coding style 
warning")
Signed-off-by: Joe Stringer 
Signed-off-by: Andy Zhou 

---
v1->v2: no change
---
 AUTHORS.rst| 1 +
 datapath/actions.c | 3 +--
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/AUTHORS.rst b/AUTHORS.rst
index 39bbd2c92613..bbe5b363ed9b 100644
--- a/AUTHORS.rst
+++ b/AUTHORS.rst
@@ -235,6 +235,7 @@ Paul Fazzonepfazz...@nicira.com
 Paul Ingram p...@nicira.com
 Paul-Emmanuel Raoul sky...@skyplabs.net
 Pavithra Ramesh param...@vmware.com
+Peter Downs pado...@gmail.com
 Philippe Jung   phil.j...@free.fr
 Pim van den Bergp...@nethuis.nl
 pritesh pritesh.koth...@cisco.com
diff --git a/datapath/actions.c b/datapath/actions.c
index 06080b24ea5a..b2a0ae537e37 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -775,9 +775,8 @@ static void ovs_fragment(struct net *net, struct vport 
*vport,
unsigned long orig_dst;
struct rt6_info ovs_rt;
 
-   if (!v6ops) {
+   if (!v6ops)
goto err;
-   }
 
prepare_frag(vport, skb,
 ovs_key_mac_proto(key));
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [datapath backport v2 02/10] datapath: Pack struct sw_flow_key.

2017-04-14 Thread Andy Zhou
From: Jarno Rajahalme 

Upstream commit:
openvswitch: Pack struct sw_flow_key.

struct sw_flow_key has two 16-bit holes. Move the most matched
conntrack match fields there.  In some typical cases this reduces the
size of the key that needs to be hashed into half and into one cache
line.

Signed-off-by: Jarno Rajahalme 
Acked-by: Joe Stringer 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Upstream: 316d4d78cf9b ("openvswitch: Pack struct sw_flow_key.")
Signed-off-by: Joe Stringer 
Signed-off-by: Andy Zhou 

---
v1->v2: no change.
---
 datapath/conntrack.c| 40 
 datapath/conntrack.h|  8 
 datapath/flow.h | 14 --
 datapath/flow_netlink.c | 11 +++
 4 files changed, 39 insertions(+), 34 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index 47f1a9938114..99a0201cc189 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -159,7 +159,7 @@ static void __ovs_ct_update_key_orig_tp(struct sw_flow_key 
*key,
const struct nf_conntrack_tuple *orig,
u8 icmp_proto)
 {
-   key->ct.orig_proto = orig->dst.protonum;
+   key->ct_orig_proto = orig->dst.protonum;
if (orig->dst.protonum == icmp_proto) {
key->ct.orig_tp.src = htons(orig->dst.u.icmp.type);
key->ct.orig_tp.dst = htons(orig->dst.u.icmp.code);
@@ -173,8 +173,8 @@ static void __ovs_ct_update_key(struct sw_flow_key *key, u8 
state,
const struct nf_conntrack_zone *zone,
const struct nf_conn *ct)
 {
-   key->ct.state = state;
-   key->ct.zone = zone->id;
+   key->ct_state = state;
+   key->ct_zone = zone->id;
key->ct.mark = ovs_ct_get_mark(ct);
ovs_ct_get_labels(ct, >ct.labels);
 
@@ -202,10 +202,10 @@ static void __ovs_ct_update_key(struct sw_flow_key *key, 
u8 state,
return;
}
}
-   /* Clear 'ct.orig_proto' to mark the non-existence of conntrack
+   /* Clear 'ct_orig_proto' to mark the non-existence of conntrack
 * original direction key fields.
 */
-   key->ct.orig_proto = 0;
+   key->ct_orig_proto = 0;
 }
 
 /* Update 'key' based on skb->_nfct.  If 'post_ct' is true, then OVS has
@@ -235,7 +235,7 @@ static void ovs_ct_update_key(const struct sk_buff *skb,
if (ct->master)
state |= OVS_CS_F_RELATED;
if (keep_nat_flags) {
-   state |= key->ct.state & OVS_CS_F_NAT_MASK;
+   state |= key->ct_state & OVS_CS_F_NAT_MASK;
} else {
if (ct->status & IPS_SRC_NAT)
state |= OVS_CS_F_SRC_NAT;
@@ -266,11 +266,11 @@ void ovs_ct_fill_key(const struct sk_buff *skb, struct 
sw_flow_key *key)
 int ovs_ct_put_key(const struct sw_flow_key *swkey,
   const struct sw_flow_key *output, struct sk_buff *skb)
 {
-   if (nla_put_u32(skb, OVS_KEY_ATTR_CT_STATE, output->ct.state))
+   if (nla_put_u32(skb, OVS_KEY_ATTR_CT_STATE, output->ct_state))
return -EMSGSIZE;
 
if (IS_ENABLED(CONFIG_NF_CONNTRACK_ZONES) &&
-   nla_put_u16(skb, OVS_KEY_ATTR_CT_ZONE, output->ct.zone))
+   nla_put_u16(skb, OVS_KEY_ATTR_CT_ZONE, output->ct_zone))
return -EMSGSIZE;
 
if (IS_ENABLED(CONFIG_NF_CONNTRACK_MARK) &&
@@ -282,14 +282,14 @@ int ovs_ct_put_key(const struct sw_flow_key *swkey,
>ct.labels))
return -EMSGSIZE;
 
-   if (swkey->ct.orig_proto) {
+   if (swkey->ct_orig_proto) {
if (swkey->eth.type == htons(ETH_P_IP)) {
struct ovs_key_ct_tuple_ipv4 orig = {
output->ipv4.ct_orig.src,
output->ipv4.ct_orig.dst,
output->ct.orig_tp.src,
output->ct.orig_tp.dst,
-   output->ct.orig_proto,
+   output->ct_orig_proto,
};
if (nla_put(skb, OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4,
sizeof(orig), ))
@@ -300,7 +300,7 @@ int ovs_ct_put_key(const struct sw_flow_key *swkey,
IN6_ADDR_INITIALIZER(output->ipv6.ct_orig.dst),
output->ct.orig_tp.src,
output->ct.orig_tp.dst,
-   output->ct.orig_proto,
+   output->ct_orig_proto,
};
if (nla_put(skb, OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6,
 

[ovs-dev] [datapath backport v2 01/10] datapath: Always define NF_CT_LABELS_MAX_SIZE

2017-04-14 Thread Andy Zhou
When CONFIG_NF_CONNTRACK_LABLES is not set, upstream code still make
use of NF_CT_LABLES_MAX_SIZE. Always define it in the compat code
to keep back ports close to the upstream.

Signed-off-by: Andy Zhou 

---
v1->v2:  Drop the v1, change to always export NF_CT_LABELS_MAX_SIZE
---
 datapath/linux/compat/include/net/netfilter/nf_conntrack_labels.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/datapath/linux/compat/include/net/netfilter/nf_conntrack_labels.h 
b/datapath/linux/compat/include/net/netfilter/nf_conntrack_labels.h
index c138de7bab7a..5af5a9a8db7c 100644
--- a/datapath/linux/compat/include/net/netfilter/nf_conntrack_labels.h
+++ b/datapath/linux/compat/include/net/netfilter/nf_conntrack_labels.h
@@ -5,13 +5,13 @@
 #include 
 #include_next 
 
-#ifndef HAVE_NF_CONNLABELS_GET_TAKES_BIT
-#if IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)
-
 #ifndef NF_CT_LABELS_MAX_SIZE
 #define NF_CT_LABELS_MAX_SIZE ((XT_CONNLABEL_MAXBIT + 1) / BITS_PER_BYTE)
 #endif
 
+#ifndef HAVE_NF_CONNLABELS_GET_TAKES_BIT
+#if IS_ENABLED(CONFIG_NF_CONNTRACK_LABELS)
+
 /* XXX: This doesn't lock others out from doing the same configuration
  * simultaneously. */
 static inline int rpl_nf_connlabels_get(struct net *net, unsigned int bits)
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: Add back target for datapath testing section.

2017-04-14 Thread Andy Zhou
On Fri, Apr 14, 2017 at 3:57 PM, Ben Pfaff  wrote:
> This one really is used.  I'm honestly not sure how I missed this, because
> I did test things.
>
> Fixes: 9ef589e4daf7 ("doc: Remove some link targets that aren't used 
> anywhere.")
> Reported-by: Andy Zhou 
> Signed-off-by: Ben Pfaff 

Thanks for the quick turnaround.

Acked-by: Andy Zhou 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] doc: Add back target for datapath testing section.

2017-04-14 Thread Ben Pfaff
This one really is used.  I'm honestly not sure how I missed this, because
I did test things.

Fixes: 9ef589e4daf7 ("doc: Remove some link targets that aren't used anywhere.")
Reported-by: Andy Zhou 
Signed-off-by: Ben Pfaff 
---
 Documentation/topics/testing.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
index 498aea4d0ae1..06f99aded3d8 100644
--- a/Documentation/topics/testing.rst
+++ b/Documentation/topics/testing.rst
@@ -205,6 +205,8 @@ omit ``RYUDIR``
   of Open vSwitch and Ryu in your bug report, plus any other information
   needed to reproduce the problem.
 
+.. _datapath-testing:
+
 Datapath testing
 
 
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 2/2] ovn-northd ipam: Support IPv6 dynamic assignment

2017-04-14 Thread Ben Pfaff
On Fri, Mar 10, 2017 at 07:47:20AM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique 
> 
> OVN will generate the IPv6 address for a logical port if requested
> using the IPv6 prefix and the MAC address (as IEEE EUI64 identifier).
> To generate the IPv6 address, CMS should define the IPv6 prefix in the
> 'Logical_switch.other_config:ipv6_prefix' column.
> 
> Signed-off-by: Numan Siddique 
> Co-authored-by: Ben Pfaff 

Thanks a lot for implementing this!  It's nice to have feature parity
for IPv4 and IPv6 to the greatest extent we can.

I'm going to apply this soon, with the following changes folded in.

diff --git a/NEWS b/NEWS
index 8607b04b6ed5..eb825ac72161 100644
--- a/NEWS
+++ b/NEWS
@@ -14,6 +14,7 @@ Post-v2.7.0
  "dot1q-tunnel" port VLAN mode.
- OVN:
  * IPAM for IPv4 can now exclude user-defined addresses from assignment.
+ * IPAM can now assign IPv6 addresses.
  * Make the DHCPv4 router setting optional.
  * Gratuitous ARP for NAT addresses on a distributed logical router.
  * Allow ovn-controller SSL configuration to be obtained from vswitchd
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index cd1a02ab2e8b..2b416cede42e 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -137,12 +137,19 @@
 
   
 These options control automatic IP address management (IPAM) for ports
-attached to the logical switch.  To enable IPAM, set  and optionally .  Then, to request dynamic address
-assignment for a particular port, use the dynamic keyword
-in the  column of
-the port's  row.
+column="other_config:exclude_ips"/>.  To enable IPAM for IPv6, set
+.  IPv4 and IPv6 may
+be enabled together or separately.
+  
+
+  
+To request dynamic address assignment for a particular port, use the
+dynamic keyword in the  column of the port's  row.  This requests both an IPv4 and an
+IPv6 address, if IPAM for IPv4 and IPv6 are both enabled.
   
 
   
@@ -175,14 +182,11 @@
 
   
 Set this to an IPv6 prefix to enable ovn-northd to
-automatically assign IPv6 addresses using this prefix. Use the
-dynamic keyword in the 
-table's  column to
-request dynamic address assignment for a particular port. The assigned
-IPv6 address will be generated using the IPv6 prefix and the
-MAC address (as IEEE EUI64 identifier) of the port.
-The IPv6 prefix defined here should be a valid IPv6 address ending with
-"::".
+automatically assign IPv6 addresses using this prefix.  The assigned
+IPv6 address will be generated using the IPv6 prefix and the MAC
+address (converted to an IEEE EUI64 identifier) of the port.  The IPv6
+prefix defined here should be a valid IPv6 address ending with
+::.
 
   Examples:
 
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 1/2] ovn-northd ipam: Support 'exclude_ips' option

2017-04-14 Thread Ben Pfaff
On Fri, Mar 10, 2017 at 07:46:58AM +0530, nusid...@redhat.com wrote:
> From: Numan Siddique 
> 
> If the CMS wants to make use of ovn ipam it can now provide a
> list of IPv4 addresses and a range of IPv4 addresses which
> will be excluded from the dynamic address assignment.
> To support this, a new option 'exclude_ips' is added in the
> Logical_switch.other_config column.
> 
> Eg. ovn-nbctl set Logical_switch sw0
> other_config:exclude_ips="10.0.0.2 10.0.0.30..10.0.0.40"
> 
> The present code, uses hash maps to store the assigned IP addresses.
> In order to support this option, this patch has refactored the IPAM
> assignment. It now uses a bitmap to manage the IP assignment with
> each bit in the bitmap representing an IPv4 address.
> 
> This patch also clears the 'Logical_switch_port.dynamic_addresses'
> if the CMS has cleared 'dynamic' address assignment request.
> 
> Signed-off-by: Numan Siddique 

Thanks a lot for working on this!  I apologize that it took a long time
to review it.

I'm going to apply this to master in a few minutes.  I'm folding in the
following.

diff --git a/NEWS b/NEWS
index 05af97a1f030..8607b04b6ed5 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,7 @@ Post-v2.7.0
- New support for multiple VLANs (802.1ad or "QinQ"), including a new
  "dot1q-tunnel" port VLAN mode.
- OVN:
+ * IPAM for IPv4 can now exclude user-defined addresses from assignment.
  * Make the DHCPv4 router setting optional.
  * Gratuitous ARP for NAT addresses on a distributed logical router.
  * Allow ovn-controller SSL configuration to be obtained from vswitchd
diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 390a8bbad641..878fb7295e33 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -544,40 +544,40 @@ init_ipam_info_for_datapath(struct ovn_datapath *od)
 /* exclude_ip_list could be in the format -
 *  "10.0.0.4 10.0.0.10 10.0.0.20..10.0.0.50 10.0.0.100..10.0.0.110".
 */
-while(lexer_get() || lexer.error) {
+lexer_get();
+while (lexer.token.type != LEX_T_END) {
 if (lexer.token.type != LEX_T_INTEGER) {
+lexer_syntax_error(, "expecting address");
 break;
 }
-uint32_t start_ipv4 = 0;
-uint32_t end_ipv4 = 0;
+uint32_t start = ntohl(lexer.token.value.ipv4);
+lexer_get();
 
-start_ipv4 = ntohl(lexer.token.value.ipv4);
-if(lexer_lookahead() == LEX_T_ELLIPSIS) {
-lexer_get();
-lexer_get();
+uint32_t end = start + 1;
+if (lexer_match(, LEX_T_ELLIPSIS)) {
 if (lexer.token.type != LEX_T_INTEGER) {
+lexer_syntax_error(, "expecting address range");
 break;
 }
-end_ipv4 = ntohl(lexer.token.value.ipv4);
-}
-
-/* Validate start_ipv4. */
-if ((end_ipv4 && start_ipv4 > end_ipv4) ||
- start_ipv4 < od->ipam_info->start_ipv4 ||
- start_ipv4 > (od->ipam_info->start_ipv4 +
-   od->ipam_info->total_ipv4s)) {
-/* Invalid format. Continue so we can parse further.*/
-continue;
+end = ntohl(lexer.token.value.ipv4) + 1;
+lexer_get();
 }
 
-if (end_ipv4 > (
-od->ipam_info->start_ipv4 + od->ipam_info->total_ipv4s)) {
-continue;
+/* Clamp start...end to fit the subnet. */
+start = MAX(od->ipam_info->start_ipv4, start);
+end = MIN(od->ipam_info->start_ipv4 + od->ipam_info->total_ipv4s, end);
+if (end > start) {
+bitmap_set_multiple(od->ipam_info->allocated_ipv4s,
+start - od->ipam_info->start_ipv4,
+end - start, 1);
+} else {
+lexer_error(, "excluded addresses not in subnet");
 }
-
-size_t count = end_ipv4 ? (end_ipv4 - start_ipv4) + 1 : 1;
-bitmap_set_multiple(od->ipam_info->allocated_ipv4s,
-start_ipv4 - od->ipam_info->start_ipv4, count, 1);
+}
+if (lexer.error) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(, "logical switch "UUID_FMT": bad exclude_ips (%s)",
+ UUID_ARGS(>key), lexer.error);
 }
 lexer_destroy();
 }
@@ -894,7 +894,8 @@ ipam_insert_ip(struct ovn_datapath *od, uint32_t ip)
 
 if (ip >= od->ipam_info->start_ipv4 &&
 ip < (od->ipam_info->start_ipv4 + od->ipam_info->total_ipv4s)) {
-bitmap_set1(od->ipam_info->allocated_ipv4s, ip - 
od->ipam_info->start_ipv4);
+bitmap_set1(od->ipam_info->allocated_ipv4s,
+ip - od->ipam_info->start_ipv4);
 }
 }
 
diff --git a/ovn/ovn-nb.xml b/ovn/ovn-nb.xml
index 75a249a9eb2b..68a00392760a 100644
--- a/ovn/ovn-nb.xml
+++ b/ovn/ovn-nb.xml
@@ -134,26 +134,35 @@
   QOS marking rules that apply to packets 

Re: [ovs-dev] [PATCH v4] netdev-dpdk: leverage the mempool offload support

2017-04-14 Thread Darrell Ball
Acked-by: Darrell Ball 

On 4/12/17, 10:31 PM, "Hemant Agrawal"  wrote:

DPDK 16.07 introduced the support for mempool offload support.
rte_pktmbuf_pool_create is the recommended method for creating pktmbuf
pools. Buffer pools created with rte_mempool_create may not get offloaded
to the underlying offloaded mempools.

This patch, changes the rte_mempool_create to use helper wrapper
"rte_pktmbuf_pool_create" provided by dpdk, so that it can leverage
offloaded mempools.

Signed-off-by: Hemant Agrawal 
Acked-by: Jianbo Liu 
Acked-by: Kevin Traynor 
---
v4:
  * fix the comment as suggested
v3:
  * adding OVS_UNUSED for mp parameter
v2:
  * removing rte_pktmbuf_init as per review comment

 lib/netdev-dpdk.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ddc651b..9fa60fd 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -451,22 +451,19 @@ free_dpdk_buf(struct dp_packet *p)
 }
 
 static void
-ovs_rte_pktmbuf_init(struct rte_mempool *mp,
+ovs_rte_pktmbuf_init(struct rte_mempool *mp OVS_UNUSED,
  void *opaque_arg OVS_UNUSED,
  void *_p,
  unsigned i OVS_UNUSED)
 {
 struct rte_mbuf *pkt = _p;
 
-rte_pktmbuf_init(mp, opaque_arg, _p, i);
-
 dp_packet_init_dpdk((struct dp_packet *) pkt, pkt->buf_len);
 }
 
 static struct dpdk_mp *
 dpdk_mp_create(int socket_id, int mtu)
 {
-struct rte_pktmbuf_pool_private mbp_priv;
 struct dpdk_mp *dmp;
 unsigned mp_size;
 char *mp_name;
@@ -478,9 +475,6 @@ dpdk_mp_create(int socket_id, int mtu)
 dmp->socket_id = socket_id;
 dmp->mtu = mtu;
 dmp->refcount = 1;
-mbp_priv.mbuf_data_room_size = MBUF_SIZE(mtu) - sizeof(struct 
dp_packet);
-mbp_priv.mbuf_priv_size = sizeof(struct dp_packet)
-  - sizeof(struct rte_mbuf);
 /* XXX: this is a really rough method of provisioning memory.
  * It's impossible to determine what the exact memory requirements are
  * when the number of ports and rxqs that utilize a particular mempool 
can
@@ -496,18 +490,24 @@ dpdk_mp_create(int socket_id, int mtu)
 mp_name = xasprintf("ovs_mp_%d_%d_%u", dmp->mtu, dmp->socket_id,
 mp_size);
 
-dmp->mp = rte_mempool_create(mp_name, mp_size, MBUF_SIZE(mtu),
- MP_CACHE_SZ,
- sizeof(struct 
rte_pktmbuf_pool_private),
- rte_pktmbuf_pool_init, _priv,
- ovs_rte_pktmbuf_init, NULL,
- socket_id, 0);
+dmp->mp = rte_pktmbuf_pool_create(mp_name, mp_size,
+  MP_CACHE_SZ,
+  sizeof (struct dp_packet)
+ - sizeof (struct 
rte_mbuf),
+  MBUF_SIZE(mtu)
+ - sizeof(struct 
dp_packet),
+  socket_id);
 if (dmp->mp) {
 VLOG_DBG("Allocated \"%s\" mempool with %u mbufs",
  mp_name, mp_size);
 }
 free(mp_name);
 if (dmp->mp) {
+/* rte_pktmbuf_pool_create has done some initialization of the
+ * rte_mbuf part of each dp_packet, while ovs_rte_pktmbuf_init
+ * initializes some OVS specific fields of dp_packet.
+ */
+rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
 return dmp;
 }
 } while (rte_errno == ENOMEM && (mp_size /= 2) >= MIN_NB_MBUF);
-- 
1.9.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Enable INDIRECT_DESC on DPDK vHostUser.

2017-04-14 Thread Darrell Ball
Acked-by: Darrell Ball 

On 4/10/17, 4:21 AM, "Kevin Traynor"  wrote:

On 04/10/2017 08:03 AM, Darrell Ball wrote:
> 
> 
> On 4/5/17, 7:52 AM, "ovs-dev-boun...@openvswitch.org on behalf of O 
Mahony, Billy"  wrote:
> 
> > -Original Message-
> > From: Kevin Traynor [mailto:ktray...@redhat.com]
> > Sent: Wednesday, April 5, 2017 2:58 PM
> > To: O Mahony, Billy ; Maxime Coquelin
> > ; d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Enable INDIRECT_DESC on
> > DPDK vHostUser.
> > 
> > On 03/20/2017 11:18 AM, O Mahony, Billy wrote:
> > > Hi Maxime,
> > >
> > >> -Original Message-
> > >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com]
> > >> Sent: Friday, March 17, 2017 9:48 AM
> > >> To: O Mahony, Billy ; 
d...@openvswitch.org
> > >> Subject: Re: [ovs-dev] [PATCH] netdev-dpdk: Enable INDIRECT_DESC 
on
> > >> DPDK vHostUser.
> > >>
> > >> Hi Billy,
> > >>
> > >> On 03/01/2017 01:36 PM, Billy O'Mahony wrote:
> > >>> Hi All,
> > >>>
> > >>> I'm creating this patch on the basis of performance results 
outlined
> > >>> below. In summary it appears that enabling INDIRECT_DESC on DPDK
> > >>> vHostUser ports leads to very large increase in performance when
> > >>> using linux stack applications in the guest with no noticable
> > >>> performance drop for DPDK based applications in the guest.
> > >>>
> > >>> Test#1 (VM-VM iperf3 performance)
> > >>>  VMs use DPDK vhostuser ports
> > >>>  OVS bridge is configured for normal action.
> > >>>  OVS version 603381a (on 2.7.0 branch but before release,
> > >>>  also seen on v2.6.0 and v2.6.1)  DPDK v16.11  QEMU v2.5.0 
(also
> > >>> seen with v2.7.1)
> > >>>
> > >>>  Results:
> > >>>   INDIRECT_DESC enabled5.30 Gbit/s
> > >>>   INDIRECT_DESC disabled   0.05 Gbit/s
> > >> This is indeed a big gain.
> > >> However, isn't there a problem when indirect descriptors are 
disabled?
> > >> 0.05 Gbits/s is very low, no?
> > >
> > > [[BO'M]] Yes the disabling of the indirect descriptors feature 
appears to be
> > what causes the very low test result. And the root cause may 
actually be
> > related to this bug
> > https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1668829 .
> > However, turning on the indirect descriptors certainly helps 
greatly.
> > >
> > >>
> > >> Could you share the iperf3 command line you used?
> > >
> > >  [[BO'M]] In the server VM "iperf3 -s" and in the client "iperf3 
-c  > addr> -t 30". -t 30 is the duration (secs) of the test. OVS-DPDK 
bridge was set
> > to use normal action.
> > >
> > Hi Billy,
> > 
> > I ran the iperf test on master and I see very different results 
than you got?
> > 
> > mrg on/indirect off: 7.10 Gbps
> > mrg off/indirect off: 5.05 Gbps
> > mrg off/indirect on: 7.15 Gbps
> 
> [[BO'M]] 
> Hi Kevin,
> 
> By those figures the performance is still +40% in the right direction 
for using indirect descriptors.
> 
> What version of qemu did you use? (if as per the Launchpad bug qemu 
is a root cause here). Also in that case kernel versions may be significant.
> 
> I was using qemu 2.5 (tagged release, built locally) and Ubuntu 
16.04.01 with 4.04 kernel in the guest. 
> 
> I can retry the tests with head of master when I get a chance but the 
patch is still offering a large improvement.
> 
> Cheers,
> Billy.
> 
> I tried it as well, on one server…
> Test#1 (VM-VM iperf3 performance):
> DPDK v16.11  QEMU v2.5.0, OVS 2.7.0 branch also before release/similar 
relevant content.
> mrg off
> Indirect off: 0.164 Gbps
> Indirect on: 2.01Gbps
> 
> Are there any significant reasons not to merge this patch ?
>  

No, I think it's ok to merge. I did some additional testing with % loss
and DPDK in the guest and did not see any significant difference.

Acked-by: Kevin Traynor 

> > 
> > Kevin.
> > 
> > >>
> > >>> Test#2  (Phy-VM-Phy RFC2544 Throughput)  DPDK PMDs are polling 
NIC,
> > >>> DPDK loopback app running in guest.
> > >>>  OVS bridge is configured with port forwarding to VM (via
> > >>> dpdkvhostuser
> > >> ports).
> > >>>  

Re: [ovs-dev] [PATCH] ofproto/bond: Make bond_may_recirc() private within bond.c

2017-04-14 Thread Ben Pfaff
On Thu, Mar 09, 2017 at 04:52:27PM -0800, Andy Zhou wrote:
> Minor refactoring to make the bond code easier to read.
> 
> Signed-off-by: Andy Zhou 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] doc: Apply flake8 to conf.py also.

2017-04-14 Thread Ben Pfaff
Suggested-by: Stephen Finucane 
Signed-off-by: Ben Pfaff 
---
 Documentation/automake.mk | 2 +-
 Documentation/conf.py | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 9911668c1ca9..8d18bd9714a4 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -91,7 +91,7 @@ DOC_SOURCE = \
Documentation/internals/contributing/libopenvswitch-abi.rst \
Documentation/internals/contributing/submitting-patches.rst \
Documentation/requirements.txt
-
+FLAKE8_PYFILES += Documentation/conf.py
 EXTRA_DIST += $(DOC_SOURCE)
 
 # You can set these variables from the command line.
diff --git a/Documentation/conf.py b/Documentation/conf.py
index e32436be5357..49514ec96571 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -175,8 +175,8 @@ else:
 html_logo = '_static/logo.png'
 
 # The name of an image file (relative to this directory) to use as a favicon of
-# the docs.  This file should be a Windows icon file (.ico) being 16x16 or 
32x32
-# pixels large.
+# the docs.  This file should be a Windows icon file (.ico) being 16x16 or
+# 32x32 pixels large.
 #
 # html_favicon = None
 
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto: Add support of OFPR_PACKET_OUT as packet-in reason

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 10:20:10AM -0700, Yi-Hung Wei wrote:
> This patch adds support of OFPR_PACKET_OUT as the packet-in reason.
> This packet-in reason is a required feature for OF1.4+, and it indicates
> that the associated packet-in message to the controller is triggered when
> the switch is processing a packet-out message. This reason code is enabled
> by default when OF1.4+ is used.
> 
> Signed-off-by: Yi-Hung Wei 

Thank you for working on this!  I would very much like to say that OVS
supports everything that OpenFlow 1.4 requires (so that we can enable it
by default), so it's nice to see some steps in that direction.

In openflow.rst, you mention that "All required features are now
supported" for packet-in reasons.  Are there optional packet-in reasons
that are still not supported?  If all packet-in reasons are now
supported, then I would prefer to delete the whole item, because these
items are supposed to list areas where there is still potential work to
do.

I don't think that the change to connmgr_send_async_msg() makes sense.
The 'reason' code passed to ofconn_receives_async_msg() is supposed to
be one of the bit indexes into the members of struct ofputil_async_cfg.
Those members are not well documented, but if you look around, for
example at ofputil_async_cfg_default(), you can see that they are
supposed to have OpenFlow version independent meanings.  Can you explain
this change?  By doing a version-dependent mapping before calling
ofconn_receives_async_msg(), we will lose some nuances; for example, we
will lose the difference between explicit and implicit misses.

In xlate_actions(), I believe that you can write:
.in_packet_out = xin->in_packet_out ? true : false,
as
.in_packet_out = xin->in_packet_out,
since both are booleans.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 2/4] doc: Convert ovs-vlan-test to rST

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 09:00:49PM +0100, Stephen Finucane wrote:
> On Fri, 2017-04-14 at 12:49 -0700, Ben Pfaff wrote:
> > On Fri, Apr 14, 2017 at 06:55:38PM +0100, Stephen Finucane wrote:
> > > On Thu, 2017-04-13 at 21:27 -0700, Ben Pfaff wrote:
> > > > Thank you!  rST is much more readable than nroff.  I have some
> > > > comments
> > > > below.
> > > > 
> > > > On Mon, Apr 10, 2017 at 01:12:28PM +0100, Stephen Finucane wrote:
> > > > > Let's start with a simple one that lets us focus on setting up
> > > > > most
> > > > > of the required "infrastructure" for building man pages using
> > > > > Sphinx.
> > > > > 
> > > > > There are a couple of things worth noting here:
> > > > > 
> > > > > - The 'check-htmldocs' target becomes 'check-docs' as its now
> > > > >   responsible for building man page docs too.
> > > > > 
> > > > > - The outputted file will always have a '.1' suffix. This is 
> > > > > Sphinx's
> > > > >   decision, and likely stems from the man page guidelines [1]
> > > > > which
> > > > >   state that "the name of the man page's source file...is the
> > > > > name
> > > > > of
> > > > >   the command, function or file name, followed by a dot,
> > > > > followed
> > > > > by the
> > > > >   section character".
> > > > 
> > > > It looks to me like the last element of the tuples inside
> > > > man_pages
> > > > in conf.py controls the section.  When I changed 1 to 8 there, it
> > > > switched the manpage to section 8.  So I made that change in
> > > > conf.py
> > > > and I removed the above paragraph from the commit message.
> > > > 
> > > > > Other than that, hurrah for (mostly) legible syntaxes.
> > > > > 
> > > > > [1] http://www.tldp.org/HOWTO/Man-Page/q2.html
> > > > > 
> > > > > Signed-off-by: Stephen Finucane 
> > > > > ---
> > > > > I don't know if this is correctly integrated into the docs
> > > > > build
> > > > > system or not. I need someone to double check this for me. In
> > > > > particular, I think I need to integrate the 'dh_sphinxdoc'
> > > > > package
> > > > > [2] into the Debian build but I've no idea how to.
> > > > > 
> > > > > [2] http://manpages.ubuntu.com/manpages/zesty/man1/dh_sphinxdoc
> > > > > .1.h
> > > > > tml
> > > > 
> > > > I spent a couple of hours working on the build and install system
> > > > here.
> > > > 
> > > > My first thought was to add rules to allow Automake to find and
> > > > install the generated manpages.  This turned out to be a huge
> > > > mess
> > > > that required tons of nasty GNU Make specific crap in the
> > > > makefiles,
> > > > and I didn't like it.
> > > > 
> > > > My second approach was better.  I gave up on integrating with the
> > > > builtin Automake rules for manpages.  All those really do anyway
> > > > is
> > > > handle install and uninstall, so I wrote some Make rules to do
> > > > that.
> > > > They're ugly because they're make+shell, but readable enough if
> > > > you
> > > > squint.
> > > > 
> > > > The main question for install and uninstall is how to choose the
> > > > right section.  The easiest way seems to be to give the .rst
> > > > files
> > > > names that embed the section, like "ovs-vlan-test.8.rst".  This
> > > > is
> > > > also handy to distinguish manpages with the same name but in
> > > > different sections, which is sometimes a reasonable thing to
> > > > have, so
> > > > that's what I did.
> > > 
> > > I still need to have a look into the patch (anything to do with
> > > autotools is generally worth avoiding, heh), but I had a crazy idea
> > > over the weekend: couldn't we just include the compiled man pages
> > > in
> > > the source? The only issue with this would be the '.TH' line, but
> > > there
> > > are solutions for avoiding this [1][2]. If we have man pages
> > > included
> > > in the source, would that be adequate to use the standard autotools
> > > setup for manpages?
> > 
> > I might not understand this idea yet.  Do you mean, to include the
> > nroff output of sphinx in the Git repository?  I don't know why this
> > would help.
> 
> Correct.
> 
> > I'm not sure there are any serious disadvantages to writing Make
> > rules to install and uninstall manpages.
> 
> The only significant advantage would be simplicity for use in other
> build systems. Would the rules you've proposed work for the Debian and
> RHEL packaging, for example?

I understand now.

I think that this change should be invisible to the Debian and RHEL
packaging, because both kinds of packaging use "make install
DESTDIR=..." to install the manpages into a staging tree, then package
them from that staging tree.  With this change, "make install" behaves
the same way as before: both before and after the change, "make install"
puts nroff versions of the manpages into $(DESTDIR)/usr/share/man/man#/
(or wherever they are configured to go).

(I have not actually tested either kind of packaging.)
___
dev mailing list
d...@openvswitch.org

Re: [ovs-dev] [RFC 2/4] doc: Convert ovs-vlan-test to rST

2017-04-14 Thread Stephen Finucane
On Fri, 2017-04-14 at 12:49 -0700, Ben Pfaff wrote:
> On Fri, Apr 14, 2017 at 06:55:38PM +0100, Stephen Finucane wrote:
> > On Thu, 2017-04-13 at 21:27 -0700, Ben Pfaff wrote:
> > > Thank you!  rST is much more readable than nroff.  I have some
> > > comments
> > > below.
> > > 
> > > On Mon, Apr 10, 2017 at 01:12:28PM +0100, Stephen Finucane wrote:
> > > > Let's start with a simple one that lets us focus on setting up
> > > > most
> > > > of the required "infrastructure" for building man pages using
> > > > Sphinx.
> > > > 
> > > > There are a couple of things worth noting here:
> > > > 
> > > > - The 'check-htmldocs' target becomes 'check-docs' as its now
> > > >   responsible for building man page docs too.
> > > > 
> > > > - The outputted file will always have a '.1' suffix. This is 
> > > > Sphinx's
> > > >   decision, and likely stems from the man page guidelines [1]
> > > > which
> > > >   state that "the name of the man page's source file...is the
> > > > name
> > > > of
> > > >   the command, function or file name, followed by a dot,
> > > > followed
> > > > by the
> > > >   section character".
> > > 
> > > It looks to me like the last element of the tuples inside
> > > man_pages
> > > in conf.py controls the section.  When I changed 1 to 8 there, it
> > > switched the manpage to section 8.  So I made that change in
> > > conf.py
> > > and I removed the above paragraph from the commit message.
> > > 
> > > > Other than that, hurrah for (mostly) legible syntaxes.
> > > > 
> > > > [1] http://www.tldp.org/HOWTO/Man-Page/q2.html
> > > > 
> > > > Signed-off-by: Stephen Finucane 
> > > > ---
> > > > I don't know if this is correctly integrated into the docs
> > > > build
> > > > system or not. I need someone to double check this for me. In
> > > > particular, I think I need to integrate the 'dh_sphinxdoc'
> > > > package
> > > > [2] into the Debian build but I've no idea how to.
> > > > 
> > > > [2] http://manpages.ubuntu.com/manpages/zesty/man1/dh_sphinxdoc
> > > > .1.h
> > > > tml
> > > 
> > > I spent a couple of hours working on the build and install system
> > > here.
> > > 
> > > My first thought was to add rules to allow Automake to find and
> > > install the generated manpages.  This turned out to be a huge
> > > mess
> > > that required tons of nasty GNU Make specific crap in the
> > > makefiles,
> > > and I didn't like it.
> > > 
> > > My second approach was better.  I gave up on integrating with the
> > > builtin Automake rules for manpages.  All those really do anyway
> > > is
> > > handle install and uninstall, so I wrote some Make rules to do
> > > that.
> > > They're ugly because they're make+shell, but readable enough if
> > > you
> > > squint.
> > > 
> > > The main question for install and uninstall is how to choose the
> > > right section.  The easiest way seems to be to give the .rst
> > > files
> > > names that embed the section, like "ovs-vlan-test.8.rst".  This
> > > is
> > > also handy to distinguish manpages with the same name but in
> > > different sections, which is sometimes a reasonable thing to
> > > have, so
> > > that's what I did.
> > 
> > I still need to have a look into the patch (anything to do with
> > autotools is generally worth avoiding, heh), but I had a crazy idea
> > over the weekend: couldn't we just include the compiled man pages
> > in
> > the source? The only issue with this would be the '.TH' line, but
> > there
> > are solutions for avoiding this [1][2]. If we have man pages
> > included
> > in the source, would that be adequate to use the standard autotools
> > setup for manpages?
> 
> I might not understand this idea yet.  Do you mean, to include the
> nroff output of sphinx in the Git repository?  I don't know why this
> would help.

Correct.

> I'm not sure there are any serious disadvantages to writing Make
> rules to install and uninstall manpages.

The only significant advantage would be simplicity for use in other
build systems. Would the rules you've proposed work for the Debian and
RHEL packaging, for example?

Stephen
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [action upcall meters 2/5] ofproto-dpif: Add 'meter_ids' to backer

2017-04-14 Thread Andy Zhou
Please ignore this patch.

On Fri, Apr 14, 2017 at 12:46 PM, Andy Zhou  wrote:
> Add 'meter_ids', an id-pool object to manage datapath meter id, i.e.
> provider_meter_id.
>
> Currently, only userspace datapath supports meter, and it implements
> the provider_meter_id management. Moving this function to 'backer'
> allows other datapath implementation to share the same logic.
>
> Signed-off-by: Andy Zhou 
> ---
>  lib/dpif-netdev.c  | 24 
>  ofproto/ofproto-dpif.c | 44 ++--
>  ofproto/ofproto-dpif.h |  4 
>  3 files changed, 46 insertions(+), 26 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index a14a2ebb5b2a..d5417162b7af 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -260,7 +260,6 @@ struct dp_netdev {
>  /* Meters. */
>  struct ovs_mutex meter_locks[N_METER_LOCKS];
>  struct dp_meter *meters[MAX_METERS]; /* Meter bands. */
> -uint32_t meter_free; /* Next free meter. */
>
>  /* Protects access to ofproto-dpif-upcall interface during revalidator
>   * thread synchronization. */
> @@ -3896,9 +3895,6 @@ dpif_netdev_meter_set(struct dpif *dpif, 
> ofproto_meter_id *meter_id,
>  struct dp_meter *meter;
>  int i;
>
> -if (mid == UINT32_MAX) {
> -mid = dp->meter_free;
> -}
>  if (mid >= MAX_METERS) {
>  return EFBIG; /* Meter_id out of range. */
>  }
> @@ -3958,21 +3954,6 @@ dpif_netdev_meter_set(struct dpif *dpif, 
> ofproto_meter_id *meter_id,
>  dp->meters[mid] = meter;
>  meter_unlock(dp, mid);
>
> -meter_id->uint32 = mid; /* Store on success. */
> -
> -/* Find next free meter */
> -if (dp->meter_free == mid) { /* Now taken. */
> -do {
> -if (++mid >= MAX_METERS) { /* Wrap around */
> -mid = 0;
> -}
> -if (mid == dp->meter_free) { /* Full circle */
> -mid = MAX_METERS;
> -break;
> -}
> -} while (dp->meters[mid]);
> -dp->meter_free = mid; /* Next free meter or MAX_METERS */
> -}
>  return 0;
>  }
>  return ENOMEM;
> @@ -4027,11 +4008,6 @@ dpif_netdev_meter_del(struct dpif *dpif,
>  meter_lock(dp, meter_id);
>  dp_delete_meter(dp, meter_id);
>  meter_unlock(dp, meter_id);
> -
> -/* Keep free meter index as low as possible */
> -if (meter_id < dp->meter_free) {
> -dp->meter_free = meter_id;
> -}
>  }
>  return error;
>  }
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index 6a5ffb94fa94..a026d4913731 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -662,6 +662,7 @@ close_dpif_backer(struct dpif_backer *backer)
>  free(backer->type);
>  free(backer->dp_version_string);
>  dpif_close(backer->dpif);
> +id_pool_destroy(backer->meter_ids);
>  free(backer);
>  }
>
> @@ -787,6 +788,15 @@ open_dpif_backer(const char *type, struct dpif_backer 
> **backerp)
>  = check_variable_length_userdata(backer);
>  backer->dp_version_string = dpif_get_dp_version(backer->dpif);
>
> +/* Manage Datpath meter IDs if supported. */
> +struct ofputil_meter_features features;
> +dpif_meter_get_features(backer->dpif, );
> +if (features.max_meters) {
> +backer->meter_ids = id_pool_create(0, features.max_meters);
> +} else {
> +backer->meter_ids = NULL;
> +}
> +
>  return error;
>  }
>
> @@ -5385,6 +5395,17 @@ meter_set(struct ofproto *ofproto_, ofproto_meter_id 
> *meter_id,
>  {
>  struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
>
> +/* Provider ID unknown. Use backer to allocate a new DP meter */
> +if (meter_id->uint32 == UINT32_MAX) {
> +if (!ofproto->backer->meter_ids) {
> +return EFBIG; /* Datapath does not support meter.  */
> +}
> +
> +if(!id_pool_alloc_id(ofproto->backer->meter_ids, _id->uint32)) 
> {
> +return ENOMEM; /* Can't allocate a DP meter. */
> +}
> +}
> +
>  switch (dpif_meter_set(ofproto->backer->dpif, meter_id, config)) {
>  case 0:
>  return 0;
> @@ -5414,12 +5435,31 @@ meter_get(const struct ofproto *ofproto_, 
> ofproto_meter_id meter_id,
>  return OFPERR_OFPMMFC_UNKNOWN_METER;
>  }
>
> +struct free_meter_id_args {
> +struct ofproto_dpif *ofproto;
> +ofproto_meter_id meter_id;
> +};
> +
> +static void
> +free_meter_id(struct free_meter_id_args *args)
> +{
> +struct ofproto_dpif *ofproto = args->ofproto;
> +
> +dpif_meter_del(ofproto->backer->dpif, args->meter_id, NULL, 0);
> +id_pool_free_id(ofproto->backer->meter_ids, args->meter_id.uint32);
> +free(args);
> +}
> +
>  static void
>  meter_del(struct ofproto *ofproto_, ofproto_meter_id meter_id)
>  {
> -struct 

Re: [ovs-dev] [PATCH 3/3] doc: Remove some link targets that aren't used anywhere.

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 07:41:24PM +0100, Stephen Finucane wrote:
> On Fri, 2017-04-14 at 10:31 -0700, Ben Pfaff wrote:
> > It seems that there's a style here that every title has a separate
> > link
> > target even if it's never used.  Maybe there is some reason for that
> > (maybe
> > it should be explained in the documentation style document?), but if
> > not
> > then it seems reasonable to leave out the unused ones since they look
> > to
> > me like just visual clutter.
> > 
> > Signed-off-by: Ben Pfaff 
> 
> Acked-by: Stephen Finucane 

Thank you for the reviews!  I applied these to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 2/4] doc: Convert ovs-vlan-test to rST

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 06:55:38PM +0100, Stephen Finucane wrote:
> On Thu, 2017-04-13 at 21:27 -0700, Ben Pfaff wrote:
> > Thank you!  rST is much more readable than nroff.  I have some
> > comments
> > below.
> > 
> > On Mon, Apr 10, 2017 at 01:12:28PM +0100, Stephen Finucane wrote:
> > > Let's start with a simple one that lets us focus on setting up most
> > > of the required "infrastructure" for building man pages using
> > > Sphinx.
> > > 
> > > There are a couple of things worth noting here:
> > > 
> > > - The 'check-htmldocs' target becomes 'check-docs' as its now
> > >   responsible for building man page docs too.
> > > 
> > > - The outputted file will always have a '.1' suffix. This is 
> > > Sphinx's
> > >   decision, and likely stems from the man page guidelines [1] which
> > >   state that "the name of the man page's source file...is the name
> > > of
> > >   the command, function or file name, followed by a dot, followed
> > > by the
> > >   section character".
> > 
> > It looks to me like the last element of the tuples inside man_pages
> > in conf.py controls the section.  When I changed 1 to 8 there, it
> > switched the manpage to section 8.  So I made that change in conf.py
> > and I removed the above paragraph from the commit message.
> > 
> > > Other than that, hurrah for (mostly) legible syntaxes.
> > > 
> > > [1] http://www.tldp.org/HOWTO/Man-Page/q2.html
> > > 
> > > Signed-off-by: Stephen Finucane 
> > > ---
> > > I don't know if this is correctly integrated into the docs build
> > > system or not. I need someone to double check this for me. In
> > > particular, I think I need to integrate the 'dh_sphinxdoc' package
> > > [2] into the Debian build but I've no idea how to.
> > > 
> > > [2] http://manpages.ubuntu.com/manpages/zesty/man1/dh_sphinxdoc.1.h
> > > tml
> > 
> > I spent a couple of hours working on the build and install system
> > here.
> > 
> > My first thought was to add rules to allow Automake to find and
> > install the generated manpages.  This turned out to be a huge mess
> > that required tons of nasty GNU Make specific crap in the makefiles,
> > and I didn't like it.
> > 
> > My second approach was better.  I gave up on integrating with the
> > builtin Automake rules for manpages.  All those really do anyway is
> > handle install and uninstall, so I wrote some Make rules to do that.
> > They're ugly because they're make+shell, but readable enough if you
> > squint.
> > 
> > The main question for install and uninstall is how to choose the
> > right section.  The easiest way seems to be to give the .rst files
> > names that embed the section, like "ovs-vlan-test.8.rst".  This is
> > also handy to distinguish manpages with the same name but in
> > different sections, which is sometimes a reasonable thing to have, so
> > that's what I did.
> 
> I still need to have a look into the patch (anything to do with
> autotools is generally worth avoiding, heh), but I had a crazy idea
> over the weekend: couldn't we just include the compiled man pages in
> the source? The only issue with this would be the '.TH' line, but there
> are solutions for avoiding this [1][2]. If we have man pages included
> in the source, would that be adequate to use the standard autotools
> setup for manpages?

I might not understand this idea yet.  Do you mean, to include the
nroff output of sphinx in the Git repository?  I don't know why this
would help.

I'm not sure there are any serious disadvantages to writing Make rules
to install and uninstall manpages.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [action upcall meters 5/5] ofproto: Meter slowpath action when action upcall meters are configured

2017-04-14 Thread Andy Zhou
If a slow path action is a controller action, meter it when the
controller meter is configured.  For other kinds of slow path actions,
meter it when the slowpath meter is configured.

Note, this patch only considers the meters configuration of the
packet's input bridge, which may not be the same bridge that the
action is generated.

Signed-off-by: Andy Zhou 
---
 ofproto/ofproto-dpif-upcall.c | 34 +++---
 ofproto/ofproto-dpif-xlate.c  | 12 ++--
 tests/ofproto-dpif.at | 31 +++
 3 files changed, 68 insertions(+), 9 deletions(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 3b28f9a22939..37f345b235b1 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1018,7 +1018,8 @@ classify_upcall(enum dpif_upcall_type type, const struct 
nlattr *userdata)
 static void
 compose_slow_path(struct udpif *udpif, struct xlate_out *xout,
   const struct flow *flow, odp_port_t odp_in_port,
-  struct ofpbuf *buf)
+  struct ofpbuf *buf, uint32_t slowpath_meter_id,
+  uint32_t controller_meter_id)
 {
 union user_action_cookie cookie;
 odp_port_t port;
@@ -1032,8 +1033,28 @@ compose_slow_path(struct udpif *udpif, struct xlate_out 
*xout,
 ? ODPP_NONE
 : odp_in_port;
 pid = dpif_port_get_pid(udpif->dpif, port, flow_hash_5tuple(flow, 0));
+
+size_t offset;
+size_t ac_offset;
+uint32_t meter_id = xout->slow & SLOW_CONTROLLER ? controller_meter_id
+ : slowpath_meter_id;
+
+if (meter_id != UINT32_MAX) {
+/* If slowpath meter is configured, generate clone(meter, userspace)
+ * action.   */
+offset = nl_msg_start_nested(buf, OVS_ACTION_ATTR_SAMPLE);
+nl_msg_put_u32(buf, OVS_SAMPLE_ATTR_PROBABILITY, UINT32_MAX);
+ac_offset = nl_msg_start_nested(buf, OVS_SAMPLE_ATTR_ACTIONS);
+nl_msg_put_u32(buf, OVS_ACTION_ATTR_METER, meter_id);
+}
+
 odp_put_userspace_action(pid, , sizeof cookie.slow_path,
  ODPP_NONE, false, buf);
+
+if (meter_id != UINT32_MAX) {
+nl_msg_end_nested(buf, ac_offset);
+nl_msg_end_nested(buf, offset);
+}
 }
 
 /* If there is no error, the upcall must be destroyed with upcall_uninit()
@@ -1136,10 +1157,12 @@ upcall_xlate(struct udpif *udpif, struct upcall *upcall,
 ofpbuf_use_const(>put_actions,
  odp_actions->data, odp_actions->size);
 } else {
+uint32_t smid= upcall->ofproto->up.slowpath_meter_id;
+uint32_t cmid = upcall->ofproto->up.controller_meter_id;
 /* upcall->put_actions already initialized by upcall_receive(). */
 compose_slow_path(udpif, >xout, upcall->flow,
   upcall->flow->in_port.odp_port,
-  >put_actions);
+  >put_actions, smid, cmid);
 }
 
 /* This function is also called for slow-pathed flows.  As we are only
@@ -1956,9 +1979,14 @@ revalidate_ukey__(struct udpif *udpif, const struct 
udpif_key *ukey,
 }
 
 if (xoutp->slow) {
+struct ofproto_dpif *ofproto;
+ofproto = xlate_lookup_ofproto(udpif->backer, , NULL);
+uint32_t smid= ofproto->up.slowpath_meter_id;
+uint32_t cmid= ofproto->up.controller_meter_id;
+
 ofpbuf_clear(odp_actions);
 compose_slow_path(udpif, xoutp, , ctx.flow.in_port.odp_port,
-  odp_actions);
+  odp_actions, smid, cmid);
 }
 
 if (odp_flow_key_to_mask(ukey->mask, ukey->mask_len, _mask, )
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 52e0d3f1b0bb..416012ab6930 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2849,8 +2849,13 @@ compose_sample_action(struct xlate_ctx *ctx,
 return 0;
 }
 
+/* If the slow path meter is configured by the controller,
+ * Insert a meter action before the user space action.   */
+struct ofproto *ofproto = >xin->ofproto->up;
+uint32_t meter_id = ofproto->slowpath_meter_id;
+
 /* No need to generate sample action for 100% sampling rate. */
-bool is_sample = probability < UINT32_MAX;
+bool is_sample = probability < UINT32_MAX || meter_id != UINT32_MAX;
 size_t sample_offset, actions_offset;
 if (is_sample) {
 sample_offset = nl_msg_start_nested(ctx->odp_actions,
@@ -2861,11 +2866,6 @@ compose_sample_action(struct xlate_ctx *ctx,
  OVS_SAMPLE_ATTR_ACTIONS);
 }
 
-/* If the slow path meter is configured by the controller,
- * Insert a meter action before the user space action.   */
-struct ofproto *ofproto = >xin->ofproto->up;
-uint32_t meter_id = ofproto->slowpath_meter_id;
-
 if (meter_id != UINT32_MAX) {
 

[ovs-dev] [action upcall meters 4/5] ofproto: Meter sample action when configured.

2017-04-14 Thread Andy Zhou
When slowpath meter is configured, add meter action when translate
sample action.

Signed-off-by: Andy Zhou 
---
 ofproto/ofproto-dpif-xlate.c |  9 +
 tests/ofproto-dpif.at| 14 ++
 2 files changed, 23 insertions(+)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index a24aef9a43a1..52e0d3f1b0bb 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2861,6 +2861,15 @@ compose_sample_action(struct xlate_ctx *ctx,
  OVS_SAMPLE_ATTR_ACTIONS);
 }
 
+/* If the slow path meter is configured by the controller,
+ * Insert a meter action before the user space action.   */
+struct ofproto *ofproto = >xin->ofproto->up;
+uint32_t meter_id = ofproto->slowpath_meter_id;
+
+if (meter_id != UINT32_MAX) {
+nl_msg_put_u32(ctx->odp_actions, OVS_ACTION_ATTR_METER, meter_id);
+}
+
 odp_port_t odp_port = ofp_port_to_odp_port(
 ctx->xbridge, ctx->xin->flow.in_port.ofp_port);
 uint32_t pid = dpif_port_get_pid(ctx->xbridge->dpif, odp_port,
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index 0c2ea384b422..3c3037b16548 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -6491,6 +6491,20 @@ flow-dump from non-dpdk interfaces:
 packets:2, bytes:68, used:0.001s, 
actions:userspace(pid=0,ipfix(output_port=4294967295))
 ])
 
+AT_CHECK([ovs-appctl revalidator/purge])
+dnl
+dnl Add a slowpath meter. The userspace action should be metered.
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-meter br0 'meter=slowpath pktps burst 
stats bands=type=drop rate=3 burst_size=1'])
+
+dnl Send some packets that should be sampled and metered.
+for i in `seq 1 3`; do
+AT_CHECK([ovs-appctl netdev-dummy/receive p1 
'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800)'])
+done
+AT_CHECK([ovs-appctl dpctl/dump-flows | sed 's/.*\(packets:\)/\1/' | sed 
's/used:[[0-9]].[[0-9]]*s/used:0.001s/'], [0], [dnl
+flow-dump from non-dpdk interfaces:
+packets:2, bytes:68, used:0.001s, 
actions:sample(sample=100.0%,actions(meter(0),userspace(pid=0,ipfix(output_port=4294967295
+])
+
 dnl Remove the IPFIX configuration.
 AT_CHECK([ovs-vsctl clear bridge br0 ipfix])
 AT_CHECK([ovs-appctl revalidator/purge])
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [action upcall meters 3/5] ofproto: Support action upcall meters

2017-04-14 Thread Andy Zhou
Allow action upcall meters, i.e. slowpath and controller meters,
to be added and displayed.

Keep track of datapath meter ID of those action upcall meters in
ofproto to aid action translation. Later patches will make use of them.

Signed-off-by: Andy Zhou 
---
 lib/ofp-print.c| 33 ++---
 ofproto/ofproto-provider.h |  4 
 ofproto/ofproto.c  | 52 ++
 3 files changed, 82 insertions(+), 7 deletions(-)

diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index a8cdfcbf20b1..140af05950b7 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -1333,11 +1333,36 @@ ofp_print_meter_band(struct ds *s, uint16_t flags,
 }
 
 static void
+ofp_print_meter_id(struct ds *s, uint32_t meter_id, char seperator)
+{
+if (meter_id <= OFPM13_MAX) {
+ds_put_format(s, "meter%c%"PRIu32, seperator, meter_id);
+} else {
+const char *name;
+switch (meter_id) {
+case OFPM13_SLOWPATH:
+name = "slowpath";
+break;
+case OFPM13_CONTROLLER:
+name = "controller";
+break;
+case OFPM13_ALL:
+name = "ALL";
+break;
+default:
+name = "unknown";
+}
+ds_put_format(s, "meter%c%s", seperator, name);
+}
+}
+
+static void
 ofp_print_meter_stats(struct ds *s, const struct ofputil_meter_stats *ms)
 {
 uint16_t i;
 
-ds_put_format(s, "meter:%"PRIu32" ", ms->meter_id);
+ofp_print_meter_id(s, ms->meter_id, ':');
+ds_put_char(s, ' ');
 ds_put_format(s, "flow_count:%"PRIu32" ", ms->flow_count);
 ds_put_format(s, "packet_in_count:%"PRIu64" ", ms->packet_in_count);
 ds_put_format(s, "byte_in_count:%"PRIu64" ", ms->byte_in_count);
@@ -1358,7 +1383,8 @@ ofp_print_meter_config(struct ds *s, const struct 
ofputil_meter_config *mc)
 {
 uint16_t i;
 
-ds_put_format(s, "meter=%"PRIu32" ", mc->meter_id);
+ofp_print_meter_id(s, mc->meter_id, '=');
+ds_put_char(s, ' ');
 
 ofp_print_meter_flags(s, mc->flags);
 
@@ -1412,8 +1438,9 @@ ofp_print_meter_stats_request(struct ds *s, const struct 
ofp_header *oh)
 uint32_t meter_id;
 
 ofputil_decode_meter_request(oh, _id);
+ds_put_char(s, ' ');
 
-ds_put_format(s, " meter=%"PRIu32, meter_id);
+ofp_print_meter_id(s, meter_id, '=');
 }
 
 static const char *
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index 000326d7f79d..688a9e5d32eb 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -112,6 +112,10 @@ struct ofproto {
 /* Meter table.  */
 struct ofputil_meter_features meter_features;
 struct hmap meters; /* uint32_t indexed 'struct meter *'.  */
+uint32_t slowpath_meter_id; /* Datapath slowpath meter.  UINT32_MAX
+   if not defined.  */
+uint32_t controller_meter_id;   /* Datapath controller meter. UINT32_MAX
+   if not defined.  */
 
 /* OpenFlow connections. */
 struct connmgr *connmgr;
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 8c4c7e67f213..abbb849a384b 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -568,6 +568,8 @@ ofproto_create(const char *datapath_name, const char 
*datapath_type,
 memset(>meter_features, 0, sizeof ofproto->meter_features);
 }
 hmap_init(>meters);
+ofproto->slowpath_meter_id = UINT32_MAX;
+ofproto->controller_meter_id = UINT32_MAX;
 
 /* Set the initial tables version. */
 ofproto_bump_tables_version(ofproto);
@@ -6232,9 +6234,33 @@ ofproto_get_meter(const struct ofproto *ofproto, 
uint32_t meter_id)
 return NULL;
 }
 
+static uint32_t *
+ofproto_upcall_meter_ptr(struct ofproto *ofproto, uint32_t meter_id)
+{
+switch(meter_id) {
+case OFPM13_SLOWPATH:
+return >slowpath_meter_id;
+break;
+case OFPM13_CONTROLLER:
+return >controller_meter_id;
+break;
+case OFPM13_ALL:
+OVS_NOT_REACHED();
+default:
+return NULL;
+}
+}
+
 static void
 ofproto_add_meter(struct ofproto *ofproto, struct meter *meter)
 {
+uint32_t *upcall_meter_ptr = ofproto_upcall_meter_ptr(ofproto, meter->id);
+
+/* Cache datapath meter IDs of special meters. */
+if (upcall_meter_ptr) {
+*upcall_meter_ptr = meter->provider_meter_id.uint32;
+}
+
 hmap_insert(>meters, >node, hash_int(meter->id, 0));
 }
 
@@ -6248,7 +6274,7 @@ static bool
 ofproto_fix_meter_action(const struct ofproto *ofproto,
  struct ofpact_meter *ma)
 {
-if (ma->meter_id && ma->meter_id <= ofproto->meter_features.max_meters) {
+if (ma->meter_id) {
 const struct meter *meter = ofproto_get_meter(ofproto, ma->meter_id);
 
 if (meter && meter->provider_meter_id.uint32 != UINT32_MAX) {
@@ -6306,6 +6332,12 @@ static void
 meter_destroy(struct ofproto *ofproto, struct meter *meter)
   

[ovs-dev] [action upcall meters 2/5] ofproto-dpif: Use backer to manage datapath meter allocation

2017-04-14 Thread Andy Zhou
Add 'meter_ids', an id-pool object to manage datapath meter id.

Currently, only userspace datapath supports meter, and it implements
the provider_meter_id management. Moving this function to 'backer'
allows other datapath implementation to share the same logic.

Signed-off-by: Andy Zhou 
---
 lib/dpif-netdev.c  | 24 
 ofproto/ofproto-dpif.c | 44 ++--
 ofproto/ofproto-dpif.h |  4 
 3 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a14a2ebb5b2a..d5417162b7af 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -260,7 +260,6 @@ struct dp_netdev {
 /* Meters. */
 struct ovs_mutex meter_locks[N_METER_LOCKS];
 struct dp_meter *meters[MAX_METERS]; /* Meter bands. */
-uint32_t meter_free; /* Next free meter. */
 
 /* Protects access to ofproto-dpif-upcall interface during revalidator
  * thread synchronization. */
@@ -3896,9 +3895,6 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id 
*meter_id,
 struct dp_meter *meter;
 int i;
 
-if (mid == UINT32_MAX) {
-mid = dp->meter_free;
-}
 if (mid >= MAX_METERS) {
 return EFBIG; /* Meter_id out of range. */
 }
@@ -3958,21 +3954,6 @@ dpif_netdev_meter_set(struct dpif *dpif, 
ofproto_meter_id *meter_id,
 dp->meters[mid] = meter;
 meter_unlock(dp, mid);
 
-meter_id->uint32 = mid; /* Store on success. */
-
-/* Find next free meter */
-if (dp->meter_free == mid) { /* Now taken. */
-do {
-if (++mid >= MAX_METERS) { /* Wrap around */
-mid = 0;
-}
-if (mid == dp->meter_free) { /* Full circle */
-mid = MAX_METERS;
-break;
-}
-} while (dp->meters[mid]);
-dp->meter_free = mid; /* Next free meter or MAX_METERS */
-}
 return 0;
 }
 return ENOMEM;
@@ -4027,11 +4008,6 @@ dpif_netdev_meter_del(struct dpif *dpif,
 meter_lock(dp, meter_id);
 dp_delete_meter(dp, meter_id);
 meter_unlock(dp, meter_id);
-
-/* Keep free meter index as low as possible */
-if (meter_id < dp->meter_free) {
-dp->meter_free = meter_id;
-}
 }
 return error;
 }
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 6a5ffb94fa94..a026d4913731 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -662,6 +662,7 @@ close_dpif_backer(struct dpif_backer *backer)
 free(backer->type);
 free(backer->dp_version_string);
 dpif_close(backer->dpif);
+id_pool_destroy(backer->meter_ids);
 free(backer);
 }
 
@@ -787,6 +788,15 @@ open_dpif_backer(const char *type, struct dpif_backer 
**backerp)
 = check_variable_length_userdata(backer);
 backer->dp_version_string = dpif_get_dp_version(backer->dpif);
 
+/* Manage Datpath meter IDs if supported. */
+struct ofputil_meter_features features;
+dpif_meter_get_features(backer->dpif, );
+if (features.max_meters) {
+backer->meter_ids = id_pool_create(0, features.max_meters);
+} else {
+backer->meter_ids = NULL;
+}
+
 return error;
 }
 
@@ -5385,6 +5395,17 @@ meter_set(struct ofproto *ofproto_, ofproto_meter_id 
*meter_id,
 {
 struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
 
+/* Provider ID unknown. Use backer to allocate a new DP meter */
+if (meter_id->uint32 == UINT32_MAX) {
+if (!ofproto->backer->meter_ids) {
+return EFBIG; /* Datapath does not support meter.  */
+}
+
+if(!id_pool_alloc_id(ofproto->backer->meter_ids, _id->uint32)) {
+return ENOMEM; /* Can't allocate a DP meter. */
+}
+}
+
 switch (dpif_meter_set(ofproto->backer->dpif, meter_id, config)) {
 case 0:
 return 0;
@@ -5414,12 +5435,31 @@ meter_get(const struct ofproto *ofproto_, 
ofproto_meter_id meter_id,
 return OFPERR_OFPMMFC_UNKNOWN_METER;
 }
 
+struct free_meter_id_args {
+struct ofproto_dpif *ofproto;
+ofproto_meter_id meter_id;
+};
+
+static void
+free_meter_id(struct free_meter_id_args *args)
+{
+struct ofproto_dpif *ofproto = args->ofproto;
+
+dpif_meter_del(ofproto->backer->dpif, args->meter_id, NULL, 0);
+id_pool_free_id(ofproto->backer->meter_ids, args->meter_id.uint32);
+free(args);
+}
+
 static void
 meter_del(struct ofproto *ofproto_, ofproto_meter_id meter_id)
 {
-struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
+struct free_meter_id_args *arg = xmalloc(sizeof *arg);
 
-dpif_meter_del(ofproto->backer->dpif, meter_id, NULL, 0);
+/* The 'meter_id' may still be in use the xlate code.
+ * Only safe to delet after RCU grace period. */
+arg->ofproto = ofproto_dpif_cast(ofproto_);
+arg->meter_id = meter_id;
+

[ovs-dev] [action upcall meters 2/5] ofproto-dpif: Add 'meter_ids' to backer

2017-04-14 Thread Andy Zhou
Add 'meter_ids', an id-pool object to manage datapath meter id, i.e.
provider_meter_id.

Currently, only userspace datapath supports meter, and it implements
the provider_meter_id management. Moving this function to 'backer'
allows other datapath implementation to share the same logic.

Signed-off-by: Andy Zhou 
---
 lib/dpif-netdev.c  | 24 
 ofproto/ofproto-dpif.c | 44 ++--
 ofproto/ofproto-dpif.h |  4 
 3 files changed, 46 insertions(+), 26 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a14a2ebb5b2a..d5417162b7af 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -260,7 +260,6 @@ struct dp_netdev {
 /* Meters. */
 struct ovs_mutex meter_locks[N_METER_LOCKS];
 struct dp_meter *meters[MAX_METERS]; /* Meter bands. */
-uint32_t meter_free; /* Next free meter. */
 
 /* Protects access to ofproto-dpif-upcall interface during revalidator
  * thread synchronization. */
@@ -3896,9 +3895,6 @@ dpif_netdev_meter_set(struct dpif *dpif, ofproto_meter_id 
*meter_id,
 struct dp_meter *meter;
 int i;
 
-if (mid == UINT32_MAX) {
-mid = dp->meter_free;
-}
 if (mid >= MAX_METERS) {
 return EFBIG; /* Meter_id out of range. */
 }
@@ -3958,21 +3954,6 @@ dpif_netdev_meter_set(struct dpif *dpif, 
ofproto_meter_id *meter_id,
 dp->meters[mid] = meter;
 meter_unlock(dp, mid);
 
-meter_id->uint32 = mid; /* Store on success. */
-
-/* Find next free meter */
-if (dp->meter_free == mid) { /* Now taken. */
-do {
-if (++mid >= MAX_METERS) { /* Wrap around */
-mid = 0;
-}
-if (mid == dp->meter_free) { /* Full circle */
-mid = MAX_METERS;
-break;
-}
-} while (dp->meters[mid]);
-dp->meter_free = mid; /* Next free meter or MAX_METERS */
-}
 return 0;
 }
 return ENOMEM;
@@ -4027,11 +4008,6 @@ dpif_netdev_meter_del(struct dpif *dpif,
 meter_lock(dp, meter_id);
 dp_delete_meter(dp, meter_id);
 meter_unlock(dp, meter_id);
-
-/* Keep free meter index as low as possible */
-if (meter_id < dp->meter_free) {
-dp->meter_free = meter_id;
-}
 }
 return error;
 }
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 6a5ffb94fa94..a026d4913731 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -662,6 +662,7 @@ close_dpif_backer(struct dpif_backer *backer)
 free(backer->type);
 free(backer->dp_version_string);
 dpif_close(backer->dpif);
+id_pool_destroy(backer->meter_ids);
 free(backer);
 }
 
@@ -787,6 +788,15 @@ open_dpif_backer(const char *type, struct dpif_backer 
**backerp)
 = check_variable_length_userdata(backer);
 backer->dp_version_string = dpif_get_dp_version(backer->dpif);
 
+/* Manage Datpath meter IDs if supported. */
+struct ofputil_meter_features features;
+dpif_meter_get_features(backer->dpif, );
+if (features.max_meters) {
+backer->meter_ids = id_pool_create(0, features.max_meters);
+} else {
+backer->meter_ids = NULL;
+}
+
 return error;
 }
 
@@ -5385,6 +5395,17 @@ meter_set(struct ofproto *ofproto_, ofproto_meter_id 
*meter_id,
 {
 struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
 
+/* Provider ID unknown. Use backer to allocate a new DP meter */
+if (meter_id->uint32 == UINT32_MAX) {
+if (!ofproto->backer->meter_ids) {
+return EFBIG; /* Datapath does not support meter.  */
+}
+
+if(!id_pool_alloc_id(ofproto->backer->meter_ids, _id->uint32)) {
+return ENOMEM; /* Can't allocate a DP meter. */
+}
+}
+
 switch (dpif_meter_set(ofproto->backer->dpif, meter_id, config)) {
 case 0:
 return 0;
@@ -5414,12 +5435,31 @@ meter_get(const struct ofproto *ofproto_, 
ofproto_meter_id meter_id,
 return OFPERR_OFPMMFC_UNKNOWN_METER;
 }
 
+struct free_meter_id_args {
+struct ofproto_dpif *ofproto;
+ofproto_meter_id meter_id;
+};
+
+static void
+free_meter_id(struct free_meter_id_args *args)
+{
+struct ofproto_dpif *ofproto = args->ofproto;
+
+dpif_meter_del(ofproto->backer->dpif, args->meter_id, NULL, 0);
+id_pool_free_id(ofproto->backer->meter_ids, args->meter_id.uint32);
+free(args);
+}
+
 static void
 meter_del(struct ofproto *ofproto_, ofproto_meter_id meter_id)
 {
-struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofproto_);
+struct free_meter_id_args *arg = xmalloc(sizeof *arg);
 
-dpif_meter_del(ofproto->backer->dpif, meter_id, NULL, 0);
+/* The 'meter_id' may still be in use the xlate code.
+ * Only safe to delet after RCU grace period. */
+arg->ofproto = ofproto_dpif_cast(ofproto_);
+arg->meter_id = meter_id;
+ 

[ovs-dev] [action upcall meters 1/5] ofproto: Store meters using hmap

2017-04-14 Thread Andy Zhou
Currently, meters are stored in a fixed pointer array. It is not
very efficient since the controller, at least in theory, can
pick any meter id (up to the limits to uint32_t), not necessarily
within the lower end of a region, or in close range to each other.
In particular, OFPM_SLOWPATH and OFPM_CONTROLLER meters are specified
at the high region.

Switching to using hmap. Ofproto layer does not restrict
the number of meters that controller can add, nor does it care
about the value of meter_id. Datapth limits the number of meters
ofproto layer can support at run time.

Signed-off-by: Andy Zhou 
---
 ofproto/ofproto-provider.h |   7 +-
 ofproto/ofproto.c  | 242 +++--
 2 files changed, 146 insertions(+), 103 deletions(-)

diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index b7b12cdfd5f4..000326d7f79d 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -109,12 +109,9 @@ struct ofproto {
 /* List of expirable flows, in all flow tables. */
 struct ovs_list expirable OVS_GUARDED_BY(ofproto_mutex);
 
-/* Meter table.
- * OpenFlow meters start at 1.  To avoid confusion we leave the first
- * pointer in the array un-used, and index directly with the OpenFlow
- * meter_id. */
+/* Meter table.  */
 struct ofputil_meter_features meter_features;
-struct meter **meters; /* 'meter_features.max_meter' + 1 pointers. */
+struct hmap meters; /* uint32_t indexed 'struct meter *'.  */
 
 /* OpenFlow connections. */
 struct connmgr *connmgr;
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 7440d5b52092..8c4c7e67f213 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -281,7 +281,8 @@ static uint64_t pick_fallback_dpid(void);
 static void ofproto_destroy__(struct ofproto *);
 static void update_mtu(struct ofproto *, struct ofport *);
 static void update_mtu_ofproto(struct ofproto *);
-static void meter_delete(struct ofproto *, uint32_t first, uint32_t last);
+static void meter_delete(struct ofproto *, uint32_t);
+static void meter_delete_all(struct ofproto *);
 static void meter_insert_rule(struct rule *);
 
 /* unixctl. */
@@ -566,8 +567,7 @@ ofproto_create(const char *datapath_name, const char 
*datapath_type,
 } else {
 memset(>meter_features, 0, sizeof ofproto->meter_features);
 }
-ofproto->meters = xzalloc((ofproto->meter_features.max_meters + 1)
-  * sizeof(struct meter *));
+hmap_init(>meters);
 
 /* Set the initial tables version. */
 ofproto_bump_tables_version(ofproto);
@@ -1635,12 +1635,8 @@ ofproto_destroy(struct ofproto *p, bool del)
 return;
 }
 
-if (p->meters) {
-meter_delete(p, 1, p->meter_features.max_meters);
-p->meter_features.max_meters = 0;
-free(p->meters);
-p->meters = NULL;
-}
+meter_delete_all(p);
+hmap_destroy(>meters);
 
 ofproto_flush__(p);
 HMAP_FOR_EACH_SAFE (ofport, next_ofport, hmap_node, >ports) {
@@ -6211,14 +6207,37 @@ handle_flow_monitor_cancel(struct ofconn *ofconn, const 
struct ofp_header *oh)
  * 'provider_meter_id' is for the provider's private use.
  */
 struct meter {
+struct hmap_node node;  /* In ofproto->meters. */
 long long int created;  /* Time created. */
 struct ovs_list rules;  /* List of "struct rule_dpif"s. */
+uint32_t id;/* OpenFlow meter_id. */
 ofproto_meter_id provider_meter_id;
 uint16_t flags; /* Meter flags. */
 uint16_t n_bands;   /* Number of meter bands. */
 struct ofputil_meter_band *bands;
 };
 
+static struct meter *
+ofproto_get_meter(const struct ofproto *ofproto, uint32_t meter_id)
+{
+struct meter *meter;
+uint32_t hash = hash_int(meter_id, 0);
+
+HMAP_FOR_EACH_WITH_HASH (meter, node, hash, >meters) {
+if (meter->id == meter_id) {
+return meter;
+}
+}
+
+return NULL;
+}
+
+static void
+ofproto_add_meter(struct ofproto *ofproto, struct meter *meter)
+{
+hmap_insert(>meters, >node, hash_int(meter->id, 0));
+}
+
 /*
  * This is used in instruction validation at flow set-up time, to map
  * the OpenFlow meter ID to the corresponding datapath provider meter
@@ -6230,7 +6249,7 @@ ofproto_fix_meter_action(const struct ofproto *ofproto,
  struct ofpact_meter *ma)
 {
 if (ma->meter_id && ma->meter_id <= ofproto->meter_features.max_meters) {
-const struct meter *meter = ofproto->meters[ma->meter_id];
+const struct meter *meter = ofproto_get_meter(ofproto, ma->meter_id);
 
 if (meter && meter->provider_meter_id.uint32 != UINT32_MAX) {
 /* Update the action with the provider's meter ID, so that we
@@ -6250,7 +6269,7 @@ meter_insert_rule(struct rule *rule)
 {
 const struct rule_actions *a = rule_get_actions(rule);
 uint32_t meter_id = ofpacts_get_meter(a->ofpacts, 

Re: [ovs-dev] [PATCH 1/6] doc: Also delete stamp file in clean-docs target.

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 06:16:19PM +0100, Stephen Finucane wrote:
> On Thu, 2017-04-13 at 21:43 -0700, Ben Pfaff wrote:
> > Otherwise "make docs-check" won't necessarily do anything since its
> > apparent target is up to date.
> > 
> > Signed-off-by: Ben Pfaff 
> 
> Acked-by: Stephen Finucane 

Thanks!  I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/6] doc: Avoid need to generate conf.py.

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 06:20:43PM +0100, Stephen Finucane wrote:
> On Thu, 2017-04-13 at 21:43 -0700, Ben Pfaff wrote:
> > It's awkward to have to at the same time generate conf.py from
> > conf.py.in
> > and to keep both versions in the repository.  This avoids the issue.
> > 
> > Signed-off-by: Ben Pfaff 
> > ---
> >  Documentation/automake.mk |  10 --
> >  Documentation/conf.py |  23 ++-
> >  Documentation/conf.py.in  | 349 
> > --
> >  3 files changed, 19 insertions(+), 363 deletions(-)
> >  delete mode 100644 Documentation/conf.py.in
> > 
> > diff --git a/Documentation/automake.mk b/Documentation/automake.mk
> > index ec60e0b1e831..9911668c1ca9 100644
> > --- a/Documentation/automake.mk
> > +++ b/Documentation/automake.mk
> > @@ -3,7 +3,6 @@ DOC_SOURCE = \
> >     Documentation/_static/logo.png \
> >     Documentation/_static/overview.png \
> >     Documentation/conf.py \
> > -   Documentation/conf.py.in \
> >     Documentation/index.rst \
> >     Documentation/contents.rst \
> >     Documentation/intro/index.rst \
> > @@ -125,15 +124,6 @@ clean-docs:
> >     rm -rf $(SPHINXBUILDDIR)/linkcheck
> >     rm -f docs-check
> >  CLEAN_LOCAL += clean-docs
> > -
> > -ALL_LOCAL += $(srcdir)/Documentation/conf.py
> > -$(srcdir)/Documentation/conf.py: $(srcdir)/Documentation/conf.py.in
> > -   $(AM_V_GEN)($(ro_shell) && sed -e
> > 's,[@]VERSION[@],$(VERSION),g' \
> > -   -e 's,[@]OVS_MAJOR[@],$(shell echo $(VERSION) | cut
> > -f1 -d.),g' \
> > -   -e 's,[@]OVS_MINOR[@],$(shell echo $(VERSION) | cut
> > -f2 -d.),g') \
> > -   < $(srcdir)/Documentation/$(@F).in > $(@F).tmp ||
> > exit 1; \
> > -   if cmp -s $(@F).tmp $@; then touch $@; rm $(@F).tmp; else mv
> > $(@F).tmp $@; fi
> > -
> >  endif
> >  .PHONY: check-docs
> >  .PHONY: clean-docs
> > diff --git a/Documentation/conf.py b/Documentation/conf.py
> > index ae672cbe7b0d..bfd7f33d88ba 100644
> > --- a/Documentation/conf.py
> > +++ b/Documentation/conf.py
> > @@ -1,4 +1,3 @@
> > -# Generated automatically -- do not modify!-*- buffer-read-only: 
> > t -*-
> >  # -*- coding: utf-8 -*-
> >  #
> >  # Open vSwitch documentation build configuration file, created by
> > @@ -63,10 +62,26 @@ author = u'The Open vSwitch Development
> > Community'
> >  # |version| and |release|, also used in various other places
> > throughout the
> >  # built documents.
> >  #
> > +import string
> > +import sys
> > +def get_release():
> > +filename = "../configure.ac"
> > +with open(filename, 'rU') as f:
> > +for line in f:
> > +if 'AC_INIT' in line:
> > +# Parse "AC_INIT(openvswitch, 2.7.90, bugs@openvswit
> > ch.org)":
> > +return line.split(',')[1].strip(string.whitespace +
> > '[]')
> > +sys.stderr.write('%s: failed to determine Open vSwitch
> > version\n'
> > + % filename)
> > +sys.exit(1)
> > +release = get_release() # The full version, including alpha/beta/rc
> > tags 
> > +
> >  # The short X.Y version.
> > -version = u'2.7'
> > -# The full version, including alpha/beta/rc tags.
> > -release = u'2.7.90'
> > +#
> > +# However, it's important to know the difference between, e.g., 2.7
> > +# and 2.7.90, which can be very different versions (2.7.90 may be
> > much
> > +# closer to 2.8 than to 2.7), so check for that.
> > +version = release if '.90' in release else
> > '.'.join(release.split('.')[0:2])
> 
> nit: You probably don't need to have this as a function. Might also
> want to run it through pep8/flake8. Other than that though, lgtm.
> 
> Acked-by: Stephen Finucane 

Thanks!  I made your and flake8's suggested changes and applied this to
master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] netdev-dpdk: Implement Tx intermediate queue for dpdk ports.

2017-04-14 Thread Bodireddy, Bhanuprakash
The latency stats published in 
v3(https://mail.openvswitch.org/pipermail/ovs-dev/2017-February/328363.html)
seems to be erroneous due to the way the RFC2544 test was configured in IXIA.  
Please find below the updated latency stats.
Only case 1 and Case 2 stats are published below, where the burst size is 32.

BTW,  While calculating Latency, the stats parameter was set as 'Cut Through' 
meaning
Latency will be calculated as first bit in, first bit out.  Also the acceptable 
Frame Loss % is
set to 1%. Note that the below results are aggregated results of approximately 
9 iterations.

Benchmarks are done on the same commit(83ede47a48eb92053f66815e462e94a39d8a1f2c)
as v3. 

Case 1:   Matching IP flow rules for each IXIA stream
###
Packet64 128
256  512 
Branch  Master  Patch  Master  Patch  MasterPatch 
Master  Patch 
Min  25360199000  30260208890  23490131320 
19620   118700   (ns)
Max   854260577600   868680302440197420195090   160930   
184740   (ns)
Avg 384182   261213412612262091   190386166025   133661   
154787   (ns)

1024  1280  
1518
   Master   Patch  Master PatchMaster Patch
Min   20290  180650  30370  157260  19680   147550   (ns)
Max  304290 239750   178570  216650199140   209050  (ns)
Avg260350 209316   149328  185930170091  177033   (ns)


case 2: ovs-ofctl add-flow br0 in_port=1,action=output:2
###
Packet 64  128 
256  512
Branch  MasterPatch Master  Patch   Master  Patch   Master  
Patch   
Min   27870  30680  1308029160 12000 18970  
   14520  14610  (ns)
Max323790205930   282360   289470   39170  51610
 48340  80670  (ns)
Avg  162219163582 4068541677 21582 41546
 35017  66192  (ns)

 1024 1280  
  1518
   Master   Patch Master Patch  Master Patch
Min  10820  29670   11270   24740  11510   24780  (ns)
Max 29480  7030029900   39010  32460  40010  (ns)
Avg   18926  54582   19239   30636  19087  16722  (ns)

Regards,
Bhanuprakash. 

>-Original Message-
>From: Bodireddy, Bhanuprakash
>Sent: Thursday, February 2, 2017 10:15 PM
>To: d...@openvswitch.org
>Cc: i.maxim...@samsung.com; ktray...@redhat.com; diproiet...@ovn.org;
>Bodireddy, Bhanuprakash ; Fischetti,
>Antonio ; Markus Magnusson
>
>Subject: [PATCH v3] netdev-dpdk: Implement Tx intermediate queue for
>dpdk ports.
>
>After packet classification, packets are queued in to batches depending on the
>matching netdev flow. Thereafter each batch is processed to execute the
>related actions. This becomes particularly inefficient if there are few packets
>in each batch as rte_eth_tx_burst() incurs expensive MMIO writes.
>
>This commit adds back intermediate queue implementation. Packets are
>queued and burst when the packet count exceeds threshold. Also drain logic
>is refactored to handle packets hanging in the tx queues. Testing shows
>significant performance gains with this implementation.
>
>Fixes: b59cc14e032d("netdev-dpdk: Use instant sending instead of queueing
>of packets.")
>Signed-off-by: Bhanuprakash Bodireddy
>>
>Signed-off-by: Antonio Fischetti 
>Co-authored-by: Antonio Fischetti 
>Signed-off-by: Markus Magnusson 
>Co-authored-by: Markus Magnusson 
>---
>v2->v3
>  * Refactor the code
>  * Use thread local copy 'send_port_cache' instead of 'tx_port' while draining
>  * Invoke dp_netdev_drain_txq_port() to drain the packets from the queue
>as
>part of pmd reconfiguration that gets triggered due to port
>addition/deletion
>or change in pmd-cpu-mask.
>  * Invoke netdev_txq_drain() from xps_get_tx_qid() to drain packets in old
>queue. This is possible in XPS case where the tx queue can change after
>timeout.
>  * Fix another bug in netdev_dpdk_eth_tx_burst() w.r.t 'txq->count'.
>
>Latency stats:
>Collected the latency stats with PHY2PHY loopback case using 30 IXIA streams
>/UDP packets/uni direction traffic. All the stats are in nanoseconds. Results
>below compare latency results between Master vs patch.
>
>case 1: Matching IP flow rules 

Re: [ovs-dev] [PATCH 3/3] doc: Remove some link targets that aren't used anywhere.

2017-04-14 Thread Stephen Finucane
On Fri, 2017-04-14 at 10:31 -0700, Ben Pfaff wrote:
> It seems that there's a style here that every title has a separate
> link
> target even if it's never used.  Maybe there is some reason for that
> (maybe
> it should be explained in the documentation style document?), but if
> not
> then it seems reasonable to leave out the unused ones since they look
> to
> me like just visual clutter.
> 
> Signed-off-by: Ben Pfaff 

Acked-by: Stephen Finucane 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] lib/automake.mk: don't install runtime directories

2017-04-14 Thread Ben Pfaff
On Thu, Apr 13, 2017 at 01:51:30PM -0400, Aaron Conole wrote:
> The Open vSwitch run, log, and DB directories are installed as part of the
> normal `make install` process.  However, this means they are created with
> user and group ownership that may conflict with the desired user.  For
> example, running `make install` as root will install those files as
> root:root, whereas the runtime user desired may be openvswitch:openvswitch.
> 
> Since these directories are automatically created as part of the ovs-ctl
> command, and with the correct user:group permissions, it makes sense to
> delay creation until these directories are actually required.
> 
> Reviewed-by: Markos Chandras 
> Signed-off-by: Aaron Conole 

This seems like a good idea to me once we figure out patch 1.

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC 2/4] doc: Convert ovs-vlan-test to rST

2017-04-14 Thread Stephen Finucane
On Thu, 2017-04-13 at 21:27 -0700, Ben Pfaff wrote:
> Thank you!  rST is much more readable than nroff.  I have some
> comments
> below.
> 
> On Mon, Apr 10, 2017 at 01:12:28PM +0100, Stephen Finucane wrote:
> > Let's start with a simple one that lets us focus on setting up most
> > of the required "infrastructure" for building man pages using
> > Sphinx.
> > 
> > There are a couple of things worth noting here:
> > 
> > - The 'check-htmldocs' target becomes 'check-docs' as its now
> >   responsible for building man page docs too.
> > 
> > - The outputted file will always have a '.1' suffix. This is 
> > Sphinx's
> >   decision, and likely stems from the man page guidelines [1] which
> >   state that "the name of the man page's source file...is the name
> > of
> >   the command, function or file name, followed by a dot, followed
> > by the
> >   section character".
> 
> It looks to me like the last element of the tuples inside man_pages
> in conf.py controls the section.  When I changed 1 to 8 there, it
> switched the manpage to section 8.  So I made that change in conf.py
> and I removed the above paragraph from the commit message.
> 
> > Other than that, hurrah for (mostly) legible syntaxes.
> > 
> > [1] http://www.tldp.org/HOWTO/Man-Page/q2.html
> > 
> > Signed-off-by: Stephen Finucane 
> > ---
> > I don't know if this is correctly integrated into the docs build
> > system or not. I need someone to double check this for me. In
> > particular, I think I need to integrate the 'dh_sphinxdoc' package
> > [2] into the Debian build but I've no idea how to.
> > 
> > [2] http://manpages.ubuntu.com/manpages/zesty/man1/dh_sphinxdoc.1.h
> > tml
> 
> I spent a couple of hours working on the build and install system
> here.
> 
> My first thought was to add rules to allow Automake to find and
> install the generated manpages.  This turned out to be a huge mess
> that required tons of nasty GNU Make specific crap in the makefiles,
> and I didn't like it.
> 
> My second approach was better.  I gave up on integrating with the
> builtin Automake rules for manpages.  All those really do anyway is
> handle install and uninstall, so I wrote some Make rules to do that.
> They're ugly because they're make+shell, but readable enough if you
> squint.
> 
> The main question for install and uninstall is how to choose the
> right section.  The easiest way seems to be to give the .rst files
> names that embed the section, like "ovs-vlan-test.8.rst".  This is
> also handy to distinguish manpages with the same name but in
> different sections, which is sometimes a reasonable thing to have, so
> that's what I did.

I still need to have a look into the patch (anything to do with
autotools is generally worth avoiding, heh), but I had a crazy idea
over the weekend: couldn't we just include the compiled man pages in
the source? The only issue with this would be the '.TH' line, but there
are solutions for avoiding this [1][2]. If we have man pages included
in the source, would that be adequate to use the standard autotools
setup for manpages?

> The patch didn't delete the old manpage, so I did that too.
> 
> I noticed that the "Synposis" section of the new ovs-vlan-test.rst
> did not use bold and italic in the canonical way for manpages, so I
> added * and ** in the right places.

++

Stephen

[1] https://gitlab.labs.nic.cz/labs/knot/blob/bab62de/doc/Makefile.am#L
149-158
[2] https://gitlab.labs.nic.cz/labs/knot/blob/bab62de/doc/man/kdig.1in
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] install-doc: suggest to use ovs-ctl for start/stop

2017-04-14 Thread Ben Pfaff
On Thu, Apr 13, 2017 at 01:51:29PM -0400, Aaron Conole wrote:
> The install documentation guided users to manually start/stop
> daemons.  This is good information to have, but with the
> existence of ovs-ctl, is probably not the best way to start
> guiding new users of ovs.
> 
> Suggest that users start by running ovs-ctl start, and
> document the ability to selectively start/stop the daemons.
> The ovs-ctl script is already mentioned a bit in the install
> doc, so this just reinforces its use.
> 
> Suggested-by: Ben Pfaff 
> Signed-off-by: Aaron Conole 

Thanks for working on the documentation!

I am a little worried about adding the casual suggestion to call
ovs-ctl, because ovs-ctl is installed in an out-of-the-way scripts
directory that is not in $PATH by default.  This is because ovs-ctl is
mostly meant as a tool for use by system init scripts rather than one
for the system administrator.  So, if we suggest that the admin should
call it directly, then we should either advise the admin where to find
it, or we should consider moving it to the default $PATH.  What do you
think is the best idea?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] debian, xenserver: Update logrotate config to match RHEL.

2017-04-14 Thread Greg Rose
On Fri, 2017-04-14 at 10:03 -0700, Ben Pfaff wrote:
> On Fri, Apr 14, 2017 at 09:05:41AM -0700, Greg Rose wrote:
> > On Thu, 2017-04-13 at 10:47 -0700, Ben Pfaff wrote:
> > > Commit 618a5b45ae8b ("rhel: Avoid logrotate error if /var/run/openvswitch
> > > does not exist") updated the RHEL logrotate configuration.  This commit
> > > makes similar changes for Debian, by synchronizing with the RHEL version.
> > > 
> > > In particular:
> > > 
> > > - Indent to match logrotate.conf(5) examples.
> > > 
> > > - Use "sharedscripts" flag, because the postrotate script only needs 
> > > to
> > >   run once regardless of the number of rotations.
> > > 
> > > - Drop "delaycompress", because the postrotate script does make 
> > > daemons
> > >   reopen their log files.
> > > 
> > > - Ignore errors calling vlog/reopen.
> > > 
> > > Also make similar changes to the xenserver logrotate script.  I really
> > > don't know if anyone uses the xenserver packaging anymore though.
> > 
> > Hi Ben,
> > 
> > I'm actually trying to build the latest upstream OVS from github on a
> > Xen Server 7.1 system.  I have followed the instructions as well as I
> > could and everything seems to build pretty well in the Xen Server 7.1
> > DDK VM until finally it fails with the following error output:
> > 
> > Checking for unpackaged
> > file(s): /usr/lib/rpm/check-files 
> > /prj/gvrose/rpmbuild/BUILDROOT/openvswitch-2.7.90-1.x86_64
> > error: Installed (but unpackaged) file(s) found:
> >/usr/bin/ovs-tcpdump
> >/usr/share/man/man7/ovs-fields.7.gz
> >/usr/share/man/man8/ovs-tcpdump.8.gz
> >/usr/share/openvswitch/scripts/ovndb-servers.ocf
> > 
> > 
> > RPM build errors:
> > Installed (but unpackaged) file(s) found:
> >/usr/bin/ovs-tcpdump
> >/usr/share/man/man7/ovs-fields.7.gz
> >/usr/share/man/man8/ovs-tcpdump.8.gz
> >/usr/share/openvswitch/scripts/ovndb-servers.ocf
> > 
> > I suppose there is some file list that needs updating?
> 
> Yes, please look in xenserver/openvswitch-xen.spec.in.
> 
> However, this may be a good sign that no one cares about XenServer,
> since Open vSwitch 2.6.0, released in September last year, also installs
> at least /usr/bin/ovs-tcpdump, and no one has complained.  Perhaps we
> should just delete the packaging, instead, and save ourselves some
> trouble.

LOL - sure. Sounds good to me.  ;^)

In that case I won't bother sending a patch.

- Greg



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 3/3] doc: Remove some link targets that aren't used anywhere.

2017-04-14 Thread Ben Pfaff
It seems that there's a style here that every title has a separate link
target even if it's never used.  Maybe there is some reason for that (maybe
it should be explained in the documentation style document?), but if not
then it seems reasonable to leave out the unused ones since they look to
me like just visual clutter.

Signed-off-by: Ben Pfaff 
---
 Documentation/topics/testing.rst | 12 
 1 file changed, 12 deletions(-)

diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
index 365416586a94..498aea4d0ae1 100644
--- a/Documentation/topics/testing.rst
+++ b/Documentation/topics/testing.rst
@@ -38,8 +38,6 @@ described in :doc:`/intro/install/general`. You do not need 
to install Open
 vSwitch or to build or load the kernel module to run these test suites. You do
 not need supervisor privilege to run these test suites.
 
-.. _testing-unit-tests:
-
 Unit Tests
 ~~
 
@@ -126,8 +124,6 @@ All the same options are available via TESTSUITEFLAGS.
   You may find that the valgrind results are easier to interpret if you put
   ``-q`` in ``~/.valgrindrc``, since that reduces the amount of output.
 
-.. _testing-oftest:
-
 OFTest
 ~~
 
@@ -181,8 +177,6 @@ standard and other standards.
   of Open vSwitch and OFTest in your bug report, plus any other information
   needed to reproduce the problem.
 
-.. _ryu:
-
 Ryu
 ~~~
 
@@ -211,8 +205,6 @@ omit ``RYUDIR``
   of Open vSwitch and Ryu in your bug report, plus any other information
   needed to reproduce the problem.
 
-.. _datapath-testing:
-
 Datapath testing
 
 
@@ -342,8 +334,6 @@ You should invoke scan-view to view analysis results. The 
last line of output
 from ``clang-analyze`` will list the command (containing results directory)
 that you should invoke to view the results on a browser.
 
-.. _testing-ci:
-
 Continuous Integration with Travis CI
 -
 
@@ -384,8 +374,6 @@ Instructions to setup travis-ci for your GitHub repository:
 4. Pushing a commit to the repository which breaks the build or the
testsuite will now trigger a email sent to myl...@mydomain.org
 
-.. _testing-vsperf:
-
 vsperf
 --
 
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/3] doc: Convert a named link into a title link.

2017-04-14 Thread Ben Pfaff
It seems like this is easier to read in the source form.  I don't
understand what is typical style here, though.

Signed-off-by: Ben Pfaff 
---
 Documentation/topics/testing.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
index f67b148efdd5..365416586a94 100644
--- a/Documentation/topics/testing.rst
+++ b/Documentation/topics/testing.rst
@@ -188,7 +188,7 @@ Ryu
 
 Ryu is an OpenFlow controller written in Python that includes an extensive
 OpenFlow testsuite. Open vSwitch includes a Makefile target to run Ryu in
-"dummy mode". See :ref:`testing-oftest` above for an explanation of dummy mode.
+"dummy mode". See `OFTest`_ above for an explanation of dummy mode.
 
 To run Ryu tests with Open vSwitch, first read and follow the instructions
 under **Testing** above. Second, obtain a copy of Ryu, install its
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/3] doc: Convert some bolded words into links.

2017-04-14 Thread Ben Pfaff
Seems more user-friendly.

Signed-off-by: Ben Pfaff 
---
 Documentation/intro/install/general.rst | 5 +++--
 Documentation/topics/testing.rst| 5 +++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/Documentation/intro/install/general.rst 
b/Documentation/intro/install/general.rst
index f2cd2b13b2b2..1ea82b6ae709 100644
--- a/Documentation/intro/install/general.rst
+++ b/Documentation/intro/install/general.rst
@@ -365,7 +365,8 @@ Building
  see unusual warnings when you use both together, consider disabling
  ccache.
 
-2. Consider running the testsuite. Refer to **Testing** for instructions.
+2. Consider running the testsuite. Refer to :doc:`/topics/testing` for
+   instructions.
 
 3. Run ``make install`` to install the executables and manpages into the
running system, by default under ``/usr/local``::
@@ -499,7 +500,7 @@ upgrade the database schema:
   $ ovsdb-tool convert /usr/local/etc/openvswitch/conf.db \
   vswitchd/vswitch.ovsschema
 
-4. Start the Open vSwitch daemons as described under **Starting** above.
+4. Start the Open vSwitch daemons as described under `Starting`_ above.
 
 Hot Upgrading
 -
diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst
index f48665b10710..f67b148efdd5 100644
--- a/Documentation/topics/testing.rst
+++ b/Documentation/topics/testing.rst
@@ -237,8 +237,9 @@ Vagrant
   Requires Vagrant (version 1.7.0 or later) and a compatible hypervisor
 
 .. note::
-  You must **Bootstrap** and **Configure** the sources before you run the steps
-  described here.
+  You must bootstrap and configure the sources (see
+  doc:`/intro/install/general`) before you run the steps described
+  here.
 
 A Vagrantfile is provided allowing to compile and provision the source tree as
 found locally in a virtual machine using the following command::
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next v2 4/6] vxlan: check valid combinations of address scopes

2017-04-14 Thread Sergei Shtylyov

On 04/14/2017 07:44 PM, Matthias Schiffer wrote:


* Multicast addresses are never valid as local address
* Link-local IPv6 unicast addresses may only be used as remote when the
  local address is link-local as well
* Don't allow link-local IPv6 local/remote addresses without interface

We also store in the flags field if link-local addresses are used for the
follow-up patches that actually make VXLAN over link-local IPv6 work.

Signed-off-by: Matthias Schiffer 
---

v2: was "vxlan: don't allow link-local IPv6 local/remote addresses without
interface" before. v2 does a lot more checks and adds the
VXLAN_F_IPV6_LINKLOCAL flag.

 drivers/net/vxlan.c | 35 +++
 include/net/vxlan.h |  2 ++
 2 files changed, 37 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 07f89b037681..95a71546e8f2 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2881,11 +2881,39 @@ static int vxlan_config_validate(struct net *src_net, 
struct vxlan_config *conf,
if (conf->saddr.sa.sa_family != conf->remote_ip.sa.sa_family)
return -EINVAL;

+   if (vxlan_addr_multicast(>saddr))
+   return -EINVAL;
+
if (conf->saddr.sa.sa_family == AF_INET6) {
if (!IS_ENABLED(CONFIG_IPV6))
return -EPFNOSUPPORT;
use_ipv6 = true;
conf->flags |= VXLAN_F_IPV6;
+
+   if (!(conf->flags & VXLAN_F_COLLECT_METADATA)) {
+   int local_type =
+   ipv6_addr_type(>saddr.sin6.sin6_addr);
+   int remote_type =
+   ipv6_addr_type(>remote_ip.sin6.sin6_addr);
+
+   if (local_type & IPV6_ADDR_LINKLOCAL) {
+   if (!(remote_type & IPV6_ADDR_LINKLOCAL) &&
+   (remote_type != IPV6_ADDR_ANY)) {
+   pr_info("invalid combination of address 
scopes\n");


   Maybe pr_err()?


+   return -EINVAL;
+   }
+
+   conf->flags |= VXLAN_F_IPV6_LINKLOCAL;
+   } else {
+   if (remote_type ==
+   (IPV6_ADDR_UNICAST | IPV6_ADDR_LINKLOCAL)) {
+   pr_info("invalid combination of address 
scopes\n");


   Here as well...


+   return -EINVAL;
+   }
+
+   conf->flags &= ~VXLAN_F_IPV6_LINKLOCAL;
+   }
+   }
}

if (conf->label && !use_ipv6) {

[...]

MBR, Sergei

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/6] doc: Avoid need to generate conf.py.

2017-04-14 Thread Stephen Finucane
On Thu, 2017-04-13 at 21:43 -0700, Ben Pfaff wrote:
> It's awkward to have to at the same time generate conf.py from
> conf.py.in
> and to keep both versions in the repository.  This avoids the issue.
> 
> Signed-off-by: Ben Pfaff 
> ---
>  Documentation/automake.mk |  10 --
>  Documentation/conf.py |  23 ++-
>  Documentation/conf.py.in  | 349 
> --
>  3 files changed, 19 insertions(+), 363 deletions(-)
>  delete mode 100644 Documentation/conf.py.in
> 
> diff --git a/Documentation/automake.mk b/Documentation/automake.mk
> index ec60e0b1e831..9911668c1ca9 100644
> --- a/Documentation/automake.mk
> +++ b/Documentation/automake.mk
> @@ -3,7 +3,6 @@ DOC_SOURCE = \
>   Documentation/_static/logo.png \
>   Documentation/_static/overview.png \
>   Documentation/conf.py \
> - Documentation/conf.py.in \
>   Documentation/index.rst \
>   Documentation/contents.rst \
>   Documentation/intro/index.rst \
> @@ -125,15 +124,6 @@ clean-docs:
>   rm -rf $(SPHINXBUILDDIR)/linkcheck
>   rm -f docs-check
>  CLEAN_LOCAL += clean-docs
> -
> -ALL_LOCAL += $(srcdir)/Documentation/conf.py
> -$(srcdir)/Documentation/conf.py: $(srcdir)/Documentation/conf.py.in
> - $(AM_V_GEN)($(ro_shell) && sed -e
> 's,[@]VERSION[@],$(VERSION),g' \
> - -e 's,[@]OVS_MAJOR[@],$(shell echo $(VERSION) | cut
> -f1 -d.),g' \
> - -e 's,[@]OVS_MINOR[@],$(shell echo $(VERSION) | cut
> -f2 -d.),g') \
> - < $(srcdir)/Documentation/$(@F).in > $(@F).tmp ||
> exit 1; \
> - if cmp -s $(@F).tmp $@; then touch $@; rm $(@F).tmp; else mv
> $(@F).tmp $@; fi
> -
>  endif
>  .PHONY: check-docs
>  .PHONY: clean-docs
> diff --git a/Documentation/conf.py b/Documentation/conf.py
> index ae672cbe7b0d..bfd7f33d88ba 100644
> --- a/Documentation/conf.py
> +++ b/Documentation/conf.py
> @@ -1,4 +1,3 @@
> -# Generated automatically -- do not modify!-*- buffer-read-only: 
> t -*-
>  # -*- coding: utf-8 -*-
>  #
>  # Open vSwitch documentation build configuration file, created by
> @@ -63,10 +62,26 @@ author = u'The Open vSwitch Development
> Community'
>  # |version| and |release|, also used in various other places
> throughout the
>  # built documents.
>  #
> +import string
> +import sys
> +def get_release():
> +filename = "../configure.ac"
> +with open(filename, 'rU') as f:
> +for line in f:
> +if 'AC_INIT' in line:
> +# Parse "AC_INIT(openvswitch, 2.7.90, bugs@openvswit
> ch.org)":
> +return line.split(',')[1].strip(string.whitespace +
> '[]')
> +sys.stderr.write('%s: failed to determine Open vSwitch
> version\n'
> + % filename)
> +sys.exit(1)
> +release = get_release() # The full version, including alpha/beta/rc
> tags 
> +
>  # The short X.Y version.
> -version = u'2.7'
> -# The full version, including alpha/beta/rc tags.
> -release = u'2.7.90'
> +#
> +# However, it's important to know the difference between, e.g., 2.7
> +# and 2.7.90, which can be very different versions (2.7.90 may be
> much
> +# closer to 2.8 than to 2.7), so check for that.
> +version = release if '.90' in release else
> '.'.join(release.split('.')[0:2])

nit: You probably don't need to have this as a function. Might also
want to run it through pep8/flake8. Other than that though, lgtm.

Acked-by: Stephen Finucane 

>  # The language for content autogenerated by Sphinx. Refer to
> documentation
>  # for a list of supported languages.
> diff --git a/Documentation/conf.py.in b/Documentation/conf.py.in
> deleted file mode 100644
> index 5ed7006228f2..
> --- a/Documentation/conf.py.in
> +++ /dev/null
> @@ -1,349 +0,0 @@
> -# -*- coding: utf-8 -*-
> -#
> -# Open vSwitch documentation build configuration file, created by
> -# sphinx-quickstart on Fri Sep 30 09:57:36 2016.
> -#
> -# This file is execfile()d with the current directory set to its
> -# containing dir.
> -#
> -# Note that not all possible configuration values are present in
> this
> -# autogenerated file.
> -#
> -# All configuration values have a default; values that are commented
> out
> -# serve to show the default.
> -
> -# If extensions (or modules to document with autodoc) are in another
> directory,
> -# add these directories to sys.path here. If the directory is
> relative to the
> -# documentation root, use os.path.abspath to make it absolute, like
> shown here.
> -#
> -# import os
> -# import sys
> -# sys.path.insert(0, os.path.abspath('.'))
> -try:
> -import ovs_sphinx_theme
> -use_ovs_theme = True
> -except ImportError:
> -print("Cannot find 'ovs_sphinx' package. Falling back to default
> theme.")
> -use_ovs_theme = False
> -
> -# -- General configuration ---
> -
> -
> -# If your documentation needs a minimal Sphinx version, state it
> here.
> -#
> -needs_sphinx = '1.1'
> -
> -# Add any Sphinx extension module names 

[ovs-dev] [PATCH] ofproto: Add support of OFPR_PACKET_OUT as packet-in reason

2017-04-14 Thread Yi-Hung Wei
This patch adds support of OFPR_PACKET_OUT as the packet-in reason.
This packet-in reason is a required feature for OF1.4+, and it indicates
that the associated packet-in message to the controller is triggered when
the switch is processing a packet-out message. This reason code is enabled
by default when OF1.4+ is used.

Signed-off-by: Yi-Hung Wei 
---
 Documentation/topics/design.rst   |  1 +
 Documentation/topics/openflow.rst |  3 ++-
 include/openvswitch/ofp-util.h|  2 ++
 lib/ofp-util.c| 16 
 ofproto/connmgr.c |  5 +++--
 ofproto/ofproto-dpif-xlate.c  |  7 ++-
 ofproto/ofproto-dpif-xlate.h  |  3 +++
 ofproto/ofproto-dpif.c|  1 +
 tests/ofproto-dpif.at | 12 ++--
 tests/ofproto.at  | 12 ++--
 10 files changed, 42 insertions(+), 20 deletions(-)

diff --git a/Documentation/topics/design.rst b/Documentation/topics/design.rst
index adc407ac9227..eb290ab68692 100644
--- a/Documentation/topics/design.rst
+++ b/Documentation/topics/design.rst
@@ -79,6 +79,7 @@ that the message is suppressed.
   ``OFPR_INVALID_TTL``  ------
   ``OFPR_ACTION_SET`` (OF1.4+)  yes---
   ``OFPR_GROUP`` (OF1.4+)   yes---
+  ``OFPR_PACKET_OUT`` (OF1.4+)  yes---
   === === =
 
 .. table:: ``OFPT_FLOW_REMOVED`` / ``NXT_FLOW_REMOVED``
diff --git a/Documentation/topics/openflow.rst 
b/Documentation/topics/openflow.rst
index 771246706958..7adf930989a4 100644
--- a/Documentation/topics/openflow.rst
+++ b/Documentation/topics/openflow.rst
@@ -279,7 +279,8 @@ features are listed in the previous section.
 
   Distinguish ``OFPR_APPLY_ACTION``, ``OFPR_ACTION_SET``, ``OFPR_GROUP``,
   ``OFPR_PACKET_OUT``.  ``NO_MATCH`` was renamed to ``OFPR_TABLE_MISS``.
-  (OFPR_ACTION_SET and OFPR_GROUP are now supported)
+
+  All required features are now supported.
 
   (EXT-136)
 
diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
index f664055c3939..53a7a9d254ed 100644
--- a/include/openvswitch/ofp-util.h
+++ b/include/openvswitch/ofp-util.h
@@ -473,6 +473,8 @@ const char *ofputil_packet_in_reason_to_string(enum 
ofp_packet_in_reason,
size_t bufsize);
 bool ofputil_packet_in_reason_from_string(const char *,
   enum ofp_packet_in_reason *);
+int ofputil_encode_packet_in_reason(enum ofp_packet_in_reason,
+enum ofp_version);
 
 /* A packet-in message, including continuation data.  The format of
  * continuation data is subject to change and thus it is supposed to be opaque
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 1f038c61ea97..28e06fcf770f 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -3613,9 +3613,9 @@ ofputil_decode_packet_in(const struct ofp_header *oh, 
bool loose,
 return 0;
 }
 
-static int
-encode_packet_in_reason(enum ofp_packet_in_reason reason,
-enum ofp_version version)
+int
+ofputil_encode_packet_in_reason(enum ofp_packet_in_reason reason,
+enum ofp_version version)
 {
 switch (reason) {
 case OFPR_NO_MATCH:
@@ -3661,7 +3661,7 @@ ofputil_put_packet_in(const struct ofputil_packet_in *pin,
 
 /* Add other properties. */
 ofpprop_put_u8(msg, NXPINT_REASON,
-   encode_packet_in_reason(pin->reason, version));
+   ofputil_encode_packet_in_reason(pin->reason, version));
 
 size_t start = ofpprop_start(msg, NXPINT_METADATA);
 oxm_put_raw(msg, >flow_metadata, version);
@@ -3793,7 +3793,7 @@ ofputil_encode_ofp10_packet_in(const struct 
ofputil_packet_in *pin)
 opi = ofpbuf_put_zeros(msg, offsetof(struct ofp10_packet_in, data));
 opi->total_len = htons(pin->packet_len);
 opi->in_port = htons(ofp_to_u16(pin->flow_metadata.flow.in_port.ofp_port));
-opi->reason = encode_packet_in_reason(pin->reason, OFP10_VERSION);
+opi->reason = ofputil_encode_packet_in_reason(pin->reason, OFP10_VERSION);
 opi->buffer_id = htonl(UINT32_MAX);
 
 return msg;
@@ -3817,7 +3817,7 @@ ofputil_encode_nx_packet_in(const struct 
ofputil_packet_in *pin,
 npi = msg->msg;
 npi->buffer_id = htonl(UINT32_MAX);
 npi->total_len = htons(pin->packet_len);
-npi->reason = encode_packet_in_reason(pin->reason, version);
+npi->reason = ofputil_encode_packet_in_reason(pin->reason, version);
 npi->table_id = pin->table_id;
 npi->cookie = pin->cookie;
 npi->match_len = htons(match_len);
@@ -3863,7 +3863,7 @@ ofputil_encode_ofp11_packet_in(const struct 
ofputil_packet_in *pin)
 pin->flow_metadata.flow.in_port.ofp_port);
 opi->in_phy_port = opi->in_port;
 opi->total_len = htons(pin->packet_len);
-opi->reason = encode_packet_in_reason(pin->reason, OFP11_VERSION);
+

Re: [ovs-dev] [PATCH] debian, xenserver: Update logrotate config to match RHEL.

2017-04-14 Thread Greg Rose
On Fri, 2017-04-14 at 09:05 -0700, Greg Rose wrote:
> On Thu, 2017-04-13 at 10:47 -0700, Ben Pfaff wrote:
> > Commit 618a5b45ae8b ("rhel: Avoid logrotate error if /var/run/openvswitch
> > does not exist") updated the RHEL logrotate configuration.  This commit
> > makes similar changes for Debian, by synchronizing with the RHEL version.
> > 
> > In particular:
> > 
> > - Indent to match logrotate.conf(5) examples.
> > 
> > - Use "sharedscripts" flag, because the postrotate script only needs to
> >   run once regardless of the number of rotations.
> > 
> > - Drop "delaycompress", because the postrotate script does make daemons
> >   reopen their log files.
> > 
> > - Ignore errors calling vlog/reopen.
> > 
> > Also make similar changes to the xenserver logrotate script.  I really
> > don't know if anyone uses the xenserver packaging anymore though.
> 
> Hi Ben,
> 
> I'm actually trying to build the latest upstream OVS from github on a
> Xen Server 7.1 system.  I have followed the instructions as well as I
> could and everything seems to build pretty well in the Xen Server 7.1
> DDK VM until finally it fails with the following error output:
> 
> Checking for unpackaged
> file(s): /usr/lib/rpm/check-files 
> /prj/gvrose/rpmbuild/BUILDROOT/openvswitch-2.7.90-1.x86_64
> error: Installed (but unpackaged) file(s) found:
>/usr/bin/ovs-tcpdump
>/usr/share/man/man7/ovs-fields.7.gz
>/usr/share/man/man8/ovs-tcpdump.8.gz
>/usr/share/openvswitch/scripts/ovndb-servers.ocf
> 
> 
> RPM build errors:
> Installed (but unpackaged) file(s) found:
>/usr/bin/ovs-tcpdump
>/usr/share/man/man7/ovs-fields.7.gz
>/usr/share/man/man8/ovs-tcpdump.8.gz
>/usr/share/openvswitch/scripts/ovndb-servers.ocf
> 
> I suppose there is some file list that needs updating?
> 
> Any pointers will be much appreciated and help me get rolling.
> 
> Thanks,
> 
> - Greg

I found the file list in the xenserver/openvswitch-xen.spec and fixed it
up.  I've got the RPMs all built now.

I'll generate a patch and send later today.

Thanks,

- Greg

> 
> > 
> > CC: Timothy Redaelli 
> > Signed-off-by: Ben Pfaff 
> > ---
> >  debian/openvswitch-switch.logrotate   | 14 +++---
> >  xenserver/etc_logrotate.d_openvswitch | 22 --
> >  2 files changed, 19 insertions(+), 17 deletions(-)
> > 
> > diff --git a/debian/openvswitch-switch.logrotate 
> > b/debian/openvswitch-switch.logrotate
> > index a7a71bdd90ad..7752af90cfed 100644
> > --- a/debian/openvswitch-switch.logrotate
> > +++ b/debian/openvswitch-switch.logrotate
> > @@ -1,16 +1,16 @@
> >  /var/log/openvswitch/*.log {
> >  daily
> >  compress
> > +sharedscripts
> >  create 640 root adm
> > -delaycompress
> >  missingok
> >  rotate 30
> >  postrotate
> > -# Tell Open vSwitch daemons to reopen their log files
> > -if [ -d /var/run/openvswitch ]; then
> > -for pidfile in `cd /var/run/openvswitch && echo *.pid`; do
> > -ovs-appctl -t "${pidfile%%.pid}" vlog/reopen
> > -done
> > -fi
> > +   # Tell Open vSwitch daemons to reopen their log files
> > +   if [ -d /var/run/openvswitch ]; then
> > +   for pidfile in `cd /var/run/openvswitch && echo *.pid`; do
> > +   ovs-appctl -t "${pidfile%%.pid}" vlog/reopen 2>/dev/null || :
> > +   done
> > +   fi
> >  endscript
> >  }
> > diff --git a/xenserver/etc_logrotate.d_openvswitch 
> > b/xenserver/etc_logrotate.d_openvswitch
> > index 73751d4578b0..cd7b3a9d569d 100644
> > --- a/xenserver/etc_logrotate.d_openvswitch
> > +++ b/xenserver/etc_logrotate.d_openvswitch
> > @@ -1,4 +1,4 @@
> > -# Copyright (C) 2009, 2010, 2011, 2012 Nicira, Inc.
> > +# Copyright (C) 2009, 2010, 2011, 2012, 2017 Nicira, Inc.
> >  #
> >  # Copying and distribution of this file, with or without modification,
> >  # are permitted in any medium without royalty provided the copyright
> > @@ -6,14 +6,16 @@
> >  # without warranty of any kind.
> >  
> >  /var/log/openvswitch/*.log {
> > -   daily
> > -   compress
> > -   sharedscripts
> > -   missingok
> > -   postrotate
> > +daily
> > +compress
> > +sharedscripts
> > +missingok
> > +postrotate
> > # Tell Open vSwitch daemons to reopen their log files
> > -for pidfile in `cd /var/run/openvswitch && echo *.pid`; do
> > -ovs-appctl -t "${pidfile%%.pid}" vlog/reopen
> > -done
> > -   endscript
> > +if [ -d /var/run/openvswitch ]; then
> > +   for pidfile in `cd /var/run/openvswitch && echo *.pid`; do
> > +   ovs-appctl -t "${pidfile%%.pid}" vlog/reopen 2>/dev/null || :
> > +   done
> > +   fi
> > +endscript
> >  }
> 
> 



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/6] doc: Also delete stamp file in clean-docs target.

2017-04-14 Thread Stephen Finucane
On Thu, 2017-04-13 at 21:43 -0700, Ben Pfaff wrote:
> Otherwise "make docs-check" won't necessarily do anything since its
> apparent target is up to date.
> 
> Signed-off-by: Ben Pfaff 

Acked-by: Stephen Finucane 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] debian, xenserver: Update logrotate config to match RHEL.

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 09:05:41AM -0700, Greg Rose wrote:
> On Thu, 2017-04-13 at 10:47 -0700, Ben Pfaff wrote:
> > Commit 618a5b45ae8b ("rhel: Avoid logrotate error if /var/run/openvswitch
> > does not exist") updated the RHEL logrotate configuration.  This commit
> > makes similar changes for Debian, by synchronizing with the RHEL version.
> > 
> > In particular:
> > 
> > - Indent to match logrotate.conf(5) examples.
> > 
> > - Use "sharedscripts" flag, because the postrotate script only needs to
> >   run once regardless of the number of rotations.
> > 
> > - Drop "delaycompress", because the postrotate script does make daemons
> >   reopen their log files.
> > 
> > - Ignore errors calling vlog/reopen.
> > 
> > Also make similar changes to the xenserver logrotate script.  I really
> > don't know if anyone uses the xenserver packaging anymore though.
> 
> Hi Ben,
> 
> I'm actually trying to build the latest upstream OVS from github on a
> Xen Server 7.1 system.  I have followed the instructions as well as I
> could and everything seems to build pretty well in the Xen Server 7.1
> DDK VM until finally it fails with the following error output:
> 
> Checking for unpackaged
> file(s): /usr/lib/rpm/check-files 
> /prj/gvrose/rpmbuild/BUILDROOT/openvswitch-2.7.90-1.x86_64
> error: Installed (but unpackaged) file(s) found:
>/usr/bin/ovs-tcpdump
>/usr/share/man/man7/ovs-fields.7.gz
>/usr/share/man/man8/ovs-tcpdump.8.gz
>/usr/share/openvswitch/scripts/ovndb-servers.ocf
> 
> 
> RPM build errors:
> Installed (but unpackaged) file(s) found:
>/usr/bin/ovs-tcpdump
>/usr/share/man/man7/ovs-fields.7.gz
>/usr/share/man/man8/ovs-tcpdump.8.gz
>/usr/share/openvswitch/scripts/ovndb-servers.ocf
> 
> I suppose there is some file list that needs updating?

Yes, please look in xenserver/openvswitch-xen.spec.in.

However, this may be a good sign that no one cares about XenServer,
since Open vSwitch 2.6.0, released in September last year, also installs
at least /usr/bin/ovs-tcpdump, and no one has complained.  Perhaps we
should just delete the packaging, instead, and save ourselves some
trouble.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 4/6] vxlan: check valid combinations of address scopes

2017-04-14 Thread Matthias Schiffer
* Multicast addresses are never valid as local address
* Link-local IPv6 unicast addresses may only be used as remote when the
  local address is link-local as well
* Don't allow link-local IPv6 local/remote addresses without interface

We also store in the flags field if link-local addresses are used for the
follow-up patches that actually make VXLAN over link-local IPv6 work.

Signed-off-by: Matthias Schiffer 
---

v2: was "vxlan: don't allow link-local IPv6 local/remote addresses without
interface" before. v2 does a lot more checks and adds the
VXLAN_F_IPV6_LINKLOCAL flag.

 drivers/net/vxlan.c | 35 +++
 include/net/vxlan.h |  2 ++
 2 files changed, 37 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 07f89b037681..95a71546e8f2 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2881,11 +2881,39 @@ static int vxlan_config_validate(struct net *src_net, 
struct vxlan_config *conf,
if (conf->saddr.sa.sa_family != conf->remote_ip.sa.sa_family)
return -EINVAL;
 
+   if (vxlan_addr_multicast(>saddr))
+   return -EINVAL;
+
if (conf->saddr.sa.sa_family == AF_INET6) {
if (!IS_ENABLED(CONFIG_IPV6))
return -EPFNOSUPPORT;
use_ipv6 = true;
conf->flags |= VXLAN_F_IPV6;
+
+   if (!(conf->flags & VXLAN_F_COLLECT_METADATA)) {
+   int local_type =
+   ipv6_addr_type(>saddr.sin6.sin6_addr);
+   int remote_type =
+   ipv6_addr_type(>remote_ip.sin6.sin6_addr);
+
+   if (local_type & IPV6_ADDR_LINKLOCAL) {
+   if (!(remote_type & IPV6_ADDR_LINKLOCAL) &&
+   (remote_type != IPV6_ADDR_ANY)) {
+   pr_info("invalid combination of address 
scopes\n");
+   return -EINVAL;
+   }
+
+   conf->flags |= VXLAN_F_IPV6_LINKLOCAL;
+   } else {
+   if (remote_type ==
+   (IPV6_ADDR_UNICAST | IPV6_ADDR_LINKLOCAL)) {
+   pr_info("invalid combination of address 
scopes\n");
+   return -EINVAL;
+   }
+
+   conf->flags &= ~VXLAN_F_IPV6_LINKLOCAL;
+   }
+   }
}
 
if (conf->label && !use_ipv6) {
@@ -2920,6 +2948,13 @@ static int vxlan_config_validate(struct net *src_net, 
struct vxlan_config *conf,
return -EINVAL;
}
 
+#if IS_ENABLED(CONFIG_IPV6)
+   if (conf->flags & VXLAN_F_IPV6_LINKLOCAL) {
+   pr_info("link-local local/remote addresses require 
interface to be specified\n");
+   return -EINVAL;
+   }
+#endif
+
*lower = NULL;
}
 
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 479bb75789ea..b816a0a6686e 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -258,6 +258,7 @@ struct vxlan_dev {
 #define VXLAN_F_REMCSUM_NOPARTIAL  0x1000
 #define VXLAN_F_COLLECT_METADATA   0x2000
 #define VXLAN_F_GPE0x4000
+#define VXLAN_F_IPV6_LINKLOCAL 0x8000
 
 /* Flags that are used in the receive path. These flags must match in
  * order for a socket to be shareable
@@ -272,6 +273,7 @@ struct vxlan_dev {
 /* Flags that can be set together with VXLAN_F_GPE. */
 #define VXLAN_F_ALLOWED_GPE(VXLAN_F_GPE |  \
 VXLAN_F_IPV6 | \
+VXLAN_F_IPV6_LINKLOCAL |   \
 VXLAN_F_UDP_ZERO_CSUM_TX | \
 VXLAN_F_UDP_ZERO_CSUM6_TX |\
 VXLAN_F_UDP_ZERO_CSUM6_RX |\
-- 
2.12.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 5/6] vxlan: fix snooping for link-local IPv6 addresses

2017-04-14 Thread Matthias Schiffer
If VXLAN is run over link-local IPv6 addresses, it is necessary to store
the ifindex in the FDB entries. Otherwise, the used interface is undefined
and unicast communication will most likely fail.

Signed-off-by: Matthias Schiffer 
---

v2: use u32 instead of __u32

 drivers/net/vxlan.c | 20 +++-
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 95a71546e8f2..2995b57a1551 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -941,16 +941,24 @@ static int vxlan_fdb_dump(struct sk_buff *skb, struct 
netlink_callback *cb,
  */
 static bool vxlan_snoop(struct net_device *dev,
union vxlan_addr *src_ip, const u8 *src_mac,
-   __be32 vni)
+   u32 src_ifindex, __be32 vni)
 {
struct vxlan_dev *vxlan = netdev_priv(dev);
struct vxlan_fdb *f;
+   u32 ifindex = 0;
+
+#if IS_ENABLED(CONFIG_IPV6)
+   if (src_ip->sa.sa_family == AF_INET6 &&
+   (ipv6_addr_type(_ip->sin6.sin6_addr) & IPV6_ADDR_LINKLOCAL))
+   ifindex = src_ifindex;
+#endif
 
f = vxlan_find_mac(vxlan, src_mac, vni);
if (likely(f)) {
struct vxlan_rdst *rdst = first_remote_rcu(f);
 
-   if (likely(vxlan_addr_equal(>remote_ip, src_ip)))
+   if (likely(vxlan_addr_equal(>remote_ip, src_ip) &&
+  rdst->remote_ifindex == ifindex))
return false;
 
/* Don't migrate static entries, drop packets */
@@ -977,7 +985,7 @@ static bool vxlan_snoop(struct net_device *dev,
 vxlan->cfg.dst_port,
 vni,
 vxlan->default_dst.remote_vni,
-0, NTF_SELF);
+ifindex, NTF_SELF);
spin_unlock(>hash_lock);
}
 
@@ -1246,6 +1254,7 @@ static bool vxlan_set_mac(struct vxlan_dev *vxlan,
  struct sk_buff *skb, __be32 vni)
 {
union vxlan_addr saddr;
+   u32 ifindex = skb->dev->ifindex;
 
skb_reset_mac_header(skb);
skb->protocol = eth_type_trans(skb, vxlan->dev);
@@ -1267,7 +1276,7 @@ static bool vxlan_set_mac(struct vxlan_dev *vxlan,
}
 
if ((vxlan->cfg.flags & VXLAN_F_LEARN) &&
-   vxlan_snoop(skb->dev, , eth_hdr(skb)->h_source, vni))
+   vxlan_snoop(skb->dev, , eth_hdr(skb)->h_source, ifindex, vni))
return false;
 
return true;
@@ -1978,7 +1987,8 @@ static void vxlan_encap_bypass(struct sk_buff *skb, 
struct vxlan_dev *src_vxlan,
}
 
if (dst_vxlan->cfg.flags & VXLAN_F_LEARN)
-   vxlan_snoop(skb->dev, , eth_hdr(skb)->h_source, vni);
+   vxlan_snoop(skb->dev, , eth_hdr(skb)->h_source, 0,
+   vni);
 
u64_stats_update_begin(_stats->syncp);
tx_stats->tx_packets++;
-- 
2.12.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 6/6] vxlan: allow multiple VXLANs with same VNI for IPv6 link-local addresses

2017-04-14 Thread Matthias Schiffer
As link-local addresses are only valid for a single interface, we can allow
to use the same VNI for multiple independent VXLANs, as long as the used
interfaces are distinct. This way, VXLANs can always be used as a drop-in
replacement for VLANs with greater ID space.

This also extends VNI lookup to respect the ifindex when link-local IPv6
addresses are used, so using the same VNI on multiple interfaces can
actually work.

Signed-off-by: Matthias Schiffer 
---

v2: use VXLAN_F_IPV6_LINKLOCAL in vxlan_vs_find_vni; rebase.

 drivers/net/vxlan.c | 73 ++---
 1 file changed, 47 insertions(+), 26 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 2995b57a1551..9054245f0770 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -224,7 +224,8 @@ static struct vxlan_sock *vxlan_find_sock(struct net *net, 
sa_family_t family,
return NULL;
 }
 
-static struct vxlan_dev *vxlan_vs_find_vni(struct vxlan_sock *vs, __be32 vni)
+static struct vxlan_dev *vxlan_vs_find_vni(struct vxlan_sock *vs, int ifindex,
+  __be32 vni)
 {
struct vxlan_dev *vxlan;
 
@@ -233,17 +234,27 @@ static struct vxlan_dev *vxlan_vs_find_vni(struct 
vxlan_sock *vs, __be32 vni)
vni = 0;
 
hlist_for_each_entry_rcu(vxlan, vni_head(vs, vni), hlist) {
-   if (vxlan->default_dst.remote_vni == vni)
-   return vxlan;
+   if (vxlan->default_dst.remote_vni != vni)
+   continue;
+
+   if (IS_ENABLED(CONFIG_IPV6)) {
+   const struct vxlan_config *cfg = >cfg;
+
+   if ((cfg->flags & VXLAN_F_IPV6_LINKLOCAL) &&
+   cfg->remote_ifindex != ifindex)
+   continue;
+   }
+
+   return vxlan;
}
 
return NULL;
 }
 
 /* Look up VNI in a per net namespace table */
-static struct vxlan_dev *vxlan_find_vni(struct net *net, __be32 vni,
-   sa_family_t family, __be16 port,
-   u32 flags)
+static struct vxlan_dev *vxlan_find_vni(struct net *net, int ifindex,
+   __be32 vni, sa_family_t family,
+   __be16 port, u32 flags)
 {
struct vxlan_sock *vs;
 
@@ -251,7 +262,7 @@ static struct vxlan_dev *vxlan_find_vni(struct net *net, 
__be32 vni,
if (!vs)
return NULL;
 
-   return vxlan_vs_find_vni(vs, vni);
+   return vxlan_vs_find_vni(vs, ifindex, vni);
 }
 
 /* Fill in neighbour message in skbuff. */
@@ -1342,7 +1353,7 @@ static int vxlan_rcv(struct sock *sk, struct sk_buff *skb)
 
vni = vxlan_vni(vxlan_hdr(skb)->vx_vni);
 
-   vxlan = vxlan_vs_find_vni(vs, vni);
+   vxlan = vxlan_vs_find_vni(vs, skb->dev->ifindex, vni);
if (!vxlan)
goto drop;
 
@@ -2006,8 +2017,10 @@ static void vxlan_encap_bypass(struct sk_buff *skb, 
struct vxlan_dev *src_vxlan,
 }
 
 static int encap_bypass_if_local(struct sk_buff *skb, struct net_device *dev,
-struct vxlan_dev *vxlan, union vxlan_addr 
*daddr,
-__be16 dst_port, __be32 vni, struct dst_entry 
*dst,
+struct vxlan_dev *vxlan,
+union vxlan_addr *daddr,
+__be16 dst_port, int dst_ifindex, __be32 vni,
+struct dst_entry *dst,
 u32 rt_flags)
 {
 #if IS_ENABLED(CONFIG_IPV6)
@@ -2023,7 +2036,7 @@ static int encap_bypass_if_local(struct sk_buff *skb, 
struct net_device *dev,
struct vxlan_dev *dst_vxlan;
 
dst_release(dst);
-   dst_vxlan = vxlan_find_vni(vxlan->net, vni,
+   dst_vxlan = vxlan_find_vni(vxlan->net, dst_ifindex, vni,
   daddr->sa.sa_family, dst_port,
   vxlan->cfg.flags);
if (!dst_vxlan) {
@@ -2055,6 +2068,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
struct dst_entry *ndst = NULL;
__be32 vni, label;
__u8 tos, ttl;
+   int ifindex;
int err;
u32 flags = vxlan->cfg.flags;
bool udp_sum = false;
@@ -2075,6 +2089,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device *dev,
 
dst_port = rdst->remote_port ? rdst->remote_port : 
vxlan->cfg.dst_port;
vni = (rdst->remote_vni) ? : default_vni;
+   ifindex = rdst->remote_ifindex;
local_ip = vxlan->cfg.saddr;
dst_cache = >dst_cache;
md->gbp = skb->mark;
@@ -2108,6 +2123,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct 
net_device 

[ovs-dev] [PATCH net-next v2 1/6] vxlan: refactor verification and application of configuration

2017-04-14 Thread Matthias Schiffer
The vxlan_dev_configure function was mixing validation and application of
the vxlan configuration; this could easily lead to bugs with the changelink
operation, as it was hard to see if the function wcould return an error
after parts of the configuration had already been applied.

This commit splits validation and application out of vxlan_dev_configure as
separate functions to make it clearer where error returns are allowed and
where the vxlan_dev or net_device may be configured.

In addition, some validation is moved to vxlan_validate and some
initialization to vxlan_setup where this improves grouping of similar
settings.

Finally, this also fixes two actual bugs:

* if set, conf->mtu would overwrite dev->mtu in each changelink operation,
  reverting other changes of dev->mtu
* the "if (!conf->dst_port)" branch would never be run, as conf->dst_port
  was set in vxlan_setup before. This caused VXLAN-GPE to use the same
  default port as other VXLAN sockets instead of the intended IANA-assigned
  4790.

Signed-off-by: Matthias Schiffer 
---

v2: new patch

 drivers/net/vxlan.c | 202 +---
 1 file changed, 114 insertions(+), 88 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index ebc98bb17a51..86471e961708 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2598,6 +2598,10 @@ static void vxlan_setup(struct net_device *dev)
netif_keep_dst(dev);
dev->priv_flags |= IFF_NO_QUEUE;
 
+   /* MTU range: 68 - 65535 */
+   dev->min_mtu = ETH_MIN_MTU;
+   dev->max_mtu = ETH_MAX_MTU;
+
INIT_LIST_HEAD(>next);
spin_lock_init(>hash_lock);
 
@@ -2605,9 +2609,8 @@ static void vxlan_setup(struct net_device *dev)
vxlan->age_timer.function = vxlan_cleanup;
vxlan->age_timer.data = (unsigned long) vxlan;
 
-   vxlan->cfg.dst_port = htons(vxlan_port);
-
vxlan->dev = dev;
+   vxlan->net = dev_net(dev);
 
gro_cells_init(>gro_cells, dev);
 
@@ -2676,11 +2679,19 @@ static int vxlan_validate(struct nlattr *tb[], struct 
nlattr *data[])
}
}
 
+   if (tb[IFLA_MTU]) {
+   u32 mtu = nla_get_u32(data[IFLA_MTU]);
+
+   if (mtu < ETH_MIN_MTU || mtu > ETH_MAX_MTU)
+   return -EINVAL;
+   }
+
if (!data)
return -EINVAL;
 
if (data[IFLA_VXLAN_ID]) {
-   __u32 id = nla_get_u32(data[IFLA_VXLAN_ID]);
+   u32 id = nla_get_u32(data[IFLA_VXLAN_ID]);
+
if (id >= VXLAN_N_VID)
return -ERANGE;
}
@@ -2839,55 +2850,36 @@ static int vxlan_sock_add(struct vxlan_dev *vxlan)
return ret;
 }
 
-static int vxlan_dev_configure(struct net *src_net, struct net_device *dev,
-  struct vxlan_config *conf,
-  bool changelink)
+static int vxlan_config_validate(struct net *src_net, struct vxlan_config 
*conf,
+struct net_device **lower,
+struct vxlan_dev *old)
 {
struct vxlan_net *vn = net_generic(src_net, vxlan_net_id);
-   struct vxlan_dev *vxlan = netdev_priv(dev), *tmp;
-   struct vxlan_rdst *dst = >default_dst;
-   unsigned short needed_headroom = ETH_HLEN;
+   struct vxlan_dev *tmp;
bool use_ipv6 = false;
-   __be16 default_port = vxlan->cfg.dst_port;
-   struct net_device *lowerdev = NULL;
 
-   if (!changelink) {
-   if (conf->flags & VXLAN_F_GPE) {
-   /* For now, allow GPE only together with
-* COLLECT_METADATA. This can be relaxed later; in such
-* case, the other side of the PtP link will have to be
-* provided.
-*/
-   if ((conf->flags & ~VXLAN_F_ALLOWED_GPE) ||
-   !(conf->flags & VXLAN_F_COLLECT_METADATA)) {
-   pr_info("unsupported combination of 
extensions\n");
-   return -EINVAL;
-   }
-   vxlan_raw_setup(dev);
-   } else {
-   vxlan_ether_setup(dev);
+   if (conf->flags & VXLAN_F_GPE) {
+   /* For now, allow GPE only together with
+* COLLECT_METADATA. This can be relaxed later; in such
+* case, the other side of the PtP link will have to be
+* provided.
+*/
+   if ((conf->flags & ~VXLAN_F_ALLOWED_GPE) ||
+   !(conf->flags & VXLAN_F_COLLECT_METADATA)) {
+   pr_info("unsupported combination of extensions\n");
+   return -EINVAL;
}
-
-   /* MTU range: 68 - 65535 */
-   dev->min_mtu = ETH_MIN_MTU;
-   dev->max_mtu = ETH_MAX_MTU;

[ovs-dev] [PATCH net-next v2 3/6] vxlan: improve validation of address family configuration

2017-04-14 Thread Matthias Schiffer
Address families of source and destination addresses must match, and
changelink operations can't change the address family.

In addition, always use the VXLAN_F_IPV6 to check if a VXLAN device uses
IPv4 or IPv6.

Signed-off-by: Matthias Schiffer 
---

v2: new patch

 drivers/net/vxlan.c | 39 ---
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index bdd19e5037b0..07f89b037681 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2459,10 +2459,7 @@ static int vxlan_change_mtu(struct net_device *dev, int 
new_mtu)
struct vxlan_rdst *dst = >default_dst;
struct net_device *lowerdev = __dev_get_by_index(vxlan->net,
 dst->remote_ifindex);
-   bool use_ipv6 = false;
-
-   if (dst->remote_ip.sa.sa_family == AF_INET6)
-   use_ipv6 = true;
+   bool use_ipv6 = !!(vxlan->cfg.flags & VXLAN_F_IPV6);
 
/* This check is different than dev->max_mtu, because it looks at
 * the lowerdev->mtu, rather than the static dev->max_mtu
@@ -2871,11 +2868,20 @@ static int vxlan_config_validate(struct net *src_net, 
struct vxlan_config *conf,
}
}
 
-   if (!conf->remote_ip.sa.sa_family)
+   if (!conf->remote_ip.sa.sa_family && !conf->saddr.sa.sa_family) {
+   /* Unless IPv6 is explicitly requested, assume IPv4 */
conf->remote_ip.sa.sa_family = AF_INET;
+   conf->saddr.sa.sa_family = AF_INET;
+   } else if (!conf->remote_ip.sa.sa_family) {
+   conf->remote_ip.sa.sa_family = conf->saddr.sa.sa_family;
+   } else if (!conf->saddr.sa.sa_family) {
+   conf->saddr.sa.sa_family = conf->remote_ip.sa.sa_family;
+   }
+
+   if (conf->saddr.sa.sa_family != conf->remote_ip.sa.sa_family)
+   return -EINVAL;
 
-   if (conf->remote_ip.sa.sa_family == AF_INET6 ||
-   conf->saddr.sa.sa_family == AF_INET6) {
+   if (conf->saddr.sa.sa_family == AF_INET6) {
if (!IS_ENABLED(CONFIG_IPV6))
return -EPFNOSUPPORT;
use_ipv6 = true;
@@ -2932,11 +2938,9 @@ static int vxlan_config_validate(struct net *src_net, 
struct vxlan_config *conf,
continue;
 
if (tmp->cfg.vni == conf->vni &&
-   (tmp->default_dst.remote_ip.sa.sa_family == AF_INET6 ||
-tmp->cfg.saddr.sa.sa_family == AF_INET6) == use_ipv6 &&
tmp->cfg.dst_port == conf->dst_port &&
-   (tmp->cfg.flags & VXLAN_F_RCV_FLAGS) ==
-   (conf->flags & VXLAN_F_RCV_FLAGS)) {
+   (tmp->cfg.flags & (VXLAN_F_RCV_FLAGS | VXLAN_F_IPV6)) ==
+   (conf->flags & (VXLAN_F_RCV_FLAGS | VXLAN_F_IPV6))) {
pr_info("duplicate VNI %u\n", be32_to_cpu(conf->vni));
return -EEXIST;
}
@@ -3069,22 +3073,35 @@ static int vxlan_nl2conf(struct nlattr *tb[], struct 
nlattr *data[],
}
 
if (data[IFLA_VXLAN_GROUP]) {
+   if (changelink && (conf->remote_ip.sa.sa_family != AF_INET))
+   return -EOPNOTSUPP;
+
conf->remote_ip.sin.sin_addr.s_addr = 
nla_get_in_addr(data[IFLA_VXLAN_GROUP]);
+   conf->remote_ip.sa.sa_family = AF_INET;
} else if (data[IFLA_VXLAN_GROUP6]) {
if (!IS_ENABLED(CONFIG_IPV6))
return -EPFNOSUPPORT;
 
+   if (changelink && (conf->remote_ip.sa.sa_family != AF_INET6))
+   return -EOPNOTSUPP;
+
conf->remote_ip.sin6.sin6_addr = 
nla_get_in6_addr(data[IFLA_VXLAN_GROUP6]);
conf->remote_ip.sa.sa_family = AF_INET6;
}
 
if (data[IFLA_VXLAN_LOCAL]) {
+   if (changelink && (conf->saddr.sa.sa_family != AF_INET))
+   return -EOPNOTSUPP;
+
conf->saddr.sin.sin_addr.s_addr = 
nla_get_in_addr(data[IFLA_VXLAN_LOCAL]);
conf->saddr.sa.sa_family = AF_INET;
} else if (data[IFLA_VXLAN_LOCAL6]) {
if (!IS_ENABLED(CONFIG_IPV6))
return -EPFNOSUPPORT;
 
+   if (changelink && (conf->saddr.sa.sa_family != AF_INET6))
+   return -EOPNOTSUPP;
+
/* TODO: respect scope id */
conf->saddr.sin6.sin6_addr = 
nla_get_in6_addr(data[IFLA_VXLAN_LOCAL6]);
conf->saddr.sa.sa_family = AF_INET6;
-- 
2.12.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH net-next v2 2/6] vxlan: get rid of redundant vxlan_dev.flags

2017-04-14 Thread Matthias Schiffer
There is no good reason to keep the flags twice in vxlan_dev and
vxlan_config.

Signed-off-by: Matthias Schiffer 
---

v2: new patch

 drivers/net/vxlan.c   | 76 +--
 include/net/vxlan.h   |  1 -
 net/openvswitch/vport-vxlan.c |  4 +--
 3 files changed, 39 insertions(+), 42 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 86471e961708..bdd19e5037b0 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -303,7 +303,7 @@ static int vxlan_fdb_info(struct sk_buff *skb, struct 
vxlan_dev *vxlan,
if (rdst->remote_vni != vxlan->default_dst.remote_vni &&
nla_put_u32(skb, NDA_VNI, be32_to_cpu(rdst->remote_vni)))
goto nla_put_failure;
-   if ((vxlan->flags & VXLAN_F_COLLECT_METADATA) && fdb->vni &&
+   if ((vxlan->cfg.flags & VXLAN_F_COLLECT_METADATA) && fdb->vni &&
nla_put_u32(skb, NDA_SRC_VNI,
be32_to_cpu(fdb->vni)))
goto nla_put_failure;
@@ -417,7 +417,7 @@ static u32 eth_vni_hash(const unsigned char *addr, __be32 
vni)
 static inline struct hlist_head *vxlan_fdb_head(struct vxlan_dev *vxlan,
const u8 *mac, __be32 vni)
 {
-   if (vxlan->flags & VXLAN_F_COLLECT_METADATA)
+   if (vxlan->cfg.flags & VXLAN_F_COLLECT_METADATA)
return >fdb_head[eth_vni_hash(mac, vni)];
else
return >fdb_head[eth_hash(mac)];
@@ -432,7 +432,7 @@ static struct vxlan_fdb *__vxlan_find_mac(struct vxlan_dev 
*vxlan,
 
hlist_for_each_entry_rcu(f, head, hlist) {
if (ether_addr_equal(mac, f->eth_addr)) {
-   if (vxlan->flags & VXLAN_F_COLLECT_METADATA) {
+   if (vxlan->cfg.flags & VXLAN_F_COLLECT_METADATA) {
if (vni == f->vni)
return f;
} else {
@@ -1266,7 +1266,7 @@ static bool vxlan_set_mac(struct vxlan_dev *vxlan,
 #endif
}
 
-   if ((vxlan->flags & VXLAN_F_LEARN) &&
+   if ((vxlan->cfg.flags & VXLAN_F_LEARN) &&
vxlan_snoop(skb->dev, , eth_hdr(skb)->h_source, vni))
return false;
 
@@ -1489,7 +1489,7 @@ static int arp_reduce(struct net_device *dev, struct 
sk_buff *skb, __be32 vni)
 
if (netif_rx_ni(reply) == NET_RX_DROP)
dev->stats.rx_dropped++;
-   } else if (vxlan->flags & VXLAN_F_L3MISS) {
+   } else if (vxlan->cfg.flags & VXLAN_F_L3MISS) {
union vxlan_addr ipa = {
.sin.sin_addr.s_addr = tip,
.sin.sin_family = AF_INET,
@@ -1649,7 +1649,7 @@ static int neigh_reduce(struct net_device *dev, struct 
sk_buff *skb, __be32 vni)
if (netif_rx_ni(reply) == NET_RX_DROP)
dev->stats.rx_dropped++;
 
-   } else if (vxlan->flags & VXLAN_F_L3MISS) {
+   } else if (vxlan->cfg.flags & VXLAN_F_L3MISS) {
union vxlan_addr ipa = {
.sin6.sin6_addr = msg->target,
.sin6.sin6_family = AF_INET6,
@@ -1682,7 +1682,7 @@ static bool route_shortcircuit(struct net_device *dev, 
struct sk_buff *skb)
return false;
pip = ip_hdr(skb);
n = neigh_lookup(_tbl, >daddr, dev);
-   if (!n && (vxlan->flags & VXLAN_F_L3MISS)) {
+   if (!n && (vxlan->cfg.flags & VXLAN_F_L3MISS)) {
union vxlan_addr ipa = {
.sin.sin_addr.s_addr = pip->daddr,
.sin.sin_family = AF_INET,
@@ -1703,7 +1703,7 @@ static bool route_shortcircuit(struct net_device *dev, 
struct sk_buff *skb)
return false;
pip6 = ipv6_hdr(skb);
n = neigh_lookup(ipv6_stub->nd_tbl, >daddr, dev);
-   if (!n && (vxlan->flags & VXLAN_F_L3MISS)) {
+   if (!n && (vxlan->cfg.flags & VXLAN_F_L3MISS)) {
union vxlan_addr ipa = {
.sin6.sin6_addr = pip6->daddr,
.sin6.sin6_family = AF_INET6,
@@ -1977,7 +1977,7 @@ static void vxlan_encap_bypass(struct sk_buff *skb, 
struct vxlan_dev *src_vxlan,
 #endif
}
 
-   if (dst_vxlan->flags & VXLAN_F_LEARN)
+   if (dst_vxlan->cfg.flags & VXLAN_F_LEARN)
vxlan_snoop(skb->dev, , eth_hdr(skb)->h_source, vni);
 
u64_stats_update_begin(_stats->syncp);
@@ -2015,7 +2015,7 @@ static int encap_bypass_if_local(struct sk_buff *skb, 
struct net_device *dev,
dst_release(dst);
dst_vxlan = vxlan_find_vni(vxlan->net, vni,
   daddr->sa.sa_family, dst_port,
-  vxlan->flags);
+  

[ovs-dev] [PATCH net-next v2 0/6] vxlan: cleanup and IPv6 link-local support

2017-04-14 Thread Matthias Schiffer
Running VXLANs over IPv6 link-local addresses allows to use them as a
drop-in replacement for VLANs, avoiding to allocate additional outer IP
addresses to run the VXLAN over.

Since v1, I have added a lot more consistency checks to the address
configuration, making sure address families and scopes match. To simplify
the implementation, I also did some general refactoring of the
configuration handling in the new first patch of the series.

The second patch is more cleanup; is slightly touches OVS code, so that
list is in CC this time, too.

As in v1, the last two patches actually make VXLAN over IPv6 link-local
work, and allow multiple VXLANs wit the same VNI and port, as long as
link-local addresses on different interfaces are used. As suggested, I now
store in the flags field if the VXLAN uses link-local addresses or not.


Matthias Schiffer (6):
  vxlan: refactor verification and application of configuration
  vxlan: get rid of redundant vxlan_dev.flags
  vxlan: improve validation of address family configuration
  vxlan: check valid combinations of address scopes
  vxlan: fix snooping for link-local IPv6 addresses
  vxlan: allow multiple VXLANs with same VNI for IPv6 link-local
addresses

 drivers/net/vxlan.c   | 411 ++
 include/net/vxlan.h   |   3 +-
 net/openvswitch/vport-vxlan.c |   4 +-
 3 files changed, 263 insertions(+), 155 deletions(-)

-- 
2.12.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN meeting report

2017-04-14 Thread Ben Pfaff
On Fri, Apr 14, 2017 at 02:48:40PM +0500, Valentine Sinitsyn wrote:
> On 13.04.2017 20:53, Ben Pfaff wrote:
> >On Wed, Apr 12, 2017 at 06:09:28PM +0500, Valentine Sinitsyn wrote:
> >>Is there some design outline for the missing implementation bits?
> >>Specifically, it would be good to know the following:
> >>
> >>1. With clustered OVSDB, a client such as IDL needs two JSON RPC
> >>connections: to the leader (to commit transactions), and a read-only one to
> >>an arbitrary replica set (scaling reads). Will it be implemented on
> >>ovsdb_idl level or encapsulated inside jsonrpc_session? The former seems
> >>natural yet multiple remotes support went to jsonrpc_session already.
> >
> >There are multiple possible approaches here.  The one that I am planning
> >to try out first is to have a client connect to only one randomly
> >selected server, and then have that server be responsible for relaying
> >write transactions to the leader.
> Yes, this is an option. However, our tests suggest that ovsdb-server doesn't
> scale well with respect to (hundreds to thousands) connections. This relay
> approach adds at most one new connection within the cluster per new client
> connection, which could be a bottleneck.

Relaying will take place over the Raft connections among the servers in
the cluster, not over the OVSDB JSON-RPC connections.  The Raft
connections are per-server (although there are N**2 of them for N
servers), so it shouldn't introduce additional per-client connections to
the cluster.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Is there a non recursive version of json_destroy and json_serialize?

2017-04-14 Thread Ben Pfaff
It seems unlikely that recursion would be the problem, because OVS does
not serialize deeply nested JSON.

On Fri, Apr 14, 2017 at 05:50:06PM +0800, 网络尖兵 wrote:
> Hello, everyone!
> I am trying to port openvswitch to MIPS64 platform. A segfault problem was 
> occurred  when it was sending a huge json buffer between ovsdb-server and 
> ovsdb-client through its monitor ALL command. I found that, in the OVS 
> version I use, the json_destroy  and json_serialize function are recursive 
> function.  I am deeply doubting the stack was overflowed by these operation.
> I send this mail to see if anyone else have encountered this problem and is 
> there a non recursive version of json_destroy and json_serialize? Or if 
> anyone could give me some suggestion on how to resolve it?
> The following are two typical call trace, and the ovsdb-server program core 
> dumped at very different place every time.
> 
> 1.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x00fff464a154 in hmap_remove (node=0xfff36e89c0, hmap=0xfff36313c0) 
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/hmap.h:245
> 245 /usr/src/debug/ops-openvswitch/git999-r0/lib/hmap.h: No such file or 
> directory.
> (gdb) bt
> #0  0x00fff464a154 in hmap_remove (node=0xfff36e89c0, hmap=0xfff36313c0) 
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/hmap.h:245
> #1  shash_steal (sh=0xfff36313c0, node=0xfff36e89c0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/shash.c:187
> #2  0x00fff464a1cc in shash_delete (sh=, node= out>) at /usr/src/debug/ops-openvswitch/git999-r0/lib/shash.c:176
> #3  0x00fff463ae34 in json_destroy_object (object=0xfff36313c0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:386
> #4  json_destroy (json=0xfff36313a0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:352
> #5  0x00fff463ae24 in json_destroy_object (object=0xfff38e9420) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:385
> #6  json_destroy (json=0xfff38e9400) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:352
> #7  0x00fff463ae24 in json_destroy_object (object=0xfff38cd600) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:385
> #8  json_destroy (json=0xfff38cd620) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:352
> #9  0x00fff465e100 in jsonrpc_send (rpc=0xfff3832680, msg= out>) at /usr/src/debug/ops-openvswitch/git999-r0/lib/jsonrpc.c:261
> #10 0x00fff465ed30 in jsonrpc_session_send (s=, 
> msg=)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/jsonrpc.c:1029
> #11 0x00fff43ee87c in ovsdb_jsonrpc_session_send (s=s@entry=0xfff3619e00, 
> msg=0xfff36dd170)
> at /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/jsonrpc-server.c:1290
> #12 0x00fff43f2ac4 in ovsdb_jsonrpc_session_got_request 
> (request=0xfff381cbd0, s=)
> at /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/jsonrpc-server.c:1256
> #13 ovsdb_jsonrpc_session_run (s=) at 
> /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/jsonrpc-server.c:523
> #14 ovsdb_jsonrpc_session_run_all (remote=) at 
> /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/jsonrpc-server.c:562
> #15 ovsdb_jsonrpc_server_run (svr=0x0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/jsonrpc-server.c:366
> #16 0x000120005d60 in main_loop (exiting=0xffdfaaf8d0, run_process=0x0, 
> remotes=0xffdfaaf888, unixctl=0x12010e4a0, all_dbs=0xffdfaaf818, 
> jsonrpc=0x0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/ovsdb-server.c:161
> #17 main (argc=1, argv=0x12fc19e90) at 
> /usr/src/debug/ops-openvswitch/git999-r0/ovsdb/ovsdb-server.c:357
> 
> 
> 
> (gdb) bt
> #0  json_serialize (json=0x3fd28f747, s=s@entry=0xffdfecf5a0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1499
> #1  0x00ffe999a024 in json_serialize_array (s=0xffdfecf5a0, 
> array=0xffe8818ec8)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1617
> #2  json_serialize (json=0xffe8818ec0, s=s@entry=0xffdfecf5a0) at 
> /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1517
> #3  0x00ffec80 in json_serialize_object_member (i=i@entry=0, 
> node=node@entry=0xffe8818d60, s=s@entry=0xffdfecf5a0)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1563
> #4  0x00ffee60 in json_serialize_object (s=0xffdfecf5a0, 
> object=0xffe8818de0)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1592
> #5  json_serialize (json=, s=s@entry=0xffdfecf5a0)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1513
> #6  0x00ffe999d210 in json_to_ds (json=, flags= out>, ds=)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/json.c:1491
> #7  0x00ffe99bde70 in jsonrpc_send (rpc=0xffe8c32680, msg=)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/jsonrpc.c:252
> #8  0x00ffe99be9a8 in jsonrpc_session_send (s=, 
> msg=)
> at /usr/src/debug/ops-openvswitch/git999-r0/lib/jsonrpc.c:1013
> #9  0x00ffe974e87c in ovsdb_jsonrpc_session_send (s=s@entry=0xffe89b8600, 
> 

Re: [ovs-dev] OVN meeting report

2017-04-14 Thread Valentine Sinitsyn

Hi Ben,

On 13.04.2017 20:53, Ben Pfaff wrote:

On Wed, Apr 12, 2017 at 06:09:28PM +0500, Valentine Sinitsyn wrote:

Hi,

On 04.04.2017 15:29, Valentine Sinitsyn wrote:

On 03.04.2017 20:29, Valentine Sinitsyn wrote:

Hi Ben,

On 23.03.2017 08:11, Ben Pfaff wrote:

Hello everyone.  I am not sure whether I am going to be able to attend
the OVN meeting tomorrow, because I will be in another possibly
distracting meeting, so I'm going to give my report here.

Toward the end of last week I did a full pass of reviews through
patchwork.  The most notable result, I think, is that I applied patches
that add 802.1ad support.  For OVN, this makes it more reasonable to
consider adding support for tagged logical ports--currently, OVN drops
all tagged logical packets--which I've heard requested once or twice,
because it means that they can now be gatewayed to physical ports within
an outer VLAN.  I don't have any plans to work on that, but I think that
it is worth pointing out.

The OVS "Open Source Day" talks have been scheduled at OpenStack
Boston.  They are all on Wednesday:
https://www.openstack.org/summit/boston-2017/summit-schedule/#track=135

I've been spending what dev time I have on database clustering.  Today,
I managed to get it working, with many caveats.  It will take weeks or
months longer to get it finished, tested, and ready for posting.  (If
you want what I have, check out the raft3 branch in my ovs-reviews repo
at github.)

I've checked out your raft3 branch, and even learned how to create an
OVSDB cluster. Thanks for the docs!

What I don't get though is how do I instruct IDL to connect to the
cluster now? Do I just connect to a random server, or there should be
some dispatcher, or whatever?

OK I see this is an ongoing work in your branch.


I had some time to play with raft3 branch last week.

I added very basic and hacky replica set support to IDL and brought up an
OVN setup with clustered southbound database. It works to some extent, yet
if I try to throw several hundreds of logical ports into the mix, the
database becomes inconsistent. The reason is probably the race window
between when the raft leader appends a log entry to other nodes (so a client
such as ovn-northd already sees it) and the entry really appears in the
leader's log itself. Not sure if it is my bug or not. The original code had
some minor issues as well (which is absolutely normal for WIP) - I can send
my (rather trivial) patches if there is any interest.


I'm not surprised that there are inconsistency bugs.  The testing I've
done so far is really sketchy.  Let me assure you that I will implement
much more thorough testing before I will propose anything to be merged.

Sure, I didn't expect it to be bug free either.




Is there some design outline for the missing implementation bits?
Specifically, it would be good to know the following:

1. With clustered OVSDB, a client such as IDL needs two JSON RPC
connections: to the leader (to commit transactions), and a read-only one to
an arbitrary replica set (scaling reads). Will it be implemented on
ovsdb_idl level or encapsulated inside jsonrpc_session? The former seems
natural yet multiple remotes support went to jsonrpc_session already.


There are multiple possible approaches here.  The one that I am planning
to try out first is to have a client connect to only one randomly
selected server, and then have that server be responsible for relaying
write transactions to the leader.
Yes, this is an option. However, our tests suggest that ovsdb-server 
doesn't scale well with respect to (hundreds to thousands) connections. 
This relay approach adds at most one new connection within the cluster 
per new client connection, which could be a bottleneck.


Thanks,
Valentine




2. How does the client know which replica set member is currently a leader?
I just loop over remotes until one accepts the transaction (which is an
awful idea). It would be nice to send some sort of cluster metadata snapshot
to JSON RPC client during initial handshake. Alternatively, one can extend
the "not leader" error object with a leader URL.


If we do adopt the idea that followers relay write transactions to the
leader, then the client doesn't need to know the leader.  But if that
isn't practical, then the Raft thesis, section 6.2, suggests the same
idea as you did, of having the follower point to the leader if it knows
it.


3. For eventual consistency reasons, if an IDL reads from one member (A) but
writes to another one (B), it can try to delete a row not yet in A's
database. This would make all further requests fail with "inconsistent data"
error and basically is what I observe in my tests. How do you plan to
overcome this?


This sounds like a bug in the existing code (not too surprising).  What
is supposed to happen is that the client waits until it receives updated
data from the server, which it knows will eventually arrive because it
knows that its write was against an inconsistent copy.  Then, it