Re: [ovs-dev] [PATCH ovs V2 00/21] Introducing HW offload support for openvswitch

2017-02-02 Thread Simon Horman
Hi Paul,

On Sun, Dec 25, 2016 at 01:39:28PM +0200, Paul Blakey wrote:
> This patch series introduces rule offload functionality to dpif-netlink 
> via netdev ports new flow offloading API. The user can specify whether to
> enable rule offloading or not via OVS configuration. Netdev providers
> are able to implement netdev flow offload API in order to offload rules.

...

I am wondering if this series is available as a git branch somewhere.
As there are a number of patches it would help me greatly.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs V2 00/21] Introducing HW offload support for openvswitch

2017-02-02 Thread Roi Dayan



On 02/02/2017 12:45, Simon Horman wrote:

Hi Paul,

On Sun, Dec 25, 2016 at 01:39:28PM +0200, Paul Blakey wrote:

This patch series introduces rule offload functionality to dpif-netlink
via netdev ports new flow offloading API. The user can specify whether to
enable rule offloading or not via OVS configuration. Netdev providers
are able to implement netdev flow offload API in order to offload rules.


...

I am wondering if this series is available as a git branch somewhere.
As there are a number of patches it would help me greatly.



Hi Simon,

This series with some additional changes since the we submitted it to 
the list is available here:

https://github.com/roidayan/ovs/commits/hw-offload

We are also working for V3 submission so if you plan some patches over 
this series you should check this branch:

https://github.com/roidayan/ovs/commits/hw-offload-fixes-for-v3-4

Though most of the commit titles might look the same in both branches - 
they are not.


Thanks,
Roi
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Maxime Coquelin



On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:

On Wed, Feb 01, 2017 at 12:33:22PM +0100, Maxime Coquelin wrote:



On 02/01/2017 10:43 AM, Daniel P. Berrange wrote:

On Wed, Feb 01, 2017 at 10:14:54AM +0100, Michal Privoznik wrote:

On 02/01/2017 09:35 AM, Maxime Coquelin wrote:



Solution 3: Libvirt queries OVS for vhost backend version string: *OK*
==


 The idea is to have a table of supported versions, associated to
key/value pairs. Libvirt could query the list of supported versions
strings for each hosts, and select the first common one among all hosts.


How does libvirt know what hosts to ask? Libvirt aims on managing a
single host. It has no knowledge of other hosts on the network. That's
task for upper layers like RHEV, OpenStack, etc.



 Then, libvirt would ask OVS to probe the vhost-user interfaces in the
selected version (compatibility mode). For example host A runs OVS-2.7,
and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode
(e.g. with indirect descriptors disabled), which should be selected at
vhost-user interface probe time.

 Advantage of doing so is that libvirt does not need any update if new
keys are introduced (i.e. it does not need to know how the new keys have
to be handled), all these checks remain in OVS's vhost-user implementation.


And that's where they should stay. Duplicating code between projects
will inevitably lead to a divergence.



 Ideally, we would support per vhost-user interface compatibility mode,
which may have an impact also on DPDK API, as the Virtio feature update
API is global, and not per port.


In general, I don't think we want any kind of this logic in libvirt. Either:

a) fallback logic should be implemented in qemu (e.g. upon migration it
should detect that the migrated guest uses certain version and thus set
backend to use that version or error out and cancel migration), or

b) libvirt would grew another element/attribute to specify version of
vhost-user backend in use and do nothing more than pass it to qemu. At
the same time, we can provide an API (or extend and existing one, e.g.
virsh domcapabilities) to list all available versions on given host.
Upper layer, which knows what are the possible hosts suitable for
virtualization, can then use this API to ask all the hosts, construct
the matrix, select preferred version and put it into libvirt's domain XML.

But frankly, I don't like b) that much. Lets put the fact this is OVS
aside for a moment. Just pretend this is a generic device in qemu. Would
we do the same magic with it? No! Or lets talk about machine types. You
spawn -M type$((X+1)) guest and then decide to migrate it to a host with
older qemu wich supports just typeX. Well, you get an error. Do we care?
Not at all! It's your responsibility (as user/admin) to upgrade the qemu
so that it supports new machine type. I think the same applies to OVS.


With machine types, if the latest machine type is X, libvirt allows
the mgmt app to spawn a guest with mcahine type X-1, so that you can
later migrate the VM to a host with older QEMU.

With the vhost user device, the VM will always be launched with version
Y. There's currently no way to request launching the vhost user device
wtih version Y-1. So even if you set your machine type to X-1 for
compat with older host, vhost user will be stuck at version Y preventing
the migration.

One argument would be to say that the vhost user featureset should be
determined by the machine type version, instead of introducing a new
version. The complexity is that vhost-user is a pretty dumb device
from QEMUs POV - most of the interesting logic & the features that
need to be constrained lie in code outside of QEMU, in whatever
server is connected to the vhost user socket.

So I can see the value in allowing a simple version string to be
associated with the vhost-user NIC.

What I'm unclear about is how we would be able to report capabilities
for a host to enumerate what versions were possible. Libvirt doesn't
interact with any of the 3rd party vhost user servers, so it is probably
out of scope - it'd be upto the higher level mgmt app to talk to DPDK
and figure out what versions it supports.

That does make me wonder though if libvirt & QEMU need to be involved
at all in any part.


Indeed, if the higher level management tool decides for the VM's machine
type, it is where it should also be done for the vhost-user backend. I
now understand this does not make much sense to have libvirt being
involved in this, all (querying/selecting/setting compat mode) should be
manageable in the upper layer.

I'm not familiar with these layers, so your inputs are really helpful.



When provisioning a new guest, the mgmt app presumably has to talk to
DPDK to setup the NIC port, so DPDK is ready when QEMU launches and
connects. Surely as part of that NIC port setup, it could directly
tell DPDK which version or featureset to permit on the port ? I

[ovs-dev] Kickstart 2017 sales

2017-02-02 Thread David Baker
Hi,

Would you be interested in list cleansing service for your outdated customer
database and start your new year off right?

United E-solutions is here to help you get started your 2017 marketing
projects on the right way with our list cleansing services. Don’t waste your
time on your outdated data, let UES clean up your data and save you precious
time & money.

 Our List Cleansing service includes:

ü  Fresh Email & Phone Appending

ü  Email Address Verification

ü  Inter file Email duplication removal

ü  Domain name correction

ü  Hard & Soft Bounce email separations

ü  Opt-outs separation

ü  Postal Address Appending

ü  Direct Phone & Fax Appending etc.

 

If you are looking for particular email lists from any of your target
geography please let me know and I will get back to you with more
information regarding the same.

 

Feel free to call this number 610-572-4885 to speak with me.


Best Regards,

David Baker | Business Development Associate

Ph: 610-572-4885

To opt-out please type REMOVE in subject line.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Daniel P. Berrange
On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> 
> 
> On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > 
> > It depends where / how in OVS it needs to be set. The only stuff libvirt
> > does with OVS is to run 'add-port' and 'del-port' commands via the ovs
> > cli tool. We pass through arguments from the port profile stored in the
> > XML config.
> > 
> >   
> > 
> > 
> >> interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > 
> >   
> > 
> > eg those things in  get passed as cli args to the 'add-port'
> > command. Soo if add-port needs this new version string, then we'd need
> > to add the version to the openvswitch virtualport XML.
> > 
> > If the version is provided to OVS in a different command, then it would
> > probably be outside scope of libvirt.
> 
> I think it would make sense to be a parameter of the add-port command.
> But it would be for vhost-user related add-port command, I didn't find
> where/if this is managed in libvirt XML.

For vhost-user, libvirt does not have any interaction with OVS at
all. If the thing that's using the vhost-user UNIX socket, in turn
connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
OVS it seems like that job is for Nova / os-vif to solve.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH RFC 0/1] Remove experimental tag for OVS userspace

2017-02-02 Thread Ian Stokes
Since the introduction of DPDK to OVS, the userspace deployment of OVS has
been considered "experimental". A number of areas were identified to be
improved upon before the tag could be removed.

There has been a conscious effort to address these areas. The improvements
to userspace (specifically regarding DPDK) include:

1. Improved documentation for user deployment and feature usage with DPDK.
2. Improved feature parity between kernel and userspace with DPDK.
3. Introduction of unit tests for DPDK specific components.
4. Availability of deployment/configuration of OVS and DPDK via multiple OS
vendor repos.

As such this patch proposes the removal of the experimental warning tag for
OVS userspace deployments throughout the documentation.

Ian Stokes (1):
  doc: Remove "experimental" warning for userspace.

 Documentation/intro/install/dpdk.rst  |3 ---
 Documentation/intro/install/userspace.rst |4 
 README.rst|6 +++---
 3 files changed, 3 insertions(+), 10 deletions(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/1] doc: Remove "experimental" warning for userspace.

2017-02-02 Thread Ian Stokes
Remove the experimental warning tag in documentation regarding OVS deployed
via userspace.

Signed-off-by: Ian Stokes 
---
 Documentation/intro/install/dpdk.rst  |3 ---
 Documentation/intro/install/userspace.rst |4 
 README.rst|6 +++---
 3 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index fff0a1a..3018590 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -29,9 +29,6 @@ This document describes how to build and install Open vSwitch 
using a DPDK
 datapath. Open vSwitch can use the DPDK library to operate entirely in
 userspace.
 
-.. warning::
-  The DPDK support of Open vSwitch is considered 'experimental'.
-
 Build requirements
 --
 
diff --git a/Documentation/intro/install/userspace.rst 
b/Documentation/intro/install/userspace.rst
index 0368527..ebd0c12 100644
--- a/Documentation/intro/install/userspace.rst
+++ b/Documentation/intro/install/userspace.rst
@@ -34,10 +34,6 @@ This version of Open vSwitch should be built manually with 
``configure`` and
 been recently tested, and so Debian packages are not a recommended way to use
 this version of Open vSwitch.
 
-.. warning::
-  The userspace-only mode of Open vSwitch is considered experimental.  It has
-  not been thoroughly tested.
-
 Building and Installing
 ---
 
diff --git a/README.rst b/README.rst
index f5cdaa5..90050e3 100644
--- a/README.rst
+++ b/README.rst
@@ -38,9 +38,9 @@ following features:
 
 The included Linux kernel module supports Linux 3.10 and up.
 
-Open vSwitch can also operate, at a cost in performance, entirely in userspace,
-without assistance from a kernel module.  This userspace implementation should
-be easier to port than the kernel-based switch.  It is considered experimental.
+Open vSwitch can also operate entirely in userspace without assistance from
+a kernel module.  This userspace implementation should be easier to port than
+the kernel-based switch.
 
 What's here?
 
-- 
1.7.0.7

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovn-controller: Provide the option to set Encap.options:csum

2017-02-02 Thread Numan Siddique
On Wed, Feb 1, 2017 at 4:31 AM, Ben Pfaff  wrote:

> On Sun, Jan 15, 2017 at 12:36:09PM +0530, Numan Siddique wrote:
> > ovn-controller by default enables tunnel encapsulation checksums
> > for geneve tunnels. With this patch user can set the desired value
> > in Open_vSwitch.external_ids:ovn_encap_csum.
> >
> > This option will be useful in cases where enabling tunnel
> > encapsulation checksums incur significant performance loss due to
> > limitations in checksum offloading capabilities of the nics.
> >
> > Signed-off-by: Numan Siddique 
>
> Thanks.  I applied this to master.
>

​Thanks Ben for the review and applying it. Can this be applied to branch
2.7 as well ?

Numan
​
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovn-controller: Provide the option to set Encap.options:csum

2017-02-02 Thread Russell Bryant
On Thu, Feb 2, 2017 at 10:32 AM, Numan Siddique  wrote:

> On Wed, Feb 1, 2017 at 4:31 AM, Ben Pfaff  wrote:
>
> > On Sun, Jan 15, 2017 at 12:36:09PM +0530, Numan Siddique wrote:
> > > ovn-controller by default enables tunnel encapsulation checksums
> > > for geneve tunnels. With this patch user can set the desired value
> > > in Open_vSwitch.external_ids:ovn_encap_csum.
> > >
> > > This option will be useful in cases where enabling tunnel
> > > encapsulation checksums incur significant performance loss due to
> > > limitations in checksum offloading capabilities of the nics.
> > >
> > > Signed-off-by: Numan Siddique 
> >
> > Thanks.  I applied this to master.
> >
>
> ​Thanks Ben for the review and applying it. Can this be applied to branch
> 2.7 as well ?
>

I'm supportive of backporting it.

Just to elaborate on why we care ...

With checksums enabled, we are not able to make use of the
tx-udp_tnl-segmentation offload.  We will be able to leave checksums
enabled once we also have support for tx-udp_tnl-csum-segmentation, which
is present in the upstream kernel but not the RHEL kernel yet.  So, right
now to get the best overall performance, we have to disable checksums (for
current CentOS / RHEL, anyway).

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Michael S. Tsirkin
On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > 
> > 
> > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > 
> > > It depends where / how in OVS it needs to be set. The only stuff libvirt
> > > does with OVS is to run 'add-port' and 'del-port' commands via the ovs
> > > cli tool. We pass through arguments from the port profile stored in the
> > > XML config.
> > > 
> > >   
> > > 
> > > 
> > >> > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > 
> > >   
> > > 
> > > eg those things in  get passed as cli args to the 'add-port'
> > > command. Soo if add-port needs this new version string, then we'd need
> > > to add the version to the openvswitch virtualport XML.
> > > 
> > > If the version is provided to OVS in a different command, then it would
> > > probably be outside scope of libvirt.
> > 
> > I think it would make sense to be a parameter of the add-port command.
> > But it would be for vhost-user related add-port command, I didn't find
> > where/if this is managed in libvirt XML.
> 
> For vhost-user, libvirt does not have any interaction with OVS at
> all. If the thing that's using the vhost-user UNIX socket, in turn
> connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> OVS it seems like that job is for Nova / os-vif to solve.
> 
> Regards,
> Daniel

I don't think they currently understand the issues involved in
cross-version migration though.  This is a complex subject and easy to
get wrong.  It would be significantly better to figure it out at libvirt
level since it already deals with cross-version migration issues.


> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] doc: Update DPDK version for 2.7 relesase.

2017-02-02 Thread Ian Stokes
Add DPDK version required for the OVS 2.7 release in documentation.

Signed-off-by: Ian Stokes 
---
 Documentation/faq/releases.rst |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index fcff5c3..319c2d7 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -159,6 +159,7 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.4.x2.0
 2.5.x2.2
 2.6.x16.07
+2.7.x16.11
  =
 
 Q: I get an error like this when I configure Open vSwitch::
-- 
1.7.0.7

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: Update DPDK version for 2.7 relesase.

2017-02-02 Thread Stokes, Ian
> Subject: [PATCH] doc: Update DPDK version for 2.7 relesase.
> 
> Add DPDK version required for the OVS 2.7 release in documentation.
> 
> Signed-off-by: Ian Stokes 
> ---
>  Documentation/faq/releases.rst |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/faq/releases.rst
> b/Documentation/faq/releases.rst index fcff5c3..319c2d7 100644
> --- a/Documentation/faq/releases.rst
> +++ b/Documentation/faq/releases.rst
> @@ -159,6 +159,7 @@ Q: What DPDK version does each Open vSwitch release
> work with?
>  2.4.x2.0
>  2.5.x2.2
>  2.6.x16.07
> +2.7.x16.11
>   =
> 
>  Q: I get an error like this when I configure Open vSwitch::
> --
> 1.7.0.7

Please ignore, will resend to correct commit message typo.

Ian

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] doc: Remove "experimental" warning for userspace.

2017-02-02 Thread Kevin Traynor
On 02/02/2017 03:21 PM, Ian Stokes wrote:
> Remove the experimental warning tag in documentation regarding OVS deployed
> via userspace.
> 
> Signed-off-by: Ian Stokes 
> ---
>  Documentation/intro/install/dpdk.rst  |3 ---
>  Documentation/intro/install/userspace.rst |4 
>  README.rst|6 +++---
>  3 files changed, 3 insertions(+), 10 deletions(-)
> 

This should also be documented in the NEWS file. With that

Acked-by: Kevin Traynor 

> diff --git a/Documentation/intro/install/dpdk.rst 
> b/Documentation/intro/install/dpdk.rst
> index fff0a1a..3018590 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -29,9 +29,6 @@ This document describes how to build and install Open 
> vSwitch using a DPDK
>  datapath. Open vSwitch can use the DPDK library to operate entirely in
>  userspace.
>  
> -.. warning::
> -  The DPDK support of Open vSwitch is considered 'experimental'.
> -
>  Build requirements
>  --
>  
> diff --git a/Documentation/intro/install/userspace.rst 
> b/Documentation/intro/install/userspace.rst
> index 0368527..ebd0c12 100644
> --- a/Documentation/intro/install/userspace.rst
> +++ b/Documentation/intro/install/userspace.rst
> @@ -34,10 +34,6 @@ This version of Open vSwitch should be built manually with 
> ``configure`` and
>  been recently tested, and so Debian packages are not a recommended way to use
>  this version of Open vSwitch.
>  
> -.. warning::
> -  The userspace-only mode of Open vSwitch is considered experimental.  It has
> -  not been thoroughly tested.
> -
>  Building and Installing
>  ---
>  
> diff --git a/README.rst b/README.rst
> index f5cdaa5..90050e3 100644
> --- a/README.rst
> +++ b/README.rst
> @@ -38,9 +38,9 @@ following features:
>  
>  The included Linux kernel module supports Linux 3.10 and up.
>  
> -Open vSwitch can also operate, at a cost in performance, entirely in 
> userspace,
> -without assistance from a kernel module.  This userspace implementation 
> should
> -be easier to port than the kernel-based switch.  It is considered 
> experimental.
> +Open vSwitch can also operate entirely in userspace without assistance from
> +a kernel module.  This userspace implementation should be easier to port than
> +the kernel-based switch.
>  
>  What's here?
>  
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] doc: Update DPDK version for 2.7 release.

2017-02-02 Thread Ian Stokes
Add DPDK version required for the OVS 2.7 release in documentation.

Signed-off-by: Ian Stokes 
---
 Documentation/faq/releases.rst |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index fcff5c3..319c2d7 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -159,6 +159,7 @@ Q: What DPDK version does each Open vSwitch release work 
with?
 2.4.x2.0
 2.5.x2.2
 2.6.x16.07
+2.7.x16.11
  =
 
 Q: I get an error like this when I configure Open vSwitch::
-- 
1.7.0.7

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] doc: Remove "experimental" warning for userspace.

2017-02-02 Thread Stokes, Ian
> On 02/02/2017 03:21 PM, Ian Stokes wrote:
> > Remove the experimental warning tag in documentation regarding OVS
> > deployed via userspace.
> >
> > Signed-off-by: Ian Stokes 
> > ---
> >  Documentation/intro/install/dpdk.rst  |3 ---
> >  Documentation/intro/install/userspace.rst |4 
> >  README.rst|6 +++---
> >  3 files changed, 3 insertions(+), 10 deletions(-)
> >
> 
> This should also be documented in the NEWS file. With that
> 
> Acked-by: Kevin Traynor 
> 
Thanks, I wasn't sure whether to add it there as it isn't a new feature as such 
but will send a v2 with that addition.

Ian
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/1] doc: Remove "experimental" warning for userspace.

2017-02-02 Thread Ian Stokes
Remove the experimental warning tag in documentation regarding OVS deployed
via userspace.

Signed-off-by: Ian Stokes 
---
 Documentation/intro/install/dpdk.rst  |3 ---
 Documentation/intro/install/userspace.rst |4 
 NEWS  |2 ++
 README.rst|6 +++---
 4 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index fff0a1a..3018590 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -29,9 +29,6 @@ This document describes how to build and install Open vSwitch 
using a DPDK
 datapath. Open vSwitch can use the DPDK library to operate entirely in
 userspace.
 
-.. warning::
-  The DPDK support of Open vSwitch is considered 'experimental'.
-
 Build requirements
 --
 
diff --git a/Documentation/intro/install/userspace.rst 
b/Documentation/intro/install/userspace.rst
index 0368527..ebd0c12 100644
--- a/Documentation/intro/install/userspace.rst
+++ b/Documentation/intro/install/userspace.rst
@@ -34,10 +34,6 @@ This version of Open vSwitch should be built manually with 
``configure`` and
 been recently tested, and so Debian packages are not a recommended way to use
 this version of Open vSwitch.
 
-.. warning::
-  The userspace-only mode of Open vSwitch is considered experimental.  It has
-  not been thoroughly tested.
-
 Building and Installing
 ---
 
diff --git a/NEWS b/NEWS
index 5efcce2..8600f0e 100644
--- a/NEWS
+++ b/NEWS
@@ -3,6 +3,8 @@ Post-v2.7.0
- Tunnels:
  * Added support to set packet mark for tunnel endpoint using
`egress_pkt_mark` OVSDB option.
+   - Documentation:
+ * OVS deployed in userspace mode no longer tagged as experimental.
 
 v2.7.0 - xx xxx 
 -
diff --git a/README.rst b/README.rst
index f5cdaa5..90050e3 100644
--- a/README.rst
+++ b/README.rst
@@ -38,9 +38,9 @@ following features:
 
 The included Linux kernel module supports Linux 3.10 and up.
 
-Open vSwitch can also operate, at a cost in performance, entirely in userspace,
-without assistance from a kernel module.  This userspace implementation should
-be easier to port than the kernel-based switch.  It is considered experimental.
+Open vSwitch can also operate entirely in userspace without assistance from
+a kernel module.  This userspace implementation should be easier to port than
+the kernel-based switch.
 
 What's here?
 
-- 
1.7.0.7

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 0/1] Remove experimental tag for OVS userspace

2017-02-02 Thread Ian Stokes
Since the introduction of DPDK to OVS, the userspace deployment of OVS has
been considered "experimental". A number of areas were identified to be
improved upon before the tag could be removed.

There has been a conscious effort to address these areas. The improvements
to userspace (specifically regarding DPDK) include:

1. Improved documentation for user deployment and feature usage with DPDK.
2. Improved feature parity between kernel and userspace with DPDK.
3. Introduction of unit tests for DPDK specific components.
4. Availability of deployment/configuration of OVS and DPDK via multiple OS
vendor repos.

As such this patch proposes the removal of the experimental warning tag for
OVS userspace deployments throughout the documentation.

Ian Stokes (1):
  doc: Remove "experimental" warning for userspace.

 Documentation/intro/install/dpdk.rst  |3 ---
 Documentation/intro/install/userspace.rst |4 
 NEWS  |2 ++
 README.rst|6 +++---
 4 files changed, 5 insertions(+), 10 deletions(-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Laine Stump

On 02/02/2017 10:06 AM, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:


On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:

It depends where / how in OVS it needs to be set. The only stuff libvirt
does with OVS is to run 'add-port' and 'del-port' commands via the ovs
cli tool.


(aside note: the code that exec's ovs-vsctl was written back when there 
was no standardized API for performing such operations. libvirt would 
prefer to not exec external programs though, and I've heard that OVS may 
now have an official API of some sort for doing things like this (maybe 
via netlink or dbus or something?) If that's the case, can someone point 
me in the right direction?)



  We pass through arguments from the port profile stored in the
XML config.

   
 
 
   
 
   

eg those things in  get passed as cli args to the 'add-port'
command. Soo if add-port needs this new version string, then we'd need
to add the version to the openvswitch virtualport XML.

If the version is provided to OVS in a different command, then it would
probably be outside scope of libvirt.

I think it would make sense to be a parameter of the add-port command.
But it would be for vhost-user related add-port command, I didn't find
where/if this is managed in libvirt XML.

For vhost-user, libvirt does not have any interaction with OVS at
all. If the thing that's using the vhost-user UNIX socket, in turn
connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
OVS it seems like that job is for Nova / os-vif to solve.


This brings up another tangentially related question I came up against 
last night - qemu now has an option to report the host's MTU to the 
guest for virtio and vhost-user interfaces, and Michal Privoznik 
recently pushed patches to set the MTU sent to the guest via an explicit 
 in libvirt's interface config:


   https://bugzilla.redhat.com/1408701

But it would be much nicer if libvirt could learn the MTU of [that stuff 
at the other end of the unix socket] without requiring intervention in 
libvirt's config. For example, I'm just now testing patches for 
tap-based interfaces (connecting to Linux host bridges or OVS switches) 
that query the current MTU of the bridge and report that to qemu; this 
eliminates the burden of configuring each interface of each guest 
individually (and changing that config in all those places if someone 
ever wants to change the MTU of the bridge).


As Dan says, though, libvirt's only interaction in the case of 
vhost-user is with the unix socket. Is there any way to learn what is 
the appropriate MTU from OVS in these cases? Or must Nova (or ovirt or 
some poor user) set that up in the libvirt config for every single 
interface?


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovsdb: fix data loss when OVSDB replication from itself

2017-02-02 Thread Guru Shetty
On 31 January 2017 at 19:27, Guoshuai Li  wrote:

>
>
> This patch removes IP migration and OVSDB services to promote timing
> dependency, which occurs at the master / slave exchange time and may not be
> user-configurable.
> If this dependency requires upper-level processing, not every cluster
> program can easily handle it. For example, building an OVSDB cluster using
> K8S is difficult to handle this dependency.


Assume that I don't know anything about Kubernetes and OVSDB replication.
Can you clearly explain how do you plan to use it with Kubernetes and what
is it that does not work right now?


>
> Replication of OVSDB onto itself seems to be an configuration issue. I
>> don't see
>> why such configuration is ever useful in practice, and probably should
>> be blocked
>> at a higher level.
>>
>> Is there something special about K8 where OVSDB is expected to
>> replicating itself?
>> If so, please explain.
>>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/4 v4] lib:Data structures: Skiplist implementation

2017-02-02 Thread Ben Pfaff
On Wed, Dec 28, 2016 at 07:41:46PM +, Rodriguez Betancourt, Esteban wrote:
> Skiplist implementation intended for the IDL compound indexes
> feature.
> 
> Signed-off-by: Esteban Rodriguez Betancourt 

Thanks!

Is there a value to having max_levels as a configurable parameter for
the skiplist?  That is, is there a reason for the caller to specify
anything other than SKIPLIST_MAX_LEVELS?  From the comments, it sounds
like, if more levels are needed, then other code has to change anyway.
That means that the only other reason would be to use fewer levels, but
is there value in that?

skiplist_determine_level() seems like it could be implemented more
simply as something like:
uint32_t level = clz32(random_uint32());
return MIN(level, cl->max_levels);
or if we get rid of max_levels:
return clz32(random_uint32());

s/ocurrence/occurrence/ in a comment.

It looks like free_func is only used in skiplist_destroy(), so It might
make more sense to simply pass it as an argument to skiplist_destroy().

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: Update DPDK version for 2.7 release.

2017-02-02 Thread Kevin Traynor
On 02/02/2017 04:30 PM, Ian Stokes wrote:
> Add DPDK version required for the OVS 2.7 release in documentation.
> 
> Signed-off-by: Ian Stokes 

Acked-by: Kevin Traynor 

> ---
>  Documentation/faq/releases.rst |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
> index fcff5c3..319c2d7 100644
> --- a/Documentation/faq/releases.rst
> +++ b/Documentation/faq/releases.rst
> @@ -159,6 +159,7 @@ Q: What DPDK version does each Open vSwitch release work 
> with?
>  2.4.x2.0
>  2.5.x2.2
>  2.6.x16.07
> +2.7.x16.11
>   =
>  
>  Q: I get an error like this when I configure Open vSwitch::
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/1] doc: Remove "experimental" warning for userspace.

2017-02-02 Thread Kevin Traynor
On 02/02/2017 04:44 PM, Ian Stokes wrote:
> Remove the experimental warning tag in documentation regarding OVS deployed
> via userspace.
> 
> Signed-off-by: Ian Stokes 
> ---
>  Documentation/intro/install/dpdk.rst  |3 ---
>  Documentation/intro/install/userspace.rst |4 
>  NEWS  |2 ++
>  README.rst|6 +++---
>  4 files changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst 
> b/Documentation/intro/install/dpdk.rst
> index fff0a1a..3018590 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -29,9 +29,6 @@ This document describes how to build and install Open 
> vSwitch using a DPDK
>  datapath. Open vSwitch can use the DPDK library to operate entirely in
>  userspace.
>  
> -.. warning::
> -  The DPDK support of Open vSwitch is considered 'experimental'.
> -
>  Build requirements
>  --
>  
> diff --git a/Documentation/intro/install/userspace.rst 
> b/Documentation/intro/install/userspace.rst
> index 0368527..ebd0c12 100644
> --- a/Documentation/intro/install/userspace.rst
> +++ b/Documentation/intro/install/userspace.rst
> @@ -34,10 +34,6 @@ This version of Open vSwitch should be built manually with 
> ``configure`` and
>  been recently tested, and so Debian packages are not a recommended way to use
>  this version of Open vSwitch.
>  
> -.. warning::
> -  The userspace-only mode of Open vSwitch is considered experimental.  It has
> -  not been thoroughly tested.
> -
>  Building and Installing
>  ---
>  
> diff --git a/NEWS b/NEWS
> index 5efcce2..8600f0e 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -3,6 +3,8 @@ Post-v2.7.0
> - Tunnels:
>   * Added support to set packet mark for tunnel endpoint using
> `egress_pkt_mark` OVSDB option.
> +   - Documentation:
> + * OVS deployed in userspace mode no longer tagged as experimental.

I think this would be a bit clearer. What do you think?

--- a/NEWS
+++ b/NEWS
@@ -59,4 +59,5 @@ v2.7.0 - xx xxx 
  * Removed support for IPsec tunnels.
- DPDK:
+ * Removal of experimental tag.


>  
>  v2.7.0 - xx xxx 
>  -
> diff --git a/README.rst b/README.rst
> index f5cdaa5..90050e3 100644
> --- a/README.rst
> +++ b/README.rst
> @@ -38,9 +38,9 @@ following features:
>  
>  The included Linux kernel module supports Linux 3.10 and up.
>  
> -Open vSwitch can also operate, at a cost in performance, entirely in 
> userspace,
> -without assistance from a kernel module.  This userspace implementation 
> should
> -be easier to port than the kernel-based switch.  It is considered 
> experimental.
> +Open vSwitch can also operate entirely in userspace without assistance from
> +a kernel module.  This userspace implementation should be easier to port than
> +the kernel-based switch.
>  
>  What's here?
>  
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Michael S. Tsirkin
On Thu, Feb 02, 2017 at 11:47:57AM -0500, Laine Stump wrote:
> On 02/02/2017 10:06 AM, Daniel P. Berrange wrote:
> > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > 
> > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > It depends where / how in OVS it needs to be set. The only stuff libvirt
> > > > does with OVS is to run 'add-port' and 'del-port' commands via the ovs
> > > > cli tool.
> 
> (aside note: the code that exec's ovs-vsctl was written back when there was
> no standardized API for performing such operations. libvirt would prefer to
> not exec external programs though, and I've heard that OVS may now have an
> official API of some sort for doing things like this (maybe via netlink or
> dbus or something?) If that's the case, can someone point me in the right
> direction?)
> 
> > > >   We pass through arguments from the port profile stored in the
> > > > XML config.
> > > > 
> > > >
> > > >  
> > > >  
> > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > >  
> > > >
> > > > 
> > > > eg those things in  get passed as cli args to the 
> > > > 'add-port'
> > > > command. Soo if add-port needs this new version string, then we'd need
> > > > to add the version to the openvswitch virtualport XML.
> > > > 
> > > > If the version is provided to OVS in a different command, then it would
> > > > probably be outside scope of libvirt.
> > > I think it would make sense to be a parameter of the add-port command.
> > > But it would be for vhost-user related add-port command, I didn't find
> > > where/if this is managed in libvirt XML.
> > For vhost-user, libvirt does not have any interaction with OVS at
> > all. If the thing that's using the vhost-user UNIX socket, in turn
> > connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> > OVS it seems like that job is for Nova / os-vif to solve.
> 
> This brings up another tangentially related question I came up against last
> night - qemu now has an option to report the host's MTU to the guest for
> virtio and vhost-user interfaces, and Michal Privoznik recently pushed
> patches to set the MTU sent to the guest via an explicit  in
> libvirt's interface config:
> 
>https://bugzilla.redhat.com/1408701
> 
> But it would be much nicer if libvirt could learn the MTU of [that stuff at
> the other end of the unix socket] without requiring intervention in
> libvirt's config. For example, I'm just now testing patches for tap-based
> interfaces (connecting to Linux host bridges or OVS switches) that query the
> current MTU of the bridge and report that to qemu; this eliminates the
> burden of configuring each interface of each guest individually (and
> changing that config in all those places if someone ever wants to change the
> MTU of the bridge).
> 
> As Dan says, though, libvirt's only interaction in the case of vhost-user is
> with the unix socket. Is there any way to learn what is the appropriate MTU
> from OVS in these cases? Or must Nova (or ovirt or some poor user) set that
> up in the libvirt config for every single interface?

We could add commands for all kind of queries to the vhost-user
protocol.  libvirt would have to learn the vhost-user protocol though.
Interested?

-- 
MST
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Daniel P. Berrange
On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > 
> > > > It depends where / how in OVS it needs to be set. The only stuff libvirt
> > > > does with OVS is to run 'add-port' and 'del-port' commands via the ovs
> > > > cli tool. We pass through arguments from the port profile stored in the
> > > > XML config.
> > > > 
> > > >   
> > > > 
> > > > 
> > > >> > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > 
> > > >   
> > > > 
> > > > eg those things in  get passed as cli args to the 
> > > > 'add-port'
> > > > command. Soo if add-port needs this new version string, then we'd need
> > > > to add the version to the openvswitch virtualport XML.
> > > > 
> > > > If the version is provided to OVS in a different command, then it would
> > > > probably be outside scope of libvirt.
> > > 
> > > I think it would make sense to be a parameter of the add-port command.
> > > But it would be for vhost-user related add-port command, I didn't find
> > > where/if this is managed in libvirt XML.
> > 
> > For vhost-user, libvirt does not have any interaction with OVS at
> > all. If the thing that's using the vhost-user UNIX socket, in turn
> > connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> > OVS it seems like that job is for Nova / os-vif to solve.
> 
> I don't think they currently understand the issues involved in
> cross-version migration though.  This is a complex subject and easy to
> get wrong.  It would be significantly better to figure it out at libvirt
> level since it already deals with cross-version migration issues.

Libvirt considers vhost-user to be a blackbox though - it just exposes
a UNIX socket, and whatever is on the other end is completely opaque.
The fact that the other end might plumb the data stream into openvswitch
is not something libvirt should know, as we don't want to end up having
to add custom code to libvirt for every different vhost-user server
impl.

IOW, if the version str can be passed to QEMU, and then onto vhost-user
backend in QEMU, then libvirt can be involved. If the version str has
to be given to openvswitch that's not for libvirt to deall with.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Daniel P. Berrange
On Thu, Feb 02, 2017 at 07:09:24PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 11:47:57AM -0500, Laine Stump wrote:
> > On 02/02/2017 10:06 AM, Daniel P. Berrange wrote:
> > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > 
> > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > It depends where / how in OVS it needs to be set. The only stuff 
> > > > > libvirt
> > > > > does with OVS is to run 'add-port' and 'del-port' commands via the ovs
> > > > > cli tool.
> > 
> > (aside note: the code that exec's ovs-vsctl was written back when there was
> > no standardized API for performing such operations. libvirt would prefer to
> > not exec external programs though, and I've heard that OVS may now have an
> > official API of some sort for doing things like this (maybe via netlink or
> > dbus or something?) If that's the case, can someone point me in the right
> > direction?)
> > 
> > > > >   We pass through arguments from the port profile stored in the
> > > > > XML config.
> > > > > 
> > > > >
> > > > >  
> > > > >  
> > > > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > >  
> > > > >
> > > > > 
> > > > > eg those things in  get passed as cli args to the 
> > > > > 'add-port'
> > > > > command. Soo if add-port needs this new version string, then we'd need
> > > > > to add the version to the openvswitch virtualport XML.
> > > > > 
> > > > > If the version is provided to OVS in a different command, then it 
> > > > > would
> > > > > probably be outside scope of libvirt.
> > > > I think it would make sense to be a parameter of the add-port command.
> > > > But it would be for vhost-user related add-port command, I didn't find
> > > > where/if this is managed in libvirt XML.
> > > For vhost-user, libvirt does not have any interaction with OVS at
> > > all. If the thing that's using the vhost-user UNIX socket, in turn
> > > connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> > > OVS it seems like that job is for Nova / os-vif to solve.
> > 
> > This brings up another tangentially related question I came up against last
> > night - qemu now has an option to report the host's MTU to the guest for
> > virtio and vhost-user interfaces, and Michal Privoznik recently pushed
> > patches to set the MTU sent to the guest via an explicit  in
> > libvirt's interface config:
> > 
> >https://bugzilla.redhat.com/1408701
> > 
> > But it would be much nicer if libvirt could learn the MTU of [that stuff at
> > the other end of the unix socket] without requiring intervention in
> > libvirt's config. For example, I'm just now testing patches for tap-based
> > interfaces (connecting to Linux host bridges or OVS switches) that query the
> > current MTU of the bridge and report that to qemu; this eliminates the
> > burden of configuring each interface of each guest individually (and
> > changing that config in all those places if someone ever wants to change the
> > MTU of the bridge).
> > 
> > As Dan says, though, libvirt's only interaction in the case of vhost-user is
> > with the unix socket. Is there any way to learn what is the appropriate MTU
> > from OVS in these cases? Or must Nova (or ovirt or some poor user) set that
> > up in the libvirt config for every single interface?
> 
> We could add commands for all kind of queries to the vhost-user
> protocol.  libvirt would have to learn the vhost-user protocol though.
> Interested?

No, I really don't think libvirt should implement the vhost-user protocol

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Maxime Coquelin



On 02/02/2017 06:09 PM, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 11:47:57AM -0500, Laine Stump wrote:

On 02/02/2017 10:06 AM, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:


On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:

It depends where / how in OVS it needs to be set. The only stuff libvirt
does with OVS is to run 'add-port' and 'del-port' commands via the ovs
cli tool.


(aside note: the code that exec's ovs-vsctl was written back when there was
no standardized API for performing such operations. libvirt would prefer to
not exec external programs though, and I've heard that OVS may now have an
official API of some sort for doing things like this (maybe via netlink or
dbus or something?) If that's the case, can someone point me in the right
direction?)


  We pass through arguments from the port profile stored in the
XML config.

   
 
 
   
 
   

eg those things in  get passed as cli args to the 'add-port'
command. Soo if add-port needs this new version string, then we'd need
to add the version to the openvswitch virtualport XML.

If the version is provided to OVS in a different command, then it would
probably be outside scope of libvirt.

I think it would make sense to be a parameter of the add-port command.
But it would be for vhost-user related add-port command, I didn't find
where/if this is managed in libvirt XML.

For vhost-user, libvirt does not have any interaction with OVS at
all. If the thing that's using the vhost-user UNIX socket, in turn
connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
OVS it seems like that job is for Nova / os-vif to solve.


This brings up another tangentially related question I came up against last
night - qemu now has an option to report the host's MTU to the guest for
virtio and vhost-user interfaces, and Michal Privoznik recently pushed
patches to set the MTU sent to the guest via an explicit  in
libvirt's interface config:

   https://bugzilla.redhat.com/1408701

But it would be much nicer if libvirt could learn the MTU of [that stuff at
the other end of the unix socket] without requiring intervention in
libvirt's config. For example, I'm just now testing patches for tap-based
interfaces (connecting to Linux host bridges or OVS switches) that query the
current MTU of the bridge and report that to qemu; this eliminates the
burden of configuring each interface of each guest individually (and
changing that config in all those places if someone ever wants to change the
MTU of the bridge).

As Dan says, though, libvirt's only interaction in the case of vhost-user is
with the unix socket. Is there any way to learn what is the appropriate MTU
from OVS in these cases? Or must Nova (or ovirt or some poor user) set that
up in the libvirt config for every single interface?


We could add commands for all kind of queries to the vhost-user
protocol.  libvirt would have to learn the vhost-user protocol though.
Interested?



I think it could be possible to query the MTU value from the OVS DB
using its JSON RPC-like API, but this is something I haven't tried.
I guess it would need to resolve the ovs interface from the vhost-user
socket path.

Can people familiar with OVS confirm this is something possible?

Regards,
Maxime
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Michael S. Tsirkin
On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:
> On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > 
> > > > > It depends where / how in OVS it needs to be set. The only stuff 
> > > > > libvirt
> > > > > does with OVS is to run 'add-port' and 'del-port' commands via the ovs
> > > > > cli tool. We pass through arguments from the port profile stored in 
> > > > > the
> > > > > XML config.
> > > > > 
> > > > >   
> > > > > 
> > > > > 
> > > > >> > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > 
> > > > >   
> > > > > 
> > > > > eg those things in  get passed as cli args to the 
> > > > > 'add-port'
> > > > > command. Soo if add-port needs this new version string, then we'd need
> > > > > to add the version to the openvswitch virtualport XML.
> > > > > 
> > > > > If the version is provided to OVS in a different command, then it 
> > > > > would
> > > > > probably be outside scope of libvirt.
> > > > 
> > > > I think it would make sense to be a parameter of the add-port command.
> > > > But it would be for vhost-user related add-port command, I didn't find
> > > > where/if this is managed in libvirt XML.
> > > 
> > > For vhost-user, libvirt does not have any interaction with OVS at
> > > all. If the thing that's using the vhost-user UNIX socket, in turn
> > > connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> > > OVS it seems like that job is for Nova / os-vif to solve.
> > 
> > I don't think they currently understand the issues involved in
> > cross-version migration though.  This is a complex subject and easy to
> > get wrong.  It would be significantly better to figure it out at libvirt
> > level since it already deals with cross-version migration issues.
> 
> Libvirt considers vhost-user to be a blackbox though - it just exposes
> a UNIX socket, and whatever is on the other end is completely opaque.
> The fact that the other end might plumb the data stream into openvswitch
> is not something libvirt should know, as we don't want to end up having
> to add custom code to libvirt for every different vhost-user server
> impl.
> 
> IOW, if the version str can be passed to QEMU, and then onto vhost-user
> backend in QEMU, then libvirt can be involved. If the version str has
> to be given to openvswitch that's not for libvirt to deall with.
> 
> Regards,
> Daniel

It's not just about passing it to QEMU. The following is needed:
- need to query version when creating VM/device
- need to query supported versions when migrating VM


> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Daniel P. Berrange
On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:
> > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > > 
> > > > > 
> > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > > 
> > > > > > It depends where / how in OVS it needs to be set. The only stuff 
> > > > > > libvirt
> > > > > > does with OVS is to run 'add-port' and 'del-port' commands via the 
> > > > > > ovs
> > > > > > cli tool. We pass through arguments from the port profile stored in 
> > > > > > the
> > > > > > XML config.
> > > > > > 
> > > > > >   
> > > > > > 
> > > > > > 
> > > > > >> > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > > 
> > > > > >   
> > > > > > 
> > > > > > eg those things in  get passed as cli args to the 
> > > > > > 'add-port'
> > > > > > command. Soo if add-port needs this new version string, then we'd 
> > > > > > need
> > > > > > to add the version to the openvswitch virtualport XML.
> > > > > > 
> > > > > > If the version is provided to OVS in a different command, then it 
> > > > > > would
> > > > > > probably be outside scope of libvirt.
> > > > > 
> > > > > I think it would make sense to be a parameter of the add-port command.
> > > > > But it would be for vhost-user related add-port command, I didn't find
> > > > > where/if this is managed in libvirt XML.
> > > > 
> > > > For vhost-user, libvirt does not have any interaction with OVS at
> > > > all. If the thing that's using the vhost-user UNIX socket, in turn
> > > > connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> > > > OVS it seems like that job is for Nova / os-vif to solve.
> > > 
> > > I don't think they currently understand the issues involved in
> > > cross-version migration though.  This is a complex subject and easy to
> > > get wrong.  It would be significantly better to figure it out at libvirt
> > > level since it already deals with cross-version migration issues.
> > 
> > Libvirt considers vhost-user to be a blackbox though - it just exposes
> > a UNIX socket, and whatever is on the other end is completely opaque.
> > The fact that the other end might plumb the data stream into openvswitch
> > is not something libvirt should know, as we don't want to end up having
> > to add custom code to libvirt for every different vhost-user server
> > impl.
> > 
> > IOW, if the version str can be passed to QEMU, and then onto vhost-user
> > backend in QEMU, then libvirt can be involved. If the version str has
> > to be given to openvswitch that's not for libvirt to deall with.
> > 
> > Regards,
> > Daniel
> 
> It's not just about passing it to QEMU. The following is needed:
> - need to query version when creating VM/device
> - need to query supported versions when migrating VM

Those are both things that nova can do, since it knows what the vhost-user
device in question is connected to, and so can query the versions, and check
versions before triggering migration with libvirt

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Michael S. Tsirkin
On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:
> On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:
> > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > > > 
> > > > > > 
> > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > > > 
> > > > > > > It depends where / how in OVS it needs to be set. The only stuff 
> > > > > > > libvirt
> > > > > > > does with OVS is to run 'add-port' and 'del-port' commands via 
> > > > > > > the ovs
> > > > > > > cli tool. We pass through arguments from the port profile stored 
> > > > > > > in the
> > > > > > > XML config.
> > > > > > > 
> > > > > > >   
> > > > > > > 
> > > > > > > 
> > > > > > >> > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > > > 
> > > > > > >   
> > > > > > > 
> > > > > > > eg those things in  get passed as cli args to the 
> > > > > > > 'add-port'
> > > > > > > command. Soo if add-port needs this new version string, then we'd 
> > > > > > > need
> > > > > > > to add the version to the openvswitch virtualport XML.
> > > > > > > 
> > > > > > > If the version is provided to OVS in a different command, then it 
> > > > > > > would
> > > > > > > probably be outside scope of libvirt.
> > > > > > 
> > > > > > I think it would make sense to be a parameter of the add-port 
> > > > > > command.
> > > > > > But it would be for vhost-user related add-port command, I didn't 
> > > > > > find
> > > > > > where/if this is managed in libvirt XML.
> > > > > 
> > > > > For vhost-user, libvirt does not have any interaction with OVS at
> > > > > all. If the thing that's using the vhost-user UNIX socket, in turn
> > > > > connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
> > > > > OVS it seems like that job is for Nova / os-vif to solve.
> > > > 
> > > > I don't think they currently understand the issues involved in
> > > > cross-version migration though.  This is a complex subject and easy to
> > > > get wrong.  It would be significantly better to figure it out at libvirt
> > > > level since it already deals with cross-version migration issues.
> > > 
> > > Libvirt considers vhost-user to be a blackbox though - it just exposes
> > > a UNIX socket, and whatever is on the other end is completely opaque.
> > > The fact that the other end might plumb the data stream into openvswitch
> > > is not something libvirt should know, as we don't want to end up having
> > > to add custom code to libvirt for every different vhost-user server
> > > impl.
> > > 
> > > IOW, if the version str can be passed to QEMU, and then onto vhost-user
> > > backend in QEMU, then libvirt can be involved. If the version str has
> > > to be given to openvswitch that's not for libvirt to deall with.
> > > 
> > > Regards,
> > > Daniel
> > 
> > It's not just about passing it to QEMU. The following is needed:
> > - need to query version when creating VM/device
> > - need to query supported versions when migrating VM
> 
> Those are both things that nova can do, since it knows what the vhost-user
> device in question is connected to, and so can query the versions, and check
> versions before triggering migration with libvirt
> 
> Regards,
> Daniel

It can, but then it will need to query libvirt or source for the version
string since it's in the XML.

> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/4 v4] ovsdb-idl: IDL Compound Indexes Implementation

2017-02-02 Thread Ben Pfaff
On Wed, Dec 28, 2016 at 07:41:48PM +, Rodriguez Betancourt, Esteban wrote:
> In the C IDL, allows to create multicolumn indexes in the
> tables, that are keep synched with the data in the replica.
> 
> Signed-off-by: Esteban Rodriguez Betancourt 

"git am" refuses to apply this; it says that the patch is corrupt.

Will you rebase and repost the series?  Perhaps you should push it to a
branch somewhere, too, in case there's a similar problem with the
repost.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] docs: Add OVS and OVN headings to pages.

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 02:02:34AM -0500, Russell Bryant wrote:
> Update the "deep dive" and "howto" pages with headings that more clearly
> indicate the separate lists of OVS or OVN content.  Also add a link to
> ovn-architecture from the "deep dive" page as it seems quite relevant
> there.
> 
> Signed-off-by: Russell Bryant 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn: Add missing netdev_close in setup_qos.

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 02:04:07AM -0500, Russell Bryant wrote:
> We missed calling netdev_close in a couple of places.  One was in an error
> condition rarely hit.  The second was just introduced and would be hit in
> all cases where QoS is not in use.
> 
> Fixes: dc2dab6e6de5 ("ovn-controller: Configure interface QoS only if it 
> would actually be used.")
> Signed-off-by: Russell Bryant 

Thank you!

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovn-controller: Configure interface QoS only if it would actually be used.

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 01:33:36AM -0500, Russell Bryant wrote:
> On Wed, Feb 1, 2017 at 12:22 PM, Ben Pfaff  wrote:
> 
> > Until now, ovn-controller has unconditionally configured linux-htb on
> > physical interfaces.  QoS is pretty much always trouble, but it's even more
> > trouble if we set it up for no good reason.  We received a bug report, in
> > particular, that doing this disrupts connectivity in Docker.
> >
> 
> This isn't related to this commit, but last time I was looking at this I
> noticed that QoS also doesn't work when traffic leaves the host on a
> physical network (via a logical switch with a localnet port) instead of via
> a tunnel.  It also doesn't work between ports on the same host, but I'm not
> sure if that really matters.

Does this just come down to adding a set_queue() in the right place in
the logical or physical flows somewhere?  It would be nice to make this
work properly.

Thank you for finding the leak.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] ovn-controller: Fix memory leak and bug in setup_qos().

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 12:44:02PM +0530, Numan Siddique wrote:
> On Wed, Feb 1, 2017 at 10:52 PM, Ben Pfaff  wrote:
> 
> > The caller of netdev_get_qos() is responsible for freeing its 'details'
> > smap.
> >
> >
> Is this patch a good candidate for back porting to 2.6 branch since it's
> fixing memory leaks ?

After a quick look, I don't think that branch-2.6 has this bug.  If I'm
mistaken, please let me know and I'll look again.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-trace: Style improvements.

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 02:18:31AM -0500, Russell Bryant wrote:
> On Fri, Jan 13, 2017 at 2:30 PM, Ben Pfaff  wrote:
> 
> > Signed-off-by: Ben Pfaff 
> > ---
> >  ovn/utilities/ovn-trace.c | 12 +---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> 
> Acked-by: Russell Bryant 

Thanks, applied to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Openvswitch mode

2017-02-02 Thread Ben Pfaff
If your definition of "hybrid switch" is "implements NORMAL action",
then OVS is a hybrid switch.

On Thu, Feb 02, 2017 at 10:52:13AM +0530, Sumitha Ramesh wrote:
> Hi Ben,
> 
> First of all thanks for your reply.
> When flow with NORMAL action is added, OVS is performing normal switching.
> When Openflow controller is connected to OVS, it performs openflow pipeline
> process.
> So I assumed OVS is a openflow hybrid switch.
> Please correct me if I'm wrong.
> 
> On Wed, Feb 1, 2017 at 9:57 PM, Ben Pfaff  wrote:
> 
> > On Wed, Feb 01, 2017 at 06:29:04PM +0530, Sumitha Ramesh wrote:
> > > Is OVS supports both openflow-hybrid mode and openflow mode? If yes, how
> > to
> > > configure OVS to openflow-hybrid mode?
> > > Im new to OVS, any help will be appreciated.
> >
> > OVS is just OVS and doesn't have two modes.
> >
> > (Furthermore, the Open Networking Foundation's Hybrid Working Group
> > defined a couple of different categories of hybrid switches, and OVS did
> > not fall into any of the categories.  So it's not even clear that OVS is
> > a hybrid switch.)
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: MPLS translate causes issues on slow systems

2017-02-02 Thread Ben Pfaff
On Wed, Feb 01, 2017 at 02:39:36PM +, Alin Serdean wrote:
> On slow systems ofproto/trace in combination with a recirculation ID
> causes issues because the flow is evicted before the second packet
> can reach it.

Thanks, applied to master, branch-2.7, and branch-2.6.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] sflow: Expose ethernet stats via sFlow

2017-02-02 Thread Neil McKee
Robert,

It's great to get the multicast and broadcast counters in there.

I do wonder if it's worth including the whole ethernet stats block just to
distinguish alignment / FCS /  frame-too-long errors, given that total
errors is already a counter in the generic block.   I know that some follow
symbolErrors particularly carefully.  I think because those are good for
early-warning of fiber/laser degradation.  Any chance we could get those
too?

So it's a thumbs-up from me,  with the hope that we can add symbolErrors
later.

Neil


On Wed, Feb 1, 2017 at 11:35 PM, Robert Wojciechowicz <
robertx.wojciechow...@intel.com> wrote:

> On Thu, Dec 08, 2016 at 12:23:15PM +, Robert Wojciechowicz wrote:
> > Expose existing netdev stats via sFlow.
> > Export sFlow ETHERNET structure with available counters.
> > Map existing stats to counters in the GENERIC INTERFACE
> > sFlow structure.
> > Adjust unit test to accommodate these new counters.
> >
>
> any comments to this patch?
>
> Br,
> Robert
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] system-ovn.at: Fix race conditions.

2017-02-02 Thread Gurucharan Shetty
The code to wait for a particular type of flow
in ovs-vswitchd was not specific enough. This commit
changes that and to be doubly sure, also uses the
sync command.

Reported-by: Andy Zhou 
Reported-by: Joe Stringer 
Signed-off-by: Gurucharan Shetty 
---
 tests/system-ovn.at | 31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 638ac56..7296550 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -97,7 +97,8 @@ ovn-nbctl -- --id=@nat create nat type="snat" 
logical_ip=192.168.2.2 \
 external_ip=30.0.0.1 -- add logical_router R2 nat @nat
 
 # wait for ovn-controller to catch up.
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep ct\( | grep nat])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=30.0.0.1)'])
 
 # 'alice1' should be able to ping 'foo1' directly.
 NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 192.168.1.2 | FORMAT_PING], \
@@ -250,7 +251,8 @@ ovn-nbctl lsp-add alice alice1 \
 ovn-nbctl -- --id=@nat create nat type="snat" logical_ip=192.168.1.2 \
 external_ip=172.16.1.1 -- add logical_router R2 nat @nat
 
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep ct\( | grep nat])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=172.16.1.1)'])
 
 # South-North SNAT: 'foo1' pings 'alice1'. But 'alice1' receives traffic
 # from 172.16.1.1
@@ -420,7 +422,8 @@ ovn-nbctl -- --id=@nat create nat type="snat" 
logical_ip=192.168.2.2 \
 external_ip=30.0.0.4 -- add logical_router R3 nat @nat
 
 # wait for ovn-controller to catch up.
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep ct\( | grep nat])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=30.0.0.4)'])
 
 # North-South DNAT: 'alice1' should be able to ping 'foo1' via 30.0.0.2
 NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 30.0.0.2 | FORMAT_PING], \
@@ -580,7 +583,9 @@ ovn-nbctl add logical_switch foo load_balancer $uuid
 ovn-nbctl set load_balancer $uuid 
vips:'"30.0.0.2:8000"'='"172.16.1.2:80,172.16.1.3:80,172.16.1.4:80"'
 
 # Wait for ovn-controller to catch up.
-OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | grep ct\(])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
+grep 'nat(dst=172.16.1.4:80)'])
 
 # Start webservers in 'bar1', 'bar2' and 'bar3'.
 OVS_START_L7([bar1], [http])
@@ -699,7 +704,9 @@ ovn-nbctl set logical_switch foo load_balancer=$uuid
 ovn-nbctl set load_balancer $uuid 
vips:'"30.0.0.2:8000"'='"192.168.1.3:80,192.168.1.4:80,192.168.1.5:80"'
 
 # Wait for ovn-controller to catch up.
-OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | grep ct\(])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
+grep 'nat(dst=192.168.1.5:80)'])
 
 # Start webservers in 'foo2', 'foo3' and 'foo4'.
 OVS_START_L7([foo2], [http])
@@ -846,7 +853,9 @@ ovn-nbctl set logical_router R2 load_balancer=$uuid
 ovn-nbctl set load_balancer $uuid 
vips:'"30.0.0.2:8000"'='"192.168.1.2:80,192.168.2.2:80"'
 
 # Wait for ovn-controller to catch up.
-OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | grep ct\(])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
+grep 'nat(dst=192.168.2.2:80)'])
 
 # Start webservers in 'foo1', 'bar1'.
 OVS_START_L7([foo1], [http])
@@ -1020,7 +1029,9 @@ ovn-nbctl set logical_router R2 load_balancer=$uuid
 ovn-nbctl set logical_router R3 load_balancer=$uuid
 
 # Wait for ovn-controller to catch up.
-OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | grep ct\(])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl -O OpenFlow13 dump-groups br-int | \
+grep 'nat(dst=192.168.2.2)'])
 
 # Start webservers in 'foo1', 'bar1'.
 OVS_START_L7([foo1], [http])
@@ -1151,7 +1162,8 @@ AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 
172.16.1.4 192.168.1.3 foo2 00:0
 # Add a SNAT rule
 AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.1 192.168.0.0/16])
 
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep ct\( | grep nat])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=172.16.1.1)'])
 
 # North-South DNAT: 'alice1' pings 'foo1' using 172.16.1.3.
 NS_CHECK_EXEC([alice1], [ping -q -c 3 -i 0.3 -w 2 172.16.1.3 | FORMAT_PING], \
@@ -1297,7 +1309,8 @@ AT_CHECK([ovn-nbctl lr-nat-add R1 dnat_and_snat 
172.16.1.4 192.168.2.2 bar1 00:0
 # Add a SNAT rule
 AT_CHECK([ovn-nbctl lr-nat-add R1 snat 172.16.1.1 192.168.0.0/16])
 
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep ct\( | grep nat])
+ovn-nbctl --wait=hv sync
+OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep 'nat(src=172.16.1.1)'])
 
 echo "-- hv dump --"
 ovs-ofctl show br-int
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovn-controller: Provide the option to set Encap.options:csum

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 10:40:31AM -0500, Russell Bryant wrote:
> On Thu, Feb 2, 2017 at 10:32 AM, Numan Siddique  wrote:
> 
> > On Wed, Feb 1, 2017 at 4:31 AM, Ben Pfaff  wrote:
> >
> > > On Sun, Jan 15, 2017 at 12:36:09PM +0530, Numan Siddique wrote:
> > > > ovn-controller by default enables tunnel encapsulation checksums
> > > > for geneve tunnels. With this patch user can set the desired value
> > > > in Open_vSwitch.external_ids:ovn_encap_csum.
> > > >
> > > > This option will be useful in cases where enabling tunnel
> > > > encapsulation checksums incur significant performance loss due to
> > > > limitations in checksum offloading capabilities of the nics.
> > > >
> > > > Signed-off-by: Numan Siddique 
> > >
> > > Thanks.  I applied this to master.
> > >
> >
> > ​Thanks Ben for the review and applying it. Can this be applied to branch
> > 2.7 as well ?
> >
> 
> I'm supportive of backporting it.
> 
> Just to elaborate on why we care ...
> 
> With checksums enabled, we are not able to make use of the
> tx-udp_tnl-segmentation offload.  We will be able to leave checksums
> enabled once we also have support for tx-udp_tnl-csum-segmentation, which
> is present in the upstream kernel but not the RHEL kernel yet.  So, right
> now to get the best overall performance, we have to disable checksums (for
> current CentOS / RHEL, anyway).

OK, I applied to branch-2.7 too.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-ovn.at: Fix race conditions.

2017-02-02 Thread Ben Pfaff
On Wed, Feb 01, 2017 at 11:52:14PM -0800, Gurucharan Shetty wrote:
> The code to wait for a particular type of flow
> in ovs-vswitchd was not specific enough. This commit
> changes that and to be doubly sure, also uses the
> sync command.
> 
> Reported-by: Andy Zhou 
> Reported-by: Joe Stringer 
> Signed-off-by: Gurucharan Shetty 

Seems reasonable, thank you!

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Daniel P. Berrange
On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:
> > On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:
> > > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> > > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > > > > 
> > > > > > > > It depends where / how in OVS it needs to be set. The only 
> > > > > > > > stuff libvirt
> > > > > > > > does with OVS is to run 'add-port' and 'del-port' commands via 
> > > > > > > > the ovs
> > > > > > > > cli tool. We pass through arguments from the port profile 
> > > > > > > > stored in the
> > > > > > > > XML config.
> > > > > > > > 
> > > > > > > >   
> > > > > > > > 
> > > > > > > > 
> > > > > > > >> > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > > > > 
> > > > > > > >   
> > > > > > > > 
> > > > > > > > eg those things in  get passed as cli args to the 
> > > > > > > > 'add-port'
> > > > > > > > command. Soo if add-port needs this new version string, then 
> > > > > > > > we'd need
> > > > > > > > to add the version to the openvswitch virtualport XML.
> > > > > > > > 
> > > > > > > > If the version is provided to OVS in a different command, then 
> > > > > > > > it would
> > > > > > > > probably be outside scope of libvirt.
> > > > > > > 
> > > > > > > I think it would make sense to be a parameter of the add-port 
> > > > > > > command.
> > > > > > > But it would be for vhost-user related add-port command, I didn't 
> > > > > > > find
> > > > > > > where/if this is managed in libvirt XML.
> > > > > > 
> > > > > > For vhost-user, libvirt does not have any interaction with OVS at
> > > > > > all. If the thing that's using the vhost-user UNIX socket, in turn
> > > > > > connects to OVS, that's outside scope of libvirt. IOW, for 
> > > > > > vhost-user
> > > > > > OVS it seems like that job is for Nova / os-vif to solve.
> > > > > 
> > > > > I don't think they currently understand the issues involved in
> > > > > cross-version migration though.  This is a complex subject and easy to
> > > > > get wrong.  It would be significantly better to figure it out at 
> > > > > libvirt
> > > > > level since it already deals with cross-version migration issues.
> > > > 
> > > > Libvirt considers vhost-user to be a blackbox though - it just exposes
> > > > a UNIX socket, and whatever is on the other end is completely opaque.
> > > > The fact that the other end might plumb the data stream into openvswitch
> > > > is not something libvirt should know, as we don't want to end up having
> > > > to add custom code to libvirt for every different vhost-user server
> > > > impl.
> > > > 
> > > > IOW, if the version str can be passed to QEMU, and then onto vhost-user
> > > > backend in QEMU, then libvirt can be involved. If the version str has
> > > > to be given to openvswitch that's not for libvirt to deall with.
> > > > 
> > > > Regards,
> > > > Daniel
> > > 
> > > It's not just about passing it to QEMU. The following is needed:
> > > - need to query version when creating VM/device
> > > - need to query supported versions when migrating VM
> > 
> > Those are both things that nova can do, since it knows what the vhost-user
> > device in question is connected to, and so can query the versions, and check
> > versions before triggering migration with libvirt
> 
> It can, but then it will need to query libvirt or source for the version
> string since it's in the XML.

No, it wouldn't be in the XML at all. Nova on the source queries what
vhostuser version it has and what the target host has, and can prevent
the migration if they're incompatible. I dont think libvirt has to be
involved at all for this, as all the info can be obtained by nova/os-vif
from the vhostuser impl it has configured.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-02 Thread Michael S. Tsirkin
On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote:
> On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:
> > > On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:
> > > > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> > > > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:
> > > > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > > > > > 
> > > > > > > > > It depends where / how in OVS it needs to be set. The only 
> > > > > > > > > stuff libvirt
> > > > > > > > > does with OVS is to run 'add-port' and 'del-port' commands 
> > > > > > > > > via the ovs
> > > > > > > > > cli tool. We pass through arguments from the port profile 
> > > > > > > > > stored in the
> > > > > > > > > XML config.
> > > > > > > > > 
> > > > > > > > >   
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > >> > > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > > > > > 
> > > > > > > > >   
> > > > > > > > > 
> > > > > > > > > eg those things in  get passed as cli args to 
> > > > > > > > > the 'add-port'
> > > > > > > > > command. Soo if add-port needs this new version string, then 
> > > > > > > > > we'd need
> > > > > > > > > to add the version to the openvswitch virtualport XML.
> > > > > > > > > 
> > > > > > > > > If the version is provided to OVS in a different command, 
> > > > > > > > > then it would
> > > > > > > > > probably be outside scope of libvirt.
> > > > > > > > 
> > > > > > > > I think it would make sense to be a parameter of the add-port 
> > > > > > > > command.
> > > > > > > > But it would be for vhost-user related add-port command, I 
> > > > > > > > didn't find
> > > > > > > > where/if this is managed in libvirt XML.
> > > > > > > 
> > > > > > > For vhost-user, libvirt does not have any interaction with OVS at
> > > > > > > all. If the thing that's using the vhost-user UNIX socket, in turn
> > > > > > > connects to OVS, that's outside scope of libvirt. IOW, for 
> > > > > > > vhost-user
> > > > > > > OVS it seems like that job is for Nova / os-vif to solve.
> > > > > > 
> > > > > > I don't think they currently understand the issues involved in
> > > > > > cross-version migration though.  This is a complex subject and easy 
> > > > > > to
> > > > > > get wrong.  It would be significantly better to figure it out at 
> > > > > > libvirt
> > > > > > level since it already deals with cross-version migration issues.
> > > > > 
> > > > > Libvirt considers vhost-user to be a blackbox though - it just exposes
> > > > > a UNIX socket, and whatever is on the other end is completely opaque.
> > > > > The fact that the other end might plumb the data stream into 
> > > > > openvswitch
> > > > > is not something libvirt should know, as we don't want to end up 
> > > > > having
> > > > > to add custom code to libvirt for every different vhost-user server
> > > > > impl.
> > > > > 
> > > > > IOW, if the version str can be passed to QEMU, and then onto 
> > > > > vhost-user
> > > > > backend in QEMU, then libvirt can be involved. If the version str has
> > > > > to be given to openvswitch that's not for libvirt to deall with.
> > > > > 
> > > > > Regards,
> > > > > Daniel
> > > > 
> > > > It's not just about passing it to QEMU. The following is needed:
> > > > - need to query version when creating VM/device
> > > > - need to query supported versions when migrating VM
> > > 
> > > Those are both things that nova can do, since it knows what the vhost-user
> > > device in question is connected to, and so can query the versions, and 
> > > check
> > > versions before triggering migration with libvirt
> > 
> > It can, but then it will need to query libvirt or source for the version
> > string since it's in the XML.
> 
> No, it wouldn't be in the XML at all. Nova on the source queries what
> vhostuser version it has and what the target host has, and can prevent
> the migration if they're incompatible.

This is not sufficient. Exactly the same as qemu machine type,
this must be preserved from time of install and moved
wherever VM goes.


> I dont think libvirt has to be
> involved at all for this, as all the info can be obtained by nova/os-vif
> from the vhostuser impl it has configured.
> 
> Regards,
> Daniel

Given we are already confused at libvirt level, I find it
highly unlikely upper levels will do the right thing.

Generally I simply don't understand why would we
expose to higher levels something which is fundamentally
not a policy decision.


> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http

Re: [ovs-dev] [PATCH] ofp-actions: Fix tunnel meatdata subfield length in flow action

2017-02-02 Thread Ben Pfaff
On Mon, Jan 09, 2017 at 06:09:33PM -0800, Yi-Hung Wei wrote:
> Previously, if a flow action that involves a tunnel metadata subfield is
> dumpped from vswitchd, the replied subfield length in the OXM header is
> filled with the maximum possible field length, instead of the length
> configured in the tunnel TLV mapping table.
> 
> In order to derive the correct length of a tun_metadata subfield, this patch
> passes the tunnel TLV mapping table (struct tun_table) to where we decode
> the flow actions. extract-ofp-actions is updated to pass the tunnel TLV 
> mapping
> table to the flow action decoding functions. Also, a new error code is
> introduced to identify a flow with an invalid tun_metadata ID. Moreover, a
> testcase is added to prevent future regressions.
> 
> Currently, this patch only addresses the usage of tun_metaata for REG_MOVE 
> action.
> Other actions that may use the tun_metadata will be addressed in subsequent 
> patches.
> 
> VMWare-BZ: #1768370
> Reported-by: Harold Lim 
> Signed-off-by: Yi-Hung Wei 

I think I've seen some side discussion on this patch.  Is it still
relevant as is or is some revision already in progress?

In any case, s/meatdata/metadata/ in the subject.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/6] datapath-windows: Fix alignment in MapTunAttrToFlowPut

2017-02-02 Thread Ben Pfaff
I see that Guru applied this.

On Tue, Jan 10, 2017 at 04:48:28PM +, Alin Serdean wrote:
> Found by inspection.
> 
> Signed-off-by: Alin Gabriel Serdean 
> ---
>  datapath-windows/ovsext/Flow.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/datapath-windows/ovsext/Flow.c b/datapath-windows/ovsext/Flow.c
> index 2e8b42b..749d83a 100644
> --- a/datapath-windows/ovsext/Flow.c
> +++ b/datapath-windows/ovsext/Flow.c
> @@ -1809,7 +1809,7 @@ MapTunAttrToFlowPut(PNL_ATTR *keyAttrs,
>  }
>  
>  if (tunAttrs[OVS_TUNNEL_KEY_ATTR_OAM]) {
> -destKey->tunKey.flags |= OVS_TNL_F_OAM;
> +destKey->tunKey.flags |= OVS_TNL_F_OAM;
>  }
>  
>  if (tunAttrs[OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS]) {
> -- 
> 2.10.2.windows.1
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/6] Fix tunnel information destination port

2017-02-02 Thread Ben Pfaff
On Tue, Jan 10, 2017 at 04:48:28PM +, Alin Serdean wrote:
> Deploying a simple environment using VXLAN encapsulation at the moment
> does not work under Hyper-V. The tunnel information sent
> by the userspace is not fully treated in the datapath.
> 
> To be more specific every packet that tries to be encapsulated via VXLAN 
> tunnel
> triggers the following line:
> https://github.com/openvswitch/ovs/blob/75e2077e0c43224bcca92746b28b01a4936fc101/datapath-windows/ovsext/Flow.c#L1754
> which is followed by a set action fail.
> 
> The actual attribute that triggers the above is: OVS_TUNNEL_KEY_ATTR_TP_DST.
> 
> This series adds support for that attribute and updates the current tunnel 
> implementations.

Patch 1 was already on master.  I've now applied patches 2 through 5 to
master.  Please let me know if you want these on branch-2.7 also.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 0/5] datapath-windows: Add support for Ipv4 fragments

2017-02-02 Thread Anand Kumar
Add support for maintaining and tracking IPv4 fragments.
This patch add a new file IpFragment.c and IpFragment.h which 
includes Ipv4 fragment related API's.

---
v4->v5:
  - Modified Patch 3 to retain MRU in _OVS_BUFFER_CONTEXT instead of 
using it in ovsForwardingContext with minor changes in rest of the 
patches.
v3->v4:
  - Rebase
  - Acquire read lock for read operations.
v2->v3:
  - using spinlock instead of RW lock.
  - updated log messages, summary, fixed alignment issues.
v1->v2:
  - Patch 4 updated to make it compile for release mode.
---
Anand Kumar (5):
  datapath-windows: Added a new file to support Ipv4 fragments.
  datapath-windows: Added Ipv4 fragments support in Conntrack
  datapath-windows: Retain MRU value in the _OVS_BUFFER_CONTEXT.
  datapath-windows: Updated OvsTcpSegmentNBL to handle IP fragments.
  datapath-windows: Fragment NBL based on MRU size

 datapath-windows/automake.mk   |   2 +
 datapath-windows/ovsext/Actions.c  |  40 ++-
 datapath-windows/ovsext/BufferMgmt.c   | 195 +
 datapath-windows/ovsext/BufferMgmt.h   |  11 +-
 datapath-windows/ovsext/Conntrack.c|  32 ++-
 datapath-windows/ovsext/Conntrack.h|   6 +-
 datapath-windows/ovsext/Debug.h|   3 +-
 datapath-windows/ovsext/DpInternal.h   |   2 +-
 datapath-windows/ovsext/Geneve.c   |   2 +-
 datapath-windows/ovsext/Gre.c  |   2 +-
 datapath-windows/ovsext/IpFragment.c   | 502 +
 datapath-windows/ovsext/IpFragment.h   |  73 +
 datapath-windows/ovsext/Stt.c  |   2 +-
 datapath-windows/ovsext/Switch.c   |   9 +
 datapath-windows/ovsext/User.c |  22 +-
 datapath-windows/ovsext/Vxlan.c|   2 +-
 datapath-windows/ovsext/ovsext.vcxproj |   2 +
 17 files changed, 830 insertions(+), 77 deletions(-)
 create mode 100644 datapath-windows/ovsext/IpFragment.c
 create mode 100644 datapath-windows/ovsext/IpFragment.h

-- 
2.9.3.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 2/5] datapath-windows: Added Ipv4 fragments support in Conntrack

2017-02-02 Thread Anand Kumar
This patch adds support for tracking Ipv4 fragments in conntrack module.
Individual fragments are not tracked and are consumed by the
fragmentation/reassembly. Only the reassembled Ipv4 datagram is tracked and
treated as a single ct entry.

Signed-off-by: Anand Kumar 
---
v4->v5:
- Removed MRU argument from function declarations as MRU is now retained
in _OVS_BUFFER_CONTEXT.
v3->v4: No Change
v2->v3:
- Updated log messages and fixed alignment.
v1->v2: No change
---
 datapath-windows/ovsext/Actions.c   | 13 +
 datapath-windows/ovsext/Conntrack.c | 32 +---
 datapath-windows/ovsext/Conntrack.h |  6 +-
 3 files changed, 39 insertions(+), 12 deletions(-)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index fd952e8..3dea9a9 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -2032,11 +2032,16 @@ OvsDoExecuteActions(POVS_SWITCH_CONTEXT switchContext,
 }
 }
 
-status = OvsExecuteConntrackAction(ovsFwdCtx.curNbl, layers,
-   key, (const PNL_ATTR)a);
+status = OvsExecuteConntrackAction(switchContext, 
&ovsFwdCtx.curNbl,
+   ovsFwdCtx.completionList,
+   
ovsFwdCtx.fwdDetail->SourcePortId,
+   layers, key, (const PNL_ATTR)a);
 if (status != NDIS_STATUS_SUCCESS) {
-OVS_LOG_ERROR("CT Action failed");
-dropReason = L"OVS-conntrack action failed";
+/* Pending NBLs are consumed by Defragmentation. */
+if (status != NDIS_STATUS_PENDING) {
+OVS_LOG_ERROR("CT Action failed");
+dropReason = L"OVS-conntrack action failed";
+}
 goto dropit;
 }
 break;
diff --git a/datapath-windows/ovsext/Conntrack.c 
b/datapath-windows/ovsext/Conntrack.c
index d1be480..6ffb55e 100644
--- a/datapath-windows/ovsext/Conntrack.c
+++ b/datapath-windows/ovsext/Conntrack.c
@@ -15,6 +15,7 @@
  */
 
 #include "Conntrack.h"
+#include "IpFragment.h"
 #include "Jhash.h"
 #include "PacketParser.h"
 #include "Event.h"
@@ -312,13 +313,23 @@ OvsCtEntryExpired(POVS_CT_ENTRY entry)
 }
 
 static __inline NDIS_STATUS
-OvsDetectCtPacket(OvsFlowKey *key)
+OvsDetectCtPacket(POVS_SWITCH_CONTEXT switchContext,
+  PNET_BUFFER_LIST *curNbl,
+  OvsCompletionList *completionList,
+  NDIS_SWITCH_PORT_ID sourcePort,
+  OvsFlowKey *key,
+  PNET_BUFFER_LIST *newNbl)
 {
 /* Currently we support only Unfragmented TCP packets */
 switch (ntohs(key->l2.dlType)) {
 case ETH_TYPE_IPV4:
 if (key->ipKey.nwFrag != OVS_FRAG_TYPE_NONE) {
-return NDIS_STATUS_NOT_SUPPORTED;
+return OvsProcessIpv4Fragment(switchContext,
+  curNbl,
+  completionList,
+  sourcePort,
+  key->tunKey.tunnelId,
+  newNbl);
 }
 if (key->ipKey.nwProto == IPPROTO_TCP
 || key->ipKey.nwProto == IPPROTO_UDP
@@ -684,11 +695,16 @@ OvsCtExecute_(PNET_BUFFER_LIST curNbl,
 /*
  *---
  * OvsExecuteConntrackAction
- * Executes Conntrack actions XXX - Add more
+ * Executes Conntrack actions
+ * For the Ipv4 fragments, consume the orginal fragment NBL
+ * XXX - Add more
  *---
  */
 NDIS_STATUS
-OvsExecuteConntrackAction(PNET_BUFFER_LIST curNbl,
+OvsExecuteConntrackAction(POVS_SWITCH_CONTEXT switchContext,
+  PNET_BUFFER_LIST *curNbl,
+  OvsCompletionList *completionList,
+  NDIS_SWITCH_PORT_ID sourcePort,
   OVS_PACKET_HDR_INFO *layers,
   OvsFlowKey *key,
   const PNL_ATTR a)
@@ -699,10 +715,11 @@ OvsExecuteConntrackAction(PNET_BUFFER_LIST curNbl,
 MD_MARK *mark = NULL;
 MD_LABELS *labels = NULL;
 PCHAR helper = NULL;
-
+PNET_BUFFER_LIST newNbl = NULL;
 NDIS_STATUS status;
 
-status = OvsDetectCtPacket(key);
+status = OvsDetectCtPacket(switchContext, curNbl, completionList,
+   sourcePort, key, &newNbl);
 if (status != NDIS_STATUS_SUCCESS) {
 return status;
 }
@@ -735,7 +752,8 @@ OvsExecuteConntrackAction(PNET_BUFFER_LIST curNbl,
 }
 }
 
-status = OvsCtExecute_(curNbl, key, layers,
+/* If newNbl is not allocated, use the current Nbl*/
+status = OvsCtExecu

[ovs-dev] [PATCH v5 4/5] datapath-windows: Updated OvsTcpSegmentNBL to handle IP fragments.

2017-02-02 Thread Anand Kumar
With this patch, OvsTcpSegmentNBL not only supports fragmenting NBL
to TCP segments but also Ipv4 fragments.

To reflect the new changes, renamed function name from OvsTcpSegmentNBL
to OvsFragmentNBL and created a wrapper for OvsTcpSegmentNBL.

Signed-off-by: Anand Kumar 
---
v4->v5: Changed a variable mss to fragmentSize.
v3->v4: No Change
v2->v3:
- Updated log message and function summary
v1->v2:
- Fix compile error for release mode.
---
 datapath-windows/ovsext/BufferMgmt.c | 194 +--
 datapath-windows/ovsext/BufferMgmt.h |  10 +-
 datapath-windows/ovsext/Geneve.c |   2 +-
 datapath-windows/ovsext/Gre.c|   2 +-
 datapath-windows/ovsext/Stt.c|   2 +-
 datapath-windows/ovsext/User.c   |   2 +-
 datapath-windows/ovsext/Vxlan.c  |   2 +-
 7 files changed, 152 insertions(+), 62 deletions(-)

diff --git a/datapath-windows/ovsext/BufferMgmt.c 
b/datapath-windows/ovsext/BufferMgmt.c
index d99052d..0011c10 100644
--- a/datapath-windows/ovsext/BufferMgmt.c
+++ b/datapath-windows/ovsext/BufferMgmt.c
@@ -1084,6 +1084,31 @@ nblcopy_error:
 return NULL;
 }
 
+NDIS_STATUS
+GetIpHeaderInfo(PNET_BUFFER_LIST curNbl,
+UINT32 *hdrSize)
+{
+CHAR *ethBuf[sizeof(EthHdr)];
+EthHdr *eth;
+IPHdr *ipHdr;
+PNET_BUFFER curNb;
+
+curNb = NET_BUFFER_LIST_FIRST_NB(curNbl);
+ASSERT(NET_BUFFER_NEXT_NB(curNb) == NULL);
+
+eth = (EthHdr *)NdisGetDataBuffer(curNb, ETH_HEADER_LENGTH,
+  (PVOID)ðBuf, 1, 0);
+if (eth == NULL) {
+return NDIS_STATUS_INVALID_PACKET;
+}
+ipHdr = (IPHdr *)((PCHAR)eth + ETH_HEADER_LENGTH);
+if (ipHdr == NULL) {
+return NDIS_STATUS_INVALID_PACKET;
+}
+*hdrSize = (UINT32)(ETH_HEADER_LENGTH + (ipHdr->ihl * 4));
+return NDIS_STATUS_SUCCESS;
+}
+
 /*
  * --
  * GetSegmentHeaderInfo
@@ -1113,15 +1138,16 @@ GetSegmentHeaderInfo(PNET_BUFFER_LIST nbl,
 
 /*
  * --
- * FixSegmentHeader
+ * FixPacketHeader
  *
- *Fix IP length, IP checksum, TCP sequence number and TCP checksum
- *in the segment.
+ *Fix IP length, Offset, IP checksum, TCP sequence number and TCP checksum
+ *in the netbuffer if applicable.
  * --
  */
 static NDIS_STATUS
-FixSegmentHeader(PNET_BUFFER nb, UINT16 segmentSize, UINT32 seqNumber,
- BOOLEAN lastPacket, UINT16 packetCounter)
+FixPacketHeader(PNET_BUFFER nb, UINT16 segmentSize, UINT32 seqNumber,
+BOOLEAN lastPacket, UINT16 packetCounter, UINT16 offset,
+BOOLEAN isFragment)
 {
 EthHdr *dstEth = NULL;
 TCPHdr *dstTCP = NULL;
@@ -1140,41 +1166,55 @@ FixSegmentHeader(PNET_BUFFER nb, UINT16 segmentSize, 
UINT32 seqNumber,
 case ETH_TYPE_IPV4_NBO:
 {
 IPHdr *dstIP = NULL;
-
-ASSERT((INT)MmGetMdlByteCount(mdl) - NET_BUFFER_CURRENT_MDL_OFFSET(nb)
+if (!isFragment) {
+ASSERT((INT)MmGetMdlByteCount(mdl) - 
NET_BUFFER_CURRENT_MDL_OFFSET(nb)
 >= sizeof(EthHdr) + sizeof(IPHdr) + sizeof(TCPHdr));
-dstIP = (IPHdr *)((PCHAR)dstEth + sizeof(*dstEth));
-dstTCP = (TCPHdr *)((PCHAR)dstIP + dstIP->ihl * 4);
-ASSERT((INT)MmGetMdlByteCount(mdl) - NET_BUFFER_CURRENT_MDL_OFFSET(nb)
+dstIP = (IPHdr *)((PCHAR)dstEth + sizeof(*dstEth));
+dstTCP = (TCPHdr *)((PCHAR)dstIP + dstIP->ihl * 4);
+ASSERT((INT)MmGetMdlByteCount(mdl) - 
NET_BUFFER_CURRENT_MDL_OFFSET(nb)
 >= sizeof(EthHdr) + dstIP->ihl * 4 + TCP_HDR_LEN(dstTCP));
 
-/* Fix IP length and checksum */
-ASSERT(dstIP->protocol == IPPROTO_TCP);
-dstIP->tot_len = htons(segmentSize + dstIP->ihl * 4 + 
TCP_HDR_LEN(dstTCP));
-dstIP->id += packetCounter;
+/* Fix IP length and checksum */
+ASSERT(dstIP->protocol == IPPROTO_TCP);
+dstIP->tot_len = htons(segmentSize + dstIP->ihl * 4 + 
TCP_HDR_LEN(dstTCP));
+dstIP->id += packetCounter;
+dstTCP->seq = htonl(seqNumber);
+
+/*
+ * Set the TCP FIN and PSH bit only for the last packet
+ * More information can be found under:
+ * 
https://msdn.microsoft.com/en-us/library/windows/hardware/ff568840%28v=vs.85%29.aspx
+ */
+if (dstTCP->fin) {
+dstTCP->fin = lastPacket;
+}
+if (dstTCP->psh) {
+dstTCP->psh = lastPacket;
+}
+UINT16 csumLength = segmentSize + TCP_HDR_LEN(dstTCP);
+dstTCP->check = IPPseudoChecksum(&dstIP->saddr,
+ &dstIP->daddr,
+ IPPROTO_TCP,
+   

[ovs-dev] [PATCH v5 3/5] datapath-windows: Retain MRU value in the _OVS_BUFFER_CONTEXT.

2017-02-02 Thread Anand Kumar
This patch introduces a new field MRU(Maximum Recieved Unit) in the
_OVS_BUFFER_CONTEXT and it is used only for Ip Fragments to retain MRU for
the reassembled IP datagram when the packet is forwarded to userspace.

Signed-off-by: Anand Kumar 
---
v4->v5:
- Refactored the patch as MRU field is removed form 
ovsForwardingContext.
- Added MRU field in _OVS_BUFFER_CONTEXT.
- Updated commit message.
v3->v4: No Change
v2->v3: No change
v1->v2: No change
---
 datapath-windows/ovsext/BufferMgmt.c |  1 +
 datapath-windows/ovsext/BufferMgmt.h |  1 +
 datapath-windows/ovsext/DpInternal.h |  2 +-
 datapath-windows/ovsext/IpFragment.c |  2 ++
 datapath-windows/ovsext/User.c   | 20 +++-
 5 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/datapath-windows/ovsext/BufferMgmt.c 
b/datapath-windows/ovsext/BufferMgmt.c
index 6027c35..d99052d 100644
--- a/datapath-windows/ovsext/BufferMgmt.c
+++ b/datapath-windows/ovsext/BufferMgmt.c
@@ -266,6 +266,7 @@ OvsInitNBLContext(POVS_BUFFER_CONTEXT ctx,
 ctx->flags = flags;
 ctx->srcPortNo = srcPortNo;
 ctx->origDataLength = origDataLength;
+ctx->mru = 0;
 }
 
 
diff --git a/datapath-windows/ovsext/BufferMgmt.h 
b/datapath-windows/ovsext/BufferMgmt.h
index 11a05b2..77b2854 100644
--- a/datapath-windows/ovsext/BufferMgmt.h
+++ b/datapath-windows/ovsext/BufferMgmt.h
@@ -58,6 +58,7 @@ typedef union _OVS_BUFFER_CONTEXT {
 UINT32 origDataLength;
 UINT32 dataOffsetDelta;
 };
+UINT16 mru;
 };
 
 UINT64 value[MEM_ALIGN_SIZE(16) >> 3];
diff --git a/datapath-windows/ovsext/DpInternal.h 
b/datapath-windows/ovsext/DpInternal.h
index f62fc55..9d1a783 100644
--- a/datapath-windows/ovsext/DpInternal.h
+++ b/datapath-windows/ovsext/DpInternal.h
@@ -298,7 +298,7 @@ typedef struct _OVS_PACKET_INFO {
 typedef struct OvsPacketExecute {
uint32_t dpNo;
uint32_t inPort;
-
+   uint16 mru;
uint32_t packetLen;
uint32_t actionsLen;
PNL_MSG_HDR nlMsgHdr;
diff --git a/datapath-windows/ovsext/IpFragment.c 
b/datapath-windows/ovsext/IpFragment.c
index 37abcd8..b0294ce 100644
--- a/datapath-windows/ovsext/IpFragment.c
+++ b/datapath-windows/ovsext/IpFragment.c
@@ -208,6 +208,8 @@ OvsIpv4Reassemble(POVS_SWITCH_CONTEXT switchContext,
 OvsCompleteNBL(switchContext, *curNbl, TRUE);
 }
 /* Store mru in the ovs buffer context. */
+ctx = (POVS_BUFFER_CONTEXT)NET_BUFFER_LIST_CONTEXT_DATA_START(*newNbl);
+ctx->mru = entry->mru;
 *curNbl = *newNbl;
 return status;
 }
diff --git a/datapath-windows/ovsext/User.c b/datapath-windows/ovsext/User.c
index c7ac284..3bfe3d0 100644
--- a/datapath-windows/ovsext/User.c
+++ b/datapath-windows/ovsext/User.c
@@ -283,7 +283,8 @@ OvsNlExecuteCmdHandler(POVS_USER_PARAMS_CONTEXT 
usrParamsCtx,
 [OVS_PACKET_ATTR_ACTIONS] = {.type = NL_A_UNSPEC, .optional = FALSE},
 [OVS_PACKET_ATTR_USERDATA] = {.type = NL_A_UNSPEC, .optional = TRUE},
 [OVS_PACKET_ATTR_EGRESS_TUN_KEY] = {.type = NL_A_UNSPEC,
-.optional = TRUE}
+.optional = TRUE},
+[OVS_PACKET_ATTR_MRU] = { .type = NL_A_U16, .optional = TRUE }
 };
 
 RtlZeroMemory(&execute, sizeof(OvsPacketExecute));
@@ -381,6 +382,10 @@ _MapNlAttrToOvsPktExec(PNL_MSG_HDR nlMsgHdr, PNL_ATTR 
*nlAttrs,
 ASSERT(keyAttrs[OVS_KEY_ATTR_IN_PORT]);
 execute->inPort = NlAttrGetU32(keyAttrs[OVS_KEY_ATTR_IN_PORT]);
 execute->keyAttrs = keyAttrs;
+
+if (nlAttrs[OVS_PACKET_ATTR_MRU]) {
+execute->mru = NlAttrGetU16(nlAttrs[OVS_PACKET_ATTR_MRU]);
+}
 }
 
 NTSTATUS
@@ -397,6 +402,7 @@ OvsExecuteDpIoctl(OvsPacketExecute *execute)
 POVS_VPORT_ENTRYvport = NULL;
 PNL_ATTR tunnelAttrs[__OVS_TUNNEL_KEY_ATTR_MAX];
 OvsFlowKey tempTunKey = {0};
+POVS_BUFFER_CONTEXT ctx;
 
 if (execute->packetLen == 0) {
 status = STATUS_INVALID_PARAMETER;
@@ -459,6 +465,9 @@ OvsExecuteDpIoctl(OvsPacketExecute *execute)
 ndisStatus = OvsExtractFlow(pNbl, execute->inPort, &key, &layers,
  tempTunKey.tunKey.dst == 0 ? NULL : &tempTunKey.tunKey);
 
+ctx = (POVS_BUFFER_CONTEXT)NET_BUFFER_LIST_CONTEXT_DATA_START(pNbl);
+ctx->mru = (UINT32)execute->mru;
+
 if (ndisStatus == NDIS_STATUS_SUCCESS) {
 NdisAcquireRWLockRead(gOvsSwitchContext->dispatchLock, &lockState, 0);
 ndisStatus = OvsActionsExecute(gOvsSwitchContext, NULL, pNbl,
@@ -988,6 +997,7 @@ OvsCreateQueueNlPacket(PVOID userData,
 UINT32 nlMsgSize;
 NL_BUFFER nlBuf;
 PNL_MSG_HDR nlMsg;
+POVS_BUFFER_CONTEXT ctx;
 
 if (vport == NULL){
 /* No vport is not fatal. */
@@ -1071,6 +1081,14 @@ OvsCreateQueueNlPacket(PVOID userData,
 goto fail;
 }
 
+/* Set MRU attribute */
+ctx = (POVS_BUFFER_CONTEXT)NET_BUFFER_LIST_CONTEXT_DATA_START(nbl);
+if (ctx->mru != 0) {
+if (!NlMsgPutTailU16(&nl

[ovs-dev] [PATCH v5 5/5] datapath-windows: Fragment NBL based on MRU size

2017-02-02 Thread Anand Kumar
This patch adds support for Fragmenting NBL based on the MRU value.
MRU value is updated only for Ipv4 fragments, if it is non zero, then
fragment the NBL and send out the new NBL to the vnic.

Signed-off-by: Anand Kumar 
---
v4->v5:
- Use MRU information in the _OVS_BUFFER_CONTEXT to fragment NBL.
v3->v4: No Change
v2->v3:
- Updated log message
v1->v2: No change
---
 datapath-windows/ovsext/Actions.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/datapath-windows/ovsext/Actions.c 
b/datapath-windows/ovsext/Actions.c
index 3dea9a9..1a67d2b 100644
--- a/datapath-windows/ovsext/Actions.c
+++ b/datapath-windows/ovsext/Actions.c
@@ -34,6 +34,7 @@
 #include "Vport.h"
 #include "Vxlan.h"
 #include "Geneve.h"
+#include "IpFragment.h"
 
 #ifdef OVS_DBG_MOD
 #undef OVS_DBG_MOD
@@ -864,6 +865,8 @@ OvsOutputForwardingCtx(OvsForwardingContext *ovsFwdCtx)
 NDIS_STATUS status = STATUS_SUCCESS;
 POVS_SWITCH_CONTEXT switchContext = ovsFwdCtx->switchContext;
 PCWSTR dropReason;
+PNET_BUFFER_LIST fragNbl = NULL;
+POVS_BUFFER_CONTEXT ctx;
 
 /*
  * Handle the case where the some of the destination ports are tunneled
@@ -909,6 +912,30 @@ OvsOutputForwardingCtx(OvsForwardingContext *ovsFwdCtx)
 goto dropit;
 }
 
+ctx = 
(POVS_BUFFER_CONTEXT)NET_BUFFER_LIST_CONTEXT_DATA_START(ovsFwdCtx->curNbl);
+if (ctx->mru != 0) {
+/* Fragment nbl based on mru. If it returns NULL then the original
+ * reassembled NBL is sent out to the VIF which will be dropped if
+ * the packet size is more than VIF MTU.
+ */
+fragNbl = OvsFragmentNBL(ovsFwdCtx->switchContext,
+ ovsFwdCtx->curNbl,
+ &(ovsFwdCtx->layers),
+ ctx->mru, 0, TRUE);
+if (fragNbl != NULL) {
+OvsCompleteNBLForwardingCtx(ovsFwdCtx,
+L"Dropped since fragmenting NBL");
+status = OvsInitForwardingCtx(ovsFwdCtx,
+  ovsFwdCtx->switchContext,
+  fragNbl,
+  ovsFwdCtx->srcVportNo,
+  ovsFwdCtx->sendFlags,
+  
NET_BUFFER_LIST_SWITCH_FORWARDING_DETAIL(fragNbl),
+  ovsFwdCtx->completionList,
+  &ovsFwdCtx->layers, FALSE);
+}
+}
+
 OvsSendNBLIngress(ovsFwdCtx->switchContext, ovsFwdCtx->curNbl,
   ovsFwdCtx->sendFlags);
 /* End this pipeline by resetting the corresponding context. */
-- 
2.9.3.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 1/5] datapath-windows: Added a new file to support Ipv4 fragments.

2017-02-02 Thread Anand Kumar
This patch adds functionalities to support IPv4 fragments, which will be
used by Conntrack module.

Added a new structure to hold the Ipv4 fragments and a hash table to
hold Ipv4 datagram entries. Also added a clean up thread that runs
every minute to delete the expired IPv4 datagram entries.

The individual fragments are ignored by the conntrack. Once all the
fragments are recieved, a new NBL is created out of the reassembled
fragments and conntrack executes actions on the new NBL.

Created new APIs OvsProcessIpv4Fragment() to process individual fragments,
OvsIpv4Reassemble() to reassemble Ipv4 fragments.

Signed-off-by: Anand Kumar 
---
v4->v5:
- Removed MRU argument from function declarations as MRU is now retained
in _OVS_BUFFER_CONTEXT.
v3->v4:
- Acquire read lock for read operation 
- Undo "trigger cleanup event after reassembling the fragments".
v2->v3:
- Use spinlock instead of RW lock
- Trigger cleanup event after reassembling the fragments.
v1->v2: No change
---
 datapath-windows/automake.mk   |   2 +
 datapath-windows/ovsext/Debug.h|   3 +-
 datapath-windows/ovsext/IpFragment.c   | 500 +
 datapath-windows/ovsext/IpFragment.h   |  73 +
 datapath-windows/ovsext/Switch.c   |   9 +
 datapath-windows/ovsext/ovsext.vcxproj |   2 +
 6 files changed, 588 insertions(+), 1 deletion(-)
 create mode 100644 datapath-windows/ovsext/IpFragment.c
 create mode 100644 datapath-windows/ovsext/IpFragment.h

diff --git a/datapath-windows/automake.mk b/datapath-windows/automake.mk
index 53983ae..4f7b55a 100644
--- a/datapath-windows/automake.mk
+++ b/datapath-windows/automake.mk
@@ -32,6 +32,8 @@ EXTRA_DIST += \
datapath-windows/ovsext/Flow.h \
datapath-windows/ovsext/Gre.h \
datapath-windows/ovsext/Gre.c \
+   datapath-windows/ovsext/IpFragment.c \
+   datapath-windows/ovsext/IpFragment.h \
datapath-windows/ovsext/IpHelper.c \
datapath-windows/ovsext/IpHelper.h \
datapath-windows/ovsext/Jhash.c \
diff --git a/datapath-windows/ovsext/Debug.h b/datapath-windows/ovsext/Debug.h
index cae6ac9..6de1812 100644
--- a/datapath-windows/ovsext/Debug.h
+++ b/datapath-windows/ovsext/Debug.h
@@ -42,8 +42,9 @@
 #define OVS_DBG_STT  BIT32(22)
 #define OVS_DBG_CONTRK   BIT32(23)
 #define OVS_DBG_GENEVE   BIT32(24)
+#define OVS_DBG_IPFRAG   BIT32(25)
 
-#define OVS_DBG_LAST 24  /* Set this to the last defined module number. */
+#define OVS_DBG_LAST 25  /* Set this to the last defined module number. */
 /* Please add above OVS_DBG_LAST. */
 
 #define OVS_DBG_ERRORDPFLTR_ERROR_LEVEL
diff --git a/datapath-windows/ovsext/IpFragment.c 
b/datapath-windows/ovsext/IpFragment.c
new file mode 100644
index 000..37abcd8
--- /dev/null
+++ b/datapath-windows/ovsext/IpFragment.c
@@ -0,0 +1,500 @@
+/*
+ * Copyright (c) 2017 VMware, Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at:
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "Conntrack.h"
+#include "Debug.h"
+#include "IpFragment.h"
+#include "Jhash.h"
+#include "Offload.h"
+#include "PacketParser.h"
+
+#ifdef OVS_DBG_MOD
+#undef OVS_DBG_MOD
+#endif
+#define OVS_DBG_MOD OVS_DBG_IPFRAG
+
+/* Function declarations */
+static VOID OvsIpFragmentEntryCleaner(PVOID data);
+static VOID OvsIpFragmentEntryDelete(POVS_IPFRAG_ENTRY entry);
+
+/* Global and static variables */
+static OVS_IPFRAG_THREAD_CTX ipFragThreadCtx;
+static PNDIS_RW_LOCK_EX ovsIpFragmentHashLockObj;
+static UINT64 ipTotalEntries;
+static PLIST_ENTRY OvsIpFragTable;
+
+NDIS_STATUS
+OvsInitIpFragment(POVS_SWITCH_CONTEXT context)
+{
+
+NDIS_STATUS status;
+HANDLE threadHandle = NULL;
+
+/* Init the sync-lock */
+ovsIpFragmentHashLockObj = NdisAllocateRWLock(context->NdisFilterHandle);
+if (ovsIpFragmentHashLockObj == NULL) {
+return STATUS_INSUFFICIENT_RESOURCES;
+}
+
+/* Init the Hash Buffer */
+OvsIpFragTable = OvsAllocateMemoryWithTag(sizeof(LIST_ENTRY)
+  * IP_FRAG_HASH_TABLE_SIZE,
+  OVS_MEMORY_TAG);
+if (OvsIpFragTable == NULL) {
+NdisFreeRWLock(ovsIpFragmentHashLockObj);
+ovsIpFragmentHashLockObj = NULL;
+return STATUS_INSUFFICIENT_RESOURCES;
+}
+
+for (int i = 0; i < IP_FRAG_HASH_TABLE_SIZE; i++) {
+InitializeListHead(&OvsIpFragTable[i]);
+}
+
+/* Init Cleaner Thread */
+KeInitializeEven

Re: [ovs-dev] [PATCH] odp: Fix sample action in userspace

2017-02-02 Thread Ben Pfaff
On Wed, Jan 11, 2017 at 04:00:04PM -0800, Andy Zhou wrote:
> User space implementation of the sample action is not consistent with
> kernel datapath. In kernel datapath, the side effects of actions
> within the sample actions are not visible to the subsequent actions.
> Current user space handling does not follow the same logic. This patch
> makes them consistent.
> 
> Signed-off-by: Andy Zhou 

Thanks!

This looks like a bug fix, but I suspect that there's no way for
existing code to actually trigger the bug, because I think that OVS at
least up to 2.6 only ever puts actions that do not modify the packet
into a sample action.

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofp-actions: Fix tunnel meatdata subfield length in flow action

2017-02-02 Thread Yi-Hung Wei
Hi Ben,

This patch is updated by [PATCH v2] ofp-actions: Fix variable length
meta-flow field usage in flow actions, and it is applied to master
yesterday.

Best,

-Yi-Hung

On Thu, Feb 2, 2017 at 10:45 AM, Ben Pfaff  wrote:

> On Mon, Jan 09, 2017 at 06:09:33PM -0800, Yi-Hung Wei wrote:
> > Previously, if a flow action that involves a tunnel metadata subfield is
> > dumpped from vswitchd, the replied subfield length in the OXM header is
> > filled with the maximum possible field length, instead of the length
> > configured in the tunnel TLV mapping table.
> >
> > In order to derive the correct length of a tun_metadata subfield, this
> patch
> > passes the tunnel TLV mapping table (struct tun_table) to where we decode
> > the flow actions. extract-ofp-actions is updated to pass the tunnel TLV
> mapping
> > table to the flow action decoding functions. Also, a new error code is
> > introduced to identify a flow with an invalid tun_metadata ID. Moreover,
> a
> > testcase is added to prevent future regressions.
> >
> > Currently, this patch only addresses the usage of tun_metaata for
> REG_MOVE action.
> > Other actions that may use the tun_metadata will be addressed in
> subsequent patches.
> >
> > VMWare-BZ: #1768370
> > Reported-by: Harold Lim 
> > Signed-off-by: Yi-Hung Wei 
>
> I think I've seen some side discussion on this patch.  Is it still
> relevant as is or is some revision already in progress?
>
> In any case, s/meatdata/metadata/ in the subject.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/6] Fix tunnel information destination port

2017-02-02 Thread Sairam Venugopal
Hi Ben,

Yes, it will be good to have these on 2.7 as well.

Thanks,
Sairam

-Original Message-
From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
On Behalf Of Ben Pfaff
Sent: Thursday, February 2, 2017 10:53 AM
To: Alin Serdean 
Cc: d...@openvswitch.org
Subject: Re: [ovs-dev] [PATCH 0/6] Fix tunnel information destination port

On Tue, Jan 10, 2017 at 04:48:28PM +, Alin Serdean wrote:
> Deploying a simple environment using VXLAN encapsulation at the moment 
> does not work under Hyper-V. The tunnel information sent by the 
> userspace is not fully treated in the datapath.
> 
> To be more specific every packet that tries to be encapsulated via 
> VXLAN tunnel triggers the following line:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvs
> witch_ovs_blob_75e2077e0c43224bcca92746b28b01a4936fc101_datapath-2Dwin
> dows_ovsext_Flow.c-23L1754&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHU
> OjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=6Sjd7sTDI_6S1unUsi7L1N_smHiu0KY
> gzgelATTDKC8&s=qhs6PeGOp6VJlBNJ3iskoP4odhoCQWMUoZINexxmDJA&e=
> which is followed by a set action fail.
> 
> The actual attribute that triggers the above is: OVS_TUNNEL_KEY_ATTR_TP_DST.
> 
> This series adds support for that attribute and updates the current tunnel 
> implementations.

Patch 1 was already on master.  I've now applied patches 2 through 5 to master. 
 Please let me know if you want these on branch-2.7 also.

Thanks,

Ben.
___
dev mailing list
d...@openvswitch.org
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=6Sjd7sTDI_6S1unUsi7L1N_smHiu0KYgzgelATTDKC8&s=7klAtQ4aN6BXfMfvMe_rkcro6tK5viBHUqel951s9T4&e=
 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] system-ovn.at: Fix race conditions.

2017-02-02 Thread Guru Shetty
On 2 February 2017 at 10:12, Ben Pfaff  wrote:

> On Wed, Feb 01, 2017 at 11:52:14PM -0800, Gurucharan Shetty wrote:
> > The code to wait for a particular type of flow
> > in ovs-vswitchd was not specific enough. This commit
> > changes that and to be doubly sure, also uses the
> > sync command.
> >
> > Reported-by: Andy Zhou 
> > Reported-by: Joe Stringer 
> > Signed-off-by: Gurucharan Shetty 
>
> Seems reasonable, thank you!
>
> Acked-by: Ben Pfaff 
>
Thanks. Applied to master and 2.7
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofp-actions: Fix tunnel meatdata subfield length in flow action

2017-02-02 Thread Ben Pfaff
OK, great, thanks.

On Thu, Feb 02, 2017 at 11:01:19AM -0800, Yi-Hung Wei wrote:
> Hi Ben,
> 
> This patch is updated by [PATCH v2] ofp-actions: Fix variable length
> meta-flow field usage in flow actions, and it is applied to master
> yesterday.
> 
> Best,
> 
> -Yi-Hung
> 
> On Thu, Feb 2, 2017 at 10:45 AM, Ben Pfaff  wrote:
> 
> > On Mon, Jan 09, 2017 at 06:09:33PM -0800, Yi-Hung Wei wrote:
> > > Previously, if a flow action that involves a tunnel metadata subfield is
> > > dumpped from vswitchd, the replied subfield length in the OXM header is
> > > filled with the maximum possible field length, instead of the length
> > > configured in the tunnel TLV mapping table.
> > >
> > > In order to derive the correct length of a tun_metadata subfield, this
> > patch
> > > passes the tunnel TLV mapping table (struct tun_table) to where we decode
> > > the flow actions. extract-ofp-actions is updated to pass the tunnel TLV
> > mapping
> > > table to the flow action decoding functions. Also, a new error code is
> > > introduced to identify a flow with an invalid tun_metadata ID. Moreover,
> > a
> > > testcase is added to prevent future regressions.
> > >
> > > Currently, this patch only addresses the usage of tun_metaata for
> > REG_MOVE action.
> > > Other actions that may use the tun_metadata will be addressed in
> > subsequent patches.
> > >
> > > VMWare-BZ: #1768370
> > > Reported-by: Harold Lim 
> > > Signed-off-by: Yi-Hung Wei 
> >
> > I think I've seen some side discussion on this patch.  Is it still
> > relevant as is or is some revision already in progress?
> >
> > In any case, s/meatdata/metadata/ in the subject.
> >
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/6] Fix tunnel information destination port

2017-02-02 Thread Ben Pfaff
OK, great, I applied them to branch-2.7 now too.

On Thu, Feb 02, 2017 at 07:12:20PM +, Sairam Venugopal wrote:
> Hi Ben,
> 
> Yes, it will be good to have these on 2.7 as well.
> 
> Thanks,
> Sairam
> 
> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Thursday, February 2, 2017 10:53 AM
> To: Alin Serdean 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 0/6] Fix tunnel information destination port
> 
> On Tue, Jan 10, 2017 at 04:48:28PM +, Alin Serdean wrote:
> > Deploying a simple environment using VXLAN encapsulation at the moment 
> > does not work under Hyper-V. The tunnel information sent by the 
> > userspace is not fully treated in the datapath.
> > 
> > To be more specific every packet that tries to be encapsulated via 
> > VXLAN tunnel triggers the following line:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_openvs
> > witch_ovs_blob_75e2077e0c43224bcca92746b28b01a4936fc101_datapath-2Dwin
> > dows_ovsext_Flow.c-23L1754&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHU
> > OjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=6Sjd7sTDI_6S1unUsi7L1N_smHiu0KY
> > gzgelATTDKC8&s=qhs6PeGOp6VJlBNJ3iskoP4odhoCQWMUoZINexxmDJA&e=
> > which is followed by a set action fail.
> > 
> > The actual attribute that triggers the above is: OVS_TUNNEL_KEY_ATTR_TP_DST.
> > 
> > This series adds support for that attribute and updates the current tunnel 
> > implementations.
> 
> Patch 1 was already on master.  I've now applied patches 2 through 5 to 
> master.  Please let me know if you want these on branch-2.7 also.
> 
> Thanks,
> 
> Ben.
> ___
> dev mailing list
> d...@openvswitch.org
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=6Sjd7sTDI_6S1unUsi7L1N_smHiu0KYgzgelATTDKC8&s=7klAtQ4aN6BXfMfvMe_rkcro6tK5viBHUqel951s9T4&e=
>  
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] rhel: Remove obsolete OVSDPDKVhostPort from ifdown script.

2017-02-02 Thread Ben Pfaff
On Tue, Jan 24, 2017 at 06:21:52PM -0800, Daniele Di Proietto wrote:
> The support for vhost cuse port has been removed long ago.
> 
> Fixes:419876444357("netdev-dpdk: Remove dpdkvhostcuse ports")
> Signed-off-by: Daniele Di Proietto 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/1] doc: Remove "experimental" warning for userspace.

2017-02-02 Thread Stokes, Ian
> On 02/02/2017 04:44 PM, Ian Stokes wrote:
> > Remove the experimental warning tag in documentation regarding OVS
> > deployed via userspace.
> >
> > Signed-off-by: Ian Stokes 
> > ---
> >  Documentation/intro/install/dpdk.rst  |3 ---
> >  Documentation/intro/install/userspace.rst |4 
> >  NEWS  |2 ++
> >  README.rst|6 +++---
> >  4 files changed, 5 insertions(+), 10 deletions(-)
> >
> > diff --git a/Documentation/intro/install/dpdk.rst
> > b/Documentation/intro/install/dpdk.rst
> > index fff0a1a..3018590 100644
> > --- a/Documentation/intro/install/dpdk.rst
> > +++ b/Documentation/intro/install/dpdk.rst
> > @@ -29,9 +29,6 @@ This document describes how to build and install
> > Open vSwitch using a DPDK  datapath. Open vSwitch can use the DPDK
> > library to operate entirely in  userspace.
> >
> > -.. warning::
> > -  The DPDK support of Open vSwitch is considered 'experimental'.
> > -
> >  Build requirements
> >  --
> >
> > diff --git a/Documentation/intro/install/userspace.rst
> > b/Documentation/intro/install/userspace.rst
> > index 0368527..ebd0c12 100644
> > --- a/Documentation/intro/install/userspace.rst
> > +++ b/Documentation/intro/install/userspace.rst
> > @@ -34,10 +34,6 @@ This version of Open vSwitch should be built
> > manually with ``configure`` and  been recently tested, and so Debian
> > packages are not a recommended way to use  this version of Open vSwitch.
> >
> > -.. warning::
> > -  The userspace-only mode of Open vSwitch is considered experimental.
> > It has
> > -  not been thoroughly tested.
> > -
> >  Building and Installing
> >  ---
> >
> > diff --git a/NEWS b/NEWS
> > index 5efcce2..8600f0e 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -3,6 +3,8 @@ Post-v2.7.0
> > - Tunnels:
> >   * Added support to set packet mark for tunnel endpoint using
> > `egress_pkt_mark` OVSDB option.
> > +   - Documentation:
> > + * OVS deployed in userspace mode no longer tagged as experimental.
> 
> I think this would be a bit clearer. What do you think?
> 
> --- a/NEWS
> +++ b/NEWS
> @@ -59,4 +59,5 @@ v2.7.0 - xx xxx 
>   * Removed support for IPsec tunnels.
> - DPDK:
> + * Removal of experimental tag.

We can put it under a DPDK header like you have here, I wonder though is the 
removal of experimental only relevant to DPDK or do we need to consider OVS 
userspace without DPDK also? I would think both are well tested at this stage 
so experimental could be removed for both. Thoughts?

> 
> 
> >
> >  v2.7.0 - xx xxx 
> >  -
> > diff --git a/README.rst b/README.rst
> > index f5cdaa5..90050e3 100644
> > --- a/README.rst
> > +++ b/README.rst
> > @@ -38,9 +38,9 @@ following features:
> >
> >  The included Linux kernel module supports Linux 3.10 and up.
> >
> > -Open vSwitch can also operate, at a cost in performance, entirely in
> > userspace, -without assistance from a kernel module.  This userspace
> > implementation should -be easier to port than the kernel-based switch.
> It is considered experimental.
> > +Open vSwitch can also operate entirely in userspace without
> > +assistance from a kernel module.  This userspace implementation
> > +should be easier to port than the kernel-based switch.
> >
> >  What's here?
> >  
> >

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] rhel: Fix ifup and ifdown after DPDK naming change.

2017-02-02 Thread Ben Pfaff
On Tue, Jan 24, 2017 at 06:21:53PM -0800, Daniele Di Proietto wrote:
> Names like dpdk0 and dpdk1 are not enough to identify a DPDK interface.
> We could update README.RHEL.rst and add
> 
> OVS_EXTRA='set Interface ${DEVICE} options:dpdk-devargs=:01:00.0'
> 
> but a better solution is to add new parameters in the configuration file
> to explicitly specify the dpdk-devargs.
> 
> Fixes: 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming")
> Signed-off-by: Daniele Di Proietto 

This seems useful.

I don't understand why this uses "set" then $1.  Are you concerned that
BOND_DPDK_DEVARGS might have multiple words and you want to get just the
first one?

>   OVSDPDKBond)
>   ifup_ovs_bridge
> + set -- ${BOND_DPDK_DEVARGS}
>   for _iface in $BOND_IFACES; do
> - IFACE_TYPES="${IFACE_TYPES} -- set interface ${_iface} 
> type=dpdk"
> + IFACE_TYPES="${IFACE_TYPES} -- set interface ${_iface} 
> type=dpdk options:dpdk-devargs=$1"
> + shift
>   done
>   ovs-vsctl -t ${TIMEOUT} \
>   -- --if-exists del-port "$OVS_BRIDGE" "$DEVICE" \
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Pass Openvswitch other_config smap to dpif.

2017-02-02 Thread Ben Pfaff
On Fri, Jan 27, 2017 at 05:10:04PM -0800, Daniele Di Proietto wrote:
> Currently we parse the 'other_config' column in Openvswitch table in
> bridge.c.  We extract the values (just 'pmd-cpu-mask' for now) and we
> pass them down to the datapath, via different layers.
> 
> If we want to pass other values to dpif-netdev.c (like we recently
> discussed) we would have to touch ofproto.c, ofproto-dpif.c and dpif.c.
> 
> This patch sends the entire other_config column to dpif-netdev, so that
> dpif-netdev can extract the values it's interested in.
> 
> No functional change.
> 
> Signed-off-by: Daniele Di Proietto 
> ---
> I don't like that dpif-netdev receives the whole other_config column,
> because it contains other values which are completely unrelated, but
> unfortunately there's no better place in the database for datapath
> specific configuration.

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ofp-actions: Fix variable length meta-flow field usage in flow actions.

2017-02-02 Thread Joe Stringer
On 1 February 2017 at 11:06, Joe Stringer  wrote:
> On 31 January 2017 at 16:57, Joe Stringer  wrote:
>>  /* Returns true if a variable length meta-flow field 'mff' is not mapped in
>>   * the 'vl_mff_map'. */
>>  bool
>> -mf_vl_mff_not_mapped(const struct mf_field *mff,
>> - const struct vl_mff_map *vl_mff_map)
>> +mf_vl_mff_mapped(const struct mf_field *mff, const struct vl_mff_map *map)
>>  {
>> -if (mff && vl_mff_map) {
>> -if (mff->variable_len && !mff->mapped) {
>> -return true;
>> -}
>> -}
>> -
>> -return false;
>> +return !(map && mff && mff->variable_len && !mff->mapped);
>>  }
>
> Yi-Hung pointed out offline that this reversal doesn't quite sit right
> logically; this function is searching for a specific set of invalid
> conditions, where there is a vl_mff_map, and the field is
> variable-length, and it's not mapped. It's misleading to have all of
> this covered by a function named "...mapped()".
>
> I suggest we retain the original logic but rename the function to
> something like mf_vl_mff_invalid().

I applied this to master.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] netdev-dpdk: Implement Tx intermediate queue for dpdk ports.

2017-02-02 Thread Bhanuprakash Bodireddy
After packet classification, packets are queued in to batches depending
on the matching netdev flow. Thereafter each batch is processed to
execute the related actions. This becomes particularly inefficient if
there are few packets in each batch as rte_eth_tx_burst() incurs expensive
MMIO writes.

This commit adds back intermediate queue implementation. Packets are
queued and burst when the packet count exceeds threshold. Also drain
logic is refactored to handle packets hanging in the tx queues. Testing
shows significant performance gains with this implementation.

Fixes: b59cc14e032d("netdev-dpdk: Use instant sending instead of
queueing of packets.")
Signed-off-by: Bhanuprakash Bodireddy >
Signed-off-by: Antonio Fischetti 
Co-authored-by: Antonio Fischetti 
Signed-off-by: Markus Magnusson 
Co-authored-by: Markus Magnusson 
---
v2->v3
  * Refactor the code
  * Use thread local copy 'send_port_cache' instead of 'tx_port' while draining
  * Invoke dp_netdev_drain_txq_port() to drain the packets from the queue as
part of pmd reconfiguration that gets triggered due to port 
addition/deletion
or change in pmd-cpu-mask.
  * Invoke netdev_txq_drain() from xps_get_tx_qid() to drain packets in old
queue. This is possible in XPS case where the tx queue can change after
timeout.
  * Fix another bug in netdev_dpdk_eth_tx_burst() w.r.t 'txq->count'.

Latency stats:
Collected the latency stats with PHY2PHY loopback case using 30 IXIA streams
/UDP packets/uni direction traffic. All the stats are in nanoseconds. Results 
below compare latency results between Master vs patch.

case 1: Matching IP flow rules for each IXIA stream
Eg:  For an IXIA stream with src Ip: 2.2.2.1, dst tip: 5.5.5.1 
ovs-ofctl add-flow br0 dl_type=0x0800,nw_src=2.2.2.1,actions=output:2

For an IXIA stream with src Ip: 4.4.4.1, dst tip: 15.15.15.1
ovs-ofctl add-flow br0 dl_type=0x0800,nw_src=4.4.4.1,actions=output:2

Packet64 128 256 512 
Branch  Master  Patch   Master  Patch   Master  Patch   Master  Patch   
case 1  26100   222000  26190   217930  23890   199000  30370   212440 (min 
latency ns)
1239100 906910  1168740 691040  575470  574240  724360  734050 (max 
latency ns)
1189501 763908  913602  662941  486187  440482  470060  479376 (avg 
latency ns)

 
   1024  1280 1518
   Master   Patch Master Patch  Master Patch
   2832018961026520  220580 23950  200480  (min latency ns)
   701040   67584670  670390 19783490   685930 747040  (max latency ns)
   444033   469297415602 506215 429587 491593  (avg latency ns)


case 2: ovs-ofctl add-flow br0 in_port=1,action=output:2

Packet64 128 256 512 
Branch  Master  Patch   Master  Patch   Master  Patch   Master  Patch   
case 2  18800   33970   19980   30350  2261026800   13500   20220
506140  596690  363010  363370  544520  541570  549120  77414700
459509  473536  254817  256801  287872  287277  290642  301572


   1024  1280 1518
   Master   Patch Master Patch  Master Patch
   2253015850 21350  36020  25970  34300
   549680   131964240 543390 81549210   552060 98207410
   292436   294388285468 305727 295133 300080


case 3 is same as case 1 with INTERIM_QUEUE_BURST_THRESHOLD=16, instead of 32.

(w) patch
case 364   128   2565121024   12801518 
 122700   119890   135200  117530  118900 116640  123710(min)
 972830   808960   574180  696820  36717550   720500  726790(max)
 783315   674814   463256  439412  467041 463093  471967(avg)

case 4 is same as case 2 with INTERIM_QUEUE_BURST_THRESHOLD=16, instead of 32.

(w) patch
case 464   128   2565121024   1280  1518 
 317502614025250   17570   14750  28600 31460(min ns)
 722690   363200   539760  538320  301845040  12556210  132114800(max 
ns)
 485710   253497   285589  284095  293189 282834285829(avg ns)

v1->v2
  * xps_get_tx_qid() is no more called twice. The last used qid is stored so
the drain function will flush the right queue also when XPS is enabled.
  * netdev_txq_drain() is called unconditionally and not just for dpdk ports.
  * txq_drain() takes the 'tx_lock' for queue in case of dynamic tx queues.
  * Restored counting of dropped packets.
  * Changed scheduling of drain function.
  * Updated comments in netdev-provider.h
  * Fixed a comment in dp-packet.h

Details:
 * In worst case scenario with fewer packets in batch, significant
   bottleneck is observed at netdev_dpdk_eth_send() function due to 
   expensive MMIO writes.

 * Also its observed that CPI(cycles per instruction) Rate for the function
   stood between 3.15 and 4.1 which is significantly higher than accept

Re: [ovs-dev] [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface

2017-02-02 Thread Joe Stringer
Hi Eric,

There's a few patches in this series with co-authored-by, but missing
signed-off-by from the co-author. Also, the author does not need to be
listed as a co-author.

When building this series I'm seeing a bunch of redefinition errors
(across all platforms I test on, ranging from kernel 3.10 to 4.8):

/usr/include/linux/if_tunnel.h:21:9: error: preprocessor token
GRE_CSUM redefined
lib/packets.h:1046:9: this was the original definition
/usr/include/linux/if_tunnel.h:22:9: error: preprocessor token
GRE_ROUTING redefined
lib/packets.h:1047:9: this was the original definition
/usr/include/linux/if_tunnel.h:23:9: error: preprocessor token GRE_KEY redefined
lib/packets.h:1048:9: this was the original definition
/usr/include/linux/if_tunnel.h:24:9: error: preprocessor token GRE_SEQ redefined
lib/packets.h:1049:9: this was the original definition
/usr/include/linux/if_tunnel.h:25:9: error: preprocessor token
GRE_STRICT redefined
lib/packets.h:1050:9: this was the original definition
/usr/include/linux/if_tunnel.h:26:9: error: preprocessor token GRE_REC redefined
lib/packets.h:1051:9: this was the original definition
/usr/include/linux/if_tunnel.h:27:9: error: preprocessor token
GRE_FLAGS redefined
lib/packets.h:1052:9: this was the original definition
/usr/include/linux/if_tunnel.h:28:9: error: preprocessor token
GRE_VERSION redefined
lib/packets.h:1053:9: this was the original definition

I have a few comments on some of the patches.


On 18 January 2017 at 11:45, Eric Garver  wrote:
> This series adds support for the creation of tunnels using the rtnetlink
> interface. This will open the possibility for new features and flags on those
> vports without the need to change vport compatibility code.
>
> Support for STT and LISP have not been added because these are not upstream 
> yet,
> so we don't know how the interface will be like upstream. And there are no
> features in the current drivers right now we could make use of.
>
> Note: This work originally started by Thadeu Lima de Souza Cascardo.
>
> Testing:
>   - kernel 4.9.3, in-tree datapath
> - rtnetlink successfully creates devices
>   - kernel 4.2.8, in-tree datapath
> - rtnetlink is tried, but fails due to no COLLECT_METADATA support
> - genetlink successfully creates devices
>   - kernel 4.2.8, out-of-tree datapath
> - rtnetlink is not tried
> - genetlink successfully creates devices
>
> v2:
>
> We are able to set the MTU to UINT16_MAX since it is not restricted by the
> driver during newlink.
>
> v3:
>
> Prefer to get type from vport before checking if device is opened. Also, 
> disable
> IFLA_VXLAN_LEARNING as it's not enabled on compat vports as well.
>
> v4:
>   - Probe for ovs_geneve on init, this indicates out-of-tree datapath
> - If exists, only try genetlink/compat
> - else, try rtnetlink and fallback to genetlink/compat
>   - Read back and verify devices created with rtnetlink
>   - checkpatch fixes
>
> Eric Garver (4):
>   dpif-netlink: Probe for out-of-tree datapath.
>   dpif-netlink: add VXLAN creation support
>   dpif-netlink: add GRE creation support
>   dpif-netlink: add GENEVE creation support
>
> Thadeu Lima de Souza Cascardo (2):
>   netdev: get device type from vport prefix if it uses one
>   dpif-netlink: break out code to add compat and non-compat vports
>
>  lib/dpif-netlink.c | 679 
> +
>  lib/netdev.c   |  26 +-
>  2 files changed, 651 insertions(+), 54 deletions(-)
>
> --
> 2.10.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 1/6] netdev: get device type from vport prefix if it uses one

2017-02-02 Thread Joe Stringer
On 18 January 2017 at 11:45, Eric Garver  wrote:
> From: Thadeu Lima de Souza Cascardo 
>
> If the device name uses a vport prefix, then use that vport type.
>
> Since these names are reserved, we can assume this is the right type.
>
> This is important when we are querying the datapath right after vswitch has
> started and using the right type will be even more important when we add 
> support
> to creating tunnel ports with rtnetlink.
>
> Signed-off-by: Thadeu Lima de Souza Cascardo 

Acked-by: Joe Stringer 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 2/6] dpif-netlink: break out code to add compat and non-compat vports

2017-02-02 Thread Joe Stringer
On 18 January 2017 at 11:45, Eric Garver  wrote:
> From: Thadeu Lima de Souza Cascardo 
>
> The vport type for adding tunnels is now compatibility code and any new 
> features
> from tunnels must configure the tunnel as an interface using the tunnel 
> metadata
> support.
>
> In order to be able to add those tunnels, we need to add code to create the
> tunnels and add them as NETDEV vports. And when there is no support to create
> them, we need to use the compatibility code and add them as tunnel vports.
>
> When removing those tunnels, we need to remove the interfaces as well, and
> detecting the right type might be important, at least to distinguish the 
> tunnel
> vports that we should remove and the interfaces that we shouldn't.
>
> Signed-off-by: Thadeu Lima de Souza Cascardo 

This patch involves refactoring as well as the base changes to
add/remove the new tunnel types. It's always easier to review if
functional changes are proposed separately from refactoring/cosmetic.
(I recognize the new functions don't do anything, but they still
combine with the refactoring). That said, with some minor changes this
looks OK to me.

Acked-by: Joe Stringer 

> ---
>  lib/dpif-netlink.c | 201 
> +++--
>  1 file changed, 148 insertions(+), 53 deletions(-)
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index c8b0e37f9365..e6459f358989 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -781,10 +781,8 @@ get_vport_type(const struct dpif_netlink_vport *vport)
>  }
>
>  static enum ovs_vport_type
> -netdev_to_ovs_vport_type(const struct netdev *netdev)
> +netdev_to_ovs_vport_type(const char *type)
>  {
> -const char *type = netdev_get_type(netdev);
> -
>  if (!strcmp(type, "tap") || !strcmp(type, "system")) {
>  return OVS_VPORT_TYPE_NETDEV;
>  } else if (!strcmp(type, "internal")) {
> @@ -805,19 +803,14 @@ netdev_to_ovs_vport_type(const struct netdev *netdev)
>  }
>
>  static int
> -dpif_netlink_port_add__(struct dpif_netlink *dpif, struct netdev *netdev,
> +dpif_netlink_port_add__(struct dpif_netlink *dpif, const char *name,
> +enum ovs_vport_type type,
> +struct ofpbuf *options,
>  odp_port_t *port_nop)
>  OVS_REQ_WRLOCK(dpif->upcall_lock)
>  {
> -const struct netdev_tunnel_config *tnl_cfg;
> -char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
> -const char *name = netdev_vport_get_dpif_port(netdev,
> -  namebuf, sizeof namebuf);
> -const char *type = netdev_get_type(netdev);
>  struct dpif_netlink_vport request, reply;
>  struct ofpbuf *buf;
> -uint64_t options_stub[64 / 8];
> -struct ofpbuf options;
>  struct nl_sock **socksp = NULL;
>  uint32_t *upcall_pids;
>  int error = 0;
> @@ -832,17 +825,80 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, 
> struct netdev *netdev,
>  dpif_netlink_vport_init(&request);
>  request.cmd = OVS_VPORT_CMD_NEW;
>  request.dp_ifindex = dpif->dp_ifindex;
> -request.type = netdev_to_ovs_vport_type(netdev);
> -if (request.type == OVS_VPORT_TYPE_UNSPEC) {
> +request.type = type;
> +request.name = name;
> +
> +request.port_no = *port_nop;
> +upcall_pids = vport_socksp_to_pids(socksp, dpif->n_handlers);
> +request.n_upcall_pids = socksp ? dpif->n_handlers : 1;
> +request.upcall_pids = upcall_pids;
> +
> +if (options) {
> +request.options = options->data;
> +request.options_len = options->size;
> +}
> +
> +error = dpif_netlink_vport_transact(&request, &reply, &buf);
> +if (!error) {
> +*port_nop = reply.port_no;
> +} else {
> +if (error == EBUSY && *port_nop != ODPP_NONE) {
> +VLOG_INFO("%s: requested port %"PRIu32" is in use",
> +  dpif_name(&dpif->dpif), *port_nop);
> +}
> +
> +vport_del_socksp(dpif, socksp);
> +goto exit;
> +}
> +
> +if (socksp) {
> +error = vport_add_channels(dpif, *port_nop, socksp);
> +if (error) {
> +VLOG_INFO("%s: could not add channel for port %s",
> +  dpif_name(&dpif->dpif), name);
> +
> +/* Delete the port. */
> +dpif_netlink_vport_init(&request);
> +request.cmd = OVS_VPORT_CMD_DEL;
> +request.dp_ifindex = dpif->dp_ifindex;
> +request.port_no = *port_nop;
> +dpif_netlink_vport_transact(&request, NULL, NULL);
> +vport_del_socksp(dpif, socksp);
> +goto exit;
> +}
> +}
> +free(socksp);
> +
> +exit:
> +ofpbuf_delete(buf);
> +free(upcall_pids);
> +
> +return error;
> +}
> +
> +static int
> +dpif_netlink_port_add_compat(struct dpif_netlink *dpif, struct netdev 
> *netdev,
> + odp_port_t *port_nop)
> +OVS_REQ_WRLOCK(dpif->upcall_lock)
> +{
> +const struct

Re: [ovs-dev] [RFC PATCH v4 3/6] dpif-netlink: Probe for out-of-tree datapath.

2017-02-02 Thread Joe Stringer
On 18 January 2017 at 11:45, Eric Garver  wrote:
> For out-of-tree datapath, only try genetlink/compat.
> For in-tree kernel datapath, try rtnetlink then genetlink.
>
> Signed-off-by: Eric Garver 
> ---
>  lib/dpif-netlink.c | 35 ---
>  1 file changed, 32 insertions(+), 3 deletions(-)
>
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index e6459f358989..769806eadbf1 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -210,6 +210,11 @@ static int ovs_packet_family;
>   * Initialized by dpif_netlink_init(). */
>  static unsigned int ovs_vport_mcgroup;
>
> +/* rtnetlink information for OVS.
> + *
> + * Initialized by dpif_netlink_init(). */
> +static bool ovs_datapath_out_of_tree = false;

Perhaps in the comment briefly mention that if this is true, devices
are created using OVS netlink and if it's false devices are created
using NETLINK_ROUTE / as LWT?

> +
>  static int dpif_netlink_init(void);
>  static int open_dpif(const struct dpif_netlink_dp *, struct dpif **);
>  static uint32_t dpif_netlink_port_get_pid(const struct dpif *,
> @@ -1014,11 +1019,13 @@ dpif_netlink_port_add(struct dpif *dpif_, struct 
> netdev *netdev,
>odp_port_t *port_nop)
>  {
>  struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
> -int error;
> +int error = EOPNOTSUPP;
>
>  fat_rwlock_wrlock(&dpif->upcall_lock);
> -error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop);
> -if (error == EOPNOTSUPP) {
> +if (!ovs_datapath_out_of_tree) {
> +error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop);
> +}
> +if (error) {
>  error = dpif_netlink_port_add_compat(dpif, netdev, port_nop);
>  }
>  fat_rwlock_unlock(&dpif->upcall_lock);
> @@ -2503,6 +2510,7 @@ dpif_netlink_init(void)
>  {
>  static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>  static int error;
> +struct netdev *netdev = NULL;

This can be in the #ifdef __linux__; otherwise you have an unused
variable on other platforms.

>
>  if (ovsthread_once_start(&once)) {
>  error = nl_lookup_genl_family(OVS_DATAPATH_FAMILY,
> @@ -2526,6 +2534,27 @@ dpif_netlink_init(void)
>  error = nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY, 
> OVS_VPORT_MCGROUP,
> &ovs_vport_mcgroup);
>  }
> +#ifdef __linux__
> +if (!error) {
> +error = netdev_open("ovs-system-probe", "geneve", &netdev);
> +if (!error) {
> +error = netdev_geneve_create_kind(netdev, "ovs_geneve");
> +if (error != EOPNOTSUPP) {
> +if (!error) {
> +char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
> +const char *dp_port;
> +
> +dp_port = netdev_vport_get_dpif_port(netdev, namebuf,
> + sizeof namebuf);
> +netdev_geneve_destroy(dp_port);
> +}
> +ovs_datapath_out_of_tree = true;
> +}
> +netdev_close(netdev);
> +error = 0;
> +}
> +}
> +#endif

Geneve isn't added yet, so this patch introduces build breakage.

This doesn't look like it matches the most recent discussion on this topic:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327476.html
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] packet: proper return type for vlan_tci_to_pcp()

2017-02-02 Thread Shu Shen
On Tue, Jan 31, 2017 at 10:45:31AM -0800, Ben Pfaff wrote:
> By my count, vlan_tci_to_pcp() is used in printf-like format
> specifiers in 7 places in the tree.  Before this patch, it is used
> with the following format specifiers:
> 
> %d  3 times %x  1 time PRIu8   1 time PRIx8   1 time
> 
> Both %d and %x are obviously correct, portable format specifiers for
> int, which is the return type of vlan_tci_to_pcp().  I contend that
> PRIu8 and PRIx8 should be acceptable too because the integer
> promotions convert uint8_t to int anyway.
The problem is that PRIu8 and PRIx8 the specifiers are not promoted to
'u' and 'x' respectively on macOS.

For example, for PRIu*, on Mac OS, /usr/include/inttypes.h:

 #  define __PRI_8_LENGTH_MODIFIER__ "hh"
 #  define __PRI_64_LENGTH_MODIFIER__ "ll"
 #  define PRIu8 __PRI_8_LENGTH_MODIFIER__ "u"
 #  define PRIu16"hu"
 #  define PRIu32"lu"
 #  define PRIu64__PRI_64_LENGTH_MODIFIER__ "u"

While on Linux/glibc, /usr/include/inttypes.h:

 # define PRIu8  "u"
 # define PRIu16 "u"
 # define PRIu32 "u"
 # define PRIu64 __PRI64_PREFIX "u"

where all PRIu* except PRIu64 are the same.

Therefore, using PRIu8 or PRIx8 with the int return type of
vlan_tci_to_pcp() is causing compiler warnings on macOS.

As PRIu8 and PRIx8 are indeed accurate specifier for pcp (Priority code
point), the patch proposes changing the return type of
vlan_tci_to_pcp().

In addition, the patch includes changes of format specifiers to
consistently use PRI* specifier. But unfortunately as you pointed out
below, there are two "%d" occurrences I missed. I'll put these specifier
changes into a separate commit for clarity.

> 
> After this patch, vlan_tci_to_pcp() returns uint8_t and it is used in
> the following format specifiers:
> 
> %d  2 times
> PRIu8   2 times
> PRIx8   2 times
> 
> I still don't see the point.  You're saying, effectively, that %x is not
> acceptable but all the others are.  How is that possible?
Please see above, the intention is to use PRI* specifier throughout, I
missed the two occurrences of '%d'. Will fix in next revision of the
patch(set).

> 
> On Sat, Jan 14, 2017 at 09:59:33AM -0800, Shu Shen wrote:
> > The main change of this patch is in lib/packets.h, where the return type of
> > vlan_tci_to_pcp() and vlan_tci_to_cfi() are changed from int to uint8_t.
> > 
> > Changes to the format specifiers are for portability to mac OS and
> > consistency use of PRIu* specifiers.
> > 
> > On Sat, Jan 14, 2017 at 8:47 AM, Ben Pfaff  wrote:
> > > On Fri, Jan 13, 2017 at 02:04:38PM -0800, Shu Shen wrote:
> > >> The return type of vlan_tci_to_pcp() was int where it's expected to be
> > >> uint8_t and causing implicit truncation when the function is used. On
> > >> some platforms such as macOS, where PRIu8 is defined as "hhx" and no
> > >> promotion of short to int is done, the compiler might throw out Wformat
> > >> message for ds_put_format() calls on the returns value of
> > >> vlan_tci_to_pcp().
> > >>
> > >> vlan_tci_to_cfi() is also fixed with uint8_t as return type although the
> > >> function is not currently being used anywhere.
> > >>
> > >> Format strings in ds_put_format() for printing out returned values from
> > >> vlan_tci_to_pcp() were updated to ensure PRIu8 or PRIx8 are used for
> > >> portability.
> > >>
> > >> Signed-off-by: Shu Shen 
> > >> ---
> > >>
> > >> v2: Fixed typoes for uint8_t in commit message
> > >
> > > This one doesn't really make sense to me.  This patch only changes two
> > > format specifiers away from %d or %x, which are both correct format
> > > specifiers for "int".  Can you explain?
> > >
> > > Thanks,
> > >
> > > Ben.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 4/6] dpif-netlink: add VXLAN creation support

2017-02-02 Thread Joe Stringer
On 18 January 2017 at 11:45, Eric Garver  wrote:
> Creates VXLAN devices using rtnetlink and tunnel metadata.
>
> Co-Authored-by: Thadeu Lima de Souza Cascardo 
> Co-Authored-by: Eric Garver 
> Signed-off-by: Eric Garver 
> ---
>  lib/dpif-netlink.c | 194 
> -
>  1 file changed, 193 insertions(+), 1 deletion(-)

I think that the vast majority of this code is linux-specific and
should not exist in dpif-netlink.c.

Perhaps we should add a new lib/netdev-lwt.[ch], where the .h file has
the #ifdef __linux__ logic to either declare or define the functions,
then lib/netdev-lwt.c has the implementations of these functions.  The
.h would always be added to the build in lib/automake.mk and the .c
file would only be added in the #if LINUX section.

One might even argue that it'd be tidier if the
dpif_netlink_port_{create,destroy} functions were moved here (and
renamed to something more apt).
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 5/6] dpif-netlink: add GRE creation support

2017-02-02 Thread Joe Stringer
On 18 January 2017 at 11:45, Eric Garver  wrote:
> Creates GRE devices using rtnetlink and tunnel metadata.
>
> Co-Authored-by: Thadeu Lima de Souza Cascardo 
> Co-Authored-by: Eric Garver 
> Signed-off-by: Eric Garver 

Same comments as VXLAN; also these GRE ones add a bunch of directive
redefinitions on various platforms. I didn't inspect or test because
of these issues.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVS - ODL Sync on OF Bundle in 1.3

2017-02-02 Thread Abhijit Kumbhare
Jarno,

Can you please attend the next Thursday OpenDaylight OpenFlow Plugin meeting? 
We would like to discuss the OpenDaylight bundles implementation – but would 
also like to discuss the OvS bundles implementation since we will very likely 
be interacting with the OvS bundles.

The meeting time is:

Feb 9: Thursday 8 am Pacific:

Webex: 
https://meetings.webex.com/collabs/#/meetings/detail?uuid=M2D4SMAUPJLGFE395VGN2HU7M7-9VIB&rnd=180270.26196

If you forget – the meeting time & link is also accessible from our main page: 
https://wiki.opendaylight.org/view/OpenDaylight_OpenFlow_Plugin:Main

I will also send a separate meeting invite once you confirm.

Abhijit

From: Jarno Rajahalme mailto:ja...@ovn.org>>
Date: Wednesday, January 25, 2017 at 2:16 PM
To: Jan Scheurich 
mailto:jan.scheur...@ericsson.com>>
Cc: Abhijit Kumbhare 
mailto:abhijit.kumbh...@ericsson.com>>, Zoltán 
Balogh mailto:zoltan.bal...@ericsson.com>>, László 
Sürü mailto:laszlo.s...@ericsson.com>>, Miklós Pelyva 
mailto:miklos.pel...@ericsson.com>>, Jozef Bacigal 
mailto:jozef.baci...@pantheon.tech>>, Tomáš Slušný 
mailto:tomas.slu...@pantheon.tech>>, Prasanna 
Huddar mailto:prasanna.hud...@ericsson.com>>, 
Shuva Jyoti Kar 
mailto:shuva.jyoti@ericsson.com>>, Sharath 
Kumar V mailto:sharath.kuma...@ericsson.com>>, 
Kanagasundaram K 
mailto:kanagasundara...@ericsson.com>>, Sunil 
Kumar G mailto:sunil.g.ku...@ericsson.com>>, D 
Arunprakash mailto:d.arunprak...@ericsson.com>>, 
Muthukumaran K 
mailto:muthukumara...@ericsson.com>>, 
"d...@openvswitch.org" 
mailto:d...@openvswitch.org>>
Subject: Re: OVS - ODL Sync on OF Bundle in 1.3

This came with too short notice for me, sorry.

  Jarno

On Jan 24, 2017, at 11:48 PM, Jan Scheurich 
mailto:jan.scheur...@ericsson.com>> wrote:

We’ll use below webex instead of Lync/SkypeFB:
https://meetings.webex.com/collabs/#/meetings/detail?uuid=MBFUHFYE8TRG3NPX76F4PLRS2N-3OWH&rnd=231184.55753

Rescheduled to Wednesday due to unavailability of key participants.
@Jarno: As you are the OVS brain behind bundles. Do you have the chance to join 
for a short time to discuss some  details regarding OF 1.3 bundle 
implementation?

Hi Jarno,

OpenDaylight folks are finally starting to implement support of OpenFlow 
bundles as a basis for the bundle-based hitless recync procedure we discussed 
earlier. As ODL does not yet have protocol support for OpenFlow versions 1.4 or 
1.5, they intend to implement the bundle extension to OF 1.3 as specified in 
EXT-230 in
https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-extensions-1.3.x-pack2.zip

Would you have time for a short meeting on early Monday to discuss and confirm 
whether the OVS implementation of bundles in OF 1.3 is compliant with EXT-230 
and has everything they need?

Thanks, Jan




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 6/6] dpif-netlink: add GENEVE creation support

2017-02-02 Thread Joe Stringer
On 18 January 2017 at 11:45, Eric Garver  wrote:
> Creates GENEVE devices using rtnetlink and tunnel metadata.
>
> Co-Authored-by: Thadeu Lima de Souza Cascardo 
> Co-Authored-by: Eric Garver 
> Signed-off-by: Eric Garver 

Same feedback as VXLAN.

You might consider adding this patch before the testing patch, but
don't expose the functionality until the probing commit. Then that
patch can compile.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] packet: proper return type for vlan_tci_to_pcp()

2017-02-02 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 02:55:03PM -0800, Shu Shen wrote:
> On Tue, Jan 31, 2017 at 10:45:31AM -0800, Ben Pfaff wrote:
> > By my count, vlan_tci_to_pcp() is used in printf-like format
> > specifiers in 7 places in the tree.  Before this patch, it is used
> > with the following format specifiers:
> > 
> > %d  3 times %x  1 time PRIu8   1 time PRIx8   1 time
> > 
> > Both %d and %x are obviously correct, portable format specifiers for
> > int, which is the return type of vlan_tci_to_pcp().  I contend that
> > PRIu8 and PRIx8 should be acceptable too because the integer
> > promotions convert uint8_t to int anyway.
> The problem is that PRIu8 and PRIx8 the specifiers are not promoted to
> 'u' and 'x' respectively on macOS.
> 
> For example, for PRIu*, on Mac OS, /usr/include/inttypes.h:
> 
>  #  define __PRI_8_LENGTH_MODIFIER__ "hh"
>  #  define __PRI_64_LENGTH_MODIFIER__ "ll"
>  #  define PRIu8 __PRI_8_LENGTH_MODIFIER__ "u"
>  #  define PRIu16"hu"
>  #  define PRIu32"lu"
>  #  define PRIu64__PRI_64_LENGTH_MODIFIER__ "u"
> 
> While on Linux/glibc, /usr/include/inttypes.h:
> 
>  # define PRIu8  "u"
>  # define PRIu16 "u"
>  # define PRIu32 "u"
>  # define PRIu64 __PRI64_PREFIX "u"
> 
> where all PRIu* except PRIu64 are the same.
> 
> Therefore, using PRIu8 or PRIx8 with the int return type of
> vlan_tci_to_pcp() is causing compiler warnings on macOS.

What are the warnings?

There is no real problem with using %hhd or %hhu to print an int,
because printf() cannot tell the difference between an int and a char:
they are both passed as int.

> As PRIu8 and PRIx8 are indeed accurate specifier for pcp (Priority code
> point), the patch proposes changing the return type of
> vlan_tci_to_pcp().
> 
> In addition, the patch includes changes of format specifiers to
> consistently use PRI* specifier. But unfortunately as you pointed out
> below, there are two "%d" occurrences I missed. I'll put these specifier
> changes into a separate commit for clarity.

I don't see how putting them in a separate commit provides "clarity"
since the lack of the changes is most of my problem with the commit.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] packet: proper return type for vlan_tci_to_pcp()

2017-02-02 Thread Shu Shen
On macOS, PRIu8 and PRIx8 the specifiers are not promoted to 'u' and 'x'
respectively.

For example, for PRIu*, on Mac OS, /usr/include/inttypes.h:

 #  define __PRI_8_LENGTH_MODIFIER__ "hh"
 #  define __PRI_64_LENGTH_MODIFIER__ "ll"
 #  define PRIu8 __PRI_8_LENGTH_MODIFIER__ "u"
 #  define PRIu16"hu"
 #  define PRIu32"lu"
 #  define PRIu64__PRI_64_LENGTH_MODIFIER__ "u"

While on Linux/glibc, /usr/include/inttypes.h:

 # define PRIu8  "u"
 # define PRIu16 "u"
 # define PRIu32 "u"
 # define PRIu64 __PRI64_PREFIX "u"

where all PRIu* except PRIu64 are the same.

Therefore, using PRIu8 or PRIx8 with the int return type of
vlan_tci_to_pcp() is causing compiler warnings on macOS.

The first patch in this patchset fixes the return types of vlan_tci_to_pcp()
and vlan_tci_to_cfi() functions from int to uint8_t, as 8-bit unsigned int is
the expected data type for these two functions. After this patch, warning
message Wformat would no longer occur for these two functions on macOS.

The second patch could be considered as cosmetic or more about coding style,
but it uses PRI* specifier consistently when formating the return values of
vlan_tci_to_pcp(). Although desirable in my opinion, this second patch may be
applied independently from the first one.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] packet: proper return type for vlan_tci_to_pcp()

2017-02-02 Thread Shu Shen
The return type of vlan_tci_to_pcp() was int where it's expected to be
uint8_t and causing implicit truncation when the function is used.

On some platforms such as macOS, where PRIu8 is defined as "hhx" and no
promotion of short to int is done, the compiler might throw out Wformat
message for ds_put_format() calls on the returns value of
vlan_tci_to_pcp().

vlan_tci_to_cfi() is also fixed with uint8_t as return type although the
function is not currently being used anywhere.

Signed-off-by: Shu Shen 
---
 lib/packets.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/packets.h b/lib/packets.h
index c4d3799..2cf5419 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -421,7 +421,7 @@ vlan_tci_to_vid(ovs_be16 vlan_tci)
 
 /* Given the vlan_tci field from an 802.1Q header, in network byte order,
  * returns the priority code point (PCP) in host byte order. */
-static inline int
+static inline uint8_t
 vlan_tci_to_pcp(ovs_be16 vlan_tci)
 {
 return (ntohs(vlan_tci) & VLAN_PCP_MASK) >> VLAN_PCP_SHIFT;
@@ -429,7 +429,7 @@ vlan_tci_to_pcp(ovs_be16 vlan_tci)
 
 /* Given the vlan_tci field from an 802.1Q header, in network byte order,
  * returns the Canonical Format Indicator (CFI). */
-static inline int
+static inline uint8_t
 vlan_tci_to_cfi(ovs_be16 vlan_tci)
 {
 return (vlan_tci & htons(VLAN_CFI)) != 0;
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/2] Consistently format specifiers for vlan_tci_to_pcp

2017-02-02 Thread Shu Shen
Format strings in ds_put_format() for printing out returned values from
vlan_tci_to_pcp() were updated to ensure PRIu8 or PRIx8 are used for
portability.

Signed-off-by: Shu Shen 
---
 lib/dpctl.c| 2 +-
 lib/match.c| 2 +-
 lib/odp-util.c | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index 23837ce..7b2f4cf 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1496,7 +1496,7 @@ dpctl_normalize_actions(int argc, const char *argv[],
 sort_output_actions(af->actions.data, af->actions.size);
 
 if (af->flow.vlan_tci != htons(0)) {
-dpctl_print(dpctl_p, "vlan(vid=%"PRIu16",pcp=%d): ",
+dpctl_print(dpctl_p, "vlan(vid=%"PRIu16",pcp="PRIu8): ",
 vlan_tci_to_vid(af->flow.vlan_tci),
 vlan_tci_to_pcp(af->flow.vlan_tci));
 } else {
diff --git a/lib/match.c b/lib/match.c
index 3fcaec5..d3c6c5a 100644
--- a/lib/match.c
+++ b/lib/match.c
@@ -1221,7 +1221,7 @@ match_format(const struct match *match, struct ds *s, int 
priority)
   colors.end, vlan_tci_to_vid(f->vlan_tci));
 }
 if (pcp_mask) {
-ds_put_format(s, "%sdl_vlan_pcp=%s%d,", colors.param,
+ds_put_format(s, "%sdl_vlan_pcp=%s"PRIu8",", colors.param,
   colors.end, vlan_tci_to_pcp(f->vlan_tci));
 }
 } else if (wc->masks.vlan_tci == htons(0x)) {
diff --git a/lib/odp-util.c b/lib/odp-util.c
index 4106738..52a585a 100644
--- a/lib/odp-util.c
+++ b/lib/odp-util.c
@@ -384,9 +384,9 @@ format_vlan_tci(struct ds *ds, ovs_be16 tci, ovs_be16 mask, 
bool verbose)
 ds_put_char(ds, ',');
 }
 if (verbose || vlan_tci_to_pcp(tci) || vlan_tci_to_pcp(mask)) {
-ds_put_format(ds, "pcp=%d", vlan_tci_to_pcp(tci));
+ds_put_format(ds, "pcp=%"PRIu8, vlan_tci_to_pcp(tci));
 if (vlan_tci_to_pcp(mask) != (VLAN_PCP_MASK >> VLAN_PCP_SHIFT)) {
-ds_put_format(ds, "/0x%x", vlan_tci_to_pcp(mask));
+ds_put_format(ds, "/0x%"PRIx8, vlan_tci_to_pcp(mask));
 }
 ds_put_char(ds, ',');
 }
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining

2017-02-02 Thread jmcdowall
From: John McDowall 

This patchset is the first round at having Service Function Chaining
functionality through OVN. The implementation is done entirely
on the northbound side of OVN. It is a bump on the wire implementation,
so no attempt is being made in keeping state while packets visit each
hop of the chain. ACLs are used as the classifiers, with the augmentation
of action SFC, as well as option column.

The current implementation of traffic redirection to the service chain
is implemented by adding an additional action 'sfc' to the ACL stage. This
overloads the ACL stage and this might not be the best long term approach.
Guidence on whether this is "good enough" for now would be appreciated.

How to leverage load balancing is also an open issue. The current LB solution
in OVN is L3 based. Suggestions on how to implement LB at L2 for SFC would
also be appreciated.

This is not yet ready to be merged, as it lacks unit tests and a rigorous
code review. Nevertheless, it works fine if you take into account a
number of limitations that include:

* missing load balancer integration;
* no ipv6 support;
* chain spanning logical switches (not supported);
* bidirectional chains (not implemented);
* no test cases.
* other suggestions?

This is the code that was used for SFC demo and talk at OVSCon 2016.

Changes:

 * ovn-northd.xml: Added documentation for SFC ACL Action
 * ovn-northd.c: Added new stage for SFC and modified ACL stage to include sfc 
action
 * ovn-architecture.xml Included architecture of SFC in documentation
 * ovn-nb.ovsschema: Extended schema to include port-chain, port-pair-groups, 
port-pairs
and added ACL SFC action
 * ovn-nb.xml: Added documentation for extensions to  ovn-nbctl for port-chain,
port-pair-groups, port-pairs and ACL SFC action
 * ovn-nbctl.c: Added code to extend ovn-nbctl for port-chain,
port-pair-groups, port-pairs and ACL SFC action

Current Limitations

This is not yet ready to be merged, as it lacks unit tests and a rigorous
code review. Nevertheless, it works fine if you take into account a
few limitations:

* missing load balancer integration.
* no ipv6 support.
* chain spanning logical switches (not supported).
* bidirectional chains (not implemented).
* no test cases.
* documentation needs rework as there have been several changes as the code has 
progressed.

Before I work on the limitations and start adding test cases I would like to 
make sure I am on the right track to get this approved for submission. Once it 
is approved for OVN/OVS I can add it to Openstack and also planning on using it 
for container service chaining.

Questions:

* Is the basic approach aligned with the direction of OVN/OVS.
* Is the approach to overloading the current ACL stage best approach for added 
a classifier action.
* How to approach load balancing - current approach in OVN is L# based any 
ideas on how to LB L2 virtual services.

Co-authored-by: Flavio Fernandes 
Reported at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2016-March/040381.html
Reported at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2016-May/041359.html

Signed-off-by: John McDowall 
---
 ovn/northd/ovn-northd.8.xml |   5 +
 ovn/northd/ovn-northd.c | 315 +++--
 ovn/ovn-architecture.7.xml  | 184 ++
 ovn/ovn-nb.ovsschema|  64 +++-
 ovn/ovn-nb.xml  | 150 +++-
 ovn/utilities/ovn-nbctl.c   | 825 +++-
 6 files changed, 1515 insertions(+), 28 deletions(-)

diff --git ovn/northd/ovn-northd.8.xml ovn/northd/ovn-northd.8.xml
index ab8fd88..e68d1de 100644
--- ovn/northd/ovn-northd.8.xml
+++ ovn/northd/ovn-northd.8.xml
@@ -304,6 +304,11 @@
 connections.
   
   
+sfc ACLs work as entry points for service function
+chaining, also known as SFC classifiers. Further attributes such
+as what chain to be used are provided via the options column.
+  
+  
 Other ACLs translate to drop; for new or untracked
 connections and ct_commit(ct_label=1/1); for known
 connections.  Setting ct_label marks a connection
diff --git ovn/northd/ovn-northd.c ovn/northd/ovn-northd.c
index a4f76a9..8a0982e 100644
--- ovn/northd/ovn-northd.c
+++ ovn/northd/ovn-northd.c
@@ -106,13 +106,14 @@ enum ovn_stage {
 PIPELINE_STAGE(SWITCH, IN,  PRE_LB, 4, "ls_in_pre_lb")\
 PIPELINE_STAGE(SWITCH, IN,  PRE_STATEFUL,   5, "ls_in_pre_stateful")  \
 PIPELINE_STAGE(SWITCH, IN,  ACL,6, "ls_in_acl")   \
-PIPELINE_STAGE(SWITCH, IN,  QOS_MARK,   7, "ls_in_qos_mark")  \
-PIPELINE_STAGE(SWITCH, IN,  LB, 8, "ls_in_lb")\
-PIPELINE_STAGE(SWITCH, IN,  STATEFUL,   9, "ls_in_stateful")  \
-PIPELINE_STAGE(SWITCH, IN,  ARP_ND_RSP,10, "ls_in_arp_rsp")   \
-PIPELINE_STAGE(SWITCH, IN,  DHCP_OPTIONS,  11, "ls_in_dhcp_options")  \
-PIPELINE_STAGE(SWITCH, IN,  DHCP_RESPONSE, 12, "ls_in_dhcp_r

Re: [ovs-dev] [PATCH v2] packet: proper return type for vlan_tci_to_pcp()

2017-02-02 Thread Shu Shen
On Thu, Feb 02, 2017 at 03:08:09PM -0800, Ben Pfaff wrote:
> On Thu, Feb 02, 2017 at 02:55:03PM -0800, Shu Shen wrote:
> > On Tue, Jan 31, 2017 at 10:45:31AM -0800, Ben Pfaff wrote:
> > > By my count, vlan_tci_to_pcp() is used in printf-like format
> > > specifiers in 7 places in the tree.  Before this patch, it is used
> > > with the following format specifiers:
> > > 
> > > %d  3 times %x  1 time PRIu8   1 time PRIx8   1 time
> > > 
> > > Both %d and %x are obviously correct, portable format specifiers for
> > > int, which is the return type of vlan_tci_to_pcp().  I contend that
> > > PRIu8 and PRIx8 should be acceptable too because the integer
> > > promotions convert uint8_t to int anyway.
> > The problem is that PRIu8 and PRIx8 the specifiers are not promoted to
> > 'u' and 'x' respectively on macOS.
> > 
> > For example, for PRIu*, on Mac OS, /usr/include/inttypes.h:
> > 
> >  #  define __PRI_8_LENGTH_MODIFIER__ "hh"
> >  #  define __PRI_64_LENGTH_MODIFIER__ "ll"
> >  #  define PRIu8 __PRI_8_LENGTH_MODIFIER__ "u"
> >  #  define PRIu16"hu"
> >  #  define PRIu32"lu"
> >  #  define PRIu64__PRI_64_LENGTH_MODIFIER__ "u"
> > 
> > While on Linux/glibc, /usr/include/inttypes.h:
> > 
> >  # define PRIu8  "u"
> >  # define PRIu16 "u"
> >  # define PRIu32 "u"
> >  # define PRIu64 __PRI64_PREFIX "u"
> > 
> > where all PRIu* except PRIu64 are the same.
> > 
> > Therefore, using PRIu8 or PRIx8 with the int return type of
> > vlan_tci_to_pcp() is causing compiler warnings on macOS.
> 
> What are the warnings?
See https://travis-ci.org/openvswitch/ovs/jobs/197378752#L1303
There are plenty of other warnings due to similar situation (if they are
not problems).

> 
> There is no real problem with using %hhd or %hhu to print an int,
> because printf() cannot tell the difference between an int and a char:
> they are both passed as int.
You may be right about printf with regard to an int and a char. But how
about an int specifier and a long variable?

When I noticed the Wformat warning messages on macOS, I have no idea
whether there is a hidden issue where the format specifier might be
truncating a variable.

The only way to be sure is to clean up the code base to remove the
warning messages even if they are just noise.

> 
> > As PRIu8 and PRIx8 are indeed accurate specifier for pcp (Priority code
> > point), the patch proposes changing the return type of
> > vlan_tci_to_pcp().
> > 
> > In addition, the patch includes changes of format specifiers to
> > consistently use PRI* specifier. But unfortunately as you pointed out
> > below, there are two "%d" occurrences I missed. I'll put these specifier
> > changes into a separate commit for clarity.
> 
> I don't see how putting them in a separate commit provides "clarity"
> since the lack of the changes is most of my problem with the commit.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [BUG] ovs-ofctl version 2.5.0 will crash with OFPFMFC_BAD_COMMAND

2017-02-02 Thread Ben Pfaff
Thanks for the additional information.

I tested this script in the OVS sandbox ("make sandbox") and in a VM
setup I sometimes use.  I didn't have any lucky reproducing the crash,
even running ovs-vswitchd under valgrind.  Sorry.

You might want to try either getting a backtrace for the crash or
running ovs-vswitchd under valgrind and then seeing what it has to say.

On Mon, Jan 23, 2017 at 07:58:51AM -0800, Vidyasagara Guntaka wrote:
> Hi Ben,
> 
> We could reproduce this with the latest version 2.6.1.  When we compiled the 
> code, we removed -O2 from CFLAGS.  This seems to make it happen more 
> frequently.  With the following script running, the error starts happening 
> within a few seconds and then continues to happen after every few seconds.  
> In summary, our suspicion is that having no controller set and having no 
> NORMAL processing flow seems to trigger the stack that was pointed out in the 
> gdb session more often and hence we hit this race condition very easily using 
> the following script.  (Even if there is a default NORMAL processing flow 
> entry, after the first iteration of the script below, that will be deleted).
> 
> Also, a few things about the setup - just in case:
>   * enp5s0 belongs to the physical interface on this hypervisor.
>   * vport0 and vport1 belong to tap interfaces corresponding to two VMs 
> running on this hypervisor.
>   * When the script was running, we had pings issued from the VMs so that 
> packets make it to the bridge br0.
> 
> Here is a small script that makes it happen on multiple of our hypervisors:
> 
> while true
> do
> ovs-ofctl add-flow br0 
> "priority=200,table=123,idle_timeout=1,in_port=1,actions=controller"
> ovs-ofctl add-flow br0 
> "priority=200,table=123,idle_timeout=1,in_port=2,actions=controller"
> ovs-ofctl add-flow br0 
> "priority=200,table=123,idle_timeout=1,in_port=3,actions=controller"
> ovs-ofctl add-flow br0 
> "priority=200,table=123,idle_timeout=1,in_port=4,actions=controller"
> ovs-ofctl del-flows br0 
> done
> 
> Here is our bridge br0 setup:
> 
> [root@deepspace ~]# ifconfig br0
> br0: flags=4163  mtu 1500
> inet 192.168.2.142  netmask 255.255.255.0  broadcast 192.168.2.255
> inet6 fe80::213:3bff:fe0f:1301  prefixlen 64  scopeid 0x20
> ether 00:13:3b:0f:13:01  txqueuelen 1000  (Ethernet)
> RX packets 89417814  bytes 12088012200 (11.2 GiB)
> RX errors 0  dropped 82  overruns 0  frame 0
> TX packets 32330647  bytes 3168352394 (2.9 GiB)
> TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
> 
> [root@deepspace ~]# ovs-vsctl show
> 54f89e00-edd2-486e-9626-6d11c7d8b0b6
> Bridge "br0"
> Port "vport1"
> Interface "vport1"
> Port "br0"
> Interface "br0"
> type: internal
> Port vtep
> Interface vtep
> type: vxlan
> options: {key=flow, remote_ip="192.168.1.141"}
> Port "vport0"
> Interface "vport0"
> Port "enp5s0"
> Interface "enp5s0"
> ovs_version: “2.6.1"
> 
> [root@deepspace ~]# ovs-ofctl show br0
> OFPT_FEATURES_REPLY (xid=0x2): dpid:00133b0f1301
> n_tables:254, n_buffers:256
> capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
> actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src 
> mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
>  1(enp5s0): addr:00:13:3b:0f:13:01
>  config: 0
>  state:  0
>  current:1GB-FD AUTO_NEG
>  advertised: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-HD 1GB-FD COPPER 
> AUTO_NEG AUTO_PAUSE AUTO_PAUSE_ASYM
>  supported:  10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-HD 1GB-FD COPPER 
> AUTO_NEG
>  speed: 1000 Mbps now, 1000 Mbps max
>  2(vport0): addr:fe:00:00:00:00:03
>  config: 0
>  state:  0
>  current:10MB-FD COPPER
>  speed: 10 Mbps now, 0 Mbps max
>  3(vport1): addr:fe:00:00:00:00:04
>  config: 0
>  state:  0
>  current:10MB-FD COPPER
>  speed: 10 Mbps now, 0 Mbps max
>  4(vtep): addr:aa:97:d2:a9:19:ed
>  config: 0
>  state:  0
>  speed: 0 Mbps now, 0 Mbps max
>  LOCAL(br0): addr:00:13:3b:0f:13:01
>  config: 0
>  state:  0
>  speed: 0 Mbps now, 0 Mbps max
> OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
> 
> [root@deepspace ~]# ovs-ofctl --version
> ovs-ofctl (Open vSwitch) 2.6.1
> OpenFlow versions 0x1:0x4
> 
> Please let us know if you need anything else to reproduce this.
> 
> Thanks,
> Sagar.
> 
> > On Jan 18, 2017, at 1:19 PM, Ben Pfaff  wrote:
> > 
> > If you can come up with simple reproduction instructions that work for
> > me, I'm happy to track this down.  It's probably something very simple.
> > 
> > On Tue, Jan 17, 2017 at 08:50:20AM -0800, Vidyasagara Guntaka wrote:
> >> This issue happened on our in-use systems and we were trying to find a way
> >> to move forw

Re: [ovs-dev] [RFC PATCH 2/5] OVN: SFC Implementation: New stage for SFC and modified ACL stage

2017-02-02 Thread John McDowall
Ben,

Posted an updated patch – then downloaded it from patchwork and applied to top 
to tree and ran make check and all tests passed.

Regards

John

On 1/31/17, 3:49 PM, "Ben Pfaff"  wrote:

If the patches cannot sensibly be applied separately, then, yes, they
should be a single patch.

On Tue, Jan 31, 2017 at 11:02:30PM +, John McDowall wrote:
> Ah, my bad do you want me to create a single patch file?
> 
> Regards
> 
> John
> 
> On 1/31/17, 2:44 PM, "Ben Pfaff"  wrote:
> 
> Now that I glance at the patch titles, I guess that the problem might 
be
> that this patch depends on some of the later patches.  In general, 
each
> patch should apply, build, and test properly whether or not later
> patches have been applied.
> 
> On Tue, Jan 31, 2017 at 10:03:21PM +, John McDowall wrote:
> > Ben,
> > 
> > Let me create a new patch set against the top of tree.
> > 
> > Regards
> > 
> > John
> > 
> > On 1/31/17, 1:46 PM, "Ben Pfaff"  wrote:
> > 
> > On Tue, Dec 27, 2016 at 02:11:43PM -0800, John McDowall wrote:
> > > This is the major body of code that implements SFC. There is 
a new L2 stage added to
> > > perform the chaining operations and modifications to the ACL 
stage to direct flows
> > > to the service chain.
> > > 
> > > Co-authored-by: Flavio Fernandes 
> > > Reported at: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMarch_040381.html&d=DwIBAg&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=0-H45ymu2qKdNfehkwCF8baQWBqDNhngIVaX4MlOpCQ&s=VbhqfPkju3uYqy7303Bfbz0fgnSeIi6aYsQoRwIH1PU&e=
 
> > > Reported at: 
https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMay_041359.html&d=DwIBAg&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=0-H45ymu2qKdNfehkwCF8baQWBqDNhngIVaX4MlOpCQ&s=vNieFlb8T7dsSygACJyaJiHvnrQDGCyox17covGw4Ns&e=
 
> > > 
> > > Signed-off-by: John McDowall 
> > 
> > I think that this has bit-rotted because I get tons of compiler 
errors
> > trying to build it.  I tried rewinding my repo to a point from 
December
> > but I still the same ones:
> > 
> > ../ovn/northd/ovn-northd.c:2669:13: error: incomplete 
definition of type 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
declaration of 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2669:37: error: incomplete 
definition of type 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
declaration of 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2673:30: error: incomplete 
definition of type 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
declaration of 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2674:30: error: incomplete 
definition of type 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
declaration of 'struct nbrec_logical_port_pair_group'
> > ../ovn/northd/ovn-northd.c:2701:49: error: no member named 
'options' in 'struct nbrec_acl'
> > ../ovn/northd/ovn-northd.c:2713:37: error: no member named 
'n_port_chains' in 'struct nbrec_logical_switch'
> > ../ovn/northd/ovn-northd.c:2714:30: error: no member named 
'port_chains' in 'struct nbrec_logical_switch'
> > ../ovn/northd/ovn-northd.c:2716:39: error: incomplete 
definition of type 'struct nbrec_logical_port_chain'
> > ../ovn/northd/ovn-northd.c:2710:18: note: forward 
declaration of 'struct nbrec_logical_port_chain'
> > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete 
definition of type 'struct nbrec_logical_port_chain'
> > /usr/include/i386-linux-gnu/bits/string2.h:110:58: note: 
expanded from macro 'strcmp'
> > ../ovn/northd/ovn-northd.c:2710:18: note: forward 
declaration of 'struct nbrec_logical_port_chain'
> > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete 
definition of type 'struct nbrec_logical_port_chain'
> > /usr/include/i386-linux-gnu/bits/string2.h:111:74: note: 
expanded from macro 'strcmp'
> > ../ovn/northd/ovn-northd.c:2710:18: note: forward 
declaration of 'struct nbrec_logical_port_chain'
> > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete 
definition

Re: [ovs-dev] OVS - ODL Sync on OF Bundle in 1.3

2017-02-02 Thread Jarno Rajahalme

> On Feb 2, 2017, at 3:04 PM, Abhijit Kumbhare  
> wrote:
> 
> Jarno,
> 
> Can you please attend the next Thursday OpenDaylight OpenFlow Plugin meeting? 
> We would like to discuss the OpenDaylight bundles implementation – but would 
> also like to discuss the OvS bundles implementation since we will very likely 
> be interacting with the OvS bundles.
> 

Thanks for the invitation, but I’d rather not listen in on this meeting.

OVS simply implements bundles as specified in OpenFlow 1.4, and in the OpenFlow 
1.3 bundles extension. I’d suggest someone from your project experiment with 
bundles with OVS through "ovs-ofctl -O OpenFlow13 bundle” command. Looking at 
the ovs-ofctl man page should give pretty good idea what is possible, and then 
implementing the OpenFlow protocol interface for this in OpenDaylight should be 
pretty straightforward.

Please let me know if you have any specific questions about OVS bundles 
implementation.

Regards,

  Jarno

> The meeting time is:
> 
> Feb 9: Thursday 8 am Pacific:
> 
> Webex: 
> https://meetings.webex.com/collabs/#/meetings/detail?uuid=M2D4SMAUPJLGFE395VGN2HU7M7-9VIB&rnd=180270.26196
>  
> 
> 
> If you forget – the meeting time & link is also accessible from our main 
> page: https://wiki.opendaylight.org/view/OpenDaylight_OpenFlow_Plugin:Main 
> 
> 
> I will also send a separate meeting invite once you confirm.
> 
> Abhijit
> 
> From: Jarno Rajahalme mailto:ja...@ovn.org>>
> Date: Wednesday, January 25, 2017 at 2:16 PM
> To: Jan Scheurich  >
> Cc: Abhijit Kumbhare  >, Zoltán Balogh 
> mailto:zoltan.bal...@ericsson.com>>, László Sürü 
> mailto:laszlo.s...@ericsson.com>>, Miklós Pelyva 
> mailto:miklos.pel...@ericsson.com>>, Jozef 
> Bacigal mailto:jozef.baci...@pantheon.tech>>, 
> Tomáš Slušný  >, Prasanna Huddar 
> mailto:prasanna.hud...@ericsson.com>>, Shuva 
> Jyoti Kar  >, Sharath Kumar V 
> mailto:sharath.kuma...@ericsson.com>>, 
> Kanagasundaram K  >, Sunil Kumar G 
> mailto:sunil.g.ku...@ericsson.com>>, D 
> Arunprakash mailto:d.arunprak...@ericsson.com>>, 
> Muthukumaran K  >, "d...@openvswitch.org 
> "  >
> Subject: Re: OVS - ODL Sync on OF Bundle in 1.3
> 
> This came with too short notice for me, sorry.
> 
>   Jarno
> 
>> On Jan 24, 2017, at 11:48 PM, Jan Scheurich > > wrote:
>> 
>>  <>We’ll use below webex instead of Lync/SkypeFB:
>> https://meetings.webex.com/collabs/#/meetings/detail?uuid=MBFUHFYE8TRG3NPX76F4PLRS2N-3OWH&rnd=231184.55753
>>  
>>  
>> Rescheduled to Wednesday due to unavailability of key participants.
>> @Jarno: As you are the OVS brain behind bundles. Do you have the chance to 
>> join for a short time to discuss some  details regarding OF 1.3 bundle 
>> implementation?
>>  
>> Hi Jarno,
>>  
>> OpenDaylight folks are finally starting to implement support of OpenFlow 
>> bundles as a basis for the bundle-based hitless recync procedure we 
>> discussed earlier. As ODL does not yet have protocol support for OpenFlow 
>> versions 1.4 or 1.5, they intend to implement the bundle extension to OF 1.3 
>> as specified in EXT-230 in 
>> https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-extensions-1.3.x-pack2.zip
>>  
>> 
>>  
>> Would you have time for a short meeting on early Monday to discuss and 
>> confirm whether the OVS implementation of bundles in OF 1.3 is compliant 
>> with EXT-230 and has everything they need?
>>  
>> Thanks, Jan
>>  
>>  
>> 
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] ovn-controller: Fix memory leak and bug in setup_qos().

2017-02-02 Thread Numan Siddique
On Thu, Feb 2, 2017 at 11:16 PM, Ben Pfaff  wrote:

> On Thu, Feb 02, 2017 at 12:44:02PM +0530, Numan Siddique wrote:
> > On Wed, Feb 1, 2017 at 10:52 PM, Ben Pfaff  wrote:
> >
> > > The caller of netdev_get_qos() is responsible for freeing its 'details'
> > > smap.
> > >
> > >
> > Is this patch a good candidate for back porting to 2.6 branch since it's
> > fixing memory leaks ?
>
> After a quick look, I don't think that branch-2.6 has this bug.  If I'm
> mistaken, please let me know and I'll look again.
>


​Oops, sorry for the confusion.

Thanks
Numan
​
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/3] ovn: specify options:nat-addresses as "router"

2017-02-02 Thread Mickey Spiegel
Currently in OVN, the "nat-addresses" in the "options" column of a
logical switch port of type "router" must be specified manually.
Typically the user would specify as "nat-addresses" all of the NAT
external IP addresses and load balancer IP addresses that have
already been specified separately on the router.

This patch allows the logical switch port's "nat-addresses" to be
specified as the string "router".  When ovn-northd sees this string,
it automatically copies the following into the southbound
Port_Binding's "nat-addresses" in the "options" column:
The options:router-port's MAC address.
Each NAT external IP address (of any NAT type) specified on the
logical router of options:router-port.
Each load balancer IP address specified on the logical router of
options:router-port.
This will cause the controller where the gateway router resides to
issue gratuitous ARPs for each NAT external IP address and for each
load balancer IP address specified on the gateway router.

This patch is written as if it will be included in OVS 2.7.  If it
is deferred to OVS 2.8, then the OVS version mentioned in ovn-nb.xml
will need to be updated from OVS 2.7 to OVS 2.8.

Signed-off-by: Mickey Spiegel 
Acked-by: Gurucharan Shetty 
---
 ovn/northd/ovn-northd.c | 116 ++--
 ovn/ovn-nb.xml  |  42 +++---
 tests/ovn.at|  60 +
 3 files changed, 187 insertions(+), 31 deletions(-)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index a4f76a9..79ebac4 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -1436,6 +1436,88 @@ join_logical_ports(struct northd_context *ctx,
 }
 
 static void
+ip_address_and_port_from_lb_key(const char *key, char **ip_address,
+uint16_t *port);
+
+static void
+get_router_load_balancer_ips(const struct ovn_datapath *od,
+ struct sset *all_ips)
+{
+if (!od->nbr) {
+return;
+}
+
+for (int i = 0; i < od->nbr->n_load_balancer; i++) {
+struct nbrec_load_balancer *lb = od->nbr->load_balancer[i];
+struct smap *vips = &lb->vips;
+struct smap_node *node;
+
+SMAP_FOR_EACH (node, vips) {
+/* node->key contains IP:port or just IP. */
+char *ip_address = NULL;
+uint16_t port;
+
+ip_address_and_port_from_lb_key(node->key, &ip_address, &port);
+if (!ip_address) {
+continue;
+}
+
+if (!sset_contains(all_ips, ip_address)) {
+sset_add(all_ips, ip_address);
+}
+
+free(ip_address);
+}
+}
+}
+
+/* Returns a string consisting of the port's MAC address followed by the
+ * external IP addresses of all NAT rules defined on that router and the
+ * VIPs of all load balancers defined on that router.
+ *
+ * The caller must free the returned string with free(). */
+static char *
+get_nat_addresses(const struct ovn_port *op)
+{
+struct eth_addr mac;
+if (!op->nbrp || !op->od || !op->od->nbr
+|| (!op->od->nbr->n_nat && !op->od->nbr->n_load_balancer)
+|| !eth_addr_from_string(op->nbrp->mac, &mac)) {
+return NULL;
+}
+
+struct ds addresses = DS_EMPTY_INITIALIZER;
+ds_put_format(&addresses, ETH_ADDR_FMT, ETH_ADDR_ARGS(mac));
+
+/* Get NAT IP addresses. */
+for (int i = 0; i < op->od->nbr->n_nat; i++) {
+const struct nbrec_nat *nat;
+nat = op->od->nbr->nat[i];
+
+ovs_be32 ip, mask;
+
+char *error = ip_parse_masked(nat->external_ip, &ip, &mask);
+if (error || mask != OVS_BE32_MAX) {
+free(error);
+continue;
+}
+ds_put_format(&addresses, " %s", nat->external_ip);
+}
+
+/* A set to hold all load-balancer vips. */
+struct sset all_ips = SSET_INITIALIZER(&all_ips);
+get_router_load_balancer_ips(op->od, &all_ips);
+
+const char *ip_address;
+SSET_FOR_EACH(ip_address, &all_ips) {
+ds_put_format(&addresses, " %s", ip_address);
+}
+sset_destroy(&all_ips);
+
+return ds_steal_cstr(&addresses);
+}
+
+static void
 ovn_port_update_sbrec(const struct ovn_port *op,
   struct hmap *chassis_qdisc_queues)
 {
@@ -1524,7 +1606,15 @@ ovn_port_update_sbrec(const struct ovn_port *op,
 
 const char *nat_addresses = smap_get(&op->nbsp->options,
"nat-addresses");
-if (nat_addresses) {
+if (nat_addresses && !strcmp(nat_addresses, "router")) {
+if (op->peer && op->peer->nbrp) {
+char *nats = get_nat_addresses(op->peer);
+if (nats) {
+smap_add(&new, "nat-addresses", nats);
+free(nats);
+}
+}
+} else if (nat_addresses) {
 struct lp

[ovs-dev] [PATCH v3 2/3] ovn: Gratuitous ARP for centralized NAT rules on a distributed router

2017-02-02 Thread Mickey Spiegel
This patch extends gratuitous ARP support for NAT addresses so that it
applies to centralized NAT rules on a distributed router, in addition to
the existing gratuitous ARP support for NAT addresses on gateway routers.

Gratuitous ARP packets for centralized NAT rules on a distributed router
are only generated on the redirect-chassis.  This is achieved by extending
the syntax for "options:nat-addresses" in the southbound database,
allowing the condition 'is_chassis_resident("LPORT_NAME")' to be appended
after the MAC and IP addresses.  This condition is automatically inserted
by ovn-northd when the northbound "options:nat-addresses" is set to
"router" and the peer is a distributed gateway port.

A separate patch will be required to support gratuitous ARP for
distributed NAT rules that specify logical_port and external_mac.  Since
the MAC address differs and the logical port often resides on a different
chassis from the redirect-chassis, these addresses cannot be included in
the same "nat-addresses" string as for centralized NAT rules.

Signed-off-by: Mickey Spiegel 
---
 ovn/controller/pinctrl.c | 104 ---
 ovn/lib/ovn-util.c   |  38 ++---
 ovn/lib/ovn-util.h   |   2 +
 ovn/northd/ovn-northd.c  |  52 +---
 ovn/ovn-nb.xml   |  33 ---
 ovn/ovn-sb.xml   |  31 ++
 tests/ovn.at |  70 +++
 7 files changed, 289 insertions(+), 41 deletions(-)

diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index 0cdbf87..c75f753 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -37,6 +37,7 @@
 #include "lib/dhcp.h"
 #include "ovn-controller.h"
 #include "ovn/actions.h"
+#include "ovn/lex.h"
 #include "ovn/lib/logical-fields.h"
 #include "ovn/lib/ovn-dhcp.h"
 #include "ovn/lib/ovn-util.h"
@@ -1047,7 +1048,8 @@ send_garp_update(const struct sbrec_port_binding 
*binding_rec,
 
 volatile struct garp_data *garp = NULL;
 /* Update GARP for NAT IP if it exists. */
-if (!strcmp(binding_rec->type, "l3gateway")) {
+if (!strcmp(binding_rec->type, "l3gateway")
+|| !strcmp(binding_rec->type, "patch")) {
 struct lport_addresses *laddrs = NULL;
 laddrs = shash_find_data(nat_addresses, binding_rec->logical_port);
 if (!laddrs) {
@@ -1200,24 +1202,101 @@ get_localnet_vifs_l3gwports(const struct ovsrec_bridge 
*br_int,
 
 const struct local_datapath *ld;
 HMAP_FOR_EACH (ld, hmap_node, local_datapaths) {
-if (!ld->has_local_l3gateway) {
+if (!ld->localnet_port) {
 continue;
 }
 
 for (size_t i = 0; i < ld->ldatapath->n_lports; i++) {
 const struct sbrec_port_binding *pb = ld->ldatapath->lports[i];
-if (!strcmp(pb->type, "l3gateway")
-/* && it's on this chassis */) {
+if ((ld->has_local_l3gateway && !strcmp(pb->type, "l3gateway"))
+|| !strcmp(pb->type, "patch")) {
 sset_add(local_l3gw_ports, pb->logical_port);
 }
 }
 }
 }
 
+static bool
+pinctrl_is_chassis_resident(const struct lport_index *lports,
+const struct sbrec_chassis *chassis,
+const char *port_name)
+{
+const struct sbrec_port_binding *pb
+= lport_lookup_by_name(lports, port_name);
+return pb && pb->chassis && pb->chassis == chassis;
+}
+
+/* Extracts the mac, IPv4 and IPv6 addresses, and logical port from
+ * 'addresses' which should be of the format 'MAC [IP1 IP2 ..]
+ * [is_chassis_resident("LPORT_NAME")]', where IPn should be a valid IPv4
+ * or IPv6 address, and stores them in the 'ipv4_addrs' and 'ipv6_addrs'
+ * fields of 'laddrs'.  The logical port name is stored in 'lport'.
+ *
+ * Returns true if at least 'MAC' is found in 'address', false otherwise.
+ *
+ * The caller must call destroy_lport_addresses() and free(lport). */
+static bool
+extract_addresses_with_port(const char *addresses,
+struct lport_addresses *laddrs,
+char **lport)
+{
+int ofs;
+if (!extract_addresses(addresses, laddrs, &ofs)) {
+return false;
+} else if (ofs >= strlen(addresses)) {
+return true;
+}
+
+struct lexer lexer;
+lexer_init(&lexer, addresses + ofs);
+lexer_get(&lexer);
+
+if (lexer.error || lexer.token.type != LEX_T_ID
+|| !lexer_match_id(&lexer, "is_chassis_resident")) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_INFO_RL(&rl, "invalid syntax '%s' in address", addresses);
+lexer_destroy(&lexer);
+return true;
+}
+
+if (!lexer_match(&lexer, LEX_T_LPAREN)) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_INFO_RL(&rl, "Syntax error: expecting '(' after "
+  "'is_chassis_resident' in address '%s

[ovs-dev] [PATCH v3 3/3] ovn: Gratuitous ARP for distributed NAT rules

2017-02-02 Thread Mickey Spiegel
This patch extends gratuitous ARP support for NAT addresses so that it
applies to distributed NAT rules on a distributed logical router.

Gratuitous ARP packets for distributed NAT rules are only generated on
the chassis where the 'logical_port' specified in the NAT rule resides.
Gratuitous ARPs are issued for the 'external_ip' address, resolving to
the 'external_mac'.

Since the MAC address varies for each distributed NAT rule, a separate
'nat_addresses' string must be generated for each distributed NAT rule.
For this reason, in the southbound 'Port_Binding',
'options:nat-addresses' is replaced by a 'nat_addresses' column that
can have an unlimited number of instances.  In order to allow for
upgrades, pinctrl in the ovn-controller can work off either the
'nat_addresses' column (if present), or 'options:nat-addresses'
otherwise.

This patch is written as if it will be included in OVS 2.7.  If it
is deferred to OVS 2.8, then the OVS version mentioned in ovn-sb.xml
will need to be updated from OVS 2.7 to OVS 2.8.

Signed-off-by: Mickey Spiegel 
---
 ovn/controller/pinctrl.c | 78 +++
 ovn/northd/ovn-northd.c  | 87 +---
 ovn/ovn-sb.ovsschema |  9 +++--
 ovn/ovn-sb.xml   | 17 --
 tests/ovn.at | 33 +++---
 5 files changed, 159 insertions(+), 65 deletions(-)

diff --git a/ovn/controller/pinctrl.c b/ovn/controller/pinctrl.c
index c75f753..7ce34f7 100644
--- a/ovn/controller/pinctrl.c
+++ b/ovn/controller/pinctrl.c
@@ -1293,6 +1293,42 @@ extract_addresses_with_port(const char *addresses,
 }
 
 static void
+consider_nat_address(const char *nat_address,
+ const struct sbrec_port_binding *pb,
+ struct sset *nat_address_keys,
+ const struct lport_index *lports,
+ const struct sbrec_chassis *chassis,
+ struct shash *nat_addresses)
+{
+struct lport_addresses *laddrs = xmalloc(sizeof *laddrs);
+char *lport = NULL;
+if (!extract_addresses_with_port(nat_address, laddrs, &lport)
+|| (!lport && !strcmp(pb->type, "patch"))) {
+free(laddrs);
+if (lport) {
+free(lport);
+}
+return;
+} else if (lport) {
+if (!pinctrl_is_chassis_resident(lports, chassis, lport)) {
+free(laddrs);
+free(lport);
+return;
+}
+free(lport);
+}
+
+int i;
+for (i = 0; i < laddrs->n_ipv4_addrs; i++) {
+char *name = xasprintf("%s-%s", pb->logical_port,
+laddrs->ipv4_addrs[i].addr_s);
+sset_add(nat_address_keys, name);
+free(name);
+}
+shash_add(nat_addresses, pb->logical_port, laddrs);
+}
+
+static void
 get_nat_addresses_and_keys(struct sset *nat_address_keys,
struct sset *local_l3gw_ports,
const struct lport_index *lports,
@@ -1306,38 +1342,24 @@ get_nat_addresses_and_keys(struct sset 
*nat_address_keys,
 if (!pb) {
 continue;
 }
-const char *nat_addresses_options = smap_get(&pb->options,
- "nat-addresses");
-if (!nat_addresses_options) {
-continue;
-}
 
-struct lport_addresses *laddrs = xmalloc(sizeof *laddrs);
-char *lport = NULL;
-if (!extract_addresses_with_port(nat_addresses_options, laddrs, &lport)
-|| (!lport && !strcmp(pb->type, "patch"))) {
-free(laddrs);
-if (lport) {
-free(lport);
+if (pb->n_nat_addresses) {
+for (int i = 0; i < pb->n_nat_addresses; i++) {
+consider_nat_address(pb->nat_addresses[i], pb,
+ nat_address_keys, lports, chassis,
+ nat_addresses);
 }
-continue;
-} else if (lport) {
-if (!pinctrl_is_chassis_resident(lports, chassis, lport)) {
-free(laddrs);
-free(lport);
-continue;
+} else {
+/* Continue to support options:nat-addresses for version
+ * upgrade. */
+const char *nat_addresses_options = smap_get(&pb->options,
+ "nat-addresses");
+if (nat_addresses_options) {
+consider_nat_address(nat_addresses_options, pb,
+ nat_address_keys, lports, chassis,
+ nat_addresses);
 }
-free(lport);
-}
-
-int i;
-for (i = 0; i < laddrs->n_ipv4_addrs; i++) {
-char *name = xasprintf("%s-%s", pb->logical_port,
-laddrs->ipv4_addrs[i].addr_s);
-sset_add(nat_address_key