Re: [ovs-dev] [PATCH v2] windows: wmi add include

2017-02-03 Thread Alin Serdean


> -Original Message-
> From: Ben Pfaff [mailto:b...@ovn.org]
> Sent: Wednesday, February 1, 2017 7:36 PM
> To: Alin Serdean 
> Cc: Guru Shetty ; Sairam Venugopal
> ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2] windows: wmi add include
> 
> On Wed, Feb 01, 2017 at 12:56:44AM +, Alin Serdean wrote:
> >
> > > -Original Message-
> > > From: Ben Pfaff [mailto:b...@ovn.org]
> > > Sent: Tuesday, January 31, 2017 11:35 PM
> > > To: Alin Serdean 
> > > Cc: d...@openvswitch.org
> > > Subject: Re: [ovs-dev] [PATCH v2] windows: wmi add include
> > >
> > > On Mon, Jan 30, 2017 at 07:42:31AM +, Alin Serdean wrote:
> > > > Add 'util.h' to includes otherwise the result of the function
> > > > 'ovs_format_message' will be unknown and be converted to int,
> > > > triggering an abort of vswitchd.
> > >
> > > Is there a way to make MSVC fail the compile if a function is called
> > > without a visible prototype?  That would prevent this problem.
> > >
> > > Alternatively, if it's possible to get a warning but not an error,
> > > it might be possible to add something to the appveyor build script
> > > to fail that build if any such warning is reported.
> > [Alin Serdean] Thanks for the idea Ben(made me facepalm for not thinking
> of it).
> > There is something like -Wimplicit-function-declaration(-Wall) and I could
> issue an error while compiling.
> 
> Sounds good.
> 
> > I tried on Linux to add a bogus function, but make stopped during linking
> (from what I remember -Werror is needed to stop compiling).
> 
> By default, on Linux, we enable a lot of warnings but we do not make the
> compiler turn them into errors, because we assume that the common case is
> that some end user or distributor is building a known-good version of OVS
> and that they don't want the build to stop because they're using a slightly
> different version of the compiler that issuse warnings in slightly different
> circumstances.
[Alin Serdean] Good point. I think on Windows the majority will use msi's for 
ease of use, but that is another debate
> 
> But we encourage developers to configure with --enable-Werror, which
> turns all warnings into errors (that break the build).  That way, developers
> can't miss warnings that might be about important issues.
> 
> I thought that we used --enable-Werror in the travis CI system too, but now I
> see that I'm mistaken.  Maybe we should enable it there.
[Alin Serdean] I have no idea atm, I will check it out.
But even if it is enable, for sure it won't work because:
https://github.com/openvswitch/ovs/blob/master/build-aux/cccl#L148
That is the wrapper that transform the gcc arguments to cl(visual studio 
compiler) arguments.
I will spend time to try to come up with a list of matches between the two type 
of compiler arguments.
At the moment we get level one warnings from the build, because that is the 
default setting. I.E.:
https://ci.appveyor.com/project/blp/ovs/build/1.0.2752#L491
I will create a test appveyor account to set up the full scheme, this is too 
important to ignore any further.
Thanks a lot for the idea and answering my questions!
> 
> > Since this has deep implication is it ok if I stop compiling for this type 
> > of
> issue?
> 
> I think so, at least for this particular warning.
> 
> > I enabled it locally and fixing current issues that we have in the code(most
> of them are ok since the type are int, but we need to fix them).
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/7] userspace: Support for L3 tunneling

2017-02-03 Thread Jarno Rajahalme
Thanks for this contribution, Jan & al!

I’ll start reviewing these right away. Please find my comment addressing your 
concern below,

> On Feb 3, 2017, at 2:37 AM, Jan Scheurich  wrote:
> 
> This patch set is part of an initiative to deal with non-Ethernet packets in 
> OVS for advanced use cases like L3 tunneling or NSH. The initiative is 
> centering on the new OpenFlow concepts of "Packet type-aware pipeline" (PTAP) 
> and "Generic encap/decap actions" (EXT-382). The overall design is documented 
> in
> https://docs.google.com/document/d/1oWMYUH8sjZJzWa72o2q9kU0N6pNE-rwZcLH3-kbbDR8
> 
> This patch series implements the user-space parts of the support for L3 
> tunnels connected to the legacy Ethernet-only pipeline in OVS.
> 
> In large parts it is an adaptation of the earlier work on L3 tunneling by 
> Lorand Jakab, Simon Horman, Jiri Benc and Yi Yang, adapted to the new design 
> for packet type aware pipelines as prototyped by Jean Tourrilhes.
> 
> It supersedes user-space patches 9-11 of the most recent patch set submitted 
> by Yi Yang 
> (https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326865.html). 
> The remaining user-space parts of that series (VXLAN-GPE) will be superseded 
> by further patches in the context of this initiative.
> 
> Key changes are the introduction of explicit packet_type fields in the 
> dp_packet and flow structs, as well as a simpler handling of L3 tunnels 
> limited to the ofproto layer.
> 
>   userspace: Add packet_type in dp_packet and flow
>   userspace: Support for push_eth and pop_eth actions
>   userspace: Switching of L3 packets in L2 pipeline
>   ofproto-dpif-upcall: Intialize dump-seq of new flow to zero
>   userspace: L3 tunnel support for GRE and LISP
>   dpif-netlink: Don't send PACKET_TYPE to kernel
>   ofproto-dpif-xlate: refactor compose_output_action__
> 
> For native L3 tunneling with the user-space datapath these patches are 
> complete. To apply L3 tunneling to the kernel datapath, they are dependent on 
> the following other contributions:
> 
> *   [PATCH net-next v13 0/8] openvswitch: support for layer 3 
> encapsulated packets
>   https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/324943.html
>   The net-next patch has been accepted
> *   [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327749.html
> *   Another user-space patch to send the "layer3" option down via 
> rtnetlink interface
> *   Backports of the kernel datapath patches to the OVS tree module:
>   Patches 01-08 of patch series [PATCH v2 00/17] port Jiri Benc's L3 patchset 
> to ovs
>   https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326865.html
> *   A compatibility mechanism to send down "layer3" option to the 
> backported kernel module to support older kernels.
> 
> To ease the progress with the implementation of the subsequent user-space 
> patches for the PTAP, EXT-382 and NSH and avoid the pain of frequent 
> re-basing of these large patch series, we would like to request exemption for 
> merging the present patch series before the kernel datapath pieces are all in 
> place.
> 

In general it is fine to add functionality to the userspace datapath even if no 
kernel equivalent exists. Then, should kernel equivalent become available, we 
typically have a choice between maintaining compatibility with different 
datapath implementations of a same feature, or changing the userspace datapath 
implementation to closely match the kernel so that one kind of translation can 
serve both datapaths.  What we do not want is to accept changes to the OVS tree 
kernel module before a feature is upstreamed, as that could lead to an OVS 
release being made with one kind of ABI and then a later upstream kernel 
datapath having a different ABI, where in the worst case specific opcode values 
for actions, key fields, etc. might have different semantics, resulting into 
maintenance difficulties.  As we do not have an ABI for the userspace datapath 
these concerns typically do not apply.

  Jarno

> build-aux/extract-ofp-fields  |   1 +
> datapath/linux/compat/include/linux/openvswitch.h |  19 +
> include/openflow/openflow-common.h|   9 +
> include/openvswitch/flow.h|  30 +-
> include/openvswitch/match.h   |   1 +
> include/openvswitch/meta-flow.h   |  15 +-
> include/openvswitch/ofp-print.h   |   9 +-
> lib/dp-packet.c   |   4 +
> lib/dp-packet.h   |  37 +-
> lib/dpif-netdev.c |  62 ++--
> lib/dpif-netlink.c|  55 ++-
> lib/dpif.c|   8 +-
> lib/flow.c| 100 +++--
> lib/flow.h|   9 +
> 

Re: [ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining

2017-02-03 Thread Ben Pfaff
The approach of using OVN logical ports to build the chain seems clean,
but I wonder whether it's general-purpose enough to do what users want.
If not, that is, if some users want to use physical appliances rather
than VMs or containers that are integrated into OVN, the question arises
of how to do that.  It might be that such appliances can somehow be
integrated with OVN by introducing new types of OVN logical ports, or
some other alternative might need to be invented.

I don't know what operators want.

I believe that Guru (CCed) has some of the same concerns.

On Fri, Feb 03, 2017 at 10:17:05PM +, John McDowall wrote:
> Ben,
> 
> Thanks for the thorough review, I will update the code and resubmit the 
> patch. Any thoughts on the general approach to service chaining etc?
> 
> Regards
> 
> John
> 
> On 2/3/17, 2:10 PM, "Ben Pfaff"  wrote:
> 
> On Thu, Feb 02, 2017 at 03:22:56PM -0800, jmcdow...@paloaltonetworks.com 
> wrote:
> > From: John McDowall 
> > 
> > This patchset is the first round at having Service Function Chaining
> > functionality through OVN. The implementation is done entirely
> > on the northbound side of OVN. It is a bump on the wire implementation,
> > so no attempt is being made in keeping state while packets visit each
> > hop of the chain. ACLs are used as the classifiers, with the 
> augmentation
> > of action SFC, as well as option column.
> 
> Thanks for working on this!  Sorry it's taken so long to review.
> 
> cmp_port_pair_groups() treats two pair groups as equal if either one of
> them has no keys, but this violates transitivity (see
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Comparison-5Fsort=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=D8WTzjM4eBLAIUSaMpsuNQPXOo94n7zg2ZjpkjdkW64=BhGNi3da1UI7sRkk29-zyejanKHyZPF_ac-fm2Ctnis=
>  ).  Example: if 0 is a pair
> group with no keys and x and y are arbitrary pair groups, then this
> function will conclude that x <= 0 and 0 <= y.  By transitivity, we
> would also have x <= y, but that's a contradiction because x and y are
> arbitrary.
> 
> I suggest that empty pair groups should sort as less than nonempty ones,
> e.g.:
> 
> @@ -2772,8 +2772,10 @@ cmp_port_pair_groups(const void *ppg1_, const void 
> *ppg2_)
>  const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p;
>  const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p;
>  
> -if (ppg1->n_sortkey == 0 || ppg2->n_sortkey == 0) {
> -return 0;
> +if (ppg1->n_sortkey == 0) {
> +return ppg2->n_sortkey == 0 ? -1 : 0;
> +} else if (ppg2->n_sortkey == 0) {
> +return 1;
>  }
>  
>  const int64_t key1 = ppg1->sortkey[0];
> 
> ovn-northd.c has some new uses of VLOG without rate-limiters.  These
> should probably be rate-limited, like most VLOG calls in ovn-northd.
> 
> VLOG messages shouldn't include a new-line since VLOG takes care of that
> itself.
> 
> The code has lots of lines longer than the coding style suggested limit
> of 79 columns.
> 
> Comments should generally begin with a capital letter and end with a
> period.
> 
> In build_chain_classifier_entry(), qsort() is a remarkably expensive way
> to find the smallest element in an array.
> 
> In build_chain_classifier_entry(), it's strange to have "" in the middle
> of the string here:
> ds_put_format(ds_action, "outport = %s;"" output;", 
> first_ovn_port->json_key);
> 
> In build_chain(), please don't comment out code:
> //const uint16_t egress_inner_priority = 150;
> 
> In build_chain(), a lot of debug logging has escaped as VLOG_INFO calls.
> 
> In build_chain(), you can use xmemdup() instead of xmalloc() followed by
> memcpy().
> 
> In build_chain(), I don't know why there's a line-splicing \ here:
> VLOG_INFO("Warning: Currently lacking support for 
> more than one port-pair %"PRIuSIZE"\n", \
>   lppg->n_port_pairs);
> Also, we'd usually just use VLOG_WARN instead of writing "Warning:".
> 
> In build_chain(), please always put {} around a conditional statement,
> e.g. here:
> if (!input_port_array[j] || !output_port_array[j]) 
> obtainedAllNeededInfo = false;
> 
> Please wrap at 79 columns in ovn-architecture.7.xml as well.
> 
> I spent some time with the documentation.  I worked on getting rid of
> passive voice because it's often unclear, e.g. "something happens"
> doesn't tell the reader who does it, but "The CMS does something" does.
> I dropped a lot of the items from the life cycle that didn't seem
> really relevant to VNFs (it seemed like they 

Re: [ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining

2017-02-03 Thread John McDowall
Ben,

Thanks for the thorough review, I will update the code and resubmit the patch. 
Any thoughts on the general approach to service chaining etc?

Regards

John

On 2/3/17, 2:10 PM, "Ben Pfaff"  wrote:

On Thu, Feb 02, 2017 at 03:22:56PM -0800, jmcdow...@paloaltonetworks.com 
wrote:
> From: John McDowall 
> 
> This patchset is the first round at having Service Function Chaining
> functionality through OVN. The implementation is done entirely
> on the northbound side of OVN. It is a bump on the wire implementation,
> so no attempt is being made in keeping state while packets visit each
> hop of the chain. ACLs are used as the classifiers, with the augmentation
> of action SFC, as well as option column.

Thanks for working on this!  Sorry it's taken so long to review.

cmp_port_pair_groups() treats two pair groups as equal if either one of
them has no keys, but this violates transitivity (see

https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Comparison-5Fsort=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=D8WTzjM4eBLAIUSaMpsuNQPXOo94n7zg2ZjpkjdkW64=BhGNi3da1UI7sRkk29-zyejanKHyZPF_ac-fm2Ctnis=
 ).  Example: if 0 is a pair
group with no keys and x and y are arbitrary pair groups, then this
function will conclude that x <= 0 and 0 <= y.  By transitivity, we
would also have x <= y, but that's a contradiction because x and y are
arbitrary.

I suggest that empty pair groups should sort as less than nonempty ones,
e.g.:

@@ -2772,8 +2772,10 @@ cmp_port_pair_groups(const void *ppg1_, const void 
*ppg2_)
 const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p;
 const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p;
 
-if (ppg1->n_sortkey == 0 || ppg2->n_sortkey == 0) {
-return 0;
+if (ppg1->n_sortkey == 0) {
+return ppg2->n_sortkey == 0 ? -1 : 0;
+} else if (ppg2->n_sortkey == 0) {
+return 1;
 }
 
 const int64_t key1 = ppg1->sortkey[0];

ovn-northd.c has some new uses of VLOG without rate-limiters.  These
should probably be rate-limited, like most VLOG calls in ovn-northd.

VLOG messages shouldn't include a new-line since VLOG takes care of that
itself.

The code has lots of lines longer than the coding style suggested limit
of 79 columns.

Comments should generally begin with a capital letter and end with a
period.

In build_chain_classifier_entry(), qsort() is a remarkably expensive way
to find the smallest element in an array.

In build_chain_classifier_entry(), it's strange to have "" in the middle
of the string here:
ds_put_format(ds_action, "outport = %s;"" output;", 
first_ovn_port->json_key);

In build_chain(), please don't comment out code:
//const uint16_t egress_inner_priority = 150;

In build_chain(), a lot of debug logging has escaped as VLOG_INFO calls.

In build_chain(), you can use xmemdup() instead of xmalloc() followed by
memcpy().

In build_chain(), I don't know why there's a line-splicing \ here:
VLOG_INFO("Warning: Currently lacking support for more 
than one port-pair %"PRIuSIZE"\n", \
  lppg->n_port_pairs);
Also, we'd usually just use VLOG_WARN instead of writing "Warning:".

In build_chain(), please always put {} around a conditional statement,
e.g. here:
if (!input_port_array[j] || !output_port_array[j]) 
obtainedAllNeededInfo = false;

Please wrap at 79 columns in ovn-architecture.7.xml as well.

I spent some time with the documentation.  I worked on getting rid of
passive voice because it's often unclear, e.g. "something happens"
doesn't tell the reader who does it, but "The CMS does something" does.
I dropped a lot of the items from the life cycle that didn't seem
really relevant to VNFs (it seemed like they were mostly cut-and-paste).

Some of the introductory documentation confuses me though.  The
following paragraph mentions about the OVN northbound database but it
seems to actually be about the southbound database.  I don't think this
is the right place to explain the logical flows that SFC puts into the
southbound database; that should be in ovn-northd.8.xml instead.  I
deleted the following paragraph but presumably it should be adapted for
ovn-nrothd.8.xml:

 
   Service insertion is implemented by adding 4 new flow rules into the OVN 
northbound database for
   each VNF inserted. The rules are added into the last table in the 
ingress path (L2_LKUP).
   The service insertion rules have a higher priority than the standard 
forwarding rules.
   This means that they 

Re: [ovs-dev] [PATCH] odp: Fix sample action in userspace

2017-02-03 Thread Andy Zhou
On Thu, Feb 2, 2017 at 10:56 AM, Ben Pfaff  wrote:
> On Wed, Jan 11, 2017 at 04:00:04PM -0800, Andy Zhou wrote:
>> User space implementation of the sample action is not consistent with
>> kernel datapath. In kernel datapath, the side effects of actions
>> within the sample actions are not visible to the subsequent actions.
>> Current user space handling does not follow the same logic. This patch
>> makes them consistent.
>>
>> Signed-off-by: Andy Zhou 
>
> Thanks!
>
> This looks like a bug fix, but I suspect that there's no way for
> existing code to actually trigger the bug, because I think that OVS at
> least up to 2.6 only ever puts actions that do not modify the packet
> into a sample action.
>
> Acked-by: Ben Pfaff 

This patch was sent out before we have decided to add the userspace
datapath 'clone' action.  Now that we have the clone action. The
sample action will continue to be used for the current use case, so
your comment will continue to apply.

The fix adds an extra packet copy. The current use cases are not
performance critical so may it is not a big deal. In case this becomes
an
issue, we then should revisit the logic here an see if we can avoid
packet copy in case it is not required.

Pushed to master.   Thanks for the review and comment.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] ovs-vsctl: Disallow empty bridge, port, and interface names.

2017-02-03 Thread Ben Pfaff
On Fri, Feb 03, 2017 at 01:58:14PM -0800, Andy Zhou wrote:
> On Fri, Feb 3, 2017 at 9:06 AM, Ben Pfaff  wrote:
> > Reported-by: Gabor Locsei 
> > Reported-at: 
> > https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html
> > Signed-off-by: Ben Pfaff 
> 
> Acked-by: Andy Zhou 

Thanks Girish, Andy, Gabor.  I applied these patches to master and
branch-2.7.  I also applied the first patch to branch-2.6.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface

2017-02-03 Thread Joe Stringer
On 3 February 2017 at 13:07, Eric Garver  wrote:
> On Thu, Feb 02, 2017 at 02:37:22PM -0800, Joe Stringer wrote:
>> There's a few patches in this series with co-authored-by, but missing
>> signed-off-by from the co-author. Also, the author does not need to be
>> listed as a co-author.
>
> Understood. I thought I followed an example from the git log, but maybe
> I was mistaken. I'll fix it in the next round.

It's possible something was a little different at a particular git
commit, but the tag explanation in the documentation is fairly clear:

http://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/#tags
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 3/6] dpif-netlink: Probe for out-of-tree datapath.

2017-02-03 Thread Joe Stringer
On 3 February 2017 at 12:56, Eric Garver  wrote:
> On Thu, Feb 02, 2017 at 02:50:01PM -0800, Joe Stringer wrote:
>> On 18 January 2017 at 11:45, Eric Garver  wrote:
>> > For out-of-tree datapath, only try genetlink/compat.
>> > For in-tree kernel datapath, try rtnetlink then genetlink.
>> >
>> > Signed-off-by: Eric Garver 
>> > ---
>> >  lib/dpif-netlink.c | 35 ---
>> >  1 file changed, 32 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>> > index e6459f358989..769806eadbf1 100644
>> > --- a/lib/dpif-netlink.c
>> > +++ b/lib/dpif-netlink.c
>> > @@ -210,6 +210,11 @@ static int ovs_packet_family;
>> >   * Initialized by dpif_netlink_init(). */
>> >  static unsigned int ovs_vport_mcgroup;
>> >
>> > +/* rtnetlink information for OVS.
>> > + *
>> > + * Initialized by dpif_netlink_init(). */
>> > +static bool ovs_datapath_out_of_tree = false;
>>
>> Perhaps in the comment briefly mention that if this is true, devices
>> are created using OVS netlink and if it's false devices are created
>> using NETLINK_ROUTE / as LWT?
>
> I'll buff the comment to explain this.
>
>> > +
>> >  static int dpif_netlink_init(void);
>> >  static int open_dpif(const struct dpif_netlink_dp *, struct dpif **);
>> >  static uint32_t dpif_netlink_port_get_pid(const struct dpif *,
>> > @@ -1014,11 +1019,13 @@ dpif_netlink_port_add(struct dpif *dpif_, struct 
>> > netdev *netdev,
>> >odp_port_t *port_nop)
>> >  {
>> >  struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
>> > -int error;
>> > +int error = EOPNOTSUPP;
>> >
>> >  fat_rwlock_wrlock(>upcall_lock);
>> > -error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop);
>> > -if (error == EOPNOTSUPP) {
>> > +if (!ovs_datapath_out_of_tree) {
>> > +error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop);
>> > +}
>> > +if (error) {
>> >  error = dpif_netlink_port_add_compat(dpif, netdev, port_nop);
>> >  }
>> >  fat_rwlock_unlock(>upcall_lock);
>> > @@ -2503,6 +2510,7 @@ dpif_netlink_init(void)
>> >  {
>> >  static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
>> >  static int error;
>> > +struct netdev *netdev = NULL;
>>
>> This can be in the #ifdef __linux__; otherwise you have an unused
>> variable on other platforms.
>
> Thanks for catching. I'll address that.
>
>> >
>> >  if (ovsthread_once_start()) {
>> >  error = nl_lookup_genl_family(OVS_DATAPATH_FAMILY,
>> > @@ -2526,6 +2534,27 @@ dpif_netlink_init(void)
>> >  error = nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY, 
>> > OVS_VPORT_MCGROUP,
>> > _vport_mcgroup);
>> >  }
>> > +#ifdef __linux__
>> > +if (!error) {
>> > +error = netdev_open("ovs-system-probe", "geneve", );
>> > +if (!error) {
>> > +error = netdev_geneve_create_kind(netdev, "ovs_geneve");
>> > +if (error != EOPNOTSUPP) {
>> > +if (!error) {
>> > +char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
>> > +const char *dp_port;
>> > +
>> > +dp_port = netdev_vport_get_dpif_port(netdev, 
>> > namebuf,
>> > + sizeof 
>> > namebuf);
>> > +netdev_geneve_destroy(dp_port);
>> > +}
>> > +ovs_datapath_out_of_tree = true;
>> > +}
>> > +netdev_close(netdev);
>> > +error = 0;
>> > +}
>> > +}
>> > +#endif
>>
>> Geneve isn't added yet, so this patch introduces build breakage.
>
> Oops! I rearranged the patches before submitting. I'll move this patch
> to where I originally had it, after adding the newlink support.
>
>> This doesn't look like it matches the most recent discussion on this topic:
>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327476.html
>
> Correct. I opted not to do the LWT probe on init. We have to verify the
> device after creation anyways, which includes checking for LWT. It
> didn't seem like probing for LWT at init gained anything.

OK, thanks for the explanation. I didn't have the full context to
review these patches so was partly going by just checking that the
feedback from the previous version was applied.

If I follow correctly, the backported datapath will register to allow
device creation via NETLINK_ROUTE with the name ovs_foo for foo in
{geneve,gre,vxlan}; in this patch, if creation of such devices is
successful, or if it fails due to any reason other than EOPNOTSUPP,
then we will attempt to use these out of tree tunnel types?

I guess the point is that the backported datapath can register to
allow such device creation, but it may fail since usually it's
configured over OVS genetlink; however, the failure 

Re: [ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining

2017-02-03 Thread Ben Pfaff
On Thu, Feb 02, 2017 at 03:22:56PM -0800, jmcdow...@paloaltonetworks.com wrote:
> From: John McDowall 
> 
> This patchset is the first round at having Service Function Chaining
> functionality through OVN. The implementation is done entirely
> on the northbound side of OVN. It is a bump on the wire implementation,
> so no attempt is being made in keeping state while packets visit each
> hop of the chain. ACLs are used as the classifiers, with the augmentation
> of action SFC, as well as option column.

Thanks for working on this!  Sorry it's taken so long to review.

cmp_port_pair_groups() treats two pair groups as equal if either one of
them has no keys, but this violates transitivity (see
https://en.wikipedia.org/wiki/Comparison_sort).  Example: if 0 is a pair
group with no keys and x and y are arbitrary pair groups, then this
function will conclude that x <= 0 and 0 <= y.  By transitivity, we
would also have x <= y, but that's a contradiction because x and y are
arbitrary.

I suggest that empty pair groups should sort as less than nonempty ones,
e.g.:

@@ -2772,8 +2772,10 @@ cmp_port_pair_groups(const void *ppg1_, const void 
*ppg2_)
 const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p;
 const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p;
 
-if (ppg1->n_sortkey == 0 || ppg2->n_sortkey == 0) {
-return 0;
+if (ppg1->n_sortkey == 0) {
+return ppg2->n_sortkey == 0 ? -1 : 0;
+} else if (ppg2->n_sortkey == 0) {
+return 1;
 }
 
 const int64_t key1 = ppg1->sortkey[0];

ovn-northd.c has some new uses of VLOG without rate-limiters.  These
should probably be rate-limited, like most VLOG calls in ovn-northd.

VLOG messages shouldn't include a new-line since VLOG takes care of that
itself.

The code has lots of lines longer than the coding style suggested limit
of 79 columns.

Comments should generally begin with a capital letter and end with a
period.

In build_chain_classifier_entry(), qsort() is a remarkably expensive way
to find the smallest element in an array.

In build_chain_classifier_entry(), it's strange to have "" in the middle
of the string here:
ds_put_format(ds_action, "outport = %s;"" output;", 
first_ovn_port->json_key);

In build_chain(), please don't comment out code:
//const uint16_t egress_inner_priority = 150;

In build_chain(), a lot of debug logging has escaped as VLOG_INFO calls.

In build_chain(), you can use xmemdup() instead of xmalloc() followed by
memcpy().

In build_chain(), I don't know why there's a line-splicing \ here:
VLOG_INFO("Warning: Currently lacking support for more than 
one port-pair %"PRIuSIZE"\n", \
  lppg->n_port_pairs);
Also, we'd usually just use VLOG_WARN instead of writing "Warning:".

In build_chain(), please always put {} around a conditional statement,
e.g. here:
if (!input_port_array[j] || !output_port_array[j]) 
obtainedAllNeededInfo = false;

Please wrap at 79 columns in ovn-architecture.7.xml as well.

I spent some time with the documentation.  I worked on getting rid of
passive voice because it's often unclear, e.g. "something happens"
doesn't tell the reader who does it, but "The CMS does something" does.
I dropped a lot of the items from the life cycle that didn't seem
really relevant to VNFs (it seemed like they were mostly cut-and-paste).

Some of the introductory documentation confuses me though.  The
following paragraph mentions about the OVN northbound database but it
seems to actually be about the southbound database.  I don't think this
is the right place to explain the logical flows that SFC puts into the
southbound database; that should be in ovn-northd.8.xml instead.  I
deleted the following paragraph but presumably it should be adapted for
ovn-nrothd.8.xml:

 
   Service insertion is implemented by adding 4 new flow rules into the OVN 
northbound database for
   each VNF inserted. The rules are added into the last table in the ingress 
path (L2_LKUP).
   The service insertion rules have a higher priority than the standard 
forwarding rules.
   This means that they override the existing forwarding rules. There are four
   new rules added for each insertion. Two ingress and two egress, The first 
ingress
   rule sends all traffic destined for the application into the VNF ingress 
port and the
   second rule takes all traffic destined to the application from the VNF 
egress port
   to the application, the priorities are such that the second rule is always 
checked
   first. In the egress direction the rules are similar if the traffic is from 
the
   application it is sent to the VNF egress port and if if is from the 
application
   and is from the VNF ingress port it is delivered to the destination.
   
 

Also this paragraph is confusing because the patch doesn't add any new
table named Service or Logical_Service or anything like that:

 
   The steps in this example refer to the 

Re: [ovs-dev] [PATCH 2/2] ovs-vsctl: Disallow empty bridge, port, and interface names.

2017-02-03 Thread Andy Zhou
On Fri, Feb 3, 2017 at 9:06 AM, Ben Pfaff  wrote:
> Reported-by: Gabor Locsei 
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html
> Signed-off-by: Ben Pfaff 

Acked-by: Andy Zhou 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ofp-actions: Fix variable length meta-flow field usage in flow actions.

2017-02-03 Thread Joe Stringer
On 2 February 2017 at 13:40, Joe Stringer  wrote:
> On 1 February 2017 at 11:06, Joe Stringer  wrote:
>> On 31 January 2017 at 16:57, Joe Stringer  wrote:
>>>  /* Returns true if a variable length meta-flow field 'mff' is not mapped in
>>>   * the 'vl_mff_map'. */
>>>  bool
>>> -mf_vl_mff_not_mapped(const struct mf_field *mff,
>>> - const struct vl_mff_map *vl_mff_map)
>>> +mf_vl_mff_mapped(const struct mf_field *mff, const struct vl_mff_map *map)
>>>  {
>>> -if (mff && vl_mff_map) {
>>> -if (mff->variable_len && !mff->mapped) {
>>> -return true;
>>> -}
>>> -}
>>> -
>>> -return false;
>>> +return !(map && mff && mff->variable_len && !mff->mapped);
>>>  }
>>
>> Yi-Hung pointed out offline that this reversal doesn't quite sit right
>> logically; this function is searching for a specific set of invalid
>> conditions, where there is a vl_mff_map, and the field is
>> variable-length, and it's not mapped. It's misleading to have all of
>> this covered by a function named "...mapped()".
>>
>> I suggest we retain the original logic but rename the function to
>> something like mf_vl_mff_invalid().
>
> I applied this to master.

Also cherry-picked to branch-2.7.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface

2017-02-03 Thread Eric Garver
On Thu, Feb 02, 2017 at 02:37:22PM -0800, Joe Stringer wrote:
> Hi Eric,

Hi Joe,

Thanks for your feedback! Much appreciated.

> There's a few patches in this series with co-authored-by, but missing
> signed-off-by from the co-author. Also, the author does not need to be
> listed as a co-author.

Understood. I thought I followed an example from the git log, but maybe
I was mistaken. I'll fix it in the next round.

> 
> When building this series I'm seeing a bunch of redefinition errors
> (across all platforms I test on, ranging from kernel 3.10 to 4.8):
> 
> /usr/include/linux/if_tunnel.h:21:9: error: preprocessor token
> GRE_CSUM redefined
> lib/packets.h:1046:9: this was the original definition
> /usr/include/linux/if_tunnel.h:22:9: error: preprocessor token
> GRE_ROUTING redefined
> lib/packets.h:1047:9: this was the original definition
> /usr/include/linux/if_tunnel.h:23:9: error: preprocessor token GRE_KEY 
> redefined
> lib/packets.h:1048:9: this was the original definition
> /usr/include/linux/if_tunnel.h:24:9: error: preprocessor token GRE_SEQ 
> redefined
> lib/packets.h:1049:9: this was the original definition
> /usr/include/linux/if_tunnel.h:25:9: error: preprocessor token
> GRE_STRICT redefined
> lib/packets.h:1050:9: this was the original definition
> /usr/include/linux/if_tunnel.h:26:9: error: preprocessor token GRE_REC 
> redefined
> lib/packets.h:1051:9: this was the original definition
> /usr/include/linux/if_tunnel.h:27:9: error: preprocessor token
> GRE_FLAGS redefined
> lib/packets.h:1052:9: this was the original definition
> /usr/include/linux/if_tunnel.h:28:9: error: preprocessor token
> GRE_VERSION redefined
> lib/packets.h:1053:9: this was the original definition

This seems to be due to including both linux/if_tunnel.h and packets.h.
linux/if_tunnel.h was added by the original patch series. I'll
investigate if it's really needed.

> 
> I have a few comments on some of the patches.

Thanks. I addressed those separately.

> On 18 January 2017 at 11:45, Eric Garver  wrote:
> > This series adds support for the creation of tunnels using the rtnetlink
> > interface. This will open the possibility for new features and flags on 
> > those
> > vports without the need to change vport compatibility code.
> >
> > Support for STT and LISP have not been added because these are not upstream 
> > yet,
> > so we don't know how the interface will be like upstream. And there are no
> > features in the current drivers right now we could make use of.
> >
> > Note: This work originally started by Thadeu Lima de Souza Cascardo.
> >
> > Testing:
> >   - kernel 4.9.3, in-tree datapath
> > - rtnetlink successfully creates devices
> >   - kernel 4.2.8, in-tree datapath
> > - rtnetlink is tried, but fails due to no COLLECT_METADATA support
> > - genetlink successfully creates devices
> >   - kernel 4.2.8, out-of-tree datapath
> > - rtnetlink is not tried
> > - genetlink successfully creates devices
> >
> > v2:
> >
> > We are able to set the MTU to UINT16_MAX since it is not restricted by the
> > driver during newlink.
> >
> > v3:
> >
> > Prefer to get type from vport before checking if device is opened. Also, 
> > disable
> > IFLA_VXLAN_LEARNING as it's not enabled on compat vports as well.
> >
> > v4:
> >   - Probe for ovs_geneve on init, this indicates out-of-tree datapath
> > - If exists, only try genetlink/compat
> > - else, try rtnetlink and fallback to genetlink/compat
> >   - Read back and verify devices created with rtnetlink
> >   - checkpatch fixes
> >
> > Eric Garver (4):
> >   dpif-netlink: Probe for out-of-tree datapath.
> >   dpif-netlink: add VXLAN creation support
> >   dpif-netlink: add GRE creation support
> >   dpif-netlink: add GENEVE creation support
> >
> > Thadeu Lima de Souza Cascardo (2):
> >   netdev: get device type from vport prefix if it uses one
> >   dpif-netlink: break out code to add compat and non-compat vports
> >
> >  lib/dpif-netlink.c | 679 
> > +
> >  lib/netdev.c   |  26 +-
> >  2 files changed, 651 insertions(+), 54 deletions(-)
> >
> > --
> > 2.10.0
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] SFC patches for OVN

2017-02-03 Thread John McDowall
Jason,

I checked it against top of the git tree. So just download the patch and clone 
the lastest and then

$ git apply –directory=ovn 

If you have any questions/feedback let me know.

Regards

John
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 4/6] dpif-netlink: add VXLAN creation support

2017-02-03 Thread Eric Garver
On Thu, Feb 02, 2017 at 02:59:42PM -0800, Joe Stringer wrote:
> On 18 January 2017 at 11:45, Eric Garver  wrote:
> > Creates VXLAN devices using rtnetlink and tunnel metadata.
> >
> > Co-Authored-by: Thadeu Lima de Souza Cascardo 
> > Co-Authored-by: Eric Garver 
> > Signed-off-by: Eric Garver 
> > ---
> >  lib/dpif-netlink.c | 194 
> > -
> >  1 file changed, 193 insertions(+), 1 deletion(-)
> 
> I think that the vast majority of this code is linux-specific and
> should not exist in dpif-netlink.c.
> 
> Perhaps we should add a new lib/netdev-lwt.[ch], where the .h file has
> the #ifdef __linux__ logic to either declare or define the functions,
> then lib/netdev-lwt.c has the implementations of these functions.  The
> .h would always be added to the build in lib/automake.mk and the .c
> file would only be added in the #if LINUX section.

That's a nice cleanup. Thanks for the suggestion.

> One might even argue that it'd be tidier if the
> dpif_netlink_port_{create,destroy} functions were moved here (and
> renamed to something more apt).

I'll look into this as well.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] SFC patches for OVN

2017-02-03 Thread Guru Shetty
>
>
>>
> Hi Guru,
>
> Thank you for your reply. I found the post. It was posted by John McDowall
> yesterday.
>
> I'd like to try it. Just wonder which ovs version/branch it should be
> applied to. 2.6.1 or the master branch?
>
> Master branch.



> Thank you
> Jason
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v4 3/6] dpif-netlink: Probe for out-of-tree datapath.

2017-02-03 Thread Eric Garver
On Thu, Feb 02, 2017 at 02:50:01PM -0800, Joe Stringer wrote:
> On 18 January 2017 at 11:45, Eric Garver  wrote:
> > For out-of-tree datapath, only try genetlink/compat.
> > For in-tree kernel datapath, try rtnetlink then genetlink.
> >
> > Signed-off-by: Eric Garver 
> > ---
> >  lib/dpif-netlink.c | 35 ---
> >  1 file changed, 32 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> > index e6459f358989..769806eadbf1 100644
> > --- a/lib/dpif-netlink.c
> > +++ b/lib/dpif-netlink.c
> > @@ -210,6 +210,11 @@ static int ovs_packet_family;
> >   * Initialized by dpif_netlink_init(). */
> >  static unsigned int ovs_vport_mcgroup;
> >
> > +/* rtnetlink information for OVS.
> > + *
> > + * Initialized by dpif_netlink_init(). */
> > +static bool ovs_datapath_out_of_tree = false;
> 
> Perhaps in the comment briefly mention that if this is true, devices
> are created using OVS netlink and if it's false devices are created
> using NETLINK_ROUTE / as LWT?

I'll buff the comment to explain this.

> > +
> >  static int dpif_netlink_init(void);
> >  static int open_dpif(const struct dpif_netlink_dp *, struct dpif **);
> >  static uint32_t dpif_netlink_port_get_pid(const struct dpif *,
> > @@ -1014,11 +1019,13 @@ dpif_netlink_port_add(struct dpif *dpif_, struct 
> > netdev *netdev,
> >odp_port_t *port_nop)
> >  {
> >  struct dpif_netlink *dpif = dpif_netlink_cast(dpif_);
> > -int error;
> > +int error = EOPNOTSUPP;
> >
> >  fat_rwlock_wrlock(>upcall_lock);
> > -error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop);
> > -if (error == EOPNOTSUPP) {
> > +if (!ovs_datapath_out_of_tree) {
> > +error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop);
> > +}
> > +if (error) {
> >  error = dpif_netlink_port_add_compat(dpif, netdev, port_nop);
> >  }
> >  fat_rwlock_unlock(>upcall_lock);
> > @@ -2503,6 +2510,7 @@ dpif_netlink_init(void)
> >  {
> >  static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
> >  static int error;
> > +struct netdev *netdev = NULL;
> 
> This can be in the #ifdef __linux__; otherwise you have an unused
> variable on other platforms.

Thanks for catching. I'll address that.

> >
> >  if (ovsthread_once_start()) {
> >  error = nl_lookup_genl_family(OVS_DATAPATH_FAMILY,
> > @@ -2526,6 +2534,27 @@ dpif_netlink_init(void)
> >  error = nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY, 
> > OVS_VPORT_MCGROUP,
> > _vport_mcgroup);
> >  }
> > +#ifdef __linux__
> > +if (!error) {
> > +error = netdev_open("ovs-system-probe", "geneve", );
> > +if (!error) {
> > +error = netdev_geneve_create_kind(netdev, "ovs_geneve");
> > +if (error != EOPNOTSUPP) {
> > +if (!error) {
> > +char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
> > +const char *dp_port;
> > +
> > +dp_port = netdev_vport_get_dpif_port(netdev, 
> > namebuf,
> > + sizeof 
> > namebuf);
> > +netdev_geneve_destroy(dp_port);
> > +}
> > +ovs_datapath_out_of_tree = true;
> > +}
> > +netdev_close(netdev);
> > +error = 0;
> > +}
> > +}
> > +#endif
> 
> Geneve isn't added yet, so this patch introduces build breakage.

Oops! I rearranged the patches before submitting. I'll move this patch
to where I originally had it, after adding the newlink support.

> This doesn't look like it matches the most recent discussion on this topic:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327476.html

Correct. I opted not to do the LWT probe on init. We have to verify the
device after creation anyways, which includes checking for LWT. It
didn't seem like probing for LWT at init gained anything.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] rhel: Firewall service files for OVN.

2017-02-03 Thread Russell Bryant
On Fri, Feb 3, 2017 at 3:50 PM, Russell Bryant  wrote:

> From: Marcin Mirecki 
>
> Firewall service files allowing to open firewalld
> ports required for running OVN
>
> Signed-off-by: Marcin Mirecki 
> Acked-by: Ben Pfaff 
> Signed-off-by: Russell Bryant 
> ---
>  AUTHORS.rst  | 1 +
>  rhel/automake.mk | 5
> -
>  rhel/openvswitch-fedora.spec.in  | 9
> +
>  rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml | 7
> +++
>  rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml| 6
> ++
>  5 files changed, 27 insertions(+), 1 deletion(-)
>  create mode 100644 rhel/usr_lib_firewalld_services_ovn-central-firewall-
> service.xml
>  create mode 100644 rhel/usr_lib_firewalld_services_ovn-host-firewall-
> service.xml
>

We worked on this a bit off-list.  I've applied this version to master and
branch-2.7.  Thanks!


-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] rhel: Firewall service files for OVN.

2017-02-03 Thread Russell Bryant
From: Marcin Mirecki 

Firewall service files allowing to open firewalld
ports required for running OVN

Signed-off-by: Marcin Mirecki 
Acked-by: Ben Pfaff 
Signed-off-by: Russell Bryant 
---
 AUTHORS.rst  | 1 +
 rhel/automake.mk | 5 -
 rhel/openvswitch-fedora.spec.in  | 9 +
 rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml | 7 +++
 rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml| 6 ++
 5 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 
rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml
 create mode 100644 
rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml

diff --git a/AUTHORS.rst b/AUTHORS.rst
index 8f3fc26..b567fcc 100644
--- a/AUTHORS.rst
+++ b/AUTHORS.rst
@@ -192,6 +192,7 @@ Luigi Rizzo ri...@iet.unipi.it
 Luis E. P.  l...@hotmail.com
 Lukasz Rzasik   lukasz.rza...@gmail.com
 Madhu Challacha...@noironetworks.com
+Marcin Mirecki  mmire...@redhat.com
 Mario Cabrera   mario.cabr...@hpe.com
 Mark D. Graymark.d.g...@intel.com
 Mark Hamilton   mhamil...@nicira.com
diff --git a/rhel/automake.mk b/rhel/automake.mk
index 45aa9b1..df4c19a 100644
--- a/rhel/automake.mk
+++ b/rhel/automake.mk
@@ -30,7 +30,10 @@ EXTRA_DIST += \
rhel/usr_lib_systemd_system_ovs-vswitchd.service \
rhel/usr_lib_systemd_system_ovn-controller.service \
rhel/usr_lib_systemd_system_ovn-controller-vtep.service \
-   rhel/usr_lib_systemd_system_ovn-northd.service
+   rhel/usr_lib_systemd_system_ovn-northd.service \
+   rhel/usr_lib_systemd_system_ovn-northd.service \
+   rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml \
+   rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml
 
 update_rhel_spec = \
   $(AM_V_GEN)($(ro_shell) && sed -e 's,[@]VERSION[@],$(VERSION),g') \
diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
index 65e937c..262acd0 100644
--- a/rhel/openvswitch-fedora.spec.in
+++ b/rhel/openvswitch-fedora.spec.in
@@ -127,6 +127,7 @@ files needed to build an external application.
 Summary: Open vSwitch - Open Virtual Network support
 License: ASL 2.0
 Requires: openvswitch openvswitch-ovn-common
+Requires: firewalld-filesystem
 
 %description ovn-central
 OVN, the Open Virtual Network, is a system to support virtual network
@@ -138,6 +139,7 @@ overlays and security groups.
 Summary: Open vSwitch - Open Virtual Network support
 License: ASL 2.0
 Requires: openvswitch openvswitch-ovn-common
+Requires: firewalld-filesystem
 
 %description ovn-host
 OVN, the Open Virtual Network, is a system to support virtual network
@@ -232,6 +234,11 @@ touch 
$RPM_BUILD_ROOT%{_sysconfdir}/openvswitch/system-id.conf
 install -p -m 644 -D selinux/openvswitch-custom.pp \
 
$RPM_BUILD_ROOT%{_datadir}/selinux/packages/%{name}/openvswitch-custom.pp
 
+install rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml \
+
$RPM_BUILD_ROOT%{_prefix}/lib/firewalld/services/ovn-central-firewall-service.xml
+install rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml \
+
$RPM_BUILD_ROOT%{_prefix}/lib/firewalld/services/ovn-host-firewall-service.xml
+
 # remove unpackaged files
 rm -f $RPM_BUILD_ROOT%{_bindir}/ovs-parse-backtrace \
 $RPM_BUILD_ROOT%{_sbindir}/ovs-vlan-bug-workaround \
@@ -517,11 +524,13 @@ fi
 %config %{_datadir}/openvswitch/ovn-nb.ovsschema
 %config %{_datadir}/openvswitch/ovn-sb.ovsschema
 %{_unitdir}/ovn-northd.service
+${_prefix}/lib/firewalld/services/ovn-central-firewall-service.xml
 
 %files ovn-host
 %{_bindir}/ovn-controller
 %{_mandir}/man8/ovn-controller.8*
 %{_unitdir}/ovn-controller.service
+${_prefix}/lib/firewalld/services/ovn-host-firewall-service.xml
 
 %files ovn-vtep
 %{_bindir}/ovn-controller-vtep
diff --git a/rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml 
b/rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml
new file mode 100644
index 000..e7f871d
--- /dev/null
+++ b/rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml
@@ -0,0 +1,7 @@
+
+
+  ovn-central-firewall-service
+  Firewall service for ovn central
+  
+  
+
diff --git a/rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml 
b/rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml
new file mode 100644
index 000..f606890
--- /dev/null
+++ b/rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml
@@ -0,0 +1,6 @@
+
+
+  ovn-host-firewall-service
+  Firewall service for ovn host
+  
+
-- 
2.9.3

___
dev mailing list
d...@openvswitch.org

Re: [ovs-dev] [RFC PATCH v4 2/6] dpif-netlink: break out code to add compat and non-compat vports

2017-02-03 Thread Eric Garver
On Thu, Feb 02, 2017 at 02:39:30PM -0800, Joe Stringer wrote:
> On 18 January 2017 at 11:45, Eric Garver  wrote:
> > From: Thadeu Lima de Souza Cascardo 
> >
> > The vport type for adding tunnels is now compatibility code and any new 
> > features
> > from tunnels must configure the tunnel as an interface using the tunnel 
> > metadata
> > support.
> >
> > In order to be able to add those tunnels, we need to add code to create the
> > tunnels and add them as NETDEV vports. And when there is no support to 
> > create
> > them, we need to use the compatibility code and add them as tunnel vports.
> >
> > When removing those tunnels, we need to remove the interfaces as well, and
> > detecting the right type might be important, at least to distinguish the 
> > tunnel
> > vports that we should remove and the interfaces that we shouldn't.
> >
> > Signed-off-by: Thadeu Lima de Souza Cascardo 
> 
> This patch involves refactoring as well as the base changes to
> add/remove the new tunnel types. It's always easier to review if
> functional changes are proposed separately from refactoring/cosmetic.
> (I recognize the new functions don't do anything, but they still
> combine with the refactoring). That said, with some minor changes this
> looks OK to me.

Understood. I'll have a go at spitting this patch.

> Acked-by: Joe Stringer 
> 
> > ---
> >  lib/dpif-netlink.c | 201 
> > +++--
> >  1 file changed, 148 insertions(+), 53 deletions(-)
> >
> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> > index c8b0e37f9365..e6459f358989 100644
> > --- a/lib/dpif-netlink.c
> > +++ b/lib/dpif-netlink.c
> > @@ -781,10 +781,8 @@ get_vport_type(const struct dpif_netlink_vport *vport)
> >  }
> >
> >  static enum ovs_vport_type
> > -netdev_to_ovs_vport_type(const struct netdev *netdev)
> > +netdev_to_ovs_vport_type(const char *type)
> >  {
> > -const char *type = netdev_get_type(netdev);
> > -
> >  if (!strcmp(type, "tap") || !strcmp(type, "system")) {
> >  return OVS_VPORT_TYPE_NETDEV;
> >  } else if (!strcmp(type, "internal")) {
> > @@ -805,19 +803,14 @@ netdev_to_ovs_vport_type(const struct netdev *netdev)
> >  }
> >
> >  static int
> > -dpif_netlink_port_add__(struct dpif_netlink *dpif, struct netdev *netdev,
> > +dpif_netlink_port_add__(struct dpif_netlink *dpif, const char *name,
> > +enum ovs_vport_type type,
> > +struct ofpbuf *options,
> >  odp_port_t *port_nop)
> >  OVS_REQ_WRLOCK(dpif->upcall_lock)
> >  {
> > -const struct netdev_tunnel_config *tnl_cfg;
> > -char namebuf[NETDEV_VPORT_NAME_BUFSIZE];
> > -const char *name = netdev_vport_get_dpif_port(netdev,
> > -  namebuf, sizeof namebuf);
> > -const char *type = netdev_get_type(netdev);
> >  struct dpif_netlink_vport request, reply;
> >  struct ofpbuf *buf;
> > -uint64_t options_stub[64 / 8];
> > -struct ofpbuf options;
> >  struct nl_sock **socksp = NULL;
> >  uint32_t *upcall_pids;
> >  int error = 0;
> > @@ -832,17 +825,80 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, 
> > struct netdev *netdev,
> >  dpif_netlink_vport_init();
> >  request.cmd = OVS_VPORT_CMD_NEW;
> >  request.dp_ifindex = dpif->dp_ifindex;
> > -request.type = netdev_to_ovs_vport_type(netdev);
> > -if (request.type == OVS_VPORT_TYPE_UNSPEC) {
> > +request.type = type;
> > +request.name = name;
> > +
> > +request.port_no = *port_nop;
> > +upcall_pids = vport_socksp_to_pids(socksp, dpif->n_handlers);
> > +request.n_upcall_pids = socksp ? dpif->n_handlers : 1;
> > +request.upcall_pids = upcall_pids;
> > +
> > +if (options) {
> > +request.options = options->data;
> > +request.options_len = options->size;
> > +}
> > +
> > +error = dpif_netlink_vport_transact(, , );
> > +if (!error) {
> > +*port_nop = reply.port_no;
> > +} else {
> > +if (error == EBUSY && *port_nop != ODPP_NONE) {
> > +VLOG_INFO("%s: requested port %"PRIu32" is in use",
> > +  dpif_name(>dpif), *port_nop);
> > +}
> > +
> > +vport_del_socksp(dpif, socksp);
> > +goto exit;
> > +}
> > +
> > +if (socksp) {
> > +error = vport_add_channels(dpif, *port_nop, socksp);
> > +if (error) {
> > +VLOG_INFO("%s: could not add channel for port %s",
> > +  dpif_name(>dpif), name);
> > +
> > +/* Delete the port. */
> > +dpif_netlink_vport_init();
> > +request.cmd = OVS_VPORT_CMD_DEL;
> > +request.dp_ifindex = dpif->dp_ifindex;
> > +request.port_no = *port_nop;
> > +dpif_netlink_vport_transact(, NULL, NULL);
> > +vport_del_socksp(dpif, socksp);
> > +   

Re: [ovs-dev] SFC patches for OVN

2017-02-03 Thread Guru Shetty
On 3 February 2017 at 11:59, Shuaijun Zhang  wrote:

> Hi All,
>
> I found a link (see below) it mentioned that a set of patches are developed
> to support SFC within OVN
> https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/324909.html
>
> Can anyone please let me know where are these patches, are they in the OVS
> repo in github?
>
They are in the review process. You can find a fresh batch posted in the
last couple of days in the mailing list.


>
> Thank you,
> Jason
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] SFC patches for OVN

2017-02-03 Thread Shuaijun Zhang
Hi All,

I found a link (see below) it mentioned that a set of patches are developed
to support SFC within OVN
https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/324909.html

Can anyone please let me know where are these patches, are they in the OVS
repo in github?

Thank you,
Jason
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] dpif-netdev: Conditional EMC insert

2017-02-03 Thread Daniele Di Proietto
2017-01-31 9:55 GMT-08:00 Ciara Loftus :
> Unconditional insertion of EMC entries results in EMC thrashing at high
> numbers of parallel flows. When this occurs, the performance of the EMC
> often falls below that of the dpcls classifier, rendering the EMC
> practically useless.
>
> Instead of unconditionally inserting entries into the EMC when a miss
> occurs, use a 1% probability of insertion. This ensures that the most
> frequent flows have the highest chance of creating an entry in the EMC,
> and the probability of thrashing the EMC is also greatly reduced.
>
> The probability of insertion is configurable, via the
> other_config:emc-insert-prob option. For example the following command
> increases the insertion probability to 1/10 ie. 10%.
>
> ovs-vsctl set Open_vSwitch . other_config:emc-insert-prob=10
>
> Signed-off-by: Ciara Loftus 
> Signed-off-by: Georg Schmuecking 
> Co-authored-by: Georg Schmuecking 

Thanks for v3.

We should add Georg to AUTHORS.rst

One of the unit tests ("PMD - stats") fails, because it checks the emc
hit stats.

I think we can fix it by configuring emc-insert-prob to 1 in that test
(assuming that
we accept emc-insert-prob even without DPDK).

More comments inline.

> ---
> v3:
> - Use new dpif other_config infrastructure to tidy up how the
>   emc-insert-prob value is passed to dpif-netdev.
> v2:
> - Enable probability configurability via other_config:emc-insert-prob
>   option.
>
>  Documentation/howto/dpdk.rst | 20 +
>  NEWS |  2 ++
>  lib/dpif-netdev.c| 53 
> ++--
>  vswitchd/vswitch.xml | 16 +
>  4 files changed, 89 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
> index d1e6e89..f2e888b 100644
> --- a/Documentation/howto/dpdk.rst
> +++ b/Documentation/howto/dpdk.rst
> @@ -354,6 +354,26 @@ the `DPDK documentation
>
>  Note: Not all DPDK virtual PMD drivers have been tested and verified to work.
>
> +EMC Insertion Probability
> +-
> +By default 1 in every 100 flows are inserted into the Exact Match Cache 
> (EMC).
> +It is possible to change this insertion probability by setting the
> +``emc-insert-prob`` option::
> +
> +$ ovs-vsctl --no-wait set Open_vSwitch . other_config:emc-insert-prob=N
> +
> +where:
> +
> +``N``
> +  is a positive integer between 0 and 4294967295 (maximum unsigned 32bit 
> int).
> +
> +If ``N`` is set to 1, an insertion will be performed for every flow. The 
> lower
> +the value of ``emc-insert-prob`` the higher the probability of insertion,
> +except for the value 0 which will result in no insertions being performed and
> +thus essentially disabling the EMC.
> +
> +For more information on the EMC refer to :doc:`/intro/install/dpdk` .
> +
>  .. _dpdk-ovs-in-guest:
>
>  OVS with DPDK Inside VMs
> diff --git a/NEWS b/NEWS
> index 6838649..5a21580 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -65,6 +65,8 @@ Post-v2.6.0
> device will not be available for use until a valid dpdk-devargs is
> specified.
>   * Virtual DPDK Poll Mode Driver (vdev PMD) support.
> + * EMC insertion probability is reduced to 1/100 and is configurable via
> +   the new 'other_config:emc-insert-prob' option.
> - Fedora packaging:
>   * A package upgrade does not automatically restart OVS service.
> - ovs-vswitchd/ovs-vsctl:
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 0be5db5..e514ddb 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -144,6 +144,10 @@ struct netdev_flow_key {
>  #define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1)
>  #define EM_FLOW_HASH_SEGS 2
>
> +/* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_PROB */
> +#define DEFAULT_EM_FLOW_INSERT_PROB 100
> +#define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX / DEFAULT_EM_FLOW_INSERT_PROB)
> +
>  struct emc_entry {
>  struct dp_netdev_flow *flow;
>  struct netdev_flow_key key;   /* key.hash used for emc hash value. */
> @@ -254,6 +258,9 @@ struct dp_netdev {
>  uint64_t last_tnl_conf_seq;
>
>  struct conntrack conntrack;
> +
> +/* Probability of EMC insertions is a factor of 'emc_insert_min'.*/
> +atomic_uint32_t emc_insert_min;

I'm not sure this makes sense, but I'm worried about emc_insert_min sharing the
same cacheline with other fields (in this case inside conntrack) that
are updated more
often.  Could we perhaps solve this by prefixing it with
OVS_ALIGNED_VAR(CACHE_LINE_SIZE)?

>  };
>
>  static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev 
> *dp,
> @@ -1066,6 +1073,8 @@ create_dp_netdev(const char *name, const struct 
> dpif_class *class,
>
>  conntrack_init(>conntrack);
>
> +atomic_init(>emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN);
> +
>  cmap_init(>poll_threads);
>  

Re: [ovs-dev] [PATCH v2 1/1] doc: Remove "experimental" warning for userspace.

2017-02-03 Thread Daniele Di Proietto
2017-02-03 9:38 GMT-08:00 Kevin Traynor :
> On 02/02/2017 08:22 PM, Stokes, Ian wrote:
>>> On 02/02/2017 04:44 PM, Ian Stokes wrote:
 Remove the experimental warning tag in documentation regarding OVS
 deployed via userspace.

 Signed-off-by: Ian Stokes 
 ---
  Documentation/intro/install/dpdk.rst  |3 ---
  Documentation/intro/install/userspace.rst |4 
  NEWS  |2 ++
  README.rst|6 +++---
  4 files changed, 5 insertions(+), 10 deletions(-)

 diff --git a/Documentation/intro/install/dpdk.rst
 b/Documentation/intro/install/dpdk.rst
 index fff0a1a..3018590 100644
 --- a/Documentation/intro/install/dpdk.rst
 +++ b/Documentation/intro/install/dpdk.rst
 @@ -29,9 +29,6 @@ This document describes how to build and install
 Open vSwitch using a DPDK  datapath. Open vSwitch can use the DPDK
 library to operate entirely in  userspace.

 -.. warning::
 -  The DPDK support of Open vSwitch is considered 'experimental'.
 -
  Build requirements
  --

 diff --git a/Documentation/intro/install/userspace.rst
 b/Documentation/intro/install/userspace.rst
 index 0368527 ..ebd0c12 100644
 --- a/Documentation/intro/install/userspace.rst
 +++ b/Documentation/intro/install/userspace.rst
 @@ -34,10 +34,6 @@ This version of Open vSwitch should be built
 manually with ``configure`` and  been recently tested, and so Debian
 packages are not a recommended way to use  this version of Open vSwitch.

 -.. warning::
 -  The userspace-only mode of Open vSwitch is considered experimental.
 It has
 -  not been thoroughly tested.
 -
  Building and Installing
  ---

 diff --git a/NEWS b/NEWS
 index 5efcce2..8600f0e 100644
 --- a/NEWS
 +++ b/NEWS
 @@ -3,6 +3,8 @@ Post-v2.7.0
 - Tunnels:
   * Added support to set packet mark for tunnel endpoint using
 `egress_pkt_mark` OVSDB option.
 +   - Documentation:
 + * OVS deployed in userspace mode no longer tagged as experimental.
>>>
>>> I think this would be a bit clearer. What do you think?
>>>
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -59,4 +59,5 @@ v2.7.0 - xx xxx 
>>>   * Removed support for IPsec tunnels.
>>> - DPDK:
>>> + * Removal of experimental tag.
>>
>> We can put it under a DPDK header like you have here, I wonder though is the 
>> removal of experimental only relevant to DPDK or do we need to consider OVS 
>> userspace without DPDK also? I would think both are well tested at this 
>> stage so experimental could be removed for both. Thoughts?
>>
>
> Fine by me. I just want it to be clear that it's removal is for/covers
> the DPDK datapath.

Thanks for the patch, I'm in favor of the change for DPDK.

Unless you have a strong reason do to so, I'd prefer to keep the
experimental tag for the userspace datapath without DPDK for the
following reasons:

* I don't see a lot of valid use cases for it.  I think it is
important for testing.
* There's at least a known problem with accessing linux device in
userspace without DPDK for containers with offloads. See
ddcf96d2dcc1("system-tests: Disable offloads in userspace tests.")

What do you think?

Thanks,

Daniele

>
>>>
>>>

  v2.7.0 - xx xxx 
  -
 diff --git a/README.rst b/README.rst
 index f5cdaa5..90050e3 100644
 --- a/README.rst
 +++ b/README.rst
 @@ -38,9 +38,9 @@ following features:

  The included Linux kernel module supports Linux 3.10 and up.

 -Open vSwitch can also operate, at a cost in performance, entirely in
 userspace, -without assistance from a kernel module.  This userspace
 implementation should -be easier to port than the kernel-based switch.
>>> It is considered experimental.
 +Open vSwitch can also operate entirely in userspace without
 +assistance from a kernel module.  This userspace implementation
 +should be easier to port than the kernel-based switch.

  What's here?
  

>>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] doc: Update DPDK version for 2.7 release.

2017-02-03 Thread Daniele Di Proietto
2017-02-02 8:59 GMT-08:00 Kevin Traynor :
> On 02/02/2017 04:30 PM, Ian Stokes wrote:
>> Add DPDK version required for the OVS 2.7 release in documentation.
>>
>> Signed-off-by: Ian Stokes 
>
> Acked-by: Kevin Traynor 

Thanks, pushed to master and branch-2.7

>
>> ---
>>  Documentation/faq/releases.rst |1 +
>>  1 files changed, 1 insertions(+), 0 deletions(-)
>>
>> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
>> index fcff5c3..319c2d7 100644
>> --- a/Documentation/faq/releases.rst
>> +++ b/Documentation/faq/releases.rst
>> @@ -159,6 +159,7 @@ Q: What DPDK version does each Open vSwitch release work 
>> with?
>>  2.4.x2.0
>>  2.5.x2.2
>>  2.6.x16.07
>> +2.7.x16.11
>>   =
>>
>>  Q: I get an error like this when I configure Open vSwitch::
>>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/3] rhel: Remove obsolete OVSDPDKVhostPort from ifdown script.

2017-02-03 Thread Daniele Di Proietto





On 02/02/2017 11:49, "Ben Pfaff"  wrote:

>On Tue, Jan 24, 2017 at 06:21:52PM -0800, Daniele Di Proietto wrote:
>> The support for vhost cuse port has been removed long ago.
>> 
>> Fixes:419876444357("netdev-dpdk: Remove dpdkvhostcuse ports")
>> Signed-off-by: Daniele Di Proietto 
>
>Acked-by: Ben Pfaff 

Thanks, pushed to master, branch-2.7 and branch-2.6
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn: Add missing netdev_close in setup_qos.

2017-02-03 Thread Russell Bryant
On Thu, Feb 2, 2017 at 12:41 PM, Ben Pfaff  wrote:

> On Thu, Feb 02, 2017 at 02:04:07AM -0500, Russell Bryant wrote:
> > We missed calling netdev_close in a couple of places.  One was in an
> error
> > condition rarely hit.  The second was just introduced and would be hit in
> > all cases where QoS is not in use.
> >
> > Fixes: dc2dab6e6de5 ("ovn-controller: Configure interface QoS only if it
> would actually be used.")
> > Signed-off-by: Russell Bryant 
>
> Thank you!
>
> Acked-by: Ben Pfaff 
>

Thanks, applied to master and branch-2.7.

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] docs: Add OVS and OVN headings to pages.

2017-02-03 Thread Russell Bryant
On Thu, Feb 2, 2017 at 12:40 PM, Ben Pfaff  wrote:

> On Thu, Feb 02, 2017 at 02:02:34AM -0500, Russell Bryant wrote:
> > Update the "deep dive" and "howto" pages with headings that more clearly
> > indicate the separate lists of OVS or OVN content.  Also add a link to
> > ovn-architecture from the "deep dive" page as it seems quite relevant
> > there.
> >
> > Signed-off-by: Russell Bryant 
>
> Acked-by: Ben Pfaff 
>

Thanks, applied to master.

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 2/5] OVN: SFC Implementation: New stage for SFC and modified ACL stage

2017-02-03 Thread Ben Pfaff
Great, I'll have a look.

On Thu, Feb 02, 2017 at 11:50:17PM +, John McDowall wrote:
> Ben,
> 
> Posted an updated patch – then downloaded it from patchwork and applied to 
> top to tree and ran make check and all tests passed.
> 
> Regards
> 
> John
> 
> On 1/31/17, 3:49 PM, "Ben Pfaff"  wrote:
> 
> If the patches cannot sensibly be applied separately, then, yes, they
> should be a single patch.
> 
> On Tue, Jan 31, 2017 at 11:02:30PM +, John McDowall wrote:
> > Ah, my bad do you want me to create a single patch file?
> > 
> > Regards
> > 
> > John
> > 
> > On 1/31/17, 2:44 PM, "Ben Pfaff"  wrote:
> > 
> > Now that I glance at the patch titles, I guess that the problem 
> might be
> > that this patch depends on some of the later patches.  In general, 
> each
> > patch should apply, build, and test properly whether or not later
> > patches have been applied.
> > 
> > On Tue, Jan 31, 2017 at 10:03:21PM +, John McDowall wrote:
> > > Ben,
> > > 
> > > Let me create a new patch set against the top of tree.
> > > 
> > > Regards
> > > 
> > > John
> > > 
> > > On 1/31/17, 1:46 PM, "Ben Pfaff"  wrote:
> > > 
> > > On Tue, Dec 27, 2016 at 02:11:43PM -0800, John McDowall wrote:
> > > > This is the major body of code that implements SFC. There 
> is a new L2 stage added to
> > > > perform the chaining operations and modifications to the 
> ACL stage to direct flows
> > > > to the service chain.
> > > > 
> > > > Co-authored-by: Flavio Fernandes 
> > > > Reported at: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMarch_040381.html=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=0-H45ymu2qKdNfehkwCF8baQWBqDNhngIVaX4MlOpCQ=VbhqfPkju3uYqy7303Bfbz0fgnSeIi6aYsQoRwIH1PU=
>  
> > > > Reported at: 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMay_041359.html=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=0-H45ymu2qKdNfehkwCF8baQWBqDNhngIVaX4MlOpCQ=vNieFlb8T7dsSygACJyaJiHvnrQDGCyox17covGw4Ns=
>  
> > > > 
> > > > Signed-off-by: John McDowall 
> 
> > > 
> > > I think that this has bit-rotted because I get tons of 
> compiler errors
> > > trying to build it.  I tried rewinding my repo to a point 
> from December
> > > but I still the same ones:
> > > 
> > > ../ovn/northd/ovn-northd.c:2669:13: error: incomplete 
> definition of type 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
> declaration of 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2669:37: error: incomplete 
> definition of type 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
> declaration of 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2673:30: error: incomplete 
> definition of type 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
> declaration of 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2674:30: error: incomplete 
> definition of type 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2664:18: note: forward 
> declaration of 'struct nbrec_logical_port_pair_group'
> > > ../ovn/northd/ovn-northd.c:2701:49: error: no member 
> named 'options' in 'struct nbrec_acl'
> > > ../ovn/northd/ovn-northd.c:2713:37: error: no member 
> named 'n_port_chains' in 'struct nbrec_logical_switch'
> > > ../ovn/northd/ovn-northd.c:2714:30: error: no member 
> named 'port_chains' in 'struct nbrec_logical_switch'
> > > ../ovn/northd/ovn-northd.c:2716:39: error: incomplete 
> definition of type 'struct nbrec_logical_port_chain'
> > > ../ovn/northd/ovn-northd.c:2710:18: note: forward 
> declaration of 'struct nbrec_logical_port_chain'
> > > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete 
> definition of type 'struct nbrec_logical_port_chain'
> > > /usr/include/i386-linux-gnu/bits/string2.h:110:58: note: 
> expanded from macro 'strcmp'
> > > ../ovn/northd/ovn-northd.c:2710:18: note: forward 
> declaration of 'struct nbrec_logical_port_chain'
> > > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete 
> definition of type 'struct 

[ovs-dev] [PATCH 2/2] ovs-vsctl: Disallow empty bridge, port, and interface names.

2017-02-03 Thread Ben Pfaff
Reported-by: Gabor Locsei 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html
Signed-off-by: Ben Pfaff 
---
 tests/ovs-vsctl.at| 12 
 utilities/ovs-vsctl.c | 12 
 2 files changed, 24 insertions(+)

diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
index 94b431f..08eb328 100644
--- a/tests/ovs-vsctl.at
+++ b/tests/ovs-vsctl.at
@@ -204,6 +204,9 @@ AT_CHECK([RUN_OVS_VSCTL([add-br a])], [0], [], [], 
[OVS_VSCTL_CLEANUP])
 AT_CHECK([RUN_OVS_VSCTL([add-br a])], [1], [],
   [ovs-vsctl: cannot create a bridge named a because a bridge named a already 
exists
 ], [OVS_VSCTL_CLEANUP])
+AT_CHECK([RUN_OVS_VSCTL([add-br ''])], [1], [],
+  [ovs-vsctl: bridge name must not be empty string
+], [OVS_VSCTL_CLEANUP])
 OVS_VSCTL_CLEANUP
 AT_CLEANUP
 
@@ -278,6 +281,9 @@ AT_CHECK([RUN_OVS_VSCTL(
 AT_CHECK([RUN_OVS_VSCTL([add-port a a1])], [1], [],
   [ovs-vsctl: cannot create a port named a1 because a port named a1 already 
exists on bridge a
 ], [OVS_VSCTL_CLEANUP])
+AT_CHECK([RUN_OVS_VSCTL([add-port a ''])], [1], [],
+  [ovs-vsctl: port name must not be empty string
+], [OVS_VSCTL_CLEANUP])
 OVS_VSCTL_CLEANUP
 AT_CLEANUP
 
@@ -315,6 +321,12 @@ AT_CHECK([RUN_OVS_VSCTL([--may-exist add-bond a bond0 a2 
a1])], [1], [],
   [ovs-vsctl: "--may-exist add-bond a bond0 a2 a1" but bond0 actually has 
interface(s) a1, a2, a3
 ],
   [OVS_VSCTL_CLEANUP])
+AT_CHECK([RUN_OVS_VSCTL([add-bond a '' x y z])], [1], [],
+  [ovs-vsctl: port name must not be empty string
+], [OVS_VSCTL_CLEANUP])
+AT_CHECK([RUN_OVS_VSCTL([add-bond a x '' y z])], [1], [],
+  [ovs-vsctl: interface name must not be empty string
+], [OVS_VSCTL_CLEANUP])
 CHECK_BRIDGES([a, a, 0])
 CHECK_PORTS([a], [bond0])
 CHECK_IFACES([a], [a1], [a2], [a3])
diff --git a/utilities/ovs-vsctl.c b/utilities/ovs-vsctl.c
index 3719046..11e1ff6 100644
--- a/utilities/ovs-vsctl.c
+++ b/utilities/ovs-vsctl.c
@@ -1154,6 +1154,9 @@ cmd_add_br(struct ctl_context *ctx)
 int vlan;
 
 br_name = ctx->argv[1];
+if (!br_name[0]) {
+ctl_fatal("bridge name must not be empty string");
+}
 if (ctx->argc == 2) {
 parent_name = NULL;
 vlan = 0;
@@ -1503,6 +1506,15 @@ add_port(struct ctl_context *ctx,
 struct ovsrec_port *port;
 size_t i;
 
+if (!port_name[0]) {
+ctl_fatal("port name must not be empty string");
+}
+for (size_t i = 0; i < n_ifaces; i++) {
+if (!iface_names[i][0]) {
+ctl_fatal("interface name must not be empty string");
+}
+}
+
 vsctl_context_populate_cache(ctx);
 if (may_exist) {
 struct vsctl_port *vsctl_port;
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] netdev: Reject empty names in netdev_open().

2017-02-03 Thread Ben Pfaff
The empty string is not a valid name for a network device.  I would have
expected that each of the netdev provider implementations would reject an
empty string, but there was a special case for Linux tap devices where they
instead caused unexpected behavior.  This commit should fix the problem for
those devices and every other kind.

Reported-by: Gabor Locsei 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html
Signed-off-by: Ben Pfaff 
---
 lib/netdev.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/netdev.c b/lib/netdev.c
index 1e6bb2b..a8d8eda 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016 Nicira, Inc.
+ * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Nicira, 
Inc.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -343,6 +343,14 @@ netdev_open(const char *name, const char *type, struct 
netdev **netdevp)
 struct netdev *netdev;
 int error;
 
+if (!name[0]) {
+/* Reject empty names.  This saves the providers having to do this.  At
+ * least one screwed this up: the netdev-linux "tap" implementation
+ * passed the name directly to the Linux TUNSETIFF call, which treats
+ * an empty string as a request to generate a unique name. */
+return EINVAL;
+}
+
 netdev_initialize();
 
 ovs_mutex_lock(_mutex);
-- 
2.10.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Michael S. Tsirkin
On Fri, Feb 03, 2017 at 03:54:52PM +, Daniel P. Berrange wrote:
> On Fri, Feb 03, 2017 at 05:34:07PM +0200, Michael S. Tsirkin wrote:
> > On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote:
> > > Hi,
> > > 
> > > On 02/01/2017 09:35 AM, Maxime Coquelin wrote:
> > > > Hi,
> > > > 
> > > >  Few months ago, Michael reported a problem about migrating VMs relying
> > > > on vhost-user between hosts supporting different backend versions:
> > > >  - Message-Id: <20161011173526-mutt-send-email-...@kernel.org>
> > > >  - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html
> > > > 
> > > >  The goal of this thread is to draft a proposal based on the outcomes
> > > > of discussions with contributors of the different parties (DPDK/OVS
> > > > /libvirt/...).
> > > 
> > > Thanks the first feedback. It seems to converge that this is Nova's
> > > role, but not Libvirt one to manage these versions from management tool
> > > layer.
> > 
> > 
> > I think the conclusion is not that it should go up the stack.  I think
> > this will just get broken all the time.  No one understands versions and
> > stuff. Even QEMU developers get confused and break compatibility once in
> > a while.
> 
> I don't think it is really true, nor is it really specific to migration,
> it can hit any time you have a network connection that is re-opened even
> without migration. Version compatibility negotiations are an inherant part
> of any network protocol that is expected to evolve over time.
> 
> > My conclusion is that doing it from OVS side is wrong.  Migration is not
> > an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.People
> > just want migration to work, ok? It's our job to do it, we do not really
> > need a "make things work" flag.
> > 
> > If libvirt does not want to use the vhost-user protocol (which sounds
> > reasonable, it's rather complex) how about qemu providing a small
> > utility to query the port?  We could output json or whatever.
> > 
> > This can help with MTU as well.
> > 
> > And maybe it will help with nowait support - if someone uses the utility
> > to dump backend config once, QEMU can later start the device without
> > feature queries.
> 
> I don't think it is QEMU's job to deal with external components in
> this way and don't think QEMU vhost-user should be treated as special.

It's very special because it's the only time we ship parts of
device emulation out to an external service.
So they need to be versioned together with QEMU.


> This general scenario arises any time there is a QEMU backend is talking
> to an external service/process.
> 
> For example, a virtio-serial talking to a daemon in the host. This daemon
> can support different versions of the protocol being spoken, so we have
> the same compat issue on migration. Or a traditional serial device, which
> the guest is using to talk to an external daemon the host. In a few cases
> we might know what the protocol is, but in general the data stream  is
> opaque to QEMU.

Are there any examples of people actually managing this across versions
and getting it right?

AFAIK there are cases where data stream is opaque to QEMU like -cpu host
and in every such case VM is unmigrateable.

OTOH vhost-user is not opaque to QEMU at all, and that is intentional
so we can support migration.

> QEMU should not have need to learn about every single protocol that might
> be used to communicate with some external service. This is an unbounded
> set of possibilities to deal with, some of which will not even be open
> source.
> 
> This all needs to be delegated to whatever mgmt app is responsible for
> setting up these external services, as it has the right domain knowledge
> about the applications being used, to make the policy decisions that are
> suitable for its usage scenario.
> 
> Regards,
> Daniel

I disagree it should not and in fact this is exactly what QEMU often does.
This is how in the past we were mostly able to emulate any missing
backend features in QEMU to make things transparent to guest.
E.g. it can actually support the same virtio net device with userspace and
kernel backends and migrate between them.

> -- 
> |: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org  -o- http://virt-manager.org :|
> |: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Daniel P. Berrange
On Fri, Feb 03, 2017 at 05:34:07PM +0200, Michael S. Tsirkin wrote:
> On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote:
> > Hi,
> > 
> > On 02/01/2017 09:35 AM, Maxime Coquelin wrote:
> > > Hi,
> > > 
> > >  Few months ago, Michael reported a problem about migrating VMs relying
> > > on vhost-user between hosts supporting different backend versions:
> > >  - Message-Id: <20161011173526-mutt-send-email-...@kernel.org>
> > >  - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html
> > > 
> > >  The goal of this thread is to draft a proposal based on the outcomes
> > > of discussions with contributors of the different parties (DPDK/OVS
> > > /libvirt/...).
> > 
> > Thanks the first feedback. It seems to converge that this is Nova's
> > role, but not Libvirt one to manage these versions from management tool
> > layer.
> 
> 
> I think the conclusion is not that it should go up the stack.  I think
> this will just get broken all the time.  No one understands versions and
> stuff. Even QEMU developers get confused and break compatibility once in
> a while.

I don't think it is really true, nor is it really specific to migration,
it can hit any time you have a network connection that is re-opened even
without migration. Version compatibility negotiations are an inherant part
of any network protocol that is expected to evolve over time.

> My conclusion is that doing it from OVS side is wrong.  Migration is not
> an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.People
> just want migration to work, ok? It's our job to do it, we do not really
> need a "make things work" flag.
> 
> If libvirt does not want to use the vhost-user protocol (which sounds
> reasonable, it's rather complex) how about qemu providing a small
> utility to query the port?  We could output json or whatever.
> 
> This can help with MTU as well.
> 
> And maybe it will help with nowait support - if someone uses the utility
> to dump backend config once, QEMU can later start the device without
> feature queries.

I don't think it is QEMU's job to deal with external components in
this way and don't think QEMU vhost-user should be treated as special.
This general scenario arises any time there is a QEMU backend is talking
to an external service/process.

For example, a virtio-serial talking to a daemon in the host. This daemon
can support different versions of the protocol being spoken, so we have
the same compat issue on migration. Or a traditional serial device, which
the guest is using to talk to an external daemon the host. In a few cases
we might know what the protocol is, but in general the data stream  is
opaque to QEMU.

QEMU should not have need to learn about every single protocol that might
be used to communicate with some external service. This is an unbounded
set of possibilities to deal with, some of which will not even be open
source.

This all needs to be delegated to whatever mgmt app is responsible for
setting up these external services, as it has the right domain knowledge
about the applications being used, to make the policy decisions that are
suitable for its usage scenario.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Michael S. Tsirkin
On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote:
> Hi,
> 
> On 02/01/2017 09:35 AM, Maxime Coquelin wrote:
> > Hi,
> > 
> >  Few months ago, Michael reported a problem about migrating VMs relying
> > on vhost-user between hosts supporting different backend versions:
> >  - Message-Id: <20161011173526-mutt-send-email-...@kernel.org>
> >  - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html
> > 
> >  The goal of this thread is to draft a proposal based on the outcomes
> > of discussions with contributors of the different parties (DPDK/OVS
> > /libvirt/...).
> 
> Thanks the first feedback. It seems to converge that this is Nova's
> role, but not Libvirt one to manage these versions from management tool
> layer.


I think the conclusion is not that it should go up the stack.  I think
this will just get broken all the time.  No one understands versions and
stuff. Even QEMU developers get confused and break compatibility once in
a while.

My conclusion is that doing it from OVS side is wrong.  Migration is not
an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.People
just want migration to work, ok? It's our job to do it, we do not really
need a "make things work" flag.

If libvirt does not want to use the vhost-user protocol (which sounds
reasonable, it's rather complex) how about qemu providing a small
utility to query the port?  We could output json or whatever.

This can help with MTU as well.

And maybe it will help with nowait support - if someone uses the utility
to dump backend config once, QEMU can later start the device without
feature queries.


> This change has has no impact from OVS perspective, same requirements
> apply. I am interested on OVS contributors feedback on the below design
> proposal.
> 
> Especially, I would like to have your opinion on the best way for OVS to
> expose its supported versions:
> - Static file generated at build time from version table described below
> - Entries in the OVS DB
> - Dedicated tool listing strings from the version table described below
> 
> For selecting the right version of the vhost-user backend, do you agree
> it should be done via a new parameter of the ovs-vsctl add-port command
> for dpdkvhostuser ports?
> 
> 
> > Problem statement:
> > ==
> > 
> >  When migrating a VM from one host to another, the interfaces exposed by
> > QEMU must stay unchanged in order to guarantee a successful migration.
> > In the case of vhost-user interface, parameters like supported Virtio
> > feature set, max number of queues, max vring sizes,... must remain
> > compatible. Indeed, the frontend not being re-initialized, no
> > renegotiation happens at migration time.
> > 
> >  For example, we have a VM that runs on host A, which has its vhost-user
> > backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest
> > also support this feature, it is successfully negotiated, and guest
> > transmit packets using indirect descriptor tables, that the backend
> > knows to handle.
> > At some point, the VM is being migrated to host B, which runs an older
> > version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC
> > feature. The migration would break, because the Guest still have the
> > VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some
> > decriptors pointing to indirect tables, that backend B doesn't know to
> > handle.
> >  This is just an example about Virtio features compatibility, but other
> > backend implementation details could cause other failures.
> > 
> >  What we need is to be able to query the destination host's backend to
> > ensure migration is possible. Also, we would need to query this
> > statically, even before the VM is started, to be sure it could be
> > migrated elsewhere for any reason.
> 
> ...
> 
> > 
> > Solution 3: Libvirt queries OVS for vhost backend version string: *OK*
> > ==
> > 
> > 
> >  The idea is to have a table of supported versions, associated to
> > key/value pairs. Libvirt could query the list of supported versions
> > strings for each hosts, and select the first common one among all hosts.
> > 
> >  Then, libvirt would ask OVS to probe the vhost-user interfaces in the
> > selected version (compatibility mode). For example host A runs OVS-2.7,
> > and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode
> > (e.g. with indirect descriptors disabled), which should be selected at
> > vhost-user interface probe time.
> > 
> >  Advantage of doing so is that libvirt does not need any update if new
> > keys are introduced (i.e. it does not need to know how the new keys have
> > to be handled), all these checks remain in OVS's vhost-user implementation.
> > 
> >  Ideally, we would support per vhost-user interface compatibility mode,
> > which may have an impact also on DPDK API, as the Virtio feature update
> > API is global, and not per port.
> > 
> > - 

Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Maxime Coquelin

Hi,

On 02/01/2017 09:35 AM, Maxime Coquelin wrote:

Hi,

 Few months ago, Michael reported a problem about migrating VMs relying
on vhost-user between hosts supporting different backend versions:
 - Message-Id: <20161011173526-mutt-send-email-...@kernel.org>
 - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html

 The goal of this thread is to draft a proposal based on the outcomes
of discussions with contributors of the different parties (DPDK/OVS
/libvirt/...).


Thanks the first feedback. It seems to converge that this is Nova's
role, but not Libvirt one to manage these versions from management tool
layer.

This change has has no impact from OVS perspective, same requirements
apply. I am interested on OVS contributors feedback on the below design
proposal.

Especially, I would like to have your opinion on the best way for OVS to
expose its supported versions:
- Static file generated at build time from version table described below
- Entries in the OVS DB
- Dedicated tool listing strings from the version table described below

For selecting the right version of the vhost-user backend, do you agree
it should be done via a new parameter of the ovs-vsctl add-port command
for dpdkvhostuser ports?



Problem statement:
==

 When migrating a VM from one host to another, the interfaces exposed by
QEMU must stay unchanged in order to guarantee a successful migration.
In the case of vhost-user interface, parameters like supported Virtio
feature set, max number of queues, max vring sizes,... must remain
compatible. Indeed, the frontend not being re-initialized, no
renegotiation happens at migration time.

 For example, we have a VM that runs on host A, which has its vhost-user
backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest
also support this feature, it is successfully negotiated, and guest
transmit packets using indirect descriptor tables, that the backend
knows to handle.
At some point, the VM is being migrated to host B, which runs an older
version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC
feature. The migration would break, because the Guest still have the
VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some
decriptors pointing to indirect tables, that backend B doesn't know to
handle.
 This is just an example about Virtio features compatibility, but other
backend implementation details could cause other failures.

 What we need is to be able to query the destination host's backend to
ensure migration is possible. Also, we would need to query this
statically, even before the VM is started, to be sure it could be
migrated elsewhere for any reason.


...



Solution 3: Libvirt queries OVS for vhost backend version string: *OK*
==


 The idea is to have a table of supported versions, associated to
key/value pairs. Libvirt could query the list of supported versions
strings for each hosts, and select the first common one among all hosts.

 Then, libvirt would ask OVS to probe the vhost-user interfaces in the
selected version (compatibility mode). For example host A runs OVS-2.7,
and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode
(e.g. with indirect descriptors disabled), which should be selected at
vhost-user interface probe time.

 Advantage of doing so is that libvirt does not need any update if new
keys are introduced (i.e. it does not need to know how the new keys have
to be handled), all these checks remain in OVS's vhost-user implementation.

 Ideally, we would support per vhost-user interface compatibility mode,
which may have an impact also on DPDK API, as the Virtio feature update
API is global, and not per port.

- Implementation:
-

 Goal here is just to illustrate this proposal, I'm sure you will have
good suggestion to improve it.
 In OVS vhost-user library, we would introduce a new structure, for
example (neither compiled nor tested):

struct vhostuser_compat {
 char *version;
 uint64_t virtio_features;
 uint32_t max_rx_queue_sz;
 uint32_t max_nr_queues;
};

 *version* field is the compatibility version string.
  It could be something like: "upstream.ovs-dpdk.v2.6"
  In case for example Fedora adds some more patches to its
  package that would break migration to upstream version, it could have
  a dedicated compatibility string: "fc26.ovs-dpdk.v2.6".
  In case OVS-v2.7 does not break compatibility with previous OVS-v2.6
  version, then no need to create a new compatibility entry, just keep
  v2.6 one.

 *virtio_features* field is the Virtio features set for a given
  compatibility version. When an OVS tag is to be created, it would be
  associated to a DPDK version. The Virtio features for these version
  would be stored in this field. It would allow to upgrade the DPDK
  package for example from v16.07 to v16.11 without breaking migration.
  In case the distribution wants to benefit from latests Virtio

[ovs-dev] [PATCH 1/3] datapath: backport: openvswitch: 802.1ad uapi changes.

2017-02-03 Thread Yi Yang
commit 8c146bb9d59aa2ac45222171916ece186c4b3943
Author: Thomas F Herbert 
Date:   Wed Sep 7 12:56:57 2016 -0400

openvswitch: Add support for 8021.AD

Change the description of the VLAN tpid field.

Signed-off-by: Thomas F Herbert 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Yi Yang 
---
 datapath/linux/compat/include/linux/openvswitch.h | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 425d3a4..b6f43b3 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -629,13 +629,13 @@ struct ovs_action_push_mpls {
  * @vlan_tci: Tag control identifier (TCI) to push.  The CFI bit must be set
  * (but it will not be set in the 802.1Q header that is pushed).
  *
- * The @vlan_tpid value is typically %ETH_P_8021Q.  The only acceptable TPID
- * values are those that the kernel module also parses as 802.1Q headers, to
- * prevent %OVS_ACTION_ATTR_PUSH_VLAN followed by %OVS_ACTION_ATTR_POP_VLAN
- * from having surprising results.
+ * The @vlan_tpid value is typically %ETH_P_8021Q or %ETH_P_8021AD.
+ * The only acceptable TPID values are those that the kernel module also parses
+ * as 802.1Q or 802.1AD headers, to prevent %OVS_ACTION_ATTR_PUSH_VLAN followed
+ * by %OVS_ACTION_ATTR_POP_VLAN from having surprising results.
  */
 struct ovs_action_push_vlan {
-   __be16 vlan_tpid;   /* 802.1Q TPID. */
+   __be16 vlan_tpid;   /* 802.1Q or 802.1ad TPID. */
__be16 vlan_tci;/* 802.1Q TCI (VLAN ID and priority). */
 };
 
@@ -755,9 +755,10 @@ enum ovs_nat_attr {
  * @OVS_ACTION_ATTR_TRUNC: Output packet to port with truncated packet size.
  * @OVS_ACTION_ATTR_USERSPACE: Send packet to userspace according to nested
  * %OVS_USERSPACE_ATTR_* attributes.
- * @OVS_ACTION_ATTR_PUSH_VLAN: Push a new outermost 802.1Q header onto the
- * packet.
- * @OVS_ACTION_ATTR_POP_VLAN: Pop the outermost 802.1Q header off the packet.
+ * @OVS_ACTION_ATTR_PUSH_VLAN: Push a new outermost 802.1Q or 802.1ad header
+ * onto the packet.
+ * @OVS_ACTION_ATTR_POP_VLAN: Pop the outermost 802.1Q or 802.1ad header
+ * from the packet.
  * @OVS_ACTION_ATTR_SAMPLE: Probabilitically executes actions, as specified in
  * the nested %OVS_SAMPLE_ATTR_* attributes.
  * @OVS_ACTION_ATTR_SET: Replaces the contents of an existing header.  The
-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/3] Backport 802.1ad patches

2017-02-03 Thread Yi Yang
This patch set is to backport 802.1ad patches Eric Garver did to ovs,
per discussion in ovs-dev mailing list, this fix patch
"openvswitch: upcall: Fix vlan handling" depends on them, l3 patch set
depends on this patch set and "openvswitch: upcall: Fix vlan handling".

Once this patch set is merged, l3 patch set can be merged.

Yi Yang (3):
  datapath: backport: openvswitch: 802.1ad uapi changes.
  datapath: backport: vlan: Check for vlan ethernet types for 8021.q or
802.1ad
  datapath: backport: openvswitch: 802.1AD Flow handling, actions, vlan
parsing, netlink attributes

 datapath/actions.c|  16 +-
 datapath/flow.c   |  65 +++--
 datapath/flow.h   |   8 +-
 datapath/flow_netlink.c   | 311 +++---
 datapath/linux/compat/include/linux/if_vlan.h |  24 +-
 datapath/linux/compat/include/linux/openvswitch.h |  17 +-
 datapath/vport.c  |   7 +-
 7 files changed, 312 insertions(+), 136 deletions(-)

-- 
2.1.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Maxime Coquelin



On 02/03/2017 11:11 AM, Daniel P. Berrange wrote:

On Fri, Feb 03, 2017 at 10:41:12AM +0100, Maxime Coquelin wrote:



On 02/03/2017 10:27 AM, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:



On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:


It depends where / how in OVS it needs to be set. The only stuff libvirt
does with OVS is to run 'add-port' and 'del-port' commands via the ovs
cli tool. We pass through arguments from the port profile stored in the
XML config.

  


  

  

eg those things in  get passed as cli args to the 'add-port'
command. Soo if add-port needs this new version string, then we'd need
to add the version to the openvswitch virtualport XML.

If the version is provided to OVS in a different command, then it would
probably be outside scope of libvirt.


I think it would make sense to be a parameter of the add-port command.
But it would be for vhost-user related add-port command, I didn't find
where/if this is managed in libvirt XML.


For vhost-user, libvirt does not have any interaction with OVS at
all. If the thing that's using the vhost-user UNIX socket, in turn
connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
OVS it seems like that job is for Nova / os-vif to solve.


I don't think they currently understand the issues involved in
cross-version migration though.  This is a complex subject and easy to
get wrong.  It would be significantly better to figure it out at libvirt
level since it already deals with cross-version migration issues.


Libvirt considers vhost-user to be a blackbox though - it just exposes
a UNIX socket, and whatever is on the other end is completely opaque.
The fact that the other end might plumb the data stream into openvswitch
is not something libvirt should know, as we don't want to end up having
to add custom code to libvirt for every different vhost-user server
impl.

IOW, if the version str can be passed to QEMU, and then onto vhost-user
backend in QEMU, then libvirt can be involved. If the version str has
to be given to openvswitch that's not for libvirt to deall with.

Regards,
Daniel


It's not just about passing it to QEMU. The following is needed:
- need to query version when creating VM/device
- need to query supported versions when migrating VM


Those are both things that nova can do, since it knows what the vhost-user
device in question is connected to, and so can query the versions, and check
versions before triggering migration with libvirt


It can, but then it will need to query libvirt or source for the version
string since it's in the XML.


No, it wouldn't be in the XML at all. Nova on the source queries what
vhostuser version it has and what the target host has, and can prevent
the migration if they're incompatible.


This is not sufficient. Exactly the same as qemu machine type,
this must be preserved from time of install and moved
wherever VM goes.


Nova has the ability todo that.


I dont think libvirt has to be
involved at all for this, as all the info can be obtained by nova/os-vif
from the vhostuser impl it has configured.


Given we are already confused at libvirt level, I find it
highly unlikely upper levels will do the right thing.


The only confusion is about what must consume the data. I mistakenly
thought it was being proposed that the qemu vhostuser backend was
being given a version string. Now it seems the version info must be
set against the 3rd party component that vhostuser connects to, and
that is outside scope of libvirt. That is entirely managed by Nova
today, so Nova is the right thing to manage this versioning info.


This is my understanding, with as example using Nova:
1. Before starting the VM, Nova queries source host's OVS and all
   possible destination hosts' OVS to retrieve supported versions
2. Nova chooses the first common version supported by all hosts
3. Nova create the dpdkvhostuser interfaces in OVS with passing the
   selected version as parameter

That should be enough, right? Or maybe I miss something?


It isn't realistic to check all possible destination hosts when first
starting QEMU, not least becasue they may not yet exist. New hardware
may be deployed *after* QEMU is first started.

When starting the VM, it should just default to using whatever the
latest version is on the host in question.

There can be an optional host configuration parameter that admins can
set to 

[ovs-dev] [PATCH 7/7] ofproto-dpif-xlate: refactor compose_output_action__

2017-02-03 Thread Jan Scheurich
The very long function compose_output_action__() has been re-factored to make
the different cases for output to patch-port, native tunnel port, kernel tunnel
port, recirculation, or termination of a native tunnel at output to LOCAL port
clearer. Larger, self-contained blocks have been split out into separate
functions.

Signed-off-by; Jan Scheurich 
Co-authored-by: Zoltan Balogh 
---
 ofproto/ofproto-dpif-xlate.c | 432 +++
 1 file changed, 230 insertions(+), 202 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 72dabf8..a19b9f2 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3072,40 +3072,26 @@ xlate_flow_is_protected(const struct xlate_ctx *ctx, 
const struct flow *flow, co
 xport_in->xbundle->protected && xport_out->xbundle->protected);
 }
 
-static void
-compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
-const struct xlate_bond_recirc *xr, bool check_stp)
+static bool
+check_output_prerequisites(struct xlate_ctx *ctx,
+   const struct xport *xport,
+   struct flow *flow,
+   bool check_stp)
 {
-const struct xport *xport = get_ofp_port(ctx->xbridge, ofp_port);
-struct flow_wildcards *wc = ctx->wc;
-struct flow *flow = >xin->flow;
-struct flow_tnl flow_tnl;
-ovs_be16 flow_vlan_tci;
-uint32_t flow_pkt_mark;
-uint8_t flow_nw_tos;
-odp_port_t out_port, odp_port;
-bool tnl_push_pop_send = false;
-uint8_t dscp;
-
-/* If 'struct flow' gets additional metadata, we'll need to zero it out
- * before traversing a patch port. */
-BUILD_ASSERT_DECL(FLOW_WC_SEQ == 37);
-memset(_tnl, 0, sizeof flow_tnl);
-
 if (!xport) {
 xlate_report(ctx, OFT_WARN, "Nonexistent output port");
-return;
+return false;
 } else if (xport->config & OFPUTIL_PC_NO_FWD) {
 xlate_report(ctx, OFT_DETAIL, "OFPPC_NO_FWD set, skipping output");
-return;
+return false;
 } else if (ctx->mirror_snaplen != 0 && xport->odp_port == ODPP_NONE) {
 xlate_report(ctx, OFT_WARN,
  "Mirror truncate to ODPP_NONE, skipping output");
-return;
+return false;
 } else if (xlate_flow_is_protected(ctx, flow, xport)) {
 xlate_report(ctx, OFT_WARN,
  "Flow is between protected ports, skipping output.");
-return;
+return false;
 } else if (check_stp) {
 if (is_stp(>base_flow)) {
 if (!xport_stp_should_forward_bpdu(xport) &&
@@ -3119,7 +3105,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t 
ofp_port,
  "RSTP not managing BPDU in this state, "
  "skipping bpdu output");
 }
-return;
+return false;
 }
 } else if (!xport_stp_forward_state(xport) ||
!xport_rstp_forward_state(xport)) {
@@ -3130,153 +3116,190 @@ compose_output_action__(struct xlate_ctx *ctx, 
ofp_port_t ofp_port,
 xlate_report(ctx, OFT_WARN,
  "RSTP not in forwarding state, skipping output");
 }
-return;
+return false;
 }
 }
+return true;
+}
 
-if (flow->packet_type == PT_ETH && xport->is_layer3 ) {
-/* Ethernet packet to L3 outport -> pop ethernet header. */
-flow->packet_type = PACKET_TYPE(OFPHTN_ETHERTYPE, 
ntohs(flow->dl_type));
+static void
+xlate_output_patch_port(struct xlate_ctx *ctx,
+const struct xport *xport)
+{
+const struct xport *peer = xport->peer;
+struct flow *flow = >xin->flow;
+struct flow old_flow = ctx->xin->flow;
+struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel;
+bool old_conntrack = ctx->conntracked;
+bool old_was_mpls = ctx->was_mpls;
+ovs_version_t old_version = ctx->xin->tables_version;
+struct ofpbuf old_stack = ctx->stack;
+uint8_t new_stack[1024];
+struct ofpbuf old_action_set = ctx->action_set;
+struct ovs_list *old_trace = ctx->xin->trace;
+uint64_t actset_stub[1024 / 8];
+
+ofpbuf_use_stub(>stack, new_stack, sizeof new_stack);
+ofpbuf_use_stub(>action_set, actset_stub, sizeof actset_stub);
+flow->in_port.ofp_port = peer->ofp_port;
+flow->metadata = htonll(0);
+memset(>tunnel, 0, sizeof flow->tunnel);
+flow->tunnel.metadata.tab = ofproto_get_tun_tab(
+>xbridge->ofproto->up);
+ctx->wc->masks.tunnel.metadata.tab = flow->tunnel.metadata.tab;
+memset(flow->regs, 0, sizeof flow->regs);
+flow->actset_output = OFPP_UNSET;
+clear_conntrack(ctx);
+ctx->xin->trace = xlate_report(ctx, OFT_BRIDGE,
+   "bridge(\"%s\")", 

[ovs-dev] [PATCH 6/7] dpif-netlink: Don't send PACKET_TYPE to kernel

2017-02-03 Thread Jan Scheurich
The kernel datapath does not support the packet_type match field.
Instead it encodes the packet type implicitly by the presence or absence of
the Ethernet attribute in the flow key and mask.

This patch filters the PACKET_TYPE attribute out of netlink flow key and
mask to be sent to the kernel datapath.

Signed-off-by; Zoltan Balogh 

---
 lib/dpif-netlink.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index a70d11c..5479c3a 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -2863,6 +2863,34 @@ dpif_netlink_flow_from_ofpbuf(struct dpif_netlink_flow 
*flow,
 return 0;
 }
 
+static inline void
+nl_msg_put_exclude_packet_type(struct ofpbuf* buf, uint16_t type,
+   const struct nlattr *data,
+   uint16_t data_len)
+{
+const struct nlattr *packet_type = nl_attr_find__(data, data_len,
+  
OVS_KEY_ATTR_PACKET_TYPE);
+if (packet_type) {
+/* exclude PACKET_TYPE Netlink attribute. */
+ovs_assert(NLA_ALIGN(packet_type->nla_len) == NLA_HDRLEN + 
sizeof(uint32_t));
+
+size_t packet_type_len = NLA_HDRLEN + sizeof(uint32_t);
+size_t first_chunk_size =
+(size_t)((uint8_t *)packet_type - (uint8_t *)data);
+size_t second_chunk_size = data_len - first_chunk_size
+   - packet_type_len;
+uint8_t *first_attr = NULL;
+struct nlattr *next_attr = nl_attr_next(packet_type);
+
+first_attr = nl_msg_put_unspec_uninit(buf, type,
+  data_len - packet_type_len);
+memcpy(first_attr, data, first_chunk_size);
+memcpy(first_attr + first_chunk_size, next_attr, second_chunk_size);
+} else {
+nl_msg_put_unspec(buf, type, data, data_len);
+}
+}
+
 /* Appends to 'buf' (which must initially be empty) a "struct ovs_header"
  * followed by Netlink attributes corresponding to 'flow'. */
 static void
@@ -2889,13 +2917,12 @@ dpif_netlink_flow_to_ofpbuf(const struct 
dpif_netlink_flow *flow,
 }
 if (!flow->ufid_terse || !flow->ufid_present) {
 if (flow->key_len) {
-nl_msg_put_unspec(buf, OVS_FLOW_ATTR_KEY,
-  flow->key, flow->key_len);
+nl_msg_put_exclude_packet_type(buf, OVS_FLOW_ATTR_KEY, flow->key,
+   flow->key_len);
 }
-
 if (flow->mask_len) {
-nl_msg_put_unspec(buf, OVS_FLOW_ATTR_MASK,
-  flow->mask, flow->mask_len);
+nl_msg_put_exclude_packet_type(buf, OVS_FLOW_ATTR_MASK, flow->mask,
+   flow->mask_len);
 }
 if (flow->actions || flow->actions_len) {
 nl_msg_put_unspec(buf, OVS_FLOW_ATTR_ACTIONS,
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 3/7] userspace: Switching of L3 packets in L2 pipeline

2017-02-03 Thread Jan Scheurich
Ports have a new layer3 attribute if they send/receive L3 packets.

The packet_type included in structs dp_packet and flow is considered in
ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and
vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite.

A dummy ethernet header is pushed to L3 packets received from L3 ports
before the pipeline processing starts. The ethernet header is popped
before sending a packet to a L3 port.

For datapath ports that can receive L2 or L3 packets, the packet_type
becomes part of the flow key for datapath flows and is handled
appropriately in dpif-netdev.

Signed-off-by: Lorand Jakab 
Signed-off-by: Simon Horman 
Signed-off-by: Jiri Benc 
Signed-off-by: Yi Yang 
Signed-off-by: Jan Scheurich 
Co-authored-by: Zoltan Balogh 

---
 build-aux/extract-ofp-fields  |   1 +
 datapath/linux/compat/include/linux/openvswitch.h |   2 +
 include/openvswitch/match.h   |   1 +
 include/openvswitch/meta-flow.h   |  15 +-
 lib/dpif-netdev.c |  57 
 lib/dpif-netlink.c|   2 +-
 lib/match.c   |  19 ++-
 lib/meta-flow.c   |   2 +
 lib/netdev-vport.c|   8 +-
 lib/netdev.h  |   1 +
 lib/odp-execute.c |   2 +
 lib/odp-util.c| 159 ++
 lib/odp-util.h|   6 +-
 lib/packets.h |   1 -
 ofproto/ofproto-dpif-sflow.c  |   1 +
 ofproto/ofproto-dpif-upcall.c |   4 +-
 ofproto/ofproto-dpif-xlate.c  |  50 ++-
 ofproto/ofproto-dpif.c|   6 +-
 18 files changed, 251 insertions(+), 86 deletions(-)

diff --git a/build-aux/extract-ofp-fields b/build-aux/extract-ofp-fields
index 40f1bb2..fc6ce1e 100755
--- a/build-aux/extract-ofp-fields
+++ b/build-aux/extract-ofp-fields
@@ -39,6 +39,7 @@ FORMATTING = {"decimal":("MFS_DECIMAL",  1,   
8),
   "TCP flags":  ("MFS_TCP_FLAGS",2,   2)}
 
 PREREQS = {"none": "MFP_NONE",
+   "Ethernet": "MFP_ETHERNET",
"ARP": "MFP_ARP",
"VLAN VID": "MFP_VLAN_VID",
"IPv4": "MFP_IPV4",
diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index d78d8a8..9e4f0b1 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -361,6 +361,8 @@ enum ovs_key_attr {
/* Only used within kernel data path. */
OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ovs_tunnel_info */
 #endif
+
+   OVS_KEY_ATTR_PACKET_TYPE,  /* be32 packet type */
__OVS_KEY_ATTR_MAX
 };
 
diff --git a/include/openvswitch/match.h b/include/openvswitch/match.h
index 0b5f050..4f4e096 100644
--- a/include/openvswitch/match.h
+++ b/include/openvswitch/match.h
@@ -99,6 +99,7 @@ void match_set_ct_mark(struct match *, uint32_t ct_mark);
 void match_set_ct_mark_masked(struct match *, uint32_t ct_mark, uint32_t mask);
 void match_set_ct_label(struct match *, ovs_u128 ct_label);
 void match_set_ct_label_masked(struct match *, ovs_u128 ct_label, ovs_u128 
mask);
+void match_set_packet_type(struct match *, ovs_be32 packet_type);
 void match_set_skb_priority(struct match *, uint32_t skb_priority);
 void match_set_dl_type(struct match *, ovs_be16);
 void match_set_dl_src(struct match *, const struct eth_addr );
diff --git a/include/openvswitch/meta-flow.h b/include/openvswitch/meta-flow.h
index d5c0971..9daca36 100644
--- a/include/openvswitch/meta-flow.h
+++ b/include/openvswitch/meta-flow.h
@@ -854,7 +854,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Type: MAC.
  * Maskable: bitwise.
  * Formatting: Ethernet.
- * Prerequisites: none.
+ * Prerequisites: Ethernet.
  * Access: read/write.
  * NXM: NXM_OF_ETH_SRC(2) since v1.1.
  * OXM: OXM_OF_ETH_SRC(4) since OF1.2 and v1.7.
@@ -870,7 +870,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Type: MAC.
  * Maskable: bitwise.
  * Formatting: Ethernet.
- * Prerequisites: none.
+ * Prerequisites: Ethernet.
  * Access: read/write.
  * NXM: NXM_OF_ETH_DST(1) since v1.1.
  * OXM: OXM_OF_ETH_DST(3) since OF1.2 and v1.7.
@@ -889,7 +889,7 @@ enum OVS_PACKED_ENUM mf_field_id {
  * Type: be16.
  * Maskable: no.
  * Formatting: hexadecimal.
- * Prerequisites: none.
+ * Prerequisites: Ethernet.
  * Access: read-only.
  * NXM: NXM_OF_ETH_TYPE(3) since v1.1.
  * OXM: OXM_OF_ETH_TYPE(5) since OF1.2 and v1.7.
@@ -919,7 +919,7 @@ enum 

[ovs-dev] [PATCH 5/7] userspace: L3 tunnel support for GRE and LISP

2017-02-03 Thread Jan Scheurich
Add a boolean "layer3" configuration option for tunnel vports.
The layer3 option defaults to false for all ports except LISP.
GRE ports accept both true and false for "layer3".

A tunnel vport configured with layer3=true receives L3 packets.
which are then converted to Ethernet packets by pushing a dummy
Ethernet heder at the ingress of the OpenFlow pipeline. The
Ethernet header of a packet is stripped before sending to a
layer3 tunnel vport.

Presently a single GRE vport cannot carry both L2 and L3 packets.
But it is possible to create two GRE vports representing the same
GRE tunel, one with layer3=false, the other with layer3=true.
L2 packet from the tunnel are received on the first vport, L3
packets on the second. The controller must send packets to the
layer3 GRE vport to tunnel them without their Ethernet header.

Units tests have been added to check the L3 tunnel handling.

Signed-off-by: Simon Horman 
Signed-off-by: Jiri Benc 
Signed-off-by: Yi Yang 
Signed-off-by; Jan Scheurich 
Co-authored-by: Zoltan Balogh 
---
 lib/netdev-native-tnl.c   | 27 +-
 lib/netdev-vport.c| 14 --
 ofproto/tunnel.c  | 15 +--
 tests/tunnel-push-pop-ipv6.at | 25 ++--
 tests/tunnel-push-pop.at  | 45 +++
 tests/tunnel.at   | 28 +--
 vswitchd/vswitch.xml  | 13 +
 7 files changed, 122 insertions(+), 45 deletions(-)

diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index c730e72..7e4b7be 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -154,6 +154,10 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
 *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
 
 memcpy(eth, header, size);
+/* The encapsulated packet has type Ethernet. Adjust dp_packet. */
+dp_packet_set_packet_type(packet, PT_ETH);
+dp_packet_reset_offsets(packet);
+packet->l3_ofs = sizeof (struct eth_header);
 
 if (netdev_tnl_is_header_ipv6(header)) {
 ip6 = netdev_tnl_ipv6_hdr(eth);
@@ -345,6 +349,7 @@ parse_gre_header(struct dp_packet *packet,
 ovs_16aligned_be32 *options;
 int hlen;
 unsigned int ulen;
+uint16_t greh_protocol;
 
 greh = netdev_tnl_ip_extract_tnl_md(packet, tnl, );
 if (!greh) {
@@ -355,10 +360,6 @@ parse_gre_header(struct dp_packet *packet,
 return -EINVAL;
 }
 
-if (greh->protocol != htons(ETH_TYPE_TEB)) {
-return -EINVAL;
-}
-
 hlen = ulen + gre_header_len(greh->flags);
 if (hlen > dp_packet_size(packet)) {
 return -EINVAL;
@@ -388,6 +389,15 @@ parse_gre_header(struct dp_packet *packet,
 options++;
 }
 
+/* Set the new packet type depending on the GRE protocol field. */
+greh_protocol = ntohs(greh->protocol);
+if (greh_protocol == ETH_TYPE_TEB) {
+packet->packet_type = PT_ETH;
+} else {
+/* Allow all GRE protocol values as Ethertypes */
+packet->packet_type = PACKET_TYPE(OFPHTN_ETHERTYPE, greh_protocol);
+}
+
 return hlen;
 }
 
@@ -451,7 +461,12 @@ netdev_gre_build_header(const struct netdev *netdev,
 
 greh = netdev_tnl_ip_build_header(data, params, IPPROTO_GRE);
 
-greh->protocol = htons(ETH_TYPE_TEB);
+/* XXX: Needs fixing for versatile tunnels in PTAP bridges */
+if (tnl_cfg->is_layer3) {
+greh->protocol = params->flow->dl_type;
+} else {
+greh->protocol = htons(ETH_TYPE_TEB);
+}
 greh->flags = 0;
 
 options = (ovs_16aligned_be32 *) (greh + 1);
@@ -504,6 +519,7 @@ netdev_vxlan_pop_header(struct dp_packet *packet)
 tnl->tun_id = htonll(ntohl(get_16aligned_be32(>vx_vni)) >> 8);
 tnl->flags |= FLOW_TNL_F_KEY;
 
+dp_packet_set_packet_type(packet, PT_ETH);
 dp_packet_reset_packet(packet, hlen + VXLAN_HLEN);
 
 return packet;
@@ -583,6 +599,7 @@ netdev_geneve_pop_header(struct dp_packet *packet)
 tnl->metadata.present.len = opts_len;
 tnl->flags |= FLOW_TNL_F_UDPIF;
 
+dp_packet_set_packet_type(packet, PT_ETH);
 dp_packet_reset_packet(packet, hlen);
 
 return packet;
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index a6c1e23..8653e96 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -410,7 +410,7 @@ set_tunnel_config(struct netdev *dev_, const struct smap 
*args, char **errp)
 const char *name = netdev_get_name(dev_);
 const char *type = netdev_get_type(dev_);
 struct ds errors = DS_EMPTY_INITIALIZER;
-bool needs_dst_port, has_csum;
+bool needs_dst_port, has_csum, optional_layer3;
 uint16_t dst_proto = 0, src_proto = 0;
 struct netdev_tunnel_config tnl_cfg;
 struct smap_node *node;
@@ -418,6 +418,7 @@ set_tunnel_config(struct netdev *dev_, const struct smap 
*args, char **errp)
 

[ovs-dev] [PATCH 2/7] userspace: Support for push_eth and pop_eth actions

2017-02-03 Thread Jan Scheurich
Add support for actions push_eth and pop_eth to the netdev datapath and
the supporting libraries. This patch relies on the support for these actions
in the kernel datapath to be present.

Signed-off-by: Lorand Jakab 
Signed-off-by: Simon Horman 
Signed-off-by: Jiri Benc 
Signed-off-by: Yi Yang 
Signed-off-by: Jean Tourrilhes 
Signed-off-by: Jan Scheurich 
Co-authored-by: Zoltan Balogh 
---
 datapath/linux/compat/include/linux/openvswitch.h | 17 ++
 lib/dpif-netdev.c |  2 +
 lib/dpif.c|  2 +
 lib/odp-execute.c | 17 ++
 lib/odp-util.c| 72 ++-
 lib/odp-util.h|  7 +++
 lib/packets.c | 41 +
 lib/packets.h |  4 ++
 ofproto/ofproto-dpif-sflow.c  |  7 +++
 9 files changed, 167 insertions(+), 2 deletions(-)

diff --git a/datapath/linux/compat/include/linux/openvswitch.h 
b/datapath/linux/compat/include/linux/openvswitch.h
index 425d3a4..d78d8a8 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -749,6 +749,18 @@ enum ovs_nat_attr {
 #define OVS_NAT_ATTR_MAX (__OVS_NAT_ATTR_MAX - 1)
 
 /**
+ * struct ovs_action_push_eth - %OVS_ACTION_ATTR_PUSH_ETH action argument.
+ * @addresses: Source and destination MAC addresses.
+ * @eth_type: Ethernet type
+ */
+struct ovs_action_push_eth {
+   struct  ovs_key_ethernet addresses;
+#ifndef __KERNEL__
+   __be16  eth_type;
+#endif
+};
+
+/**
  * enum ovs_action_attr - Action types.
  *
  * @OVS_ACTION_ATTR_OUTPUT: Output packet to port.
@@ -782,6 +794,9 @@ enum ovs_nat_attr {
  * is no MPLS label stack, as determined by ethertype, no action is taken.
  * @OVS_ACTION_ATTR_CT: Track the connection. Populate the conntrack-related
  * entries in the flow key.
+ * @OVS_ACTION_ATTR_PUSH_ETH: Push a new outermost Ethernet header onto the
+ * packet.
+ * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the packet.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -814,6 +829,8 @@ enum ovs_action_attr {
   * bits. */
OVS_ACTION_ATTR_CT,   /* Nested OVS_CT_ATTR_* . */
OVS_ACTION_ATTR_TRUNC,/* u32 struct ovs_action_trunc. */
+   OVS_ACTION_ATTR_PUSH_ETH, /* struct ovs_action_push_eth. */
+   OVS_ACTION_ATTR_POP_ETH,  /* No argument. */
 
 #ifndef __KERNEL__
OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b630824..973814c 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4731,6 +4731,8 @@ dp_execute_cb(void *aux_, struct dp_packet_batch 
*packets_,
 case OVS_ACTION_ATTR_UNSPEC:
 case OVS_ACTION_ATTR_TRUNC:
 case OVS_ACTION_ATTR_CLONE:
+case OVS_ACTION_ATTR_PUSH_ETH:
+case OVS_ACTION_ATTR_POP_ETH:
 case __OVS_ACTION_ATTR_MAX:
 OVS_NOT_REACHED();
 }
diff --git a/lib/dpif.c b/lib/dpif.c
index d9816e5..3f82d80 100644
--- a/lib/dpif.c
+++ b/lib/dpif.c
@@ -1183,6 +1183,8 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch 
*packets_,
 case OVS_ACTION_ATTR_SAMPLE:
 case OVS_ACTION_ATTR_TRUNC:
 case OVS_ACTION_ATTR_CLONE:
+case OVS_ACTION_ATTR_PUSH_ETH:
+case OVS_ACTION_ATTR_POP_ETH:
 case OVS_ACTION_ATTR_UNSPEC:
 case __OVS_ACTION_ATTR_MAX:
 OVS_NOT_REACHED();
diff --git a/lib/odp-execute.c b/lib/odp-execute.c
index 9612ef5..4ed4475 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -573,6 +573,8 @@ requires_datapath_assistance(const struct nlattr *a)
 case OVS_ACTION_ATTR_POP_MPLS:
 case OVS_ACTION_ATTR_TRUNC:
 case OVS_ACTION_ATTR_CLONE:
+case OVS_ACTION_ATTR_POP_ETH:
+case OVS_ACTION_ATTR_PUSH_ETH:
 return false;
 
 case OVS_ACTION_ATTR_UNSPEC:
@@ -715,6 +717,21 @@ odp_execute_actions(void *dp, struct dp_packet_batch 
*batch, bool steal,
 return;
 }
 break;
+case OVS_ACTION_ATTR_PUSH_ETH: {
+const struct ovs_action_push_eth *eth = nl_attr_get(a);
+
+DP_PACKET_BATCH_FOR_EACH (packet, batch) {
+push_eth(packet, >addresses.eth_dst,
+ >addresses.eth_src, eth->eth_type);
+}
+break;
+}
+
+case OVS_ACTION_ATTR_POP_ETH:
+DP_PACKET_BATCH_FOR_EACH (packet, batch) {
+pop_eth(packet);
+}
+break;
 
 case OVS_ACTION_ATTR_OUTPUT:
 case 

[ovs-dev] [PATCH 4/7] ofproto-dpif-upcall: Intialize dump-seq of new flow to zero

2017-02-03 Thread Jan Scheurich
This forces updating of flow stat at the next re-validation, even for
flows that are being created when the revalidation has already commenced.

It enables reliable testing of fast path flow stats using ovs-appctl
time/warp after flow creation.

Signed-off-by: Jan Scheurich 
---
 ofproto/ofproto-dpif-upcall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 96a89b7..49fed77 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1118,7 +1118,7 @@ upcall_xlate(struct udpif *udpif, struct upcall *upcall,
  * with pushing its stats eventually. */
 }
 
-upcall->dump_seq = seq_read(udpif->dump_seq);
+upcall->dump_seq = 0;
 upcall->reval_seq = seq_read(udpif->reval_seq);
 
 xlate_actions(, >xout);
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/7] userspace: Add packet_type in dp_packet and flow

2017-02-03 Thread Jan Scheurich
This commit adds a packet_type attribute to the structs dp_packet and flow
to explicitly carry the type of the packet as preparation for the
introduction of the so-called packet type-aware pipeline (PTAP) in OVS.

The packet_type is a big-endian 32 bit integer with the encoding as
specified in OpenFlow verion 1.5.

The upper 16 bits contain the packet type name space. Pre-defined values
are defined in openflow-common.h:

enum ofp_header_type_namespaces {
OFPHTN_ONF = 0, /* ONF namespace. */
OFPHTN_ETHERTYPE = 1,   /* ns_type is an Ethertype. */
OFPHTN_IP_PROTO = 2,/* ns_type is a IP protocol number. */
OFPHTN_UDP_TCP_PORT = 3,/* ns_type is a TCP or UDP port. */
OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
};

The lower 16 bits specify the actual type in the context of the name space.

Only name spaces 0 and 1 will be supported for now.

For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet).
This is the default packet_type in OVS and the only one supported so far.
Packets of type (OFPHTN_ONF, 0) are called Ethernet packets.

In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet.
A packet of type (OFPHTN_ETHERTYPE, ) is a standard L2 packet
with the Ethernet header (and any VLAN tags) removed to expose the L3
(or L2.5) payload of the packet. These will simply be called L3 packets.

The Ethernet address fields dl_src and dl_dst in struct flow are not
applicable for an L3 packet and must be zero. However, to maintain
compatibility with the large code base, we have chosen to copy the
Ethertype of an L3 packet into the dl_type field of struct flow.

This does not mean that it will be possible to match on dl_type for L3
packets with PTAP later on. Matching must be done on packet_type instead.

New dp_packets are initialized with packet_type Ethernet. Ports that
receive L3 packets will have to explicitly adjust the packet_type.

Signed-off-by: Jean Tourrilhes 
Signed-off-by: Jan Scheurich 
Co-authored-by: Zoltan Balogh 
---
 include/openflow/openflow-common.h |   9 
 include/openvswitch/flow.h |  30 +--
 include/openvswitch/ofp-print.h|   9 +++-
 lib/dp-packet.c|   4 ++
 lib/dp-packet.h|  37 +-
 lib/dpif-netdev.c  |   3 +-
 lib/dpif-netlink.c |  16 ++
 lib/dpif.c |   6 +--
 lib/flow.c | 100 +++--
 lib/flow.h |   9 
 lib/match.c|   2 +-
 lib/netdev-bsd.c   |   1 +
 lib/netdev-dummy.c |   5 ++
 lib/netdev-linux.c |   4 +-
 lib/nx-match.c |   2 +-
 lib/odp-util.h |   2 +-
 lib/ofp-print.c|  28 +--
 lib/ofp-util.c |   2 +-
 lib/packets.c  |   4 ++
 lib/pcap-file.c|   4 +-
 ofproto/ofproto-dpif-rid.h |   2 +-
 ofproto/ofproto-dpif-xlate.c   |   2 +-
 tests/test-flows.c |   2 +-
 23 files changed, 217 insertions(+), 66 deletions(-)

diff --git a/include/openflow/openflow-common.h 
b/include/openflow/openflow-common.h
index 7b619a9..2382682 100644
--- a/include/openflow/openflow-common.h
+++ b/include/openflow/openflow-common.h
@@ -454,4 +454,13 @@ enum ofp_table_config {
 OFPTC14_VACANCY_EVENTS= 1 << 3, /* Enable vacancy events. */
 };
 
+/* Header and packet type name spaces. */
+enum ofp_header_type_namespaces {
+OFPHTN_ONF = 0, /* ONF namespace. */
+OFPHTN_ETHERTYPE = 1,   /* ns_type is an Ethertype. */
+OFPHTN_IP_PROTO = 2,/* ns_type is a IP protocol number. */
+OFPHTN_UDP_TCP_PORT = 3,/* ns_type is a TCP or UDP port. */
+OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */
+};
+
 #endif /* openflow/openflow-common.h */
diff --git a/include/openvswitch/flow.h b/include/openvswitch/flow.h
index df80dfe..f16060b 100644
--- a/include/openvswitch/flow.h
+++ b/include/openvswitch/flow.h
@@ -19,11 +19,13 @@
 #include "openflow/nicira-ext.h"
 #include "openvswitch/packets.h"
 #include "openvswitch/util.h"
+#include "byte-order.h"
+#include "lib/packets.h"
 
 /* This sequence number should be incremented whenever anything involving flows
  * or the wildcarding of flows changes.  This will cause build assertion
  * failures in places which likely need to be updated. */
-#define FLOW_WC_SEQ 36
+#define FLOW_WC_SEQ 37
 
 /* Number of Open vSwitch extension 32-bit registers. */
 #define FLOW_N_REGS 16
@@ -61,6 +63,22 @@ const char *flow_tun_flag_to_string(uint32_t flags);
 /* Maximum number of supported MPLS labels. */
 #define FLOW_MAX_MPLS_LABELS 3
 
+#define PACKET_TYPE(NS, NS_TYPE) CONSTANT_HTONL((NS) << 16 | (NS_TYPE))
+#define 

Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Daniel P. Berrange
On Fri, Feb 03, 2017 at 10:41:12AM +0100, Maxime Coquelin wrote:
> 
> 
> On 02/03/2017 10:27 AM, Daniel P. Berrange wrote:
> > On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote:
> > > > On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:
> > > > > > On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange 
> > > > > > > wrote:
> > > > > > > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin 
> > > > > > > > wrote:
> > > > > > > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange 
> > > > > > > > > wrote:
> > > > > > > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin 
> > > > > > > > > > wrote:
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > It depends where / how in OVS it needs to be set. The 
> > > > > > > > > > > > only stuff libvirt
> > > > > > > > > > > > does with OVS is to run 'add-port' and 'del-port' 
> > > > > > > > > > > > commands via the ovs
> > > > > > > > > > > > cli tool. We pass through arguments from the port 
> > > > > > > > > > > > profile stored in the
> > > > > > > > > > > > XML config.
> > > > > > > > > > > > 
> > > > > > > > > > > >   
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > >> > > > > > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > > > > > > > > 
> > > > > > > > > > > >   
> > > > > > > > > > > > 
> > > > > > > > > > > > eg those things in  get passed as cli args 
> > > > > > > > > > > > to the 'add-port'
> > > > > > > > > > > > command. Soo if add-port needs this new version string, 
> > > > > > > > > > > > then we'd need
> > > > > > > > > > > > to add the version to the openvswitch virtualport XML.
> > > > > > > > > > > > 
> > > > > > > > > > > > If the version is provided to OVS in a different 
> > > > > > > > > > > > command, then it would
> > > > > > > > > > > > probably be outside scope of libvirt.
> > > > > > > > > > > 
> > > > > > > > > > > I think it would make sense to be a parameter of the 
> > > > > > > > > > > add-port command.
> > > > > > > > > > > But it would be for vhost-user related add-port command, 
> > > > > > > > > > > I didn't find
> > > > > > > > > > > where/if this is managed in libvirt XML.
> > > > > > > > > > 
> > > > > > > > > > For vhost-user, libvirt does not have any interaction with 
> > > > > > > > > > OVS at
> > > > > > > > > > all. If the thing that's using the vhost-user UNIX socket, 
> > > > > > > > > > in turn
> > > > > > > > > > connects to OVS, that's outside scope of libvirt. IOW, for 
> > > > > > > > > > vhost-user
> > > > > > > > > > OVS it seems like that job is for Nova / os-vif to solve.
> > > > > > > > > 
> > > > > > > > > I don't think they currently understand the issues involved in
> > > > > > > > > cross-version migration though.  This is a complex subject 
> > > > > > > > > and easy to
> > > > > > > > > get wrong.  It would be significantly better to figure it out 
> > > > > > > > > at libvirt
> > > > > > > > > level since it already deals with cross-version migration 
> > > > > > > > > issues.
> > > > > > > > 
> > > > > > > > Libvirt considers vhost-user to be a blackbox though - it just 
> > > > > > > > exposes
> > > > > > > > a UNIX socket, and whatever is on the other end is completely 
> > > > > > > > opaque.
> > > > > > > > The fact that the other end might plumb the data stream into 
> > > > > > > > openvswitch
> > > > > > > > is not something libvirt should know, as we don't want to end 
> > > > > > > > up having
> > > > > > > > to add custom code to libvirt for every different vhost-user 
> > > > > > > > server
> > > > > > > > impl.
> > > > > > > > 
> > > > > > > > IOW, if the version str can be passed to QEMU, and then onto 
> > > > > > > > vhost-user
> > > > > > > > backend in QEMU, then libvirt can be involved. If the version 
> > > > > > > > str has
> > > > > > > > to be given to openvswitch that's not for libvirt to deall with.
> > > > > > > > 
> > > > > > > > Regards,
> > > > > > > > Daniel
> > > > > > > 
> > > > > > > It's not just about passing it to QEMU. The following is needed:
> > > > > > > - need to query version when creating VM/device
> > > > > > > - need to query supported versions when migrating VM
> > > > > > 
> > > > > > Those are both things that nova can do, since it knows what the 
> > > > > > vhost-user
> > > > > > device in question is connected to, and so can query the versions, 
> > > > > > and check
> > > > > > versions before triggering migration with libvirt
> > > > > 
> > > > > It can, but then it will need to query libvirt or source for the 
> > > > > version
> > > > > string since it's in 

Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Maxime Coquelin



On 02/03/2017 10:27 AM, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:

On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote:

On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:



On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:


It depends where / how in OVS it needs to be set. The only stuff libvirt
does with OVS is to run 'add-port' and 'del-port' commands via the ovs
cli tool. We pass through arguments from the port profile stored in the
XML config.

  


  

  

eg those things in  get passed as cli args to the 'add-port'
command. Soo if add-port needs this new version string, then we'd need
to add the version to the openvswitch virtualport XML.

If the version is provided to OVS in a different command, then it would
probably be outside scope of libvirt.


I think it would make sense to be a parameter of the add-port command.
But it would be for vhost-user related add-port command, I didn't find
where/if this is managed in libvirt XML.


For vhost-user, libvirt does not have any interaction with OVS at
all. If the thing that's using the vhost-user UNIX socket, in turn
connects to OVS, that's outside scope of libvirt. IOW, for vhost-user
OVS it seems like that job is for Nova / os-vif to solve.


I don't think they currently understand the issues involved in
cross-version migration though.  This is a complex subject and easy to
get wrong.  It would be significantly better to figure it out at libvirt
level since it already deals with cross-version migration issues.


Libvirt considers vhost-user to be a blackbox though - it just exposes
a UNIX socket, and whatever is on the other end is completely opaque.
The fact that the other end might plumb the data stream into openvswitch
is not something libvirt should know, as we don't want to end up having
to add custom code to libvirt for every different vhost-user server
impl.

IOW, if the version str can be passed to QEMU, and then onto vhost-user
backend in QEMU, then libvirt can be involved. If the version str has
to be given to openvswitch that's not for libvirt to deall with.

Regards,
Daniel


It's not just about passing it to QEMU. The following is needed:
- need to query version when creating VM/device
- need to query supported versions when migrating VM


Those are both things that nova can do, since it knows what the vhost-user
device in question is connected to, and so can query the versions, and check
versions before triggering migration with libvirt


It can, but then it will need to query libvirt or source for the version
string since it's in the XML.


No, it wouldn't be in the XML at all. Nova on the source queries what
vhostuser version it has and what the target host has, and can prevent
the migration if they're incompatible.


This is not sufficient. Exactly the same as qemu machine type,
this must be preserved from time of install and moved
wherever VM goes.


Nova has the ability todo that.


I dont think libvirt has to be
involved at all for this, as all the info can be obtained by nova/os-vif
from the vhostuser impl it has configured.


Given we are already confused at libvirt level, I find it
highly unlikely upper levels will do the right thing.


The only confusion is about what must consume the data. I mistakenly
thought it was being proposed that the qemu vhostuser backend was
being given a version string. Now it seems the version info must be
set against the 3rd party component that vhostuser connects to, and
that is outside scope of libvirt. That is entirely managed by Nova
today, so Nova is the right thing to manage this versioning info.


This is my understanding, with as example using Nova:
1. Before starting the VM, Nova queries source host's OVS and all
   possible destination hosts' OVS to retrieve supported versions
2. Nova chooses the first common version supported by all hosts
3. Nova create the dpdkvhostuser interfaces in OVS with passing the
   selected version as parameter

That should be enough, right? Or maybe I miss something?

Thanks,
Maxime
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support

2017-02-03 Thread Daniel P. Berrange
On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote:
> On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote:
> > On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote:
> > > > On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote:
> > > > > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote:
> > > > > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote:
> > > > > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange 
> > > > > > > wrote:
> > > > > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote:
> > > > > > > > > > 
> > > > > > > > > > It depends where / how in OVS it needs to be set. The only 
> > > > > > > > > > stuff libvirt
> > > > > > > > > > does with OVS is to run 'add-port' and 'del-port' commands 
> > > > > > > > > > via the ovs
> > > > > > > > > > cli tool. We pass through arguments from the port profile 
> > > > > > > > > > stored in the
> > > > > > > > > > XML config.
> > > > > > > > > > 
> > > > > > > > > >   
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > >> > > > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/>
> > > > > > > > > > 
> > > > > > > > > >   
> > > > > > > > > > 
> > > > > > > > > > eg those things in  get passed as cli args to 
> > > > > > > > > > the 'add-port'
> > > > > > > > > > command. Soo if add-port needs this new version string, 
> > > > > > > > > > then we'd need
> > > > > > > > > > to add the version to the openvswitch virtualport XML.
> > > > > > > > > > 
> > > > > > > > > > If the version is provided to OVS in a different command, 
> > > > > > > > > > then it would
> > > > > > > > > > probably be outside scope of libvirt.
> > > > > > > > > 
> > > > > > > > > I think it would make sense to be a parameter of the add-port 
> > > > > > > > > command.
> > > > > > > > > But it would be for vhost-user related add-port command, I 
> > > > > > > > > didn't find
> > > > > > > > > where/if this is managed in libvirt XML.
> > > > > > > > 
> > > > > > > > For vhost-user, libvirt does not have any interaction with OVS 
> > > > > > > > at
> > > > > > > > all. If the thing that's using the vhost-user UNIX socket, in 
> > > > > > > > turn
> > > > > > > > connects to OVS, that's outside scope of libvirt. IOW, for 
> > > > > > > > vhost-user
> > > > > > > > OVS it seems like that job is for Nova / os-vif to solve.
> > > > > > > 
> > > > > > > I don't think they currently understand the issues involved in
> > > > > > > cross-version migration though.  This is a complex subject and 
> > > > > > > easy to
> > > > > > > get wrong.  It would be significantly better to figure it out at 
> > > > > > > libvirt
> > > > > > > level since it already deals with cross-version migration issues.
> > > > > > 
> > > > > > Libvirt considers vhost-user to be a blackbox though - it just 
> > > > > > exposes
> > > > > > a UNIX socket, and whatever is on the other end is completely 
> > > > > > opaque.
> > > > > > The fact that the other end might plumb the data stream into 
> > > > > > openvswitch
> > > > > > is not something libvirt should know, as we don't want to end up 
> > > > > > having
> > > > > > to add custom code to libvirt for every different vhost-user server
> > > > > > impl.
> > > > > > 
> > > > > > IOW, if the version str can be passed to QEMU, and then onto 
> > > > > > vhost-user
> > > > > > backend in QEMU, then libvirt can be involved. If the version str 
> > > > > > has
> > > > > > to be given to openvswitch that's not for libvirt to deall with.
> > > > > > 
> > > > > > Regards,
> > > > > > Daniel
> > > > > 
> > > > > It's not just about passing it to QEMU. The following is needed:
> > > > > - need to query version when creating VM/device
> > > > > - need to query supported versions when migrating VM
> > > > 
> > > > Those are both things that nova can do, since it knows what the 
> > > > vhost-user
> > > > device in question is connected to, and so can query the versions, and 
> > > > check
> > > > versions before triggering migration with libvirt
> > > 
> > > It can, but then it will need to query libvirt or source for the version
> > > string since it's in the XML.
> > 
> > No, it wouldn't be in the XML at all. Nova on the source queries what
> > vhostuser version it has and what the target host has, and can prevent
> > the migration if they're incompatible.
> 
> This is not sufficient. Exactly the same as qemu machine type,
> this must be preserved from time of install and moved
> wherever VM goes.

Nova has the ability todo that.

> > I dont think libvirt has to be
> > involved at all for this, as all the info can be obtained by nova/os-vif
> > from the vhostuser impl it has configured.
> 
> Given we are already confused at libvirt level,