Re: [ovs-dev] [PATCH v2] windows: wmi add include
> -Original Message- > From: Ben Pfaff [mailto:b...@ovn.org] > Sent: Wednesday, February 1, 2017 7:36 PM > To: Alin Serdean> Cc: Guru Shetty ; Sairam Venugopal > ; d...@openvswitch.org > Subject: Re: [ovs-dev] [PATCH v2] windows: wmi add include > > On Wed, Feb 01, 2017 at 12:56:44AM +, Alin Serdean wrote: > > > > > -Original Message- > > > From: Ben Pfaff [mailto:b...@ovn.org] > > > Sent: Tuesday, January 31, 2017 11:35 PM > > > To: Alin Serdean > > > Cc: d...@openvswitch.org > > > Subject: Re: [ovs-dev] [PATCH v2] windows: wmi add include > > > > > > On Mon, Jan 30, 2017 at 07:42:31AM +, Alin Serdean wrote: > > > > Add 'util.h' to includes otherwise the result of the function > > > > 'ovs_format_message' will be unknown and be converted to int, > > > > triggering an abort of vswitchd. > > > > > > Is there a way to make MSVC fail the compile if a function is called > > > without a visible prototype? That would prevent this problem. > > > > > > Alternatively, if it's possible to get a warning but not an error, > > > it might be possible to add something to the appveyor build script > > > to fail that build if any such warning is reported. > > [Alin Serdean] Thanks for the idea Ben(made me facepalm for not thinking > of it). > > There is something like -Wimplicit-function-declaration(-Wall) and I could > issue an error while compiling. > > Sounds good. > > > I tried on Linux to add a bogus function, but make stopped during linking > (from what I remember -Werror is needed to stop compiling). > > By default, on Linux, we enable a lot of warnings but we do not make the > compiler turn them into errors, because we assume that the common case is > that some end user or distributor is building a known-good version of OVS > and that they don't want the build to stop because they're using a slightly > different version of the compiler that issuse warnings in slightly different > circumstances. [Alin Serdean] Good point. I think on Windows the majority will use msi's for ease of use, but that is another debate > > But we encourage developers to configure with --enable-Werror, which > turns all warnings into errors (that break the build). That way, developers > can't miss warnings that might be about important issues. > > I thought that we used --enable-Werror in the travis CI system too, but now I > see that I'm mistaken. Maybe we should enable it there. [Alin Serdean] I have no idea atm, I will check it out. But even if it is enable, for sure it won't work because: https://github.com/openvswitch/ovs/blob/master/build-aux/cccl#L148 That is the wrapper that transform the gcc arguments to cl(visual studio compiler) arguments. I will spend time to try to come up with a list of matches between the two type of compiler arguments. At the moment we get level one warnings from the build, because that is the default setting. I.E.: https://ci.appveyor.com/project/blp/ovs/build/1.0.2752#L491 I will create a test appveyor account to set up the full scheme, this is too important to ignore any further. Thanks a lot for the idea and answering my questions! > > > Since this has deep implication is it ok if I stop compiling for this type > > of > issue? > > I think so, at least for this particular warning. > > > I enabled it locally and fixing current issues that we have in the code(most > of them are ok since the type are int, but we need to fix them). ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 0/7] userspace: Support for L3 tunneling
Thanks for this contribution, Jan & al! I’ll start reviewing these right away. Please find my comment addressing your concern below, > On Feb 3, 2017, at 2:37 AM, Jan Scheurichwrote: > > This patch set is part of an initiative to deal with non-Ethernet packets in > OVS for advanced use cases like L3 tunneling or NSH. The initiative is > centering on the new OpenFlow concepts of "Packet type-aware pipeline" (PTAP) > and "Generic encap/decap actions" (EXT-382). The overall design is documented > in > https://docs.google.com/document/d/1oWMYUH8sjZJzWa72o2q9kU0N6pNE-rwZcLH3-kbbDR8 > > This patch series implements the user-space parts of the support for L3 > tunnels connected to the legacy Ethernet-only pipeline in OVS. > > In large parts it is an adaptation of the earlier work on L3 tunneling by > Lorand Jakab, Simon Horman, Jiri Benc and Yi Yang, adapted to the new design > for packet type aware pipelines as prototyped by Jean Tourrilhes. > > It supersedes user-space patches 9-11 of the most recent patch set submitted > by Yi Yang > (https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326865.html). > The remaining user-space parts of that series (VXLAN-GPE) will be superseded > by further patches in the context of this initiative. > > Key changes are the introduction of explicit packet_type fields in the > dp_packet and flow structs, as well as a simpler handling of L3 tunnels > limited to the ofproto layer. > > userspace: Add packet_type in dp_packet and flow > userspace: Support for push_eth and pop_eth actions > userspace: Switching of L3 packets in L2 pipeline > ofproto-dpif-upcall: Intialize dump-seq of new flow to zero > userspace: L3 tunnel support for GRE and LISP > dpif-netlink: Don't send PACKET_TYPE to kernel > ofproto-dpif-xlate: refactor compose_output_action__ > > For native L3 tunneling with the user-space datapath these patches are > complete. To apply L3 tunneling to the kernel datapath, they are dependent on > the following other contributions: > > * [PATCH net-next v13 0/8] openvswitch: support for layer 3 > encapsulated packets > https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/324943.html > The net-next patch has been accepted > * [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface > https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327749.html > * Another user-space patch to send the "layer3" option down via > rtnetlink interface > * Backports of the kernel datapath patches to the OVS tree module: > Patches 01-08 of patch series [PATCH v2 00/17] port Jiri Benc's L3 patchset > to ovs > https://mail.openvswitch.org/pipermail/ovs-dev/2016-December/326865.html > * A compatibility mechanism to send down "layer3" option to the > backported kernel module to support older kernels. > > To ease the progress with the implementation of the subsequent user-space > patches for the PTAP, EXT-382 and NSH and avoid the pain of frequent > re-basing of these large patch series, we would like to request exemption for > merging the present patch series before the kernel datapath pieces are all in > place. > In general it is fine to add functionality to the userspace datapath even if no kernel equivalent exists. Then, should kernel equivalent become available, we typically have a choice between maintaining compatibility with different datapath implementations of a same feature, or changing the userspace datapath implementation to closely match the kernel so that one kind of translation can serve both datapaths. What we do not want is to accept changes to the OVS tree kernel module before a feature is upstreamed, as that could lead to an OVS release being made with one kind of ABI and then a later upstream kernel datapath having a different ABI, where in the worst case specific opcode values for actions, key fields, etc. might have different semantics, resulting into maintenance difficulties. As we do not have an ABI for the userspace datapath these concerns typically do not apply. Jarno > build-aux/extract-ofp-fields | 1 + > datapath/linux/compat/include/linux/openvswitch.h | 19 + > include/openflow/openflow-common.h| 9 + > include/openvswitch/flow.h| 30 +- > include/openvswitch/match.h | 1 + > include/openvswitch/meta-flow.h | 15 +- > include/openvswitch/ofp-print.h | 9 +- > lib/dp-packet.c | 4 + > lib/dp-packet.h | 37 +- > lib/dpif-netdev.c | 62 ++-- > lib/dpif-netlink.c| 55 ++- > lib/dpif.c| 8 +- > lib/flow.c| 100 +++-- > lib/flow.h| 9 + >
Re: [ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining
The approach of using OVN logical ports to build the chain seems clean, but I wonder whether it's general-purpose enough to do what users want. If not, that is, if some users want to use physical appliances rather than VMs or containers that are integrated into OVN, the question arises of how to do that. It might be that such appliances can somehow be integrated with OVN by introducing new types of OVN logical ports, or some other alternative might need to be invented. I don't know what operators want. I believe that Guru (CCed) has some of the same concerns. On Fri, Feb 03, 2017 at 10:17:05PM +, John McDowall wrote: > Ben, > > Thanks for the thorough review, I will update the code and resubmit the > patch. Any thoughts on the general approach to service chaining etc? > > Regards > > John > > On 2/3/17, 2:10 PM, "Ben Pfaff"wrote: > > On Thu, Feb 02, 2017 at 03:22:56PM -0800, jmcdow...@paloaltonetworks.com > wrote: > > From: John McDowall > > > > This patchset is the first round at having Service Function Chaining > > functionality through OVN. The implementation is done entirely > > on the northbound side of OVN. It is a bump on the wire implementation, > > so no attempt is being made in keeping state while packets visit each > > hop of the chain. ACLs are used as the classifiers, with the > augmentation > > of action SFC, as well as option column. > > Thanks for working on this! Sorry it's taken so long to review. > > cmp_port_pair_groups() treats two pair groups as equal if either one of > them has no keys, but this violates transitivity (see > > https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Comparison-5Fsort=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=D8WTzjM4eBLAIUSaMpsuNQPXOo94n7zg2ZjpkjdkW64=BhGNi3da1UI7sRkk29-zyejanKHyZPF_ac-fm2Ctnis= > ). Example: if 0 is a pair > group with no keys and x and y are arbitrary pair groups, then this > function will conclude that x <= 0 and 0 <= y. By transitivity, we > would also have x <= y, but that's a contradiction because x and y are > arbitrary. > > I suggest that empty pair groups should sort as less than nonempty ones, > e.g.: > > @@ -2772,8 +2772,10 @@ cmp_port_pair_groups(const void *ppg1_, const void > *ppg2_) > const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p; > const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p; > > -if (ppg1->n_sortkey == 0 || ppg2->n_sortkey == 0) { > -return 0; > +if (ppg1->n_sortkey == 0) { > +return ppg2->n_sortkey == 0 ? -1 : 0; > +} else if (ppg2->n_sortkey == 0) { > +return 1; > } > > const int64_t key1 = ppg1->sortkey[0]; > > ovn-northd.c has some new uses of VLOG without rate-limiters. These > should probably be rate-limited, like most VLOG calls in ovn-northd. > > VLOG messages shouldn't include a new-line since VLOG takes care of that > itself. > > The code has lots of lines longer than the coding style suggested limit > of 79 columns. > > Comments should generally begin with a capital letter and end with a > period. > > In build_chain_classifier_entry(), qsort() is a remarkably expensive way > to find the smallest element in an array. > > In build_chain_classifier_entry(), it's strange to have "" in the middle > of the string here: > ds_put_format(ds_action, "outport = %s;"" output;", > first_ovn_port->json_key); > > In build_chain(), please don't comment out code: > //const uint16_t egress_inner_priority = 150; > > In build_chain(), a lot of debug logging has escaped as VLOG_INFO calls. > > In build_chain(), you can use xmemdup() instead of xmalloc() followed by > memcpy(). > > In build_chain(), I don't know why there's a line-splicing \ here: > VLOG_INFO("Warning: Currently lacking support for > more than one port-pair %"PRIuSIZE"\n", \ > lppg->n_port_pairs); > Also, we'd usually just use VLOG_WARN instead of writing "Warning:". > > In build_chain(), please always put {} around a conditional statement, > e.g. here: > if (!input_port_array[j] || !output_port_array[j]) > obtainedAllNeededInfo = false; > > Please wrap at 79 columns in ovn-architecture.7.xml as well. > > I spent some time with the documentation. I worked on getting rid of > passive voice because it's often unclear, e.g. "something happens" > doesn't tell the reader who does it, but "The CMS does something" does. > I dropped a lot of the items from the life cycle that didn't seem > really relevant to VNFs (it seemed like they
Re: [ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining
Ben, Thanks for the thorough review, I will update the code and resubmit the patch. Any thoughts on the general approach to service chaining etc? Regards John On 2/3/17, 2:10 PM, "Ben Pfaff"wrote: On Thu, Feb 02, 2017 at 03:22:56PM -0800, jmcdow...@paloaltonetworks.com wrote: > From: John McDowall > > This patchset is the first round at having Service Function Chaining > functionality through OVN. The implementation is done entirely > on the northbound side of OVN. It is a bump on the wire implementation, > so no attempt is being made in keeping state while packets visit each > hop of the chain. ACLs are used as the classifiers, with the augmentation > of action SFC, as well as option column. Thanks for working on this! Sorry it's taken so long to review. cmp_port_pair_groups() treats two pair groups as equal if either one of them has no keys, but this violates transitivity (see https://urldefense.proofpoint.com/v2/url?u=https-3A__en.wikipedia.org_wiki_Comparison-5Fsort=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=D8WTzjM4eBLAIUSaMpsuNQPXOo94n7zg2ZjpkjdkW64=BhGNi3da1UI7sRkk29-zyejanKHyZPF_ac-fm2Ctnis= ). Example: if 0 is a pair group with no keys and x and y are arbitrary pair groups, then this function will conclude that x <= 0 and 0 <= y. By transitivity, we would also have x <= y, but that's a contradiction because x and y are arbitrary. I suggest that empty pair groups should sort as less than nonempty ones, e.g.: @@ -2772,8 +2772,10 @@ cmp_port_pair_groups(const void *ppg1_, const void *ppg2_) const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p; const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p; -if (ppg1->n_sortkey == 0 || ppg2->n_sortkey == 0) { -return 0; +if (ppg1->n_sortkey == 0) { +return ppg2->n_sortkey == 0 ? -1 : 0; +} else if (ppg2->n_sortkey == 0) { +return 1; } const int64_t key1 = ppg1->sortkey[0]; ovn-northd.c has some new uses of VLOG without rate-limiters. These should probably be rate-limited, like most VLOG calls in ovn-northd. VLOG messages shouldn't include a new-line since VLOG takes care of that itself. The code has lots of lines longer than the coding style suggested limit of 79 columns. Comments should generally begin with a capital letter and end with a period. In build_chain_classifier_entry(), qsort() is a remarkably expensive way to find the smallest element in an array. In build_chain_classifier_entry(), it's strange to have "" in the middle of the string here: ds_put_format(ds_action, "outport = %s;"" output;", first_ovn_port->json_key); In build_chain(), please don't comment out code: //const uint16_t egress_inner_priority = 150; In build_chain(), a lot of debug logging has escaped as VLOG_INFO calls. In build_chain(), you can use xmemdup() instead of xmalloc() followed by memcpy(). In build_chain(), I don't know why there's a line-splicing \ here: VLOG_INFO("Warning: Currently lacking support for more than one port-pair %"PRIuSIZE"\n", \ lppg->n_port_pairs); Also, we'd usually just use VLOG_WARN instead of writing "Warning:". In build_chain(), please always put {} around a conditional statement, e.g. here: if (!input_port_array[j] || !output_port_array[j]) obtainedAllNeededInfo = false; Please wrap at 79 columns in ovn-architecture.7.xml as well. I spent some time with the documentation. I worked on getting rid of passive voice because it's often unclear, e.g. "something happens" doesn't tell the reader who does it, but "The CMS does something" does. I dropped a lot of the items from the life cycle that didn't seem really relevant to VNFs (it seemed like they were mostly cut-and-paste). Some of the introductory documentation confuses me though. The following paragraph mentions about the OVN northbound database but it seems to actually be about the southbound database. I don't think this is the right place to explain the logical flows that SFC puts into the southbound database; that should be in ovn-northd.8.xml instead. I deleted the following paragraph but presumably it should be adapted for ovn-nrothd.8.xml: Service insertion is implemented by adding 4 new flow rules into the OVN northbound database for each VNF inserted. The rules are added into the last table in the ingress path (L2_LKUP). The service insertion rules have a higher priority than the standard forwarding rules. This means that they
Re: [ovs-dev] [PATCH] odp: Fix sample action in userspace
On Thu, Feb 2, 2017 at 10:56 AM, Ben Pfaffwrote: > On Wed, Jan 11, 2017 at 04:00:04PM -0800, Andy Zhou wrote: >> User space implementation of the sample action is not consistent with >> kernel datapath. In kernel datapath, the side effects of actions >> within the sample actions are not visible to the subsequent actions. >> Current user space handling does not follow the same logic. This patch >> makes them consistent. >> >> Signed-off-by: Andy Zhou > > Thanks! > > This looks like a bug fix, but I suspect that there's no way for > existing code to actually trigger the bug, because I think that OVS at > least up to 2.6 only ever puts actions that do not modify the packet > into a sample action. > > Acked-by: Ben Pfaff This patch was sent out before we have decided to add the userspace datapath 'clone' action. Now that we have the clone action. The sample action will continue to be used for the current use case, so your comment will continue to apply. The fix adds an extra packet copy. The current use cases are not performance critical so may it is not a big deal. In case this becomes an issue, we then should revisit the logic here an see if we can avoid packet copy in case it is not required. Pushed to master. Thanks for the review and comment. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] ovs-vsctl: Disallow empty bridge, port, and interface names.
On Fri, Feb 03, 2017 at 01:58:14PM -0800, Andy Zhou wrote: > On Fri, Feb 3, 2017 at 9:06 AM, Ben Pfaffwrote: > > Reported-by: Gabor Locsei > > Reported-at: > > https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html > > Signed-off-by: Ben Pfaff > > Acked-by: Andy Zhou Thanks Girish, Andy, Gabor. I applied these patches to master and branch-2.7. I also applied the first patch to branch-2.6. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface
On 3 February 2017 at 13:07, Eric Garverwrote: > On Thu, Feb 02, 2017 at 02:37:22PM -0800, Joe Stringer wrote: >> There's a few patches in this series with co-authored-by, but missing >> signed-off-by from the co-author. Also, the author does not need to be >> listed as a co-author. > > Understood. I thought I followed an example from the git log, but maybe > I was mistaken. I'll fix it in the next round. It's possible something was a little different at a particular git commit, but the tag explanation in the documentation is fairly clear: http://docs.openvswitch.org/en/latest/internals/contributing/submitting-patches/#tags ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v4 3/6] dpif-netlink: Probe for out-of-tree datapath.
On 3 February 2017 at 12:56, Eric Garverwrote: > On Thu, Feb 02, 2017 at 02:50:01PM -0800, Joe Stringer wrote: >> On 18 January 2017 at 11:45, Eric Garver wrote: >> > For out-of-tree datapath, only try genetlink/compat. >> > For in-tree kernel datapath, try rtnetlink then genetlink. >> > >> > Signed-off-by: Eric Garver >> > --- >> > lib/dpif-netlink.c | 35 --- >> > 1 file changed, 32 insertions(+), 3 deletions(-) >> > >> > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c >> > index e6459f358989..769806eadbf1 100644 >> > --- a/lib/dpif-netlink.c >> > +++ b/lib/dpif-netlink.c >> > @@ -210,6 +210,11 @@ static int ovs_packet_family; >> > * Initialized by dpif_netlink_init(). */ >> > static unsigned int ovs_vport_mcgroup; >> > >> > +/* rtnetlink information for OVS. >> > + * >> > + * Initialized by dpif_netlink_init(). */ >> > +static bool ovs_datapath_out_of_tree = false; >> >> Perhaps in the comment briefly mention that if this is true, devices >> are created using OVS netlink and if it's false devices are created >> using NETLINK_ROUTE / as LWT? > > I'll buff the comment to explain this. > >> > + >> > static int dpif_netlink_init(void); >> > static int open_dpif(const struct dpif_netlink_dp *, struct dpif **); >> > static uint32_t dpif_netlink_port_get_pid(const struct dpif *, >> > @@ -1014,11 +1019,13 @@ dpif_netlink_port_add(struct dpif *dpif_, struct >> > netdev *netdev, >> >odp_port_t *port_nop) >> > { >> > struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); >> > -int error; >> > +int error = EOPNOTSUPP; >> > >> > fat_rwlock_wrlock(>upcall_lock); >> > -error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop); >> > -if (error == EOPNOTSUPP) { >> > +if (!ovs_datapath_out_of_tree) { >> > +error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop); >> > +} >> > +if (error) { >> > error = dpif_netlink_port_add_compat(dpif, netdev, port_nop); >> > } >> > fat_rwlock_unlock(>upcall_lock); >> > @@ -2503,6 +2510,7 @@ dpif_netlink_init(void) >> > { >> > static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; >> > static int error; >> > +struct netdev *netdev = NULL; >> >> This can be in the #ifdef __linux__; otherwise you have an unused >> variable on other platforms. > > Thanks for catching. I'll address that. > >> > >> > if (ovsthread_once_start()) { >> > error = nl_lookup_genl_family(OVS_DATAPATH_FAMILY, >> > @@ -2526,6 +2534,27 @@ dpif_netlink_init(void) >> > error = nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY, >> > OVS_VPORT_MCGROUP, >> > _vport_mcgroup); >> > } >> > +#ifdef __linux__ >> > +if (!error) { >> > +error = netdev_open("ovs-system-probe", "geneve", ); >> > +if (!error) { >> > +error = netdev_geneve_create_kind(netdev, "ovs_geneve"); >> > +if (error != EOPNOTSUPP) { >> > +if (!error) { >> > +char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; >> > +const char *dp_port; >> > + >> > +dp_port = netdev_vport_get_dpif_port(netdev, >> > namebuf, >> > + sizeof >> > namebuf); >> > +netdev_geneve_destroy(dp_port); >> > +} >> > +ovs_datapath_out_of_tree = true; >> > +} >> > +netdev_close(netdev); >> > +error = 0; >> > +} >> > +} >> > +#endif >> >> Geneve isn't added yet, so this patch introduces build breakage. > > Oops! I rearranged the patches before submitting. I'll move this patch > to where I originally had it, after adding the newlink support. > >> This doesn't look like it matches the most recent discussion on this topic: >> https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327476.html > > Correct. I opted not to do the LWT probe on init. We have to verify the > device after creation anyways, which includes checking for LWT. It > didn't seem like probing for LWT at init gained anything. OK, thanks for the explanation. I didn't have the full context to review these patches so was partly going by just checking that the feedback from the previous version was applied. If I follow correctly, the backported datapath will register to allow device creation via NETLINK_ROUTE with the name ovs_foo for foo in {geneve,gre,vxlan}; in this patch, if creation of such devices is successful, or if it fails due to any reason other than EOPNOTSUPP, then we will attempt to use these out of tree tunnel types? I guess the point is that the backported datapath can register to allow such device creation, but it may fail since usually it's configured over OVS genetlink; however, the failure
Re: [ovs-dev] [ovs-dev, RFC] ovn: support for service function chaining
On Thu, Feb 02, 2017 at 03:22:56PM -0800, jmcdow...@paloaltonetworks.com wrote: > From: John McDowall> > This patchset is the first round at having Service Function Chaining > functionality through OVN. The implementation is done entirely > on the northbound side of OVN. It is a bump on the wire implementation, > so no attempt is being made in keeping state while packets visit each > hop of the chain. ACLs are used as the classifiers, with the augmentation > of action SFC, as well as option column. Thanks for working on this! Sorry it's taken so long to review. cmp_port_pair_groups() treats two pair groups as equal if either one of them has no keys, but this violates transitivity (see https://en.wikipedia.org/wiki/Comparison_sort). Example: if 0 is a pair group with no keys and x and y are arbitrary pair groups, then this function will conclude that x <= 0 and 0 <= y. By transitivity, we would also have x <= y, but that's a contradiction because x and y are arbitrary. I suggest that empty pair groups should sort as less than nonempty ones, e.g.: @@ -2772,8 +2772,10 @@ cmp_port_pair_groups(const void *ppg1_, const void *ppg2_) const struct nbrec_logical_port_pair_group *ppg1 = *ppg1p; const struct nbrec_logical_port_pair_group *ppg2 = *ppg2p; -if (ppg1->n_sortkey == 0 || ppg2->n_sortkey == 0) { -return 0; +if (ppg1->n_sortkey == 0) { +return ppg2->n_sortkey == 0 ? -1 : 0; +} else if (ppg2->n_sortkey == 0) { +return 1; } const int64_t key1 = ppg1->sortkey[0]; ovn-northd.c has some new uses of VLOG without rate-limiters. These should probably be rate-limited, like most VLOG calls in ovn-northd. VLOG messages shouldn't include a new-line since VLOG takes care of that itself. The code has lots of lines longer than the coding style suggested limit of 79 columns. Comments should generally begin with a capital letter and end with a period. In build_chain_classifier_entry(), qsort() is a remarkably expensive way to find the smallest element in an array. In build_chain_classifier_entry(), it's strange to have "" in the middle of the string here: ds_put_format(ds_action, "outport = %s;"" output;", first_ovn_port->json_key); In build_chain(), please don't comment out code: //const uint16_t egress_inner_priority = 150; In build_chain(), a lot of debug logging has escaped as VLOG_INFO calls. In build_chain(), you can use xmemdup() instead of xmalloc() followed by memcpy(). In build_chain(), I don't know why there's a line-splicing \ here: VLOG_INFO("Warning: Currently lacking support for more than one port-pair %"PRIuSIZE"\n", \ lppg->n_port_pairs); Also, we'd usually just use VLOG_WARN instead of writing "Warning:". In build_chain(), please always put {} around a conditional statement, e.g. here: if (!input_port_array[j] || !output_port_array[j]) obtainedAllNeededInfo = false; Please wrap at 79 columns in ovn-architecture.7.xml as well. I spent some time with the documentation. I worked on getting rid of passive voice because it's often unclear, e.g. "something happens" doesn't tell the reader who does it, but "The CMS does something" does. I dropped a lot of the items from the life cycle that didn't seem really relevant to VNFs (it seemed like they were mostly cut-and-paste). Some of the introductory documentation confuses me though. The following paragraph mentions about the OVN northbound database but it seems to actually be about the southbound database. I don't think this is the right place to explain the logical flows that SFC puts into the southbound database; that should be in ovn-northd.8.xml instead. I deleted the following paragraph but presumably it should be adapted for ovn-nrothd.8.xml: Service insertion is implemented by adding 4 new flow rules into the OVN northbound database for each VNF inserted. The rules are added into the last table in the ingress path (L2_LKUP). The service insertion rules have a higher priority than the standard forwarding rules. This means that they override the existing forwarding rules. There are four new rules added for each insertion. Two ingress and two egress, The first ingress rule sends all traffic destined for the application into the VNF ingress port and the second rule takes all traffic destined to the application from the VNF egress port to the application, the priorities are such that the second rule is always checked first. In the egress direction the rules are similar if the traffic is from the application it is sent to the VNF egress port and if if is from the application and is from the VNF ingress port it is delivered to the destination. Also this paragraph is confusing because the patch doesn't add any new table named Service or Logical_Service or anything like that: The steps in this example refer to the
Re: [ovs-dev] [PATCH 2/2] ovs-vsctl: Disallow empty bridge, port, and interface names.
On Fri, Feb 3, 2017 at 9:06 AM, Ben Pfaffwrote: > Reported-by: Gabor Locsei > Reported-at: > https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html > Signed-off-by: Ben Pfaff Acked-by: Andy Zhou ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ofp-actions: Fix variable length meta-flow field usage in flow actions.
On 2 February 2017 at 13:40, Joe Stringerwrote: > On 1 February 2017 at 11:06, Joe Stringer wrote: >> On 31 January 2017 at 16:57, Joe Stringer wrote: >>> /* Returns true if a variable length meta-flow field 'mff' is not mapped in >>> * the 'vl_mff_map'. */ >>> bool >>> -mf_vl_mff_not_mapped(const struct mf_field *mff, >>> - const struct vl_mff_map *vl_mff_map) >>> +mf_vl_mff_mapped(const struct mf_field *mff, const struct vl_mff_map *map) >>> { >>> -if (mff && vl_mff_map) { >>> -if (mff->variable_len && !mff->mapped) { >>> -return true; >>> -} >>> -} >>> - >>> -return false; >>> +return !(map && mff && mff->variable_len && !mff->mapped); >>> } >> >> Yi-Hung pointed out offline that this reversal doesn't quite sit right >> logically; this function is searching for a specific set of invalid >> conditions, where there is a vl_mff_map, and the field is >> variable-length, and it's not mapped. It's misleading to have all of >> this covered by a function named "...mapped()". >> >> I suggest we retain the original logic but rename the function to >> something like mf_vl_mff_invalid(). > > I applied this to master. Also cherry-picked to branch-2.7. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v4 0/6] create tunnel devices using rtnetlink interface
On Thu, Feb 02, 2017 at 02:37:22PM -0800, Joe Stringer wrote: > Hi Eric, Hi Joe, Thanks for your feedback! Much appreciated. > There's a few patches in this series with co-authored-by, but missing > signed-off-by from the co-author. Also, the author does not need to be > listed as a co-author. Understood. I thought I followed an example from the git log, but maybe I was mistaken. I'll fix it in the next round. > > When building this series I'm seeing a bunch of redefinition errors > (across all platforms I test on, ranging from kernel 3.10 to 4.8): > > /usr/include/linux/if_tunnel.h:21:9: error: preprocessor token > GRE_CSUM redefined > lib/packets.h:1046:9: this was the original definition > /usr/include/linux/if_tunnel.h:22:9: error: preprocessor token > GRE_ROUTING redefined > lib/packets.h:1047:9: this was the original definition > /usr/include/linux/if_tunnel.h:23:9: error: preprocessor token GRE_KEY > redefined > lib/packets.h:1048:9: this was the original definition > /usr/include/linux/if_tunnel.h:24:9: error: preprocessor token GRE_SEQ > redefined > lib/packets.h:1049:9: this was the original definition > /usr/include/linux/if_tunnel.h:25:9: error: preprocessor token > GRE_STRICT redefined > lib/packets.h:1050:9: this was the original definition > /usr/include/linux/if_tunnel.h:26:9: error: preprocessor token GRE_REC > redefined > lib/packets.h:1051:9: this was the original definition > /usr/include/linux/if_tunnel.h:27:9: error: preprocessor token > GRE_FLAGS redefined > lib/packets.h:1052:9: this was the original definition > /usr/include/linux/if_tunnel.h:28:9: error: preprocessor token > GRE_VERSION redefined > lib/packets.h:1053:9: this was the original definition This seems to be due to including both linux/if_tunnel.h and packets.h. linux/if_tunnel.h was added by the original patch series. I'll investigate if it's really needed. > > I have a few comments on some of the patches. Thanks. I addressed those separately. > On 18 January 2017 at 11:45, Eric Garverwrote: > > This series adds support for the creation of tunnels using the rtnetlink > > interface. This will open the possibility for new features and flags on > > those > > vports without the need to change vport compatibility code. > > > > Support for STT and LISP have not been added because these are not upstream > > yet, > > so we don't know how the interface will be like upstream. And there are no > > features in the current drivers right now we could make use of. > > > > Note: This work originally started by Thadeu Lima de Souza Cascardo. > > > > Testing: > > - kernel 4.9.3, in-tree datapath > > - rtnetlink successfully creates devices > > - kernel 4.2.8, in-tree datapath > > - rtnetlink is tried, but fails due to no COLLECT_METADATA support > > - genetlink successfully creates devices > > - kernel 4.2.8, out-of-tree datapath > > - rtnetlink is not tried > > - genetlink successfully creates devices > > > > v2: > > > > We are able to set the MTU to UINT16_MAX since it is not restricted by the > > driver during newlink. > > > > v3: > > > > Prefer to get type from vport before checking if device is opened. Also, > > disable > > IFLA_VXLAN_LEARNING as it's not enabled on compat vports as well. > > > > v4: > > - Probe for ovs_geneve on init, this indicates out-of-tree datapath > > - If exists, only try genetlink/compat > > - else, try rtnetlink and fallback to genetlink/compat > > - Read back and verify devices created with rtnetlink > > - checkpatch fixes > > > > Eric Garver (4): > > dpif-netlink: Probe for out-of-tree datapath. > > dpif-netlink: add VXLAN creation support > > dpif-netlink: add GRE creation support > > dpif-netlink: add GENEVE creation support > > > > Thadeu Lima de Souza Cascardo (2): > > netdev: get device type from vport prefix if it uses one > > dpif-netlink: break out code to add compat and non-compat vports > > > > lib/dpif-netlink.c | 679 > > + > > lib/netdev.c | 26 +- > > 2 files changed, 651 insertions(+), 54 deletions(-) > > > > -- > > 2.10.0 > > > > ___ > > dev mailing list > > d...@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] SFC patches for OVN
Jason, I checked it against top of the git tree. So just download the patch and clone the lastest and then $ git apply –directory=ovn If you have any questions/feedback let me know. Regards John ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v4 4/6] dpif-netlink: add VXLAN creation support
On Thu, Feb 02, 2017 at 02:59:42PM -0800, Joe Stringer wrote: > On 18 January 2017 at 11:45, Eric Garverwrote: > > Creates VXLAN devices using rtnetlink and tunnel metadata. > > > > Co-Authored-by: Thadeu Lima de Souza Cascardo > > Co-Authored-by: Eric Garver > > Signed-off-by: Eric Garver > > --- > > lib/dpif-netlink.c | 194 > > - > > 1 file changed, 193 insertions(+), 1 deletion(-) > > I think that the vast majority of this code is linux-specific and > should not exist in dpif-netlink.c. > > Perhaps we should add a new lib/netdev-lwt.[ch], where the .h file has > the #ifdef __linux__ logic to either declare or define the functions, > then lib/netdev-lwt.c has the implementations of these functions. The > .h would always be added to the build in lib/automake.mk and the .c > file would only be added in the #if LINUX section. That's a nice cleanup. Thanks for the suggestion. > One might even argue that it'd be tidier if the > dpif_netlink_port_{create,destroy} functions were moved here (and > renamed to something more apt). I'll look into this as well. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] SFC patches for OVN
> > >> > Hi Guru, > > Thank you for your reply. I found the post. It was posted by John McDowall > yesterday. > > I'd like to try it. Just wonder which ovs version/branch it should be > applied to. 2.6.1 or the master branch? > > Master branch. > Thank you > Jason > > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH v4 3/6] dpif-netlink: Probe for out-of-tree datapath.
On Thu, Feb 02, 2017 at 02:50:01PM -0800, Joe Stringer wrote: > On 18 January 2017 at 11:45, Eric Garverwrote: > > For out-of-tree datapath, only try genetlink/compat. > > For in-tree kernel datapath, try rtnetlink then genetlink. > > > > Signed-off-by: Eric Garver > > --- > > lib/dpif-netlink.c | 35 --- > > 1 file changed, 32 insertions(+), 3 deletions(-) > > > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > > index e6459f358989..769806eadbf1 100644 > > --- a/lib/dpif-netlink.c > > +++ b/lib/dpif-netlink.c > > @@ -210,6 +210,11 @@ static int ovs_packet_family; > > * Initialized by dpif_netlink_init(). */ > > static unsigned int ovs_vport_mcgroup; > > > > +/* rtnetlink information for OVS. > > + * > > + * Initialized by dpif_netlink_init(). */ > > +static bool ovs_datapath_out_of_tree = false; > > Perhaps in the comment briefly mention that if this is true, devices > are created using OVS netlink and if it's false devices are created > using NETLINK_ROUTE / as LWT? I'll buff the comment to explain this. > > + > > static int dpif_netlink_init(void); > > static int open_dpif(const struct dpif_netlink_dp *, struct dpif **); > > static uint32_t dpif_netlink_port_get_pid(const struct dpif *, > > @@ -1014,11 +1019,13 @@ dpif_netlink_port_add(struct dpif *dpif_, struct > > netdev *netdev, > >odp_port_t *port_nop) > > { > > struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); > > -int error; > > +int error = EOPNOTSUPP; > > > > fat_rwlock_wrlock(>upcall_lock); > > -error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop); > > -if (error == EOPNOTSUPP) { > > +if (!ovs_datapath_out_of_tree) { > > +error = dpif_netlink_port_create_and_add(dpif, netdev, port_nop); > > +} > > +if (error) { > > error = dpif_netlink_port_add_compat(dpif, netdev, port_nop); > > } > > fat_rwlock_unlock(>upcall_lock); > > @@ -2503,6 +2510,7 @@ dpif_netlink_init(void) > > { > > static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; > > static int error; > > +struct netdev *netdev = NULL; > > This can be in the #ifdef __linux__; otherwise you have an unused > variable on other platforms. Thanks for catching. I'll address that. > > > > if (ovsthread_once_start()) { > > error = nl_lookup_genl_family(OVS_DATAPATH_FAMILY, > > @@ -2526,6 +2534,27 @@ dpif_netlink_init(void) > > error = nl_lookup_genl_mcgroup(OVS_VPORT_FAMILY, > > OVS_VPORT_MCGROUP, > > _vport_mcgroup); > > } > > +#ifdef __linux__ > > +if (!error) { > > +error = netdev_open("ovs-system-probe", "geneve", ); > > +if (!error) { > > +error = netdev_geneve_create_kind(netdev, "ovs_geneve"); > > +if (error != EOPNOTSUPP) { > > +if (!error) { > > +char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; > > +const char *dp_port; > > + > > +dp_port = netdev_vport_get_dpif_port(netdev, > > namebuf, > > + sizeof > > namebuf); > > +netdev_geneve_destroy(dp_port); > > +} > > +ovs_datapath_out_of_tree = true; > > +} > > +netdev_close(netdev); > > +error = 0; > > +} > > +} > > +#endif > > Geneve isn't added yet, so this patch introduces build breakage. Oops! I rearranged the patches before submitting. I'll move this patch to where I originally had it, after adding the newlink support. > This doesn't look like it matches the most recent discussion on this topic: > https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327476.html Correct. I opted not to do the LWT probe on init. We have to verify the device after creation anyways, which includes checking for LWT. It didn't seem like probing for LWT at init gained anything. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] rhel: Firewall service files for OVN.
On Fri, Feb 3, 2017 at 3:50 PM, Russell Bryantwrote: > From: Marcin Mirecki > > Firewall service files allowing to open firewalld > ports required for running OVN > > Signed-off-by: Marcin Mirecki > Acked-by: Ben Pfaff > Signed-off-by: Russell Bryant > --- > AUTHORS.rst | 1 + > rhel/automake.mk | 5 > - > rhel/openvswitch-fedora.spec.in | 9 > + > rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml | 7 > +++ > rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml| 6 > ++ > 5 files changed, 27 insertions(+), 1 deletion(-) > create mode 100644 rhel/usr_lib_firewalld_services_ovn-central-firewall- > service.xml > create mode 100644 rhel/usr_lib_firewalld_services_ovn-host-firewall- > service.xml > We worked on this a bit off-list. I've applied this version to master and branch-2.7. Thanks! -- Russell Bryant ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] rhel: Firewall service files for OVN.
From: Marcin MireckiFirewall service files allowing to open firewalld ports required for running OVN Signed-off-by: Marcin Mirecki Acked-by: Ben Pfaff Signed-off-by: Russell Bryant --- AUTHORS.rst | 1 + rhel/automake.mk | 5 - rhel/openvswitch-fedora.spec.in | 9 + rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml | 7 +++ rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml| 6 ++ 5 files changed, 27 insertions(+), 1 deletion(-) create mode 100644 rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml create mode 100644 rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml diff --git a/AUTHORS.rst b/AUTHORS.rst index 8f3fc26..b567fcc 100644 --- a/AUTHORS.rst +++ b/AUTHORS.rst @@ -192,6 +192,7 @@ Luigi Rizzo ri...@iet.unipi.it Luis E. P. l...@hotmail.com Lukasz Rzasik lukasz.rza...@gmail.com Madhu Challacha...@noironetworks.com +Marcin Mirecki mmire...@redhat.com Mario Cabrera mario.cabr...@hpe.com Mark D. Graymark.d.g...@intel.com Mark Hamilton mhamil...@nicira.com diff --git a/rhel/automake.mk b/rhel/automake.mk index 45aa9b1..df4c19a 100644 --- a/rhel/automake.mk +++ b/rhel/automake.mk @@ -30,7 +30,10 @@ EXTRA_DIST += \ rhel/usr_lib_systemd_system_ovs-vswitchd.service \ rhel/usr_lib_systemd_system_ovn-controller.service \ rhel/usr_lib_systemd_system_ovn-controller-vtep.service \ - rhel/usr_lib_systemd_system_ovn-northd.service + rhel/usr_lib_systemd_system_ovn-northd.service \ + rhel/usr_lib_systemd_system_ovn-northd.service \ + rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml \ + rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml update_rhel_spec = \ $(AM_V_GEN)($(ro_shell) && sed -e 's,[@]VERSION[@],$(VERSION),g') \ diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in index 65e937c..262acd0 100644 --- a/rhel/openvswitch-fedora.spec.in +++ b/rhel/openvswitch-fedora.spec.in @@ -127,6 +127,7 @@ files needed to build an external application. Summary: Open vSwitch - Open Virtual Network support License: ASL 2.0 Requires: openvswitch openvswitch-ovn-common +Requires: firewalld-filesystem %description ovn-central OVN, the Open Virtual Network, is a system to support virtual network @@ -138,6 +139,7 @@ overlays and security groups. Summary: Open vSwitch - Open Virtual Network support License: ASL 2.0 Requires: openvswitch openvswitch-ovn-common +Requires: firewalld-filesystem %description ovn-host OVN, the Open Virtual Network, is a system to support virtual network @@ -232,6 +234,11 @@ touch $RPM_BUILD_ROOT%{_sysconfdir}/openvswitch/system-id.conf install -p -m 644 -D selinux/openvswitch-custom.pp \ $RPM_BUILD_ROOT%{_datadir}/selinux/packages/%{name}/openvswitch-custom.pp +install rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml \ + $RPM_BUILD_ROOT%{_prefix}/lib/firewalld/services/ovn-central-firewall-service.xml +install rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml \ + $RPM_BUILD_ROOT%{_prefix}/lib/firewalld/services/ovn-host-firewall-service.xml + # remove unpackaged files rm -f $RPM_BUILD_ROOT%{_bindir}/ovs-parse-backtrace \ $RPM_BUILD_ROOT%{_sbindir}/ovs-vlan-bug-workaround \ @@ -517,11 +524,13 @@ fi %config %{_datadir}/openvswitch/ovn-nb.ovsschema %config %{_datadir}/openvswitch/ovn-sb.ovsschema %{_unitdir}/ovn-northd.service +${_prefix}/lib/firewalld/services/ovn-central-firewall-service.xml %files ovn-host %{_bindir}/ovn-controller %{_mandir}/man8/ovn-controller.8* %{_unitdir}/ovn-controller.service +${_prefix}/lib/firewalld/services/ovn-host-firewall-service.xml %files ovn-vtep %{_bindir}/ovn-controller-vtep diff --git a/rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml b/rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml new file mode 100644 index 000..e7f871d --- /dev/null +++ b/rhel/usr_lib_firewalld_services_ovn-central-firewall-service.xml @@ -0,0 +1,7 @@ + + + ovn-central-firewall-service + Firewall service for ovn central + + + diff --git a/rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml b/rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml new file mode 100644 index 000..f606890 --- /dev/null +++ b/rhel/usr_lib_firewalld_services_ovn-host-firewall-service.xml @@ -0,0 +1,6 @@ + + + ovn-host-firewall-service + Firewall service for ovn host + + -- 2.9.3 ___ dev mailing list d...@openvswitch.org
Re: [ovs-dev] [RFC PATCH v4 2/6] dpif-netlink: break out code to add compat and non-compat vports
On Thu, Feb 02, 2017 at 02:39:30PM -0800, Joe Stringer wrote: > On 18 January 2017 at 11:45, Eric Garverwrote: > > From: Thadeu Lima de Souza Cascardo > > > > The vport type for adding tunnels is now compatibility code and any new > > features > > from tunnels must configure the tunnel as an interface using the tunnel > > metadata > > support. > > > > In order to be able to add those tunnels, we need to add code to create the > > tunnels and add them as NETDEV vports. And when there is no support to > > create > > them, we need to use the compatibility code and add them as tunnel vports. > > > > When removing those tunnels, we need to remove the interfaces as well, and > > detecting the right type might be important, at least to distinguish the > > tunnel > > vports that we should remove and the interfaces that we shouldn't. > > > > Signed-off-by: Thadeu Lima de Souza Cascardo > > This patch involves refactoring as well as the base changes to > add/remove the new tunnel types. It's always easier to review if > functional changes are proposed separately from refactoring/cosmetic. > (I recognize the new functions don't do anything, but they still > combine with the refactoring). That said, with some minor changes this > looks OK to me. Understood. I'll have a go at spitting this patch. > Acked-by: Joe Stringer > > > --- > > lib/dpif-netlink.c | 201 > > +++-- > > 1 file changed, 148 insertions(+), 53 deletions(-) > > > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > > index c8b0e37f9365..e6459f358989 100644 > > --- a/lib/dpif-netlink.c > > +++ b/lib/dpif-netlink.c > > @@ -781,10 +781,8 @@ get_vport_type(const struct dpif_netlink_vport *vport) > > } > > > > static enum ovs_vport_type > > -netdev_to_ovs_vport_type(const struct netdev *netdev) > > +netdev_to_ovs_vport_type(const char *type) > > { > > -const char *type = netdev_get_type(netdev); > > - > > if (!strcmp(type, "tap") || !strcmp(type, "system")) { > > return OVS_VPORT_TYPE_NETDEV; > > } else if (!strcmp(type, "internal")) { > > @@ -805,19 +803,14 @@ netdev_to_ovs_vport_type(const struct netdev *netdev) > > } > > > > static int > > -dpif_netlink_port_add__(struct dpif_netlink *dpif, struct netdev *netdev, > > +dpif_netlink_port_add__(struct dpif_netlink *dpif, const char *name, > > +enum ovs_vport_type type, > > +struct ofpbuf *options, > > odp_port_t *port_nop) > > OVS_REQ_WRLOCK(dpif->upcall_lock) > > { > > -const struct netdev_tunnel_config *tnl_cfg; > > -char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; > > -const char *name = netdev_vport_get_dpif_port(netdev, > > - namebuf, sizeof namebuf); > > -const char *type = netdev_get_type(netdev); > > struct dpif_netlink_vport request, reply; > > struct ofpbuf *buf; > > -uint64_t options_stub[64 / 8]; > > -struct ofpbuf options; > > struct nl_sock **socksp = NULL; > > uint32_t *upcall_pids; > > int error = 0; > > @@ -832,17 +825,80 @@ dpif_netlink_port_add__(struct dpif_netlink *dpif, > > struct netdev *netdev, > > dpif_netlink_vport_init(); > > request.cmd = OVS_VPORT_CMD_NEW; > > request.dp_ifindex = dpif->dp_ifindex; > > -request.type = netdev_to_ovs_vport_type(netdev); > > -if (request.type == OVS_VPORT_TYPE_UNSPEC) { > > +request.type = type; > > +request.name = name; > > + > > +request.port_no = *port_nop; > > +upcall_pids = vport_socksp_to_pids(socksp, dpif->n_handlers); > > +request.n_upcall_pids = socksp ? dpif->n_handlers : 1; > > +request.upcall_pids = upcall_pids; > > + > > +if (options) { > > +request.options = options->data; > > +request.options_len = options->size; > > +} > > + > > +error = dpif_netlink_vport_transact(, , ); > > +if (!error) { > > +*port_nop = reply.port_no; > > +} else { > > +if (error == EBUSY && *port_nop != ODPP_NONE) { > > +VLOG_INFO("%s: requested port %"PRIu32" is in use", > > + dpif_name(>dpif), *port_nop); > > +} > > + > > +vport_del_socksp(dpif, socksp); > > +goto exit; > > +} > > + > > +if (socksp) { > > +error = vport_add_channels(dpif, *port_nop, socksp); > > +if (error) { > > +VLOG_INFO("%s: could not add channel for port %s", > > + dpif_name(>dpif), name); > > + > > +/* Delete the port. */ > > +dpif_netlink_vport_init(); > > +request.cmd = OVS_VPORT_CMD_DEL; > > +request.dp_ifindex = dpif->dp_ifindex; > > +request.port_no = *port_nop; > > +dpif_netlink_vport_transact(, NULL, NULL); > > +vport_del_socksp(dpif, socksp); > > +
Re: [ovs-dev] SFC patches for OVN
On 3 February 2017 at 11:59, Shuaijun Zhangwrote: > Hi All, > > I found a link (see below) it mentioned that a set of patches are developed > to support SFC within OVN > https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/324909.html > > Can anyone please let me know where are these patches, are they in the OVS > repo in github? > They are in the review process. You can find a fresh batch posted in the last couple of days in the mailing list. > > Thank you, > Jason > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] SFC patches for OVN
Hi All, I found a link (see below) it mentioned that a set of patches are developed to support SFC within OVN https://mail.openvswitch.org/pipermail/ovs-dev/2016-November/324909.html Can anyone please let me know where are these patches, are they in the OVS repo in github? Thank you, Jason ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v3] dpif-netdev: Conditional EMC insert
2017-01-31 9:55 GMT-08:00 Ciara Loftus: > Unconditional insertion of EMC entries results in EMC thrashing at high > numbers of parallel flows. When this occurs, the performance of the EMC > often falls below that of the dpcls classifier, rendering the EMC > practically useless. > > Instead of unconditionally inserting entries into the EMC when a miss > occurs, use a 1% probability of insertion. This ensures that the most > frequent flows have the highest chance of creating an entry in the EMC, > and the probability of thrashing the EMC is also greatly reduced. > > The probability of insertion is configurable, via the > other_config:emc-insert-prob option. For example the following command > increases the insertion probability to 1/10 ie. 10%. > > ovs-vsctl set Open_vSwitch . other_config:emc-insert-prob=10 > > Signed-off-by: Ciara Loftus > Signed-off-by: Georg Schmuecking > Co-authored-by: Georg Schmuecking Thanks for v3. We should add Georg to AUTHORS.rst One of the unit tests ("PMD - stats") fails, because it checks the emc hit stats. I think we can fix it by configuring emc-insert-prob to 1 in that test (assuming that we accept emc-insert-prob even without DPDK). More comments inline. > --- > v3: > - Use new dpif other_config infrastructure to tidy up how the > emc-insert-prob value is passed to dpif-netdev. > v2: > - Enable probability configurability via other_config:emc-insert-prob > option. > > Documentation/howto/dpdk.rst | 20 + > NEWS | 2 ++ > lib/dpif-netdev.c| 53 > ++-- > vswitchd/vswitch.xml | 16 + > 4 files changed, 89 insertions(+), 2 deletions(-) > > diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst > index d1e6e89..f2e888b 100644 > --- a/Documentation/howto/dpdk.rst > +++ b/Documentation/howto/dpdk.rst > @@ -354,6 +354,26 @@ the `DPDK documentation > > Note: Not all DPDK virtual PMD drivers have been tested and verified to work. > > +EMC Insertion Probability > +- > +By default 1 in every 100 flows are inserted into the Exact Match Cache > (EMC). > +It is possible to change this insertion probability by setting the > +``emc-insert-prob`` option:: > + > +$ ovs-vsctl --no-wait set Open_vSwitch . other_config:emc-insert-prob=N > + > +where: > + > +``N`` > + is a positive integer between 0 and 4294967295 (maximum unsigned 32bit > int). > + > +If ``N`` is set to 1, an insertion will be performed for every flow. The > lower > +the value of ``emc-insert-prob`` the higher the probability of insertion, > +except for the value 0 which will result in no insertions being performed and > +thus essentially disabling the EMC. > + > +For more information on the EMC refer to :doc:`/intro/install/dpdk` . > + > .. _dpdk-ovs-in-guest: > > OVS with DPDK Inside VMs > diff --git a/NEWS b/NEWS > index 6838649..5a21580 100644 > --- a/NEWS > +++ b/NEWS > @@ -65,6 +65,8 @@ Post-v2.6.0 > device will not be available for use until a valid dpdk-devargs is > specified. > * Virtual DPDK Poll Mode Driver (vdev PMD) support. > + * EMC insertion probability is reduced to 1/100 and is configurable via > + the new 'other_config:emc-insert-prob' option. > - Fedora packaging: > * A package upgrade does not automatically restart OVS service. > - ovs-vswitchd/ovs-vsctl: > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index 0be5db5..e514ddb 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -144,6 +144,10 @@ struct netdev_flow_key { > #define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1) > #define EM_FLOW_HASH_SEGS 2 > > +/* Default EMC insert probability is 1 / DEFAULT_EM_FLOW_INSERT_PROB */ > +#define DEFAULT_EM_FLOW_INSERT_PROB 100 > +#define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX / DEFAULT_EM_FLOW_INSERT_PROB) > + > struct emc_entry { > struct dp_netdev_flow *flow; > struct netdev_flow_key key; /* key.hash used for emc hash value. */ > @@ -254,6 +258,9 @@ struct dp_netdev { > uint64_t last_tnl_conf_seq; > > struct conntrack conntrack; > + > +/* Probability of EMC insertions is a factor of 'emc_insert_min'.*/ > +atomic_uint32_t emc_insert_min; I'm not sure this makes sense, but I'm worried about emc_insert_min sharing the same cacheline with other fields (in this case inside conntrack) that are updated more often. Could we perhaps solve this by prefixing it with OVS_ALIGNED_VAR(CACHE_LINE_SIZE)? > }; > > static struct dp_netdev_port *dp_netdev_lookup_port(const struct dp_netdev > *dp, > @@ -1066,6 +1073,8 @@ create_dp_netdev(const char *name, const struct > dpif_class *class, > > conntrack_init(>conntrack); > > +atomic_init(>emc_insert_min, DEFAULT_EM_FLOW_INSERT_MIN); > + > cmap_init(>poll_threads); >
Re: [ovs-dev] [PATCH v2 1/1] doc: Remove "experimental" warning for userspace.
2017-02-03 9:38 GMT-08:00 Kevin Traynor: > On 02/02/2017 08:22 PM, Stokes, Ian wrote: >>> On 02/02/2017 04:44 PM, Ian Stokes wrote: Remove the experimental warning tag in documentation regarding OVS deployed via userspace. Signed-off-by: Ian Stokes --- Documentation/intro/install/dpdk.rst |3 --- Documentation/intro/install/userspace.rst |4 NEWS |2 ++ README.rst|6 +++--- 4 files changed, 5 insertions(+), 10 deletions(-) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index fff0a1a..3018590 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -29,9 +29,6 @@ This document describes how to build and install Open vSwitch using a DPDK datapath. Open vSwitch can use the DPDK library to operate entirely in userspace. -.. warning:: - The DPDK support of Open vSwitch is considered 'experimental'. - Build requirements -- diff --git a/Documentation/intro/install/userspace.rst b/Documentation/intro/install/userspace.rst index 0368527 ..ebd0c12 100644 --- a/Documentation/intro/install/userspace.rst +++ b/Documentation/intro/install/userspace.rst @@ -34,10 +34,6 @@ This version of Open vSwitch should be built manually with ``configure`` and been recently tested, and so Debian packages are not a recommended way to use this version of Open vSwitch. -.. warning:: - The userspace-only mode of Open vSwitch is considered experimental. It has - not been thoroughly tested. - Building and Installing --- diff --git a/NEWS b/NEWS index 5efcce2..8600f0e 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,8 @@ Post-v2.7.0 - Tunnels: * Added support to set packet mark for tunnel endpoint using `egress_pkt_mark` OVSDB option. + - Documentation: + * OVS deployed in userspace mode no longer tagged as experimental. >>> >>> I think this would be a bit clearer. What do you think? >>> >>> --- a/NEWS >>> +++ b/NEWS >>> @@ -59,4 +59,5 @@ v2.7.0 - xx xxx >>> * Removed support for IPsec tunnels. >>> - DPDK: >>> + * Removal of experimental tag. >> >> We can put it under a DPDK header like you have here, I wonder though is the >> removal of experimental only relevant to DPDK or do we need to consider OVS >> userspace without DPDK also? I would think both are well tested at this >> stage so experimental could be removed for both. Thoughts? >> > > Fine by me. I just want it to be clear that it's removal is for/covers > the DPDK datapath. Thanks for the patch, I'm in favor of the change for DPDK. Unless you have a strong reason do to so, I'd prefer to keep the experimental tag for the userspace datapath without DPDK for the following reasons: * I don't see a lot of valid use cases for it. I think it is important for testing. * There's at least a known problem with accessing linux device in userspace without DPDK for containers with offloads. See ddcf96d2dcc1("system-tests: Disable offloads in userspace tests.") What do you think? Thanks, Daniele > >>> >>> v2.7.0 - xx xxx - diff --git a/README.rst b/README.rst index f5cdaa5..90050e3 100644 --- a/README.rst +++ b/README.rst @@ -38,9 +38,9 @@ following features: The included Linux kernel module supports Linux 3.10 and up. -Open vSwitch can also operate, at a cost in performance, entirely in userspace, -without assistance from a kernel module. This userspace implementation should -be easier to port than the kernel-based switch. >>> It is considered experimental. +Open vSwitch can also operate entirely in userspace without +assistance from a kernel module. This userspace implementation +should be easier to port than the kernel-based switch. What's here? >> > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] doc: Update DPDK version for 2.7 release.
2017-02-02 8:59 GMT-08:00 Kevin Traynor: > On 02/02/2017 04:30 PM, Ian Stokes wrote: >> Add DPDK version required for the OVS 2.7 release in documentation. >> >> Signed-off-by: Ian Stokes > > Acked-by: Kevin Traynor Thanks, pushed to master and branch-2.7 > >> --- >> Documentation/faq/releases.rst |1 + >> 1 files changed, 1 insertions(+), 0 deletions(-) >> >> diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst >> index fcff5c3..319c2d7 100644 >> --- a/Documentation/faq/releases.rst >> +++ b/Documentation/faq/releases.rst >> @@ -159,6 +159,7 @@ Q: What DPDK version does each Open vSwitch release work >> with? >> 2.4.x2.0 >> 2.5.x2.2 >> 2.6.x16.07 >> +2.7.x16.11 >> = >> >> Q: I get an error like this when I configure Open vSwitch:: >> > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/3] rhel: Remove obsolete OVSDPDKVhostPort from ifdown script.
On 02/02/2017 11:49, "Ben Pfaff"wrote: >On Tue, Jan 24, 2017 at 06:21:52PM -0800, Daniele Di Proietto wrote: >> The support for vhost cuse port has been removed long ago. >> >> Fixes:419876444357("netdev-dpdk: Remove dpdkvhostcuse ports") >> Signed-off-by: Daniele Di Proietto > >Acked-by: Ben Pfaff Thanks, pushed to master, branch-2.7 and branch-2.6 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovn: Add missing netdev_close in setup_qos.
On Thu, Feb 2, 2017 at 12:41 PM, Ben Pfaffwrote: > On Thu, Feb 02, 2017 at 02:04:07AM -0500, Russell Bryant wrote: > > We missed calling netdev_close in a couple of places. One was in an > error > > condition rarely hit. The second was just introduced and would be hit in > > all cases where QoS is not in use. > > > > Fixes: dc2dab6e6de5 ("ovn-controller: Configure interface QoS only if it > would actually be used.") > > Signed-off-by: Russell Bryant > > Thank you! > > Acked-by: Ben Pfaff > Thanks, applied to master and branch-2.7. -- Russell Bryant ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] docs: Add OVS and OVN headings to pages.
On Thu, Feb 2, 2017 at 12:40 PM, Ben Pfaffwrote: > On Thu, Feb 02, 2017 at 02:02:34AM -0500, Russell Bryant wrote: > > Update the "deep dive" and "howto" pages with headings that more clearly > > indicate the separate lists of OVS or OVN content. Also add a link to > > ovn-architecture from the "deep dive" page as it seems quite relevant > > there. > > > > Signed-off-by: Russell Bryant > > Acked-by: Ben Pfaff > Thanks, applied to master. -- Russell Bryant ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC PATCH 2/5] OVN: SFC Implementation: New stage for SFC and modified ACL stage
Great, I'll have a look. On Thu, Feb 02, 2017 at 11:50:17PM +, John McDowall wrote: > Ben, > > Posted an updated patch – then downloaded it from patchwork and applied to > top to tree and ran make check and all tests passed. > > Regards > > John > > On 1/31/17, 3:49 PM, "Ben Pfaff"wrote: > > If the patches cannot sensibly be applied separately, then, yes, they > should be a single patch. > > On Tue, Jan 31, 2017 at 11:02:30PM +, John McDowall wrote: > > Ah, my bad do you want me to create a single patch file? > > > > Regards > > > > John > > > > On 1/31/17, 2:44 PM, "Ben Pfaff" wrote: > > > > Now that I glance at the patch titles, I guess that the problem > might be > > that this patch depends on some of the later patches. In general, > each > > patch should apply, build, and test properly whether or not later > > patches have been applied. > > > > On Tue, Jan 31, 2017 at 10:03:21PM +, John McDowall wrote: > > > Ben, > > > > > > Let me create a new patch set against the top of tree. > > > > > > Regards > > > > > > John > > > > > > On 1/31/17, 1:46 PM, "Ben Pfaff" wrote: > > > > > > On Tue, Dec 27, 2016 at 02:11:43PM -0800, John McDowall wrote: > > > > This is the major body of code that implements SFC. There > is a new L2 stage added to > > > > perform the chaining operations and modifications to the > ACL stage to direct flows > > > > to the service chain. > > > > > > > > Co-authored-by: Flavio Fernandes > > > > Reported at: > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMarch_040381.html=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=0-H45ymu2qKdNfehkwCF8baQWBqDNhngIVaX4MlOpCQ=VbhqfPkju3uYqy7303Bfbz0fgnSeIi6aYsQoRwIH1PU= > > > > > Reported at: > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMay_041359.html=DwIBAg=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ=0-H45ymu2qKdNfehkwCF8baQWBqDNhngIVaX4MlOpCQ=vNieFlb8T7dsSygACJyaJiHvnrQDGCyox17covGw4Ns= > > > > > > > > > Signed-off-by: John McDowall > > > > > > > I think that this has bit-rotted because I get tons of > compiler errors > > > trying to build it. I tried rewinding my repo to a point > from December > > > but I still the same ones: > > > > > > ../ovn/northd/ovn-northd.c:2669:13: error: incomplete > definition of type 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2664:18: note: forward > declaration of 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2669:37: error: incomplete > definition of type 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2664:18: note: forward > declaration of 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2673:30: error: incomplete > definition of type 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2664:18: note: forward > declaration of 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2674:30: error: incomplete > definition of type 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2664:18: note: forward > declaration of 'struct nbrec_logical_port_pair_group' > > > ../ovn/northd/ovn-northd.c:2701:49: error: no member > named 'options' in 'struct nbrec_acl' > > > ../ovn/northd/ovn-northd.c:2713:37: error: no member > named 'n_port_chains' in 'struct nbrec_logical_switch' > > > ../ovn/northd/ovn-northd.c:2714:30: error: no member > named 'port_chains' in 'struct nbrec_logical_switch' > > > ../ovn/northd/ovn-northd.c:2716:39: error: incomplete > definition of type 'struct nbrec_logical_port_chain' > > > ../ovn/northd/ovn-northd.c:2710:18: note: forward > declaration of 'struct nbrec_logical_port_chain' > > > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete > definition of type 'struct nbrec_logical_port_chain' > > > /usr/include/i386-linux-gnu/bits/string2.h:110:58: note: > expanded from macro 'strcmp' > > > ../ovn/northd/ovn-northd.c:2710:18: note: forward > declaration of 'struct nbrec_logical_port_chain' > > > ../ovn/northd/ovn-northd.c:2721:44: error: incomplete > definition of type 'struct
[ovs-dev] [PATCH 2/2] ovs-vsctl: Disallow empty bridge, port, and interface names.
Reported-by: Gabor LocseiReported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html Signed-off-by: Ben Pfaff --- tests/ovs-vsctl.at| 12 utilities/ovs-vsctl.c | 12 2 files changed, 24 insertions(+) diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at index 94b431f..08eb328 100644 --- a/tests/ovs-vsctl.at +++ b/tests/ovs-vsctl.at @@ -204,6 +204,9 @@ AT_CHECK([RUN_OVS_VSCTL([add-br a])], [0], [], [], [OVS_VSCTL_CLEANUP]) AT_CHECK([RUN_OVS_VSCTL([add-br a])], [1], [], [ovs-vsctl: cannot create a bridge named a because a bridge named a already exists ], [OVS_VSCTL_CLEANUP]) +AT_CHECK([RUN_OVS_VSCTL([add-br ''])], [1], [], + [ovs-vsctl: bridge name must not be empty string +], [OVS_VSCTL_CLEANUP]) OVS_VSCTL_CLEANUP AT_CLEANUP @@ -278,6 +281,9 @@ AT_CHECK([RUN_OVS_VSCTL( AT_CHECK([RUN_OVS_VSCTL([add-port a a1])], [1], [], [ovs-vsctl: cannot create a port named a1 because a port named a1 already exists on bridge a ], [OVS_VSCTL_CLEANUP]) +AT_CHECK([RUN_OVS_VSCTL([add-port a ''])], [1], [], + [ovs-vsctl: port name must not be empty string +], [OVS_VSCTL_CLEANUP]) OVS_VSCTL_CLEANUP AT_CLEANUP @@ -315,6 +321,12 @@ AT_CHECK([RUN_OVS_VSCTL([--may-exist add-bond a bond0 a2 a1])], [1], [], [ovs-vsctl: "--may-exist add-bond a bond0 a2 a1" but bond0 actually has interface(s) a1, a2, a3 ], [OVS_VSCTL_CLEANUP]) +AT_CHECK([RUN_OVS_VSCTL([add-bond a '' x y z])], [1], [], + [ovs-vsctl: port name must not be empty string +], [OVS_VSCTL_CLEANUP]) +AT_CHECK([RUN_OVS_VSCTL([add-bond a x '' y z])], [1], [], + [ovs-vsctl: interface name must not be empty string +], [OVS_VSCTL_CLEANUP]) CHECK_BRIDGES([a, a, 0]) CHECK_PORTS([a], [bond0]) CHECK_IFACES([a], [a1], [a2], [a3]) diff --git a/utilities/ovs-vsctl.c b/utilities/ovs-vsctl.c index 3719046..11e1ff6 100644 --- a/utilities/ovs-vsctl.c +++ b/utilities/ovs-vsctl.c @@ -1154,6 +1154,9 @@ cmd_add_br(struct ctl_context *ctx) int vlan; br_name = ctx->argv[1]; +if (!br_name[0]) { +ctl_fatal("bridge name must not be empty string"); +} if (ctx->argc == 2) { parent_name = NULL; vlan = 0; @@ -1503,6 +1506,15 @@ add_port(struct ctl_context *ctx, struct ovsrec_port *port; size_t i; +if (!port_name[0]) { +ctl_fatal("port name must not be empty string"); +} +for (size_t i = 0; i < n_ifaces; i++) { +if (!iface_names[i][0]) { +ctl_fatal("interface name must not be empty string"); +} +} + vsctl_context_populate_cache(ctx); if (may_exist) { struct vsctl_port *vsctl_port; -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/2] netdev: Reject empty names in netdev_open().
The empty string is not a valid name for a network device. I would have expected that each of the netdev provider implementations would reject an empty string, but there was a special case for Linux tap devices where they instead caused unexpected behavior. This commit should fix the problem for those devices and every other kind. Reported-by: Gabor LocseiReported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043613.html Signed-off-by: Ben Pfaff --- lib/netdev.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/netdev.c b/lib/netdev.c index 1e6bb2b..a8d8eda 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016 Nicira, Inc. + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2016, 2017 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -343,6 +343,14 @@ netdev_open(const char *name, const char *type, struct netdev **netdevp) struct netdev *netdev; int error; +if (!name[0]) { +/* Reject empty names. This saves the providers having to do this. At + * least one screwed this up: the netdev-linux "tap" implementation + * passed the name directly to the Linux TUNSETIFF call, which treats + * an empty string as a request to generate a unique name. */ +return EINVAL; +} + netdev_initialize(); ovs_mutex_lock(_mutex); -- 2.10.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support
On Fri, Feb 03, 2017 at 03:54:52PM +, Daniel P. Berrange wrote: > On Fri, Feb 03, 2017 at 05:34:07PM +0200, Michael S. Tsirkin wrote: > > On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote: > > > Hi, > > > > > > On 02/01/2017 09:35 AM, Maxime Coquelin wrote: > > > > Hi, > > > > > > > > Few months ago, Michael reported a problem about migrating VMs relying > > > > on vhost-user between hosts supporting different backend versions: > > > > - Message-Id: <20161011173526-mutt-send-email-...@kernel.org> > > > > - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html > > > > > > > > The goal of this thread is to draft a proposal based on the outcomes > > > > of discussions with contributors of the different parties (DPDK/OVS > > > > /libvirt/...). > > > > > > Thanks the first feedback. It seems to converge that this is Nova's > > > role, but not Libvirt one to manage these versions from management tool > > > layer. > > > > > > I think the conclusion is not that it should go up the stack. I think > > this will just get broken all the time. No one understands versions and > > stuff. Even QEMU developers get confused and break compatibility once in > > a while. > > I don't think it is really true, nor is it really specific to migration, > it can hit any time you have a network connection that is re-opened even > without migration. Version compatibility negotiations are an inherant part > of any network protocol that is expected to evolve over time. > > > My conclusion is that doing it from OVS side is wrong. Migration is not > > an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.People > > just want migration to work, ok? It's our job to do it, we do not really > > need a "make things work" flag. > > > > If libvirt does not want to use the vhost-user protocol (which sounds > > reasonable, it's rather complex) how about qemu providing a small > > utility to query the port? We could output json or whatever. > > > > This can help with MTU as well. > > > > And maybe it will help with nowait support - if someone uses the utility > > to dump backend config once, QEMU can later start the device without > > feature queries. > > I don't think it is QEMU's job to deal with external components in > this way and don't think QEMU vhost-user should be treated as special. It's very special because it's the only time we ship parts of device emulation out to an external service. So they need to be versioned together with QEMU. > This general scenario arises any time there is a QEMU backend is talking > to an external service/process. > > For example, a virtio-serial talking to a daemon in the host. This daemon > can support different versions of the protocol being spoken, so we have > the same compat issue on migration. Or a traditional serial device, which > the guest is using to talk to an external daemon the host. In a few cases > we might know what the protocol is, but in general the data stream is > opaque to QEMU. Are there any examples of people actually managing this across versions and getting it right? AFAIK there are cases where data stream is opaque to QEMU like -cpu host and in every such case VM is unmigrateable. OTOH vhost-user is not opaque to QEMU at all, and that is intentional so we can support migration. > QEMU should not have need to learn about every single protocol that might > be used to communicate with some external service. This is an unbounded > set of possibilities to deal with, some of which will not even be open > source. > > This all needs to be delegated to whatever mgmt app is responsible for > setting up these external services, as it has the right domain knowledge > about the applications being used, to make the policy decisions that are > suitable for its usage scenario. > > Regards, > Daniel I disagree it should not and in fact this is exactly what QEMU often does. This is how in the past we were mostly able to emulate any missing backend features in QEMU to make things transparent to guest. E.g. it can actually support the same virtio net device with userspace and kernel backends and migrate between them. > -- > |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| > |: http://libvirt.org -o- http://virt-manager.org :| > |: http://entangle-photo.org -o-http://search.cpan.org/~danberr/ :| ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support
On Fri, Feb 03, 2017 at 05:34:07PM +0200, Michael S. Tsirkin wrote: > On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote: > > Hi, > > > > On 02/01/2017 09:35 AM, Maxime Coquelin wrote: > > > Hi, > > > > > > Few months ago, Michael reported a problem about migrating VMs relying > > > on vhost-user between hosts supporting different backend versions: > > > - Message-Id: <20161011173526-mutt-send-email-...@kernel.org> > > > - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html > > > > > > The goal of this thread is to draft a proposal based on the outcomes > > > of discussions with contributors of the different parties (DPDK/OVS > > > /libvirt/...). > > > > Thanks the first feedback. It seems to converge that this is Nova's > > role, but not Libvirt one to manage these versions from management tool > > layer. > > > I think the conclusion is not that it should go up the stack. I think > this will just get broken all the time. No one understands versions and > stuff. Even QEMU developers get confused and break compatibility once in > a while. I don't think it is really true, nor is it really specific to migration, it can hit any time you have a network connection that is re-opened even without migration. Version compatibility negotiations are an inherant part of any network protocol that is expected to evolve over time. > My conclusion is that doing it from OVS side is wrong. Migration is not > an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.People > just want migration to work, ok? It's our job to do it, we do not really > need a "make things work" flag. > > If libvirt does not want to use the vhost-user protocol (which sounds > reasonable, it's rather complex) how about qemu providing a small > utility to query the port? We could output json or whatever. > > This can help with MTU as well. > > And maybe it will help with nowait support - if someone uses the utility > to dump backend config once, QEMU can later start the device without > feature queries. I don't think it is QEMU's job to deal with external components in this way and don't think QEMU vhost-user should be treated as special. This general scenario arises any time there is a QEMU backend is talking to an external service/process. For example, a virtio-serial talking to a daemon in the host. This daemon can support different versions of the protocol being spoken, so we have the same compat issue on migration. Or a traditional serial device, which the guest is using to talk to an external daemon the host. In a few cases we might know what the protocol is, but in general the data stream is opaque to QEMU. QEMU should not have need to learn about every single protocol that might be used to communicate with some external service. This is an unbounded set of possibilities to deal with, some of which will not even be open source. This all needs to be delegated to whatever mgmt app is responsible for setting up these external services, as it has the right domain knowledge about the applications being used, to make the policy decisions that are suitable for its usage scenario. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o-http://search.cpan.org/~danberr/ :| ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support
On Fri, Feb 03, 2017 at 03:11:10PM +0100, Maxime Coquelin wrote: > Hi, > > On 02/01/2017 09:35 AM, Maxime Coquelin wrote: > > Hi, > > > > Few months ago, Michael reported a problem about migrating VMs relying > > on vhost-user between hosts supporting different backend versions: > > - Message-Id: <20161011173526-mutt-send-email-...@kernel.org> > > - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html > > > > The goal of this thread is to draft a proposal based on the outcomes > > of discussions with contributors of the different parties (DPDK/OVS > > /libvirt/...). > > Thanks the first feedback. It seems to converge that this is Nova's > role, but not Libvirt one to manage these versions from management tool > layer. I think the conclusion is not that it should go up the stack. I think this will just get broken all the time. No one understands versions and stuff. Even QEMU developers get confused and break compatibility once in a while. My conclusion is that doing it from OVS side is wrong. Migration is not an OVS thing, it's a QEMU thing, and libvirt abstracts QEMU.People just want migration to work, ok? It's our job to do it, we do not really need a "make things work" flag. If libvirt does not want to use the vhost-user protocol (which sounds reasonable, it's rather complex) how about qemu providing a small utility to query the port? We could output json or whatever. This can help with MTU as well. And maybe it will help with nowait support - if someone uses the utility to dump backend config once, QEMU can later start the device without feature queries. > This change has has no impact from OVS perspective, same requirements > apply. I am interested on OVS contributors feedback on the below design > proposal. > > Especially, I would like to have your opinion on the best way for OVS to > expose its supported versions: > - Static file generated at build time from version table described below > - Entries in the OVS DB > - Dedicated tool listing strings from the version table described below > > For selecting the right version of the vhost-user backend, do you agree > it should be done via a new parameter of the ovs-vsctl add-port command > for dpdkvhostuser ports? > > > > Problem statement: > > == > > > > When migrating a VM from one host to another, the interfaces exposed by > > QEMU must stay unchanged in order to guarantee a successful migration. > > In the case of vhost-user interface, parameters like supported Virtio > > feature set, max number of queues, max vring sizes,... must remain > > compatible. Indeed, the frontend not being re-initialized, no > > renegotiation happens at migration time. > > > > For example, we have a VM that runs on host A, which has its vhost-user > > backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest > > also support this feature, it is successfully negotiated, and guest > > transmit packets using indirect descriptor tables, that the backend > > knows to handle. > > At some point, the VM is being migrated to host B, which runs an older > > version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC > > feature. The migration would break, because the Guest still have the > > VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some > > decriptors pointing to indirect tables, that backend B doesn't know to > > handle. > > This is just an example about Virtio features compatibility, but other > > backend implementation details could cause other failures. > > > > What we need is to be able to query the destination host's backend to > > ensure migration is possible. Also, we would need to query this > > statically, even before the VM is started, to be sure it could be > > migrated elsewhere for any reason. > > ... > > > > > Solution 3: Libvirt queries OVS for vhost backend version string: *OK* > > == > > > > > > The idea is to have a table of supported versions, associated to > > key/value pairs. Libvirt could query the list of supported versions > > strings for each hosts, and select the first common one among all hosts. > > > > Then, libvirt would ask OVS to probe the vhost-user interfaces in the > > selected version (compatibility mode). For example host A runs OVS-2.7, > > and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode > > (e.g. with indirect descriptors disabled), which should be selected at > > vhost-user interface probe time. > > > > Advantage of doing so is that libvirt does not need any update if new > > keys are introduced (i.e. it does not need to know how the new keys have > > to be handled), all these checks remain in OVS's vhost-user implementation. > > > > Ideally, we would support per vhost-user interface compatibility mode, > > which may have an impact also on DPDK API, as the Virtio feature update > > API is global, and not per port. > > > > -
Re: [ovs-dev] [RFC] Vhost-user backends cross-version migration support
Hi, On 02/01/2017 09:35 AM, Maxime Coquelin wrote: Hi, Few months ago, Michael reported a problem about migrating VMs relying on vhost-user between hosts supporting different backend versions: - Message-Id: <20161011173526-mutt-send-email-...@kernel.org> - https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03026.html The goal of this thread is to draft a proposal based on the outcomes of discussions with contributors of the different parties (DPDK/OVS /libvirt/...). Thanks the first feedback. It seems to converge that this is Nova's role, but not Libvirt one to manage these versions from management tool layer. This change has has no impact from OVS perspective, same requirements apply. I am interested on OVS contributors feedback on the below design proposal. Especially, I would like to have your opinion on the best way for OVS to expose its supported versions: - Static file generated at build time from version table described below - Entries in the OVS DB - Dedicated tool listing strings from the version table described below For selecting the right version of the vhost-user backend, do you agree it should be done via a new parameter of the ovs-vsctl add-port command for dpdkvhostuser ports? Problem statement: == When migrating a VM from one host to another, the interfaces exposed by QEMU must stay unchanged in order to guarantee a successful migration. In the case of vhost-user interface, parameters like supported Virtio feature set, max number of queues, max vring sizes,... must remain compatible. Indeed, the frontend not being re-initialized, no renegotiation happens at migration time. For example, we have a VM that runs on host A, which has its vhost-user backend advertising VIRTIO_F_RING_INDIRECT_DESC feature. Since the Guest also support this feature, it is successfully negotiated, and guest transmit packets using indirect descriptor tables, that the backend knows to handle. At some point, the VM is being migrated to host B, which runs an older version of the backend not supporting this VIRTIO_F_RING_INDIRECT_DESC feature. The migration would break, because the Guest still have the VIRTIO_F_RING_INDIRECT_DESC bit sets, and the virtqueue contains some decriptors pointing to indirect tables, that backend B doesn't know to handle. This is just an example about Virtio features compatibility, but other backend implementation details could cause other failures. What we need is to be able to query the destination host's backend to ensure migration is possible. Also, we would need to query this statically, even before the VM is started, to be sure it could be migrated elsewhere for any reason. ... Solution 3: Libvirt queries OVS for vhost backend version string: *OK* == The idea is to have a table of supported versions, associated to key/value pairs. Libvirt could query the list of supported versions strings for each hosts, and select the first common one among all hosts. Then, libvirt would ask OVS to probe the vhost-user interfaces in the selected version (compatibility mode). For example host A runs OVS-2.7, and host B OVS-2.6. Host A's OVS-2.7 has an OVS-2.6 compatibility mode (e.g. with indirect descriptors disabled), which should be selected at vhost-user interface probe time. Advantage of doing so is that libvirt does not need any update if new keys are introduced (i.e. it does not need to know how the new keys have to be handled), all these checks remain in OVS's vhost-user implementation. Ideally, we would support per vhost-user interface compatibility mode, which may have an impact also on DPDK API, as the Virtio feature update API is global, and not per port. - Implementation: - Goal here is just to illustrate this proposal, I'm sure you will have good suggestion to improve it. In OVS vhost-user library, we would introduce a new structure, for example (neither compiled nor tested): struct vhostuser_compat { char *version; uint64_t virtio_features; uint32_t max_rx_queue_sz; uint32_t max_nr_queues; }; *version* field is the compatibility version string. It could be something like: "upstream.ovs-dpdk.v2.6" In case for example Fedora adds some more patches to its package that would break migration to upstream version, it could have a dedicated compatibility string: "fc26.ovs-dpdk.v2.6". In case OVS-v2.7 does not break compatibility with previous OVS-v2.6 version, then no need to create a new compatibility entry, just keep v2.6 one. *virtio_features* field is the Virtio features set for a given compatibility version. When an OVS tag is to be created, it would be associated to a DPDK version. The Virtio features for these version would be stored in this field. It would allow to upgrade the DPDK package for example from v16.07 to v16.11 without breaking migration. In case the distribution wants to benefit from latests Virtio
[ovs-dev] [PATCH 1/3] datapath: backport: openvswitch: 802.1ad uapi changes.
commit 8c146bb9d59aa2ac45222171916ece186c4b3943 Author: Thomas F HerbertDate: Wed Sep 7 12:56:57 2016 -0400 openvswitch: Add support for 8021.AD Change the description of the VLAN tpid field. Signed-off-by: Thomas F Herbert Acked-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Yi Yang --- datapath/linux/compat/include/linux/openvswitch.h | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index 425d3a4..b6f43b3 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -629,13 +629,13 @@ struct ovs_action_push_mpls { * @vlan_tci: Tag control identifier (TCI) to push. The CFI bit must be set * (but it will not be set in the 802.1Q header that is pushed). * - * The @vlan_tpid value is typically %ETH_P_8021Q. The only acceptable TPID - * values are those that the kernel module also parses as 802.1Q headers, to - * prevent %OVS_ACTION_ATTR_PUSH_VLAN followed by %OVS_ACTION_ATTR_POP_VLAN - * from having surprising results. + * The @vlan_tpid value is typically %ETH_P_8021Q or %ETH_P_8021AD. + * The only acceptable TPID values are those that the kernel module also parses + * as 802.1Q or 802.1AD headers, to prevent %OVS_ACTION_ATTR_PUSH_VLAN followed + * by %OVS_ACTION_ATTR_POP_VLAN from having surprising results. */ struct ovs_action_push_vlan { - __be16 vlan_tpid; /* 802.1Q TPID. */ + __be16 vlan_tpid; /* 802.1Q or 802.1ad TPID. */ __be16 vlan_tci;/* 802.1Q TCI (VLAN ID and priority). */ }; @@ -755,9 +755,10 @@ enum ovs_nat_attr { * @OVS_ACTION_ATTR_TRUNC: Output packet to port with truncated packet size. * @OVS_ACTION_ATTR_USERSPACE: Send packet to userspace according to nested * %OVS_USERSPACE_ATTR_* attributes. - * @OVS_ACTION_ATTR_PUSH_VLAN: Push a new outermost 802.1Q header onto the - * packet. - * @OVS_ACTION_ATTR_POP_VLAN: Pop the outermost 802.1Q header off the packet. + * @OVS_ACTION_ATTR_PUSH_VLAN: Push a new outermost 802.1Q or 802.1ad header + * onto the packet. + * @OVS_ACTION_ATTR_POP_VLAN: Pop the outermost 802.1Q or 802.1ad header + * from the packet. * @OVS_ACTION_ATTR_SAMPLE: Probabilitically executes actions, as specified in * the nested %OVS_SAMPLE_ATTR_* attributes. * @OVS_ACTION_ATTR_SET: Replaces the contents of an existing header. The -- 2.1.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 0/3] Backport 802.1ad patches
This patch set is to backport 802.1ad patches Eric Garver did to ovs, per discussion in ovs-dev mailing list, this fix patch "openvswitch: upcall: Fix vlan handling" depends on them, l3 patch set depends on this patch set and "openvswitch: upcall: Fix vlan handling". Once this patch set is merged, l3 patch set can be merged. Yi Yang (3): datapath: backport: openvswitch: 802.1ad uapi changes. datapath: backport: vlan: Check for vlan ethernet types for 8021.q or 802.1ad datapath: backport: openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes datapath/actions.c| 16 +- datapath/flow.c | 65 +++-- datapath/flow.h | 8 +- datapath/flow_netlink.c | 311 +++--- datapath/linux/compat/include/linux/if_vlan.h | 24 +- datapath/linux/compat/include/linux/openvswitch.h | 17 +- datapath/vport.c | 7 +- 7 files changed, 312 insertions(+), 136 deletions(-) -- 2.1.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support
On 02/03/2017 11:11 AM, Daniel P. Berrange wrote: On Fri, Feb 03, 2017 at 10:41:12AM +0100, Maxime Coquelin wrote: On 02/03/2017 10:27 AM, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote: On 02/01/2017 12:41 PM, Daniel P. Berrange wrote: It depends where / how in OVS it needs to be set. The only stuff libvirt does with OVS is to run 'add-port' and 'del-port' commands via the ovs cli tool. We pass through arguments from the port profile stored in the XML config. eg those things in get passed as cli args to the 'add-port' command. Soo if add-port needs this new version string, then we'd need to add the version to the openvswitch virtualport XML. If the version is provided to OVS in a different command, then it would probably be outside scope of libvirt. I think it would make sense to be a parameter of the add-port command. But it would be for vhost-user related add-port command, I didn't find where/if this is managed in libvirt XML. For vhost-user, libvirt does not have any interaction with OVS at all. If the thing that's using the vhost-user UNIX socket, in turn connects to OVS, that's outside scope of libvirt. IOW, for vhost-user OVS it seems like that job is for Nova / os-vif to solve. I don't think they currently understand the issues involved in cross-version migration though. This is a complex subject and easy to get wrong. It would be significantly better to figure it out at libvirt level since it already deals with cross-version migration issues. Libvirt considers vhost-user to be a blackbox though - it just exposes a UNIX socket, and whatever is on the other end is completely opaque. The fact that the other end might plumb the data stream into openvswitch is not something libvirt should know, as we don't want to end up having to add custom code to libvirt for every different vhost-user server impl. IOW, if the version str can be passed to QEMU, and then onto vhost-user backend in QEMU, then libvirt can be involved. If the version str has to be given to openvswitch that's not for libvirt to deall with. Regards, Daniel It's not just about passing it to QEMU. The following is needed: - need to query version when creating VM/device - need to query supported versions when migrating VM Those are both things that nova can do, since it knows what the vhost-user device in question is connected to, and so can query the versions, and check versions before triggering migration with libvirt It can, but then it will need to query libvirt or source for the version string since it's in the XML. No, it wouldn't be in the XML at all. Nova on the source queries what vhostuser version it has and what the target host has, and can prevent the migration if they're incompatible. This is not sufficient. Exactly the same as qemu machine type, this must be preserved from time of install and moved wherever VM goes. Nova has the ability todo that. I dont think libvirt has to be involved at all for this, as all the info can be obtained by nova/os-vif from the vhostuser impl it has configured. Given we are already confused at libvirt level, I find it highly unlikely upper levels will do the right thing. The only confusion is about what must consume the data. I mistakenly thought it was being proposed that the qemu vhostuser backend was being given a version string. Now it seems the version info must be set against the 3rd party component that vhostuser connects to, and that is outside scope of libvirt. That is entirely managed by Nova today, so Nova is the right thing to manage this versioning info. This is my understanding, with as example using Nova: 1. Before starting the VM, Nova queries source host's OVS and all possible destination hosts' OVS to retrieve supported versions 2. Nova chooses the first common version supported by all hosts 3. Nova create the dpdkvhostuser interfaces in OVS with passing the selected version as parameter That should be enough, right? Or maybe I miss something? It isn't realistic to check all possible destination hosts when first starting QEMU, not least becasue they may not yet exist. New hardware may be deployed *after* QEMU is first started. When starting the VM, it should just default to using whatever the latest version is on the host in question. There can be an optional host configuration parameter that admins can set to
[ovs-dev] [PATCH 7/7] ofproto-dpif-xlate: refactor compose_output_action__
The very long function compose_output_action__() has been re-factored to make the different cases for output to patch-port, native tunnel port, kernel tunnel port, recirculation, or termination of a native tunnel at output to LOCAL port clearer. Larger, self-contained blocks have been split out into separate functions. Signed-off-by; Jan ScheurichCo-authored-by: Zoltan Balogh --- ofproto/ofproto-dpif-xlate.c | 432 +++ 1 file changed, 230 insertions(+), 202 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 72dabf8..a19b9f2 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3072,40 +3072,26 @@ xlate_flow_is_protected(const struct xlate_ctx *ctx, const struct flow *flow, co xport_in->xbundle->protected && xport_out->xbundle->protected); } -static void -compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, -const struct xlate_bond_recirc *xr, bool check_stp) +static bool +check_output_prerequisites(struct xlate_ctx *ctx, + const struct xport *xport, + struct flow *flow, + bool check_stp) { -const struct xport *xport = get_ofp_port(ctx->xbridge, ofp_port); -struct flow_wildcards *wc = ctx->wc; -struct flow *flow = >xin->flow; -struct flow_tnl flow_tnl; -ovs_be16 flow_vlan_tci; -uint32_t flow_pkt_mark; -uint8_t flow_nw_tos; -odp_port_t out_port, odp_port; -bool tnl_push_pop_send = false; -uint8_t dscp; - -/* If 'struct flow' gets additional metadata, we'll need to zero it out - * before traversing a patch port. */ -BUILD_ASSERT_DECL(FLOW_WC_SEQ == 37); -memset(_tnl, 0, sizeof flow_tnl); - if (!xport) { xlate_report(ctx, OFT_WARN, "Nonexistent output port"); -return; +return false; } else if (xport->config & OFPUTIL_PC_NO_FWD) { xlate_report(ctx, OFT_DETAIL, "OFPPC_NO_FWD set, skipping output"); -return; +return false; } else if (ctx->mirror_snaplen != 0 && xport->odp_port == ODPP_NONE) { xlate_report(ctx, OFT_WARN, "Mirror truncate to ODPP_NONE, skipping output"); -return; +return false; } else if (xlate_flow_is_protected(ctx, flow, xport)) { xlate_report(ctx, OFT_WARN, "Flow is between protected ports, skipping output."); -return; +return false; } else if (check_stp) { if (is_stp(>base_flow)) { if (!xport_stp_should_forward_bpdu(xport) && @@ -3119,7 +3105,7 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, "RSTP not managing BPDU in this state, " "skipping bpdu output"); } -return; +return false; } } else if (!xport_stp_forward_state(xport) || !xport_rstp_forward_state(xport)) { @@ -3130,153 +3116,190 @@ compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port, xlate_report(ctx, OFT_WARN, "RSTP not in forwarding state, skipping output"); } -return; +return false; } } +return true; +} -if (flow->packet_type == PT_ETH && xport->is_layer3 ) { -/* Ethernet packet to L3 outport -> pop ethernet header. */ -flow->packet_type = PACKET_TYPE(OFPHTN_ETHERTYPE, ntohs(flow->dl_type)); +static void +xlate_output_patch_port(struct xlate_ctx *ctx, +const struct xport *xport) +{ +const struct xport *peer = xport->peer; +struct flow *flow = >xin->flow; +struct flow old_flow = ctx->xin->flow; +struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel; +bool old_conntrack = ctx->conntracked; +bool old_was_mpls = ctx->was_mpls; +ovs_version_t old_version = ctx->xin->tables_version; +struct ofpbuf old_stack = ctx->stack; +uint8_t new_stack[1024]; +struct ofpbuf old_action_set = ctx->action_set; +struct ovs_list *old_trace = ctx->xin->trace; +uint64_t actset_stub[1024 / 8]; + +ofpbuf_use_stub(>stack, new_stack, sizeof new_stack); +ofpbuf_use_stub(>action_set, actset_stub, sizeof actset_stub); +flow->in_port.ofp_port = peer->ofp_port; +flow->metadata = htonll(0); +memset(>tunnel, 0, sizeof flow->tunnel); +flow->tunnel.metadata.tab = ofproto_get_tun_tab( +>xbridge->ofproto->up); +ctx->wc->masks.tunnel.metadata.tab = flow->tunnel.metadata.tab; +memset(flow->regs, 0, sizeof flow->regs); +flow->actset_output = OFPP_UNSET; +clear_conntrack(ctx); +ctx->xin->trace = xlate_report(ctx, OFT_BRIDGE, + "bridge(\"%s\")",
[ovs-dev] [PATCH 6/7] dpif-netlink: Don't send PACKET_TYPE to kernel
The kernel datapath does not support the packet_type match field. Instead it encodes the packet type implicitly by the presence or absence of the Ethernet attribute in the flow key and mask. This patch filters the PACKET_TYPE attribute out of netlink flow key and mask to be sent to the kernel datapath. Signed-off-by; Zoltan Balogh--- lib/dpif-netlink.c | 37 - 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index a70d11c..5479c3a 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -2863,6 +2863,34 @@ dpif_netlink_flow_from_ofpbuf(struct dpif_netlink_flow *flow, return 0; } +static inline void +nl_msg_put_exclude_packet_type(struct ofpbuf* buf, uint16_t type, + const struct nlattr *data, + uint16_t data_len) +{ +const struct nlattr *packet_type = nl_attr_find__(data, data_len, + OVS_KEY_ATTR_PACKET_TYPE); +if (packet_type) { +/* exclude PACKET_TYPE Netlink attribute. */ +ovs_assert(NLA_ALIGN(packet_type->nla_len) == NLA_HDRLEN + sizeof(uint32_t)); + +size_t packet_type_len = NLA_HDRLEN + sizeof(uint32_t); +size_t first_chunk_size = +(size_t)((uint8_t *)packet_type - (uint8_t *)data); +size_t second_chunk_size = data_len - first_chunk_size + - packet_type_len; +uint8_t *first_attr = NULL; +struct nlattr *next_attr = nl_attr_next(packet_type); + +first_attr = nl_msg_put_unspec_uninit(buf, type, + data_len - packet_type_len); +memcpy(first_attr, data, first_chunk_size); +memcpy(first_attr + first_chunk_size, next_attr, second_chunk_size); +} else { +nl_msg_put_unspec(buf, type, data, data_len); +} +} + /* Appends to 'buf' (which must initially be empty) a "struct ovs_header" * followed by Netlink attributes corresponding to 'flow'. */ static void @@ -2889,13 +2917,12 @@ dpif_netlink_flow_to_ofpbuf(const struct dpif_netlink_flow *flow, } if (!flow->ufid_terse || !flow->ufid_present) { if (flow->key_len) { -nl_msg_put_unspec(buf, OVS_FLOW_ATTR_KEY, - flow->key, flow->key_len); +nl_msg_put_exclude_packet_type(buf, OVS_FLOW_ATTR_KEY, flow->key, + flow->key_len); } - if (flow->mask_len) { -nl_msg_put_unspec(buf, OVS_FLOW_ATTR_MASK, - flow->mask, flow->mask_len); +nl_msg_put_exclude_packet_type(buf, OVS_FLOW_ATTR_MASK, flow->mask, + flow->mask_len); } if (flow->actions || flow->actions_len) { nl_msg_put_unspec(buf, OVS_FLOW_ATTR_ACTIONS, -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 3/7] userspace: Switching of L3 packets in L2 pipeline
Ports have a new layer3 attribute if they send/receive L3 packets. The packet_type included in structs dp_packet and flow is considered in ofproto-dpif. The classical L2 match fields (dl_src, dl_dst, dl_type, and vlan_tci, vlan_vid, vlan_pcp) now have Ethernet as pre-requisite. A dummy ethernet header is pushed to L3 packets received from L3 ports before the pipeline processing starts. The ethernet header is popped before sending a packet to a L3 port. For datapath ports that can receive L2 or L3 packets, the packet_type becomes part of the flow key for datapath flows and is handled appropriately in dpif-netdev. Signed-off-by: Lorand JakabSigned-off-by: Simon Horman Signed-off-by: Jiri Benc Signed-off-by: Yi Yang Signed-off-by: Jan Scheurich Co-authored-by: Zoltan Balogh --- build-aux/extract-ofp-fields | 1 + datapath/linux/compat/include/linux/openvswitch.h | 2 + include/openvswitch/match.h | 1 + include/openvswitch/meta-flow.h | 15 +- lib/dpif-netdev.c | 57 lib/dpif-netlink.c| 2 +- lib/match.c | 19 ++- lib/meta-flow.c | 2 + lib/netdev-vport.c| 8 +- lib/netdev.h | 1 + lib/odp-execute.c | 2 + lib/odp-util.c| 159 ++ lib/odp-util.h| 6 +- lib/packets.h | 1 - ofproto/ofproto-dpif-sflow.c | 1 + ofproto/ofproto-dpif-upcall.c | 4 +- ofproto/ofproto-dpif-xlate.c | 50 ++- ofproto/ofproto-dpif.c| 6 +- 18 files changed, 251 insertions(+), 86 deletions(-) diff --git a/build-aux/extract-ofp-fields b/build-aux/extract-ofp-fields index 40f1bb2..fc6ce1e 100755 --- a/build-aux/extract-ofp-fields +++ b/build-aux/extract-ofp-fields @@ -39,6 +39,7 @@ FORMATTING = {"decimal":("MFS_DECIMAL", 1, 8), "TCP flags": ("MFS_TCP_FLAGS",2, 2)} PREREQS = {"none": "MFP_NONE", + "Ethernet": "MFP_ETHERNET", "ARP": "MFP_ARP", "VLAN VID": "MFP_VLAN_VID", "IPv4": "MFP_IPV4", diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index d78d8a8..9e4f0b1 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -361,6 +361,8 @@ enum ovs_key_attr { /* Only used within kernel data path. */ OVS_KEY_ATTR_TUNNEL_INFO, /* struct ovs_tunnel_info */ #endif + + OVS_KEY_ATTR_PACKET_TYPE, /* be32 packet type */ __OVS_KEY_ATTR_MAX }; diff --git a/include/openvswitch/match.h b/include/openvswitch/match.h index 0b5f050..4f4e096 100644 --- a/include/openvswitch/match.h +++ b/include/openvswitch/match.h @@ -99,6 +99,7 @@ void match_set_ct_mark(struct match *, uint32_t ct_mark); void match_set_ct_mark_masked(struct match *, uint32_t ct_mark, uint32_t mask); void match_set_ct_label(struct match *, ovs_u128 ct_label); void match_set_ct_label_masked(struct match *, ovs_u128 ct_label, ovs_u128 mask); +void match_set_packet_type(struct match *, ovs_be32 packet_type); void match_set_skb_priority(struct match *, uint32_t skb_priority); void match_set_dl_type(struct match *, ovs_be16); void match_set_dl_src(struct match *, const struct eth_addr ); diff --git a/include/openvswitch/meta-flow.h b/include/openvswitch/meta-flow.h index d5c0971..9daca36 100644 --- a/include/openvswitch/meta-flow.h +++ b/include/openvswitch/meta-flow.h @@ -854,7 +854,7 @@ enum OVS_PACKED_ENUM mf_field_id { * Type: MAC. * Maskable: bitwise. * Formatting: Ethernet. - * Prerequisites: none. + * Prerequisites: Ethernet. * Access: read/write. * NXM: NXM_OF_ETH_SRC(2) since v1.1. * OXM: OXM_OF_ETH_SRC(4) since OF1.2 and v1.7. @@ -870,7 +870,7 @@ enum OVS_PACKED_ENUM mf_field_id { * Type: MAC. * Maskable: bitwise. * Formatting: Ethernet. - * Prerequisites: none. + * Prerequisites: Ethernet. * Access: read/write. * NXM: NXM_OF_ETH_DST(1) since v1.1. * OXM: OXM_OF_ETH_DST(3) since OF1.2 and v1.7. @@ -889,7 +889,7 @@ enum OVS_PACKED_ENUM mf_field_id { * Type: be16. * Maskable: no. * Formatting: hexadecimal. - * Prerequisites: none. + * Prerequisites: Ethernet. * Access: read-only. * NXM: NXM_OF_ETH_TYPE(3) since v1.1. * OXM: OXM_OF_ETH_TYPE(5) since OF1.2 and v1.7. @@ -919,7 +919,7 @@ enum
[ovs-dev] [PATCH 5/7] userspace: L3 tunnel support for GRE and LISP
Add a boolean "layer3" configuration option for tunnel vports. The layer3 option defaults to false for all ports except LISP. GRE ports accept both true and false for "layer3". A tunnel vport configured with layer3=true receives L3 packets. which are then converted to Ethernet packets by pushing a dummy Ethernet heder at the ingress of the OpenFlow pipeline. The Ethernet header of a packet is stripped before sending to a layer3 tunnel vport. Presently a single GRE vport cannot carry both L2 and L3 packets. But it is possible to create two GRE vports representing the same GRE tunel, one with layer3=false, the other with layer3=true. L2 packet from the tunnel are received on the first vport, L3 packets on the second. The controller must send packets to the layer3 GRE vport to tunnel them without their Ethernet header. Units tests have been added to check the L3 tunnel handling. Signed-off-by: Simon HormanSigned-off-by: Jiri Benc Signed-off-by: Yi Yang Signed-off-by; Jan Scheurich Co-authored-by: Zoltan Balogh --- lib/netdev-native-tnl.c | 27 +- lib/netdev-vport.c| 14 -- ofproto/tunnel.c | 15 +-- tests/tunnel-push-pop-ipv6.at | 25 ++-- tests/tunnel-push-pop.at | 45 +++ tests/tunnel.at | 28 +-- vswitchd/vswitch.xml | 13 + 7 files changed, 122 insertions(+), 45 deletions(-) diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index c730e72..7e4b7be 100644 --- a/lib/netdev-native-tnl.c +++ b/lib/netdev-native-tnl.c @@ -154,6 +154,10 @@ netdev_tnl_push_ip_header(struct dp_packet *packet, *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header); memcpy(eth, header, size); +/* The encapsulated packet has type Ethernet. Adjust dp_packet. */ +dp_packet_set_packet_type(packet, PT_ETH); +dp_packet_reset_offsets(packet); +packet->l3_ofs = sizeof (struct eth_header); if (netdev_tnl_is_header_ipv6(header)) { ip6 = netdev_tnl_ipv6_hdr(eth); @@ -345,6 +349,7 @@ parse_gre_header(struct dp_packet *packet, ovs_16aligned_be32 *options; int hlen; unsigned int ulen; +uint16_t greh_protocol; greh = netdev_tnl_ip_extract_tnl_md(packet, tnl, ); if (!greh) { @@ -355,10 +360,6 @@ parse_gre_header(struct dp_packet *packet, return -EINVAL; } -if (greh->protocol != htons(ETH_TYPE_TEB)) { -return -EINVAL; -} - hlen = ulen + gre_header_len(greh->flags); if (hlen > dp_packet_size(packet)) { return -EINVAL; @@ -388,6 +389,15 @@ parse_gre_header(struct dp_packet *packet, options++; } +/* Set the new packet type depending on the GRE protocol field. */ +greh_protocol = ntohs(greh->protocol); +if (greh_protocol == ETH_TYPE_TEB) { +packet->packet_type = PT_ETH; +} else { +/* Allow all GRE protocol values as Ethertypes */ +packet->packet_type = PACKET_TYPE(OFPHTN_ETHERTYPE, greh_protocol); +} + return hlen; } @@ -451,7 +461,12 @@ netdev_gre_build_header(const struct netdev *netdev, greh = netdev_tnl_ip_build_header(data, params, IPPROTO_GRE); -greh->protocol = htons(ETH_TYPE_TEB); +/* XXX: Needs fixing for versatile tunnels in PTAP bridges */ +if (tnl_cfg->is_layer3) { +greh->protocol = params->flow->dl_type; +} else { +greh->protocol = htons(ETH_TYPE_TEB); +} greh->flags = 0; options = (ovs_16aligned_be32 *) (greh + 1); @@ -504,6 +519,7 @@ netdev_vxlan_pop_header(struct dp_packet *packet) tnl->tun_id = htonll(ntohl(get_16aligned_be32(>vx_vni)) >> 8); tnl->flags |= FLOW_TNL_F_KEY; +dp_packet_set_packet_type(packet, PT_ETH); dp_packet_reset_packet(packet, hlen + VXLAN_HLEN); return packet; @@ -583,6 +599,7 @@ netdev_geneve_pop_header(struct dp_packet *packet) tnl->metadata.present.len = opts_len; tnl->flags |= FLOW_TNL_F_UDPIF; +dp_packet_set_packet_type(packet, PT_ETH); dp_packet_reset_packet(packet, hlen); return packet; diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c index a6c1e23..8653e96 100644 --- a/lib/netdev-vport.c +++ b/lib/netdev-vport.c @@ -410,7 +410,7 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp) const char *name = netdev_get_name(dev_); const char *type = netdev_get_type(dev_); struct ds errors = DS_EMPTY_INITIALIZER; -bool needs_dst_port, has_csum; +bool needs_dst_port, has_csum, optional_layer3; uint16_t dst_proto = 0, src_proto = 0; struct netdev_tunnel_config tnl_cfg; struct smap_node *node; @@ -418,6 +418,7 @@ set_tunnel_config(struct netdev *dev_, const struct smap *args, char **errp)
[ovs-dev] [PATCH 2/7] userspace: Support for push_eth and pop_eth actions
Add support for actions push_eth and pop_eth to the netdev datapath and the supporting libraries. This patch relies on the support for these actions in the kernel datapath to be present. Signed-off-by: Lorand JakabSigned-off-by: Simon Horman Signed-off-by: Jiri Benc Signed-off-by: Yi Yang Signed-off-by: Jean Tourrilhes Signed-off-by: Jan Scheurich Co-authored-by: Zoltan Balogh --- datapath/linux/compat/include/linux/openvswitch.h | 17 ++ lib/dpif-netdev.c | 2 + lib/dpif.c| 2 + lib/odp-execute.c | 17 ++ lib/odp-util.c| 72 ++- lib/odp-util.h| 7 +++ lib/packets.c | 41 + lib/packets.h | 4 ++ ofproto/ofproto-dpif-sflow.c | 7 +++ 9 files changed, 167 insertions(+), 2 deletions(-) diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h index 425d3a4..d78d8a8 100644 --- a/datapath/linux/compat/include/linux/openvswitch.h +++ b/datapath/linux/compat/include/linux/openvswitch.h @@ -749,6 +749,18 @@ enum ovs_nat_attr { #define OVS_NAT_ATTR_MAX (__OVS_NAT_ATTR_MAX - 1) /** + * struct ovs_action_push_eth - %OVS_ACTION_ATTR_PUSH_ETH action argument. + * @addresses: Source and destination MAC addresses. + * @eth_type: Ethernet type + */ +struct ovs_action_push_eth { + struct ovs_key_ethernet addresses; +#ifndef __KERNEL__ + __be16 eth_type; +#endif +}; + +/** * enum ovs_action_attr - Action types. * * @OVS_ACTION_ATTR_OUTPUT: Output packet to port. @@ -782,6 +794,9 @@ enum ovs_nat_attr { * is no MPLS label stack, as determined by ethertype, no action is taken. * @OVS_ACTION_ATTR_CT: Track the connection. Populate the conntrack-related * entries in the flow key. + * @OVS_ACTION_ATTR_PUSH_ETH: Push a new outermost Ethernet header onto the + * packet. + * @OVS_ACTION_ATTR_POP_ETH: Pop the outermost Ethernet header off the packet. * * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all * fields within a header are modifiable, e.g. the IPv4 protocol and fragment @@ -814,6 +829,8 @@ enum ovs_action_attr { * bits. */ OVS_ACTION_ATTR_CT, /* Nested OVS_CT_ATTR_* . */ OVS_ACTION_ATTR_TRUNC,/* u32 struct ovs_action_trunc. */ + OVS_ACTION_ATTR_PUSH_ETH, /* struct ovs_action_push_eth. */ + OVS_ACTION_ATTR_POP_ETH, /* No argument. */ #ifndef __KERNEL__ OVS_ACTION_ATTR_TUNNEL_PUSH, /* struct ovs_action_push_tnl*/ diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b630824..973814c 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -4731,6 +4731,8 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, case OVS_ACTION_ATTR_UNSPEC: case OVS_ACTION_ATTR_TRUNC: case OVS_ACTION_ATTR_CLONE: +case OVS_ACTION_ATTR_PUSH_ETH: +case OVS_ACTION_ATTR_POP_ETH: case __OVS_ACTION_ATTR_MAX: OVS_NOT_REACHED(); } diff --git a/lib/dpif.c b/lib/dpif.c index d9816e5..3f82d80 100644 --- a/lib/dpif.c +++ b/lib/dpif.c @@ -1183,6 +1183,8 @@ dpif_execute_helper_cb(void *aux_, struct dp_packet_batch *packets_, case OVS_ACTION_ATTR_SAMPLE: case OVS_ACTION_ATTR_TRUNC: case OVS_ACTION_ATTR_CLONE: +case OVS_ACTION_ATTR_PUSH_ETH: +case OVS_ACTION_ATTR_POP_ETH: case OVS_ACTION_ATTR_UNSPEC: case __OVS_ACTION_ATTR_MAX: OVS_NOT_REACHED(); diff --git a/lib/odp-execute.c b/lib/odp-execute.c index 9612ef5..4ed4475 100644 --- a/lib/odp-execute.c +++ b/lib/odp-execute.c @@ -573,6 +573,8 @@ requires_datapath_assistance(const struct nlattr *a) case OVS_ACTION_ATTR_POP_MPLS: case OVS_ACTION_ATTR_TRUNC: case OVS_ACTION_ATTR_CLONE: +case OVS_ACTION_ATTR_POP_ETH: +case OVS_ACTION_ATTR_PUSH_ETH: return false; case OVS_ACTION_ATTR_UNSPEC: @@ -715,6 +717,21 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal, return; } break; +case OVS_ACTION_ATTR_PUSH_ETH: { +const struct ovs_action_push_eth *eth = nl_attr_get(a); + +DP_PACKET_BATCH_FOR_EACH (packet, batch) { +push_eth(packet, >addresses.eth_dst, + >addresses.eth_src, eth->eth_type); +} +break; +} + +case OVS_ACTION_ATTR_POP_ETH: +DP_PACKET_BATCH_FOR_EACH (packet, batch) { +pop_eth(packet); +} +break; case OVS_ACTION_ATTR_OUTPUT: case
[ovs-dev] [PATCH 4/7] ofproto-dpif-upcall: Intialize dump-seq of new flow to zero
This forces updating of flow stat at the next re-validation, even for flows that are being created when the revalidation has already commenced. It enables reliable testing of fast path flow stats using ovs-appctl time/warp after flow creation. Signed-off-by: Jan Scheurich--- ofproto/ofproto-dpif-upcall.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 96a89b7..49fed77 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -1118,7 +1118,7 @@ upcall_xlate(struct udpif *udpif, struct upcall *upcall, * with pushing its stats eventually. */ } -upcall->dump_seq = seq_read(udpif->dump_seq); +upcall->dump_seq = 0; upcall->reval_seq = seq_read(udpif->reval_seq); xlate_actions(, >xout); -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/7] userspace: Add packet_type in dp_packet and flow
This commit adds a packet_type attribute to the structs dp_packet and flow to explicitly carry the type of the packet as preparation for the introduction of the so-called packet type-aware pipeline (PTAP) in OVS. The packet_type is a big-endian 32 bit integer with the encoding as specified in OpenFlow verion 1.5. The upper 16 bits contain the packet type name space. Pre-defined values are defined in openflow-common.h: enum ofp_header_type_namespaces { OFPHTN_ONF = 0, /* ONF namespace. */ OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */ OFPHTN_IP_PROTO = 2,/* ns_type is a IP protocol number. */ OFPHTN_UDP_TCP_PORT = 3,/* ns_type is a TCP or UDP port. */ OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */ }; The lower 16 bits specify the actual type in the context of the name space. Only name spaces 0 and 1 will be supported for now. For name space OFPHTN_ONF the relevant packet type is 0 (Ethernet). This is the default packet_type in OVS and the only one supported so far. Packets of type (OFPHTN_ONF, 0) are called Ethernet packets. In name space OFPHTN_ETHERTYPE the type is the Ethertype of the packet. A packet of type (OFPHTN_ETHERTYPE, ) is a standard L2 packet with the Ethernet header (and any VLAN tags) removed to expose the L3 (or L2.5) payload of the packet. These will simply be called L3 packets. The Ethernet address fields dl_src and dl_dst in struct flow are not applicable for an L3 packet and must be zero. However, to maintain compatibility with the large code base, we have chosen to copy the Ethertype of an L3 packet into the dl_type field of struct flow. This does not mean that it will be possible to match on dl_type for L3 packets with PTAP later on. Matching must be done on packet_type instead. New dp_packets are initialized with packet_type Ethernet. Ports that receive L3 packets will have to explicitly adjust the packet_type. Signed-off-by: Jean TourrilhesSigned-off-by: Jan Scheurich Co-authored-by: Zoltan Balogh --- include/openflow/openflow-common.h | 9 include/openvswitch/flow.h | 30 +-- include/openvswitch/ofp-print.h| 9 +++- lib/dp-packet.c| 4 ++ lib/dp-packet.h| 37 +- lib/dpif-netdev.c | 3 +- lib/dpif-netlink.c | 16 ++ lib/dpif.c | 6 +-- lib/flow.c | 100 +++-- lib/flow.h | 9 lib/match.c| 2 +- lib/netdev-bsd.c | 1 + lib/netdev-dummy.c | 5 ++ lib/netdev-linux.c | 4 +- lib/nx-match.c | 2 +- lib/odp-util.h | 2 +- lib/ofp-print.c| 28 +-- lib/ofp-util.c | 2 +- lib/packets.c | 4 ++ lib/pcap-file.c| 4 +- ofproto/ofproto-dpif-rid.h | 2 +- ofproto/ofproto-dpif-xlate.c | 2 +- tests/test-flows.c | 2 +- 23 files changed, 217 insertions(+), 66 deletions(-) diff --git a/include/openflow/openflow-common.h b/include/openflow/openflow-common.h index 7b619a9..2382682 100644 --- a/include/openflow/openflow-common.h +++ b/include/openflow/openflow-common.h @@ -454,4 +454,13 @@ enum ofp_table_config { OFPTC14_VACANCY_EVENTS= 1 << 3, /* Enable vacancy events. */ }; +/* Header and packet type name spaces. */ +enum ofp_header_type_namespaces { +OFPHTN_ONF = 0, /* ONF namespace. */ +OFPHTN_ETHERTYPE = 1, /* ns_type is an Ethertype. */ +OFPHTN_IP_PROTO = 2,/* ns_type is a IP protocol number. */ +OFPHTN_UDP_TCP_PORT = 3,/* ns_type is a TCP or UDP port. */ +OFPHTN_IPV4_OPTION = 4, /* ns_type is an IPv4 option number. */ +}; + #endif /* openflow/openflow-common.h */ diff --git a/include/openvswitch/flow.h b/include/openvswitch/flow.h index df80dfe..f16060b 100644 --- a/include/openvswitch/flow.h +++ b/include/openvswitch/flow.h @@ -19,11 +19,13 @@ #include "openflow/nicira-ext.h" #include "openvswitch/packets.h" #include "openvswitch/util.h" +#include "byte-order.h" +#include "lib/packets.h" /* This sequence number should be incremented whenever anything involving flows * or the wildcarding of flows changes. This will cause build assertion * failures in places which likely need to be updated. */ -#define FLOW_WC_SEQ 36 +#define FLOW_WC_SEQ 37 /* Number of Open vSwitch extension 32-bit registers. */ #define FLOW_N_REGS 16 @@ -61,6 +63,22 @@ const char *flow_tun_flag_to_string(uint32_t flags); /* Maximum number of supported MPLS labels. */ #define FLOW_MAX_MPLS_LABELS 3 +#define PACKET_TYPE(NS, NS_TYPE) CONSTANT_HTONL((NS) << 16 | (NS_TYPE)) +#define
Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support
On Fri, Feb 03, 2017 at 10:41:12AM +0100, Maxime Coquelin wrote: > > > On 02/03/2017 10:27 AM, Daniel P. Berrange wrote: > > On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote: > > > On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote: > > > > On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote: > > > > > On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote: > > > > > > On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote: > > > > > > > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange > > > > > > > wrote: > > > > > > > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin > > > > > > > > wrote: > > > > > > > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange > > > > > > > > > wrote: > > > > > > > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote: > > > > > > > > > > > > > > > > > > > > > > > > It depends where / how in OVS it needs to be set. The > > > > > > > > > > > > only stuff libvirt > > > > > > > > > > > > does with OVS is to run 'add-port' and 'del-port' > > > > > > > > > > > > commands via the ovs > > > > > > > > > > > > cli tool. We pass through arguments from the port > > > > > > > > > > > > profile stored in the > > > > > > > > > > > > XML config. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > eg those things in get passed as cli args > > > > > > > > > > > > to the 'add-port' > > > > > > > > > > > > command. Soo if add-port needs this new version string, > > > > > > > > > > > > then we'd need > > > > > > > > > > > > to add the version to the openvswitch virtualport XML. > > > > > > > > > > > > > > > > > > > > > > > > If the version is provided to OVS in a different > > > > > > > > > > > > command, then it would > > > > > > > > > > > > probably be outside scope of libvirt. > > > > > > > > > > > > > > > > > > > > > > I think it would make sense to be a parameter of the > > > > > > > > > > > add-port command. > > > > > > > > > > > But it would be for vhost-user related add-port command, > > > > > > > > > > > I didn't find > > > > > > > > > > > where/if this is managed in libvirt XML. > > > > > > > > > > > > > > > > > > > > For vhost-user, libvirt does not have any interaction with > > > > > > > > > > OVS at > > > > > > > > > > all. If the thing that's using the vhost-user UNIX socket, > > > > > > > > > > in turn > > > > > > > > > > connects to OVS, that's outside scope of libvirt. IOW, for > > > > > > > > > > vhost-user > > > > > > > > > > OVS it seems like that job is for Nova / os-vif to solve. > > > > > > > > > > > > > > > > > > I don't think they currently understand the issues involved in > > > > > > > > > cross-version migration though. This is a complex subject > > > > > > > > > and easy to > > > > > > > > > get wrong. It would be significantly better to figure it out > > > > > > > > > at libvirt > > > > > > > > > level since it already deals with cross-version migration > > > > > > > > > issues. > > > > > > > > > > > > > > > > Libvirt considers vhost-user to be a blackbox though - it just > > > > > > > > exposes > > > > > > > > a UNIX socket, and whatever is on the other end is completely > > > > > > > > opaque. > > > > > > > > The fact that the other end might plumb the data stream into > > > > > > > > openvswitch > > > > > > > > is not something libvirt should know, as we don't want to end > > > > > > > > up having > > > > > > > > to add custom code to libvirt for every different vhost-user > > > > > > > > server > > > > > > > > impl. > > > > > > > > > > > > > > > > IOW, if the version str can be passed to QEMU, and then onto > > > > > > > > vhost-user > > > > > > > > backend in QEMU, then libvirt can be involved. If the version > > > > > > > > str has > > > > > > > > to be given to openvswitch that's not for libvirt to deall with. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Daniel > > > > > > > > > > > > > > It's not just about passing it to QEMU. The following is needed: > > > > > > > - need to query version when creating VM/device > > > > > > > - need to query supported versions when migrating VM > > > > > > > > > > > > Those are both things that nova can do, since it knows what the > > > > > > vhost-user > > > > > > device in question is connected to, and so can query the versions, > > > > > > and check > > > > > > versions before triggering migration with libvirt > > > > > > > > > > It can, but then it will need to query libvirt or source for the > > > > > version > > > > > string since it's in
Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support
On 02/03/2017 10:27 AM, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote: On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange wrote: On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote: On 02/01/2017 12:41 PM, Daniel P. Berrange wrote: It depends where / how in OVS it needs to be set. The only stuff libvirt does with OVS is to run 'add-port' and 'del-port' commands via the ovs cli tool. We pass through arguments from the port profile stored in the XML config. eg those things in get passed as cli args to the 'add-port' command. Soo if add-port needs this new version string, then we'd need to add the version to the openvswitch virtualport XML. If the version is provided to OVS in a different command, then it would probably be outside scope of libvirt. I think it would make sense to be a parameter of the add-port command. But it would be for vhost-user related add-port command, I didn't find where/if this is managed in libvirt XML. For vhost-user, libvirt does not have any interaction with OVS at all. If the thing that's using the vhost-user UNIX socket, in turn connects to OVS, that's outside scope of libvirt. IOW, for vhost-user OVS it seems like that job is for Nova / os-vif to solve. I don't think they currently understand the issues involved in cross-version migration though. This is a complex subject and easy to get wrong. It would be significantly better to figure it out at libvirt level since it already deals with cross-version migration issues. Libvirt considers vhost-user to be a blackbox though - it just exposes a UNIX socket, and whatever is on the other end is completely opaque. The fact that the other end might plumb the data stream into openvswitch is not something libvirt should know, as we don't want to end up having to add custom code to libvirt for every different vhost-user server impl. IOW, if the version str can be passed to QEMU, and then onto vhost-user backend in QEMU, then libvirt can be involved. If the version str has to be given to openvswitch that's not for libvirt to deall with. Regards, Daniel It's not just about passing it to QEMU. The following is needed: - need to query version when creating VM/device - need to query supported versions when migrating VM Those are both things that nova can do, since it knows what the vhost-user device in question is connected to, and so can query the versions, and check versions before triggering migration with libvirt It can, but then it will need to query libvirt or source for the version string since it's in the XML. No, it wouldn't be in the XML at all. Nova on the source queries what vhostuser version it has and what the target host has, and can prevent the migration if they're incompatible. This is not sufficient. Exactly the same as qemu machine type, this must be preserved from time of install and moved wherever VM goes. Nova has the ability todo that. I dont think libvirt has to be involved at all for this, as all the info can be obtained by nova/os-vif from the vhostuser impl it has configured. Given we are already confused at libvirt level, I find it highly unlikely upper levels will do the right thing. The only confusion is about what must consume the data. I mistakenly thought it was being proposed that the qemu vhostuser backend was being given a version string. Now it seems the version info must be set against the 3rd party component that vhostuser connects to, and that is outside scope of libvirt. That is entirely managed by Nova today, so Nova is the right thing to manage this versioning info. This is my understanding, with as example using Nova: 1. Before starting the VM, Nova queries source host's OVS and all possible destination hosts' OVS to retrieve supported versions 2. Nova chooses the first common version supported by all hosts 3. Nova create the dpdkvhostuser interfaces in OVS with passing the selected version as parameter That should be enough, right? Or maybe I miss something? Thanks, Maxime ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [libvirt] [RFC] Vhost-user backends cross-version migration support
On Thu, Feb 02, 2017 at 08:27:18PM +0200, Michael S. Tsirkin wrote: > On Thu, Feb 02, 2017 at 06:21:55PM +, Daniel P. Berrange wrote: > > On Thu, Feb 02, 2017 at 07:31:49PM +0200, Michael S. Tsirkin wrote: > > > On Thu, Feb 02, 2017 at 05:29:08PM +, Daniel P. Berrange wrote: > > > > On Thu, Feb 02, 2017 at 07:20:35PM +0200, Michael S. Tsirkin wrote: > > > > > On Thu, Feb 02, 2017 at 05:10:28PM +, Daniel P. Berrange wrote: > > > > > > On Thu, Feb 02, 2017 at 06:21:55PM +0200, Michael S. Tsirkin wrote: > > > > > > > On Thu, Feb 02, 2017 at 03:06:21PM +, Daniel P. Berrange > > > > > > > wrote: > > > > > > > > On Thu, Feb 02, 2017 at 03:14:01PM +0100, Maxime Coquelin wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > On 02/01/2017 12:41 PM, Daniel P. Berrange wrote: > > > > > > > > > > > > > > > > > > > > It depends where / how in OVS it needs to be set. The only > > > > > > > > > > stuff libvirt > > > > > > > > > > does with OVS is to run 'add-port' and 'del-port' commands > > > > > > > > > > via the ovs > > > > > > > > > > cli tool. We pass through arguments from the port profile > > > > > > > > > > stored in the > > > > > > > > > > XML config. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > interfaceid='09b11c53-8b5c-4eeb-8f00-d84eaa0aaa4f'/> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > eg those things in get passed as cli args to > > > > > > > > > > the 'add-port' > > > > > > > > > > command. Soo if add-port needs this new version string, > > > > > > > > > > then we'd need > > > > > > > > > > to add the version to the openvswitch virtualport XML. > > > > > > > > > > > > > > > > > > > > If the version is provided to OVS in a different command, > > > > > > > > > > then it would > > > > > > > > > > probably be outside scope of libvirt. > > > > > > > > > > > > > > > > > > I think it would make sense to be a parameter of the add-port > > > > > > > > > command. > > > > > > > > > But it would be for vhost-user related add-port command, I > > > > > > > > > didn't find > > > > > > > > > where/if this is managed in libvirt XML. > > > > > > > > > > > > > > > > For vhost-user, libvirt does not have any interaction with OVS > > > > > > > > at > > > > > > > > all. If the thing that's using the vhost-user UNIX socket, in > > > > > > > > turn > > > > > > > > connects to OVS, that's outside scope of libvirt. IOW, for > > > > > > > > vhost-user > > > > > > > > OVS it seems like that job is for Nova / os-vif to solve. > > > > > > > > > > > > > > I don't think they currently understand the issues involved in > > > > > > > cross-version migration though. This is a complex subject and > > > > > > > easy to > > > > > > > get wrong. It would be significantly better to figure it out at > > > > > > > libvirt > > > > > > > level since it already deals with cross-version migration issues. > > > > > > > > > > > > Libvirt considers vhost-user to be a blackbox though - it just > > > > > > exposes > > > > > > a UNIX socket, and whatever is on the other end is completely > > > > > > opaque. > > > > > > The fact that the other end might plumb the data stream into > > > > > > openvswitch > > > > > > is not something libvirt should know, as we don't want to end up > > > > > > having > > > > > > to add custom code to libvirt for every different vhost-user server > > > > > > impl. > > > > > > > > > > > > IOW, if the version str can be passed to QEMU, and then onto > > > > > > vhost-user > > > > > > backend in QEMU, then libvirt can be involved. If the version str > > > > > > has > > > > > > to be given to openvswitch that's not for libvirt to deall with. > > > > > > > > > > > > Regards, > > > > > > Daniel > > > > > > > > > > It's not just about passing it to QEMU. The following is needed: > > > > > - need to query version when creating VM/device > > > > > - need to query supported versions when migrating VM > > > > > > > > Those are both things that nova can do, since it knows what the > > > > vhost-user > > > > device in question is connected to, and so can query the versions, and > > > > check > > > > versions before triggering migration with libvirt > > > > > > It can, but then it will need to query libvirt or source for the version > > > string since it's in the XML. > > > > No, it wouldn't be in the XML at all. Nova on the source queries what > > vhostuser version it has and what the target host has, and can prevent > > the migration if they're incompatible. > > This is not sufficient. Exactly the same as qemu machine type, > this must be preserved from time of install and moved > wherever VM goes. Nova has the ability todo that. > > I dont think libvirt has to be > > involved at all for this, as all the info can be obtained by nova/os-vif > > from the vhostuser impl it has configured. > > Given we are already confused at libvirt level,