Re: [ovs-dev] [PATCH] ofproto/bond: Fix bond/show when all interfaces are disabled
> On Feb 17, 2017, at 4:49 AM, Andy Zhouwrote: > > Without this patch, when all slaves are disabled, the 'bond/show' > command still shows the mac address of last active slave in > 'active slave mac' output. This patch clears them to zeros. > > Signed-off-by: Andy Zhou > > --- > ofproto/bond.c | 12 > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/ofproto/bond.c b/ofproto/bond.c > index c138593..260023e 100644 > --- a/ofproto/bond.c > +++ b/ofproto/bond.c > @@ -488,10 +488,13 @@ bond_find_slave_by_mac(const struct bond *bond, const > struct eth_addr mac) > static void > bond_active_slave_changed(struct bond *bond) > { > -struct eth_addr mac; > - > -netdev_get_etheraddr(bond->active_slave->netdev, ); > -bond->active_slave_mac = mac; > +if (bond->active_slave) { > +struct eth_addr mac; > +netdev_get_etheraddr(bond->active_slave->netdev, ); > +bond->active_slave_mac = mac; > +} else { > +bond->active_slave_mac = eth_addr_zero; > +} > bond->active_slave_changed = true; > seq_change(connectivity_seq_get()); > } > @@ -1866,6 +1869,7 @@ bond_choose_active_slave(struct bond *bond) > bond_active_slave_changed(bond); > } > } else if (old_active_slave) { > +bond_active_slave_changed(bond); > VLOG_INFO_RL(, "bond %s: all interfaces disabled", bond->name); > } > } > -- > 1.9.1 looks good to me. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 3/3] xlate: Translate openflow clone into odp sample action.
When datapath does not support the 'clone' action directly, generate sample action (with 100% probability) instead. Specifically, currently, there is no plan to support the 'clone' action on the Linux kernel datapath directly, so the sample action will be used to translate the openflow clone action for this datapath. Signed-off-by: Andy Zhou--- ofproto/ofproto-dpif-xlate.c | 38 -- tests/ofproto-dpif.at| 2 +- 2 files changed, 29 insertions(+), 11 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index c4ca5d2..1a5fdf8 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -4659,18 +4659,36 @@ xlate_sample_action(struct xlate_ctx *ctx, tunnel_out_port, false); } -/* Only called if the datapath supports 'OVS_ACTION_ATTR_CLONE'. - * - * Translates 'oc' within OVS_ACTION_ATTR_CLONE. */ +/* Use datapath 'clone' or sample to enclose the translation of 'oc'. */ static void compose_clone_action(struct xlate_ctx *ctx, const struct ofpact_nest *oc) { size_t clone_offset = nl_msg_start_nested(ctx->odp_actions, OVS_ACTION_ATTR_CLONE); +do_xlate_actions(oc->actions, ofpact_nest_get_action_len(oc), ctx); +nl_msg_end_non_empty_nested(ctx->odp_actions, clone_offset); +} + +/* Use datapath 'sample' action to translate clone. */ +static void +compose_clone_action_using_sample(struct xlate_ctx *ctx, + const struct ofpact_nest *oc) +{ +size_t offset = nl_msg_start_nested(ctx->odp_actions, +OVS_ACTION_ATTR_SAMPLE); + +size_t ac_offset = nl_msg_start_nested(ctx->odp_actions, + OVS_SAMPLE_ATTR_ACTIONS); do_xlate_actions(oc->actions, ofpact_nest_get_action_len(oc), ctx); -nl_msg_end_non_empty_nested(ctx->odp_actions, clone_offset); +if (nl_msg_end_non_empty_nested(ctx->odp_actions, ac_offset)) { +nl_msg_cancel_nested(ctx->odp_actions, offset); +} else { +nl_msg_put_u32(ctx->odp_actions, OVS_SAMPLE_ATTR_PROBABILITY, + UINT32_MAX); /* 100% probability. */ +nl_msg_end_nested(ctx->odp_actions, offset); +} } static void @@ -4690,16 +4708,16 @@ xlate_clone(struct xlate_ctx *ctx, const struct ofpact_nest *oc) ofpbuf_use_stub(>action_set, actset_stub, sizeof actset_stub); ofpbuf_put(>action_set, old_action_set.data, old_action_set.size); +/* Datapath clone action will make sure the pre clone packets + * are used for actions after clone. Save and restore + * ctx->base_flow to reflect this for the openflow pipeline. */ +struct flow old_base_flow = ctx->base_flow; if (ctx->xbridge->support.clone) { -/* Datapath clone action will make sure the pre clone packets - * are used for actions after clone. Save and restore - * ctx->base_flow to reflect this for the openflow pipeline. */ -struct flow old_base_flow = ctx->base_flow; compose_clone_action(ctx, oc); -ctx->base_flow = old_base_flow; } else { -do_xlate_actions(oc->actions, ofpact_nest_get_action_len(oc), ctx); +compose_clone_action_using_sample(ctx, oc); } +ctx->base_flow = old_base_flow; ofpbuf_uninit(>action_set); ctx->action_set = old_action_set; diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index e861d9f..f1415e4 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -6457,7 +6457,7 @@ AT_CHECK([ovs-appctl dpif/disable-dp-clone br0], [0], AT_CHECK([ovs-appctl ofproto/trace ovs-dummy 'in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.10.10.2,dst=10.10.10.1,proto=1,tos=1,ttl=128,frag=no),icmp(type=8,code=0)'], [0], [stdout]) AT_CHECK([tail -1 stdout], [0], [dnl -Datapath actions: set(ipv4(src=10.10.10.2,dst=192.168.4.4)),2,set(eth(src=80:81:81:81:81:81)),set(ipv4(src=10.10.10.2,dst=192.168.5.5)),3,set(eth(src=50:54:00:00:00:09)),set(ipv4(src=10.10.10.2,dst=10.10.10.1)),4 +Datapath actions: sample(sample=100.0%,actions(set(ipv4(src=10.10.10.2,dst=192.168.4.4)),2)),sample(sample=100.0%,actions(set(eth(src=80:81:81:81:81:81)),set(ipv4(src=10.10.10.2,dst=192.168.5.5)),3)),4 ]) OVS_VSWITCHD_STOP -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 2/3] lib: Refactor nested netlink APIs.
Future patches will make use of those changes. Signed-off-by: Andy Zhou--- lib/netlink.c | 19 --- lib/netlink.h | 3 ++- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/lib/netlink.c b/lib/netlink.c index ad7d35a..ae4c72a 100644 --- a/lib/netlink.c +++ b/lib/netlink.c @@ -467,16 +467,29 @@ nl_msg_end_nested(struct ofpbuf *msg, size_t offset) attr->nla_len = msg->size - offset; } -/* Same as nls_msg_end_nested() when the nested Netlink contains non empty - * message. Otherwise, drop the nested message header from 'msg'.*/ +/* Cancel a nested Netlink attribute in 'msg'. 'offset' should be the value + * returned by nl_msg_start_nested(). */ void +nl_msg_cancel_nested(struct ofpbuf *msg, size_t offset) +{ +msg->size = offset; +} + +/* Same as nls_msg_end_nested() when the nested Netlink contains non empty + * message. Otherwise, drop the nested message header from 'msg'. + * + * Return true if the nested message has been dropped. */ +bool nl_msg_end_non_empty_nested(struct ofpbuf *msg, size_t offset) { nl_msg_end_nested(msg, offset); struct nlattr *attr = ofpbuf_at_assert(msg, offset, sizeof *attr); if (!nl_attr_get_size(attr)) { -msg->size = offset; +nl_msg_cancel_nested(msg, offset); +return true; +} else { +return false; } } diff --git a/lib/netlink.h b/lib/netlink.h index 7646f91..bb4dbf6 100644 --- a/lib/netlink.h +++ b/lib/netlink.h @@ -79,7 +79,8 @@ void nl_msg_put_string(struct ofpbuf *, uint16_t type, const char *value); size_t nl_msg_start_nested(struct ofpbuf *, uint16_t type); void nl_msg_end_nested(struct ofpbuf *, size_t offset); -void nl_msg_end_non_empty_nested(struct ofpbuf *, size_t offset); +void nl_msg_cancel_nested(struct ofpbuf *, size_t offset); +bool nl_msg_end_non_empty_nested(struct ofpbuf *, size_t offset); void nl_msg_put_nested(struct ofpbuf *, uint16_t type, const void *data, size_t size); -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/3] ofproto-dpif: Enhance execute_controller_action().
Allow execute_controller_action() to accept actions encoded with nested netlink attributes. execute_controller_action() can be called during 'xlate_actions'. It tries executes all actions translated so far to get the current packet that needs to be sent to the controller. This works fine until when the action is enclosed within a nested netlink message, and the action translation has not finished yet. For example; A, clone(B, controller, C) In this case, we can not execute 'clone' since its translation has not be finished (missing C), However, A still needs to be executed before the packet can be sent to the controller. This solution is to make a copy of the odp actions translated so far, and 'fix up' the copy so that it can be executed. The original odp actions are left intact so that xlate can continue. Signed-off-by: Andy Zhou--- ofproto/ofproto-dpif-xlate.c | 149 +-- 1 file changed, 144 insertions(+), 5 deletions(-) diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c index 503a347..c4ca5d2 100644 --- a/ofproto/ofproto-dpif-xlate.c +++ b/ofproto/ofproto-dpif-xlate.c @@ -3805,13 +3805,150 @@ flood_packets(struct xlate_ctx *ctx, bool all) ctx->nf_output_iface = NF_OUT_FLOOD; } +/* Copy and reformat a partially xlated odp actions to a new + * odp actions list in 'b', so that the new actions list + * can be executed by odp_execute_actions. + * + * When xlate using nested odp actions, such as sample and clone, + * The nested action created by nl_msg_start_nested() may not + * have been properly closed yet, thus can not be executed + * directly. + * + * Since unclosed nested action has to be last action, it can be + * fixed by skip the outer header, and treat the actions within + * as if they are outside the nested attribute. Since the effect + * of executing them on packet is the same. + * + * As an optimization, a fully closed 'sample' or 'clone' action + * is skipped since their execution has no effect to the packet. + * + * Returns true if success. 'b' contains the new actions list. + * The caller is responsible for dispose 'b'. + * + * Returns false if error, 'b' has been freed already. */ +static bool +xlate_fixup_actions(struct ofpbuf *b, const struct nlattr *actions, +size_t actions_len) +{ +const struct nlattr *a; +unsigned int left; + +NL_ATTR_FOR_EACH_UNSAFE (a, left, actions, actions_len) { +int type = nl_attr_type(a); + +switch ((enum ovs_action_attr) type) { +case OVS_ACTION_ATTR_HASH: +case OVS_ACTION_ATTR_PUSH_VLAN: +case OVS_ACTION_ATTR_POP_VLAN: +case OVS_ACTION_ATTR_PUSH_MPLS: +case OVS_ACTION_ATTR_POP_MPLS: +case OVS_ACTION_ATTR_SET: +case OVS_ACTION_ATTR_SET_MASKED: +case OVS_ACTION_ATTR_TRUNC: +case OVS_ACTION_ATTR_OUTPUT: +case OVS_ACTION_ATTR_TUNNEL_PUSH: +case OVS_ACTION_ATTR_TUNNEL_POP: +case OVS_ACTION_ATTR_USERSPACE: +case OVS_ACTION_ATTR_RECIRC: +case OVS_ACTION_ATTR_CT: +ofpbuf_put(b, a, nl_attr_len_pad(a, left)); +break; + +case OVS_ACTION_ATTR_CLONE: +/* If the clone action has been fully xlated, it can + * be skipped, since any actions executed within clone + * do not affect the current packet. + * + * When xlating actions wihtin clone, the clone action, + * because it is an nested netlink attribute, do not have + * a vlaid 'nla_len'; it will be zero instead. Skip + * the clone heaer to find the start of the actions + * enclosed. Treat those actions as if they are written + * outside of clone. */ +if (!a->nla_len) { +bool ok; +if (left < NLA_HDRLEN) { +goto error; +} + +ok = xlate_fixup_actions(b, nl_attr_get_unspec(a, 0), + left - NLA_HDRLEN); +if (!ok) { +goto error; +} +} +break; + +case OVS_ACTION_ATTR_SAMPLE: +if (!a->nla_len) { +bool ok; +if (left < NLA_HDRLEN) { +goto error; +} +const struct nlattr *attr = nl_attr_get_unspec(a, 0); +left -= NLA_HDRLEN; + +while (left > 0 && + nl_attr_type(attr) != OVS_SAMPLE_ATTR_ACTIONS) { +/* Only OVS_SAMPLE_ATTR_ACTIONS can have unclosed + * nested netlink attribute. */ +if (!attr->nla_len) { +goto error; +} + +left -= NLA_ALIGN(attr->nla_len); +attr = nl_attr_next(attr); +} + +
Re: [ovs-dev] [PATCH] netdev-dpdk: Fix rx_error stat for dpdk ports.
2017-02-16 7:31 GMT-08:00 Ian Stokes: > "rx_error" stat for a DPDK interface was calculated with the assumption that > dropped packets due to hardware buffer overload were counted as errors > in DPDK and the rte ierror stat included rte imissed packets i.e. > > rx_errors = rte_stats.ierrors - rte_stats.imissed > > This results in negative statistic values as imissed packets are no longer > counted as part of ierror since DPDK v.16.04. > > Fix this by setting rx_errors equal to ierrors only. > > Fixes: 9e3ddd45 (netdev-dpdk: Add some missing statistics.) > CC: Timo Puha ) > Reported-by: Stepan Andrushko > Signed-off-by: Ian Stokes Good catch! I've applied this to master, branch-2.7 and branch-2.6. > --- > lib/netdev-dpdk.c |3 +-- > 1 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c > index 94568a1..ee53c4c 100644 > --- a/lib/netdev-dpdk.c > +++ b/lib/netdev-dpdk.c > @@ -2067,8 +2067,7 @@ out: > stats->tx_packets = rte_stats.opackets; > stats->rx_bytes = rte_stats.ibytes; > stats->tx_bytes = rte_stats.obytes; > -/* DPDK counts imissed as errors, but count them here as dropped instead > */ > -stats->rx_errors = rte_stats.ierrors - rte_stats.imissed; > +stats->rx_errors = rte_stats.ierrors; > stats->tx_errors = rte_stats.oerrors; > > rte_spinlock_lock(>stats_lock); > -- > 1.7.0.7 > > ___ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] (no subject)
How are you today and your family? I'm Qin Yanjun, Tak-lam, SBS, JP, and Chief Executive of (HKMA). I have a concealed business suggestion for you, It require your attention and honest co-operation. Regards, Mr. Qin Yanjun __ Sky Silk, http://aknet.kz ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] manpages in rst?
> -Original Message- > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- > boun...@openvswitch.org] On Behalf Of Russell Bryant > Sent: Thursday, February 16, 2017 8:10 PM > To: Ben Pfaff> Cc: ovs dev > Subject: Re: [ovs-dev] manpages in rst? > > On Thu, Feb 16, 2017 at 12:24 PM, Ben Pfaff wrote: > > > Currently, we have some manpages written directly in nroff. This is > > an awful format, that is difficult to read and difficult to write. > > Other manpages are written in a custom XML format that, while it is > > easier to read and write, isn't any standard format and so we can't > > expect anyone else (person or program) to understand it. This is not > > ideal. It's difficult to include either format in the readthedocs > > documentation, too. > > > > I'm thinking about starting to write manpages in REstructured Text > > (rst). This would make it much easier to include them in the > > readthedocs pages, and ReST seems to convert pretty well to nroff for > > installing as real manpages. For example, try fetching > > http://docutils.sourceforge.net/sandbox/manpage-writer/input/test.txt, > > which is a rst file, and then running "rst2man test.txt > test.man" > > and viewing test.man with "man -l" or "groffer". The output looks fine. > > > > I think that all we'd need for this is a build dependency on > > python-docutils to ensure that rst2man is available at build time. > > > > Does anyone have comments? > > > > +1 for rst. That should make it easier to integrate the same content > +into > docs.openvswitch.org. > > I think sphinx has support for man pages, but it has been a long time since > I've used it. > > -- > Russell Bryant [Alin Serdean] +1 for rst, it has chm support http://docutils.sourceforge.net/sandbox/chm-writer/ :D, "default" man for Windows. Thanks, Alin. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH RFC v5 8/8] NEWS: Add item for creating tunnels via rtnetlink
Signed-off-by: Eric Garver--- NEWS | 3 +++ 1 file changed, 3 insertions(+) diff --git a/NEWS b/NEWS index ce9fe8803280..922de349db1d 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,9 @@ Post-v2.7.0 - Tunnels: * Added support to set packet mark for tunnel endpoint using `egress_pkt_mark` OVSDB option. + * When using Linux kernel datapath tunnels may be created using rtnetlink. + This will allow us to take advantage of new tunnel features without + having to make changes to the vport modules. - EMC insertion probability is reduced to 1% and is configurable via the new 'other_config:emc-insert-inv-prob' option. -- 2.10.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH RFC v5 7/8] dpif-netlink: Probe for out-of-tree datapath.
For out-of-tree datapath, only try genetlink/compat. For in-tree kernel datapath, try rtnetlink then genetlink. Signed-off-by: Eric Garver--- lib/dpif-netlink.c | 16 +--- lib/dpif-rtnetlink.c | 39 +++ lib/dpif-rtnetlink.h | 7 +++ 3 files changed, 59 insertions(+), 3 deletions(-) diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index ae8204ffe6e1..991004491ec2 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -211,6 +211,12 @@ static int ovs_packet_family; * Initialized by dpif_netlink_init(). */ static unsigned int ovs_vport_mcgroup; +/* If true, tunnel devices are created using OVS compat/genetlink. + * If false, tunnel devices are created with rtnetlink and using light weight + * tunnels. If we fail to create the tunnel the rtnetlink+LWT, then we fallback + * to using the compat interface. */ +static bool ovs_tunnels_out_of_tree = true; + static int dpif_netlink_init(void); static int open_dpif(const struct dpif_netlink_dp *, struct dpif **); static uint32_t dpif_netlink_port_get_pid(const struct dpif *, @@ -979,11 +985,13 @@ dpif_netlink_port_add(struct dpif *dpif_, struct netdev *netdev, odp_port_t *port_nop) { struct dpif_netlink *dpif = dpif_netlink_cast(dpif_); -int error; +int error = EOPNOTSUPP; fat_rwlock_wrlock(>upcall_lock); -error = dpif_rtnetlink_port_create_and_add(dpif, netdev, port_nop); -if (error == EOPNOTSUPP) { +if (!ovs_tunnels_out_of_tree) { +error = dpif_rtnetlink_port_create_and_add(dpif, netdev, port_nop); +} +if (error) { error = dpif_netlink_port_add_compat(dpif, netdev, port_nop); } fat_rwlock_unlock(>upcall_lock); @@ -2495,6 +2503,8 @@ dpif_netlink_init(void) _vport_mcgroup); } +ovs_tunnels_out_of_tree = dpif_rtnetlink_probe_oot_tunnels(); + ovsthread_once_done(); } diff --git a/lib/dpif-rtnetlink.c b/lib/dpif-rtnetlink.c index 60a5003b88ca..853de6b764e8 100644 --- a/lib/dpif-rtnetlink.c +++ b/lib/dpif-rtnetlink.c @@ -474,3 +474,42 @@ dpif_rtnetlink_port_destroy(const char *name, const char *type) } return 0; } + +/** + * This is to probe for whether the modules are out-of-tree (openvswitch) or + * in-tree (upstream kernel). + * + * We probe for "ovs_geneve" via rtnetlink. As long as this returns something + * other than EOPNOTSUPP we know that the module in use is the out-of-tree one. + * This will be used to determine what netlink interface to use when creating + * ports; rtnetlink or compat/genetlink. + * + * See ovs_tunnels_out_of_tree + */ +bool +dpif_rtnetlink_probe_oot_tunnels(void) +{ +struct netdev *netdev = NULL; +bool out_of_tree = false; +int error; + +error = netdev_open("ovs-system-probe", "geneve", ); +if (!error) { +error = dpif_rtnetlink_geneve_create_kind(netdev, "ovs_geneve"); +if (error != EOPNOTSUPP) { +if (!error) { +char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; +const char *dp_port; + +dp_port = netdev_vport_get_dpif_port(netdev, namebuf, + sizeof namebuf); +dpif_rtnetlink_geneve_destroy(dp_port); +} +out_of_tree = true; +} +netdev_close(netdev); +error = 0; +} + +return out_of_tree; +} diff --git a/lib/dpif-rtnetlink.h b/lib/dpif-rtnetlink.h index 515820f02e66..5bb578c4ac65 100644 --- a/lib/dpif-rtnetlink.h +++ b/lib/dpif-rtnetlink.h @@ -24,6 +24,8 @@ int dpif_rtnetlink_port_create(struct netdev *netdev); int dpif_rtnetlink_port_destroy(const char *name, const char *type); +bool dpif_rtnetlink_probe_oot_tunnels(void); + #ifndef __linux__ /* Dummy implementations for non Linux builds. * @@ -41,6 +43,11 @@ static inline int dpif_rtnetlink_port_destroy(const char *name OVS_UNUSED, return EOPNOTSUPP; } +static inline bool dpif_rtnetlink_probe_oot_tunnels(void) +{ +return true; +} + #endif #endif /* DPIF_RTNETLINK_H */ -- 2.10.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH RFC v5 3/8] dpif-netlink: code to create/destroy tunnel ports via rtnetlink
In order to be able to add those tunnels, we need to add code to create the tunnels and add them as NETDEV vports. And when there is no support to create them, we need to fallback to compatibility code and add them as tunnel vports. When removing those tunnels, we need to remove the interfaces as well, and detecting the right type might be important, at least to distinguish the tunnel vports that we should remove and the interfaces that we shouldn't. Co-authored-by: Thadeu Lima de Souza CascardoSigned-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: Eric Garver --- lib/automake.mk | 3 +++ lib/dpif-netlink.c | 58 -- lib/dpif-netlink.h | 2 ++ lib/dpif-rtnetlink.c | 60 lib/dpif-rtnetlink.h | 46 5 files changed, 162 insertions(+), 7 deletions(-) create mode 100644 lib/dpif-rtnetlink.c create mode 100644 lib/dpif-rtnetlink.h diff --git a/lib/automake.mk b/lib/automake.mk index abc9d0d5cc4e..288a828d9007 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -351,6 +351,8 @@ if LINUX lib_libopenvswitch_la_SOURCES += \ lib/dpif-netlink.c \ lib/dpif-netlink.h \ + lib/dpif-rtnetlink.c \ + lib/dpif-rtnetlink.h \ lib/if-notifier.c \ lib/if-notifier.h \ lib/netdev-linux.c \ @@ -381,6 +383,7 @@ if WIN32 lib_libopenvswitch_la_SOURCES += \ lib/dpif-netlink.c \ lib/dpif-netlink.h \ + lib/dpif-rtnetlink.h \ lib/netdev-windows.c \ lib/netlink-conntrack.c \ lib/netlink-conntrack.h \ diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index a882ea683114..ae8204ffe6e1 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -34,6 +34,7 @@ #include "bitmap.h" #include "dpif-provider.h" +#include "dpif-rtnetlink.h" #include "openvswitch/dynamic-string.h" #include "flow.h" #include "fat-rwlock.h" @@ -221,6 +222,9 @@ static void dpif_netlink_vport_to_ofpbuf(const struct dpif_netlink_vport *, struct ofpbuf *); static int dpif_netlink_vport_from_ofpbuf(struct dpif_netlink_vport *, const struct ofpbuf *); +static int +dpif_netlink_port_query__(const struct dpif_netlink *dpif, odp_port_t port_no, + const char *port_name, struct dpif_port *dpif_port); static struct dpif_netlink * dpif_netlink_cast(const struct dpif *dpif) @@ -780,7 +784,7 @@ get_vport_type(const struct dpif_netlink_vport *vport) return "unknown"; } -static enum ovs_vport_type +enum ovs_vport_type netdev_to_ovs_vport_type(const char *type) { if (!strcmp(type, "tap") || !strcmp(type, "system")) { @@ -942,8 +946,33 @@ dpif_netlink_port_add_compat(struct dpif_netlink *dpif, struct netdev *netdev, } +static int +dpif_rtnetlink_port_create_and_add(struct dpif_netlink *dpif, + struct netdev *netdev, odp_port_t *port_nop) +OVS_REQ_WRLOCK(dpif->upcall_lock) +{ +int error; +char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; +const char *name = netdev_vport_get_dpif_port(netdev, + namebuf, sizeof namebuf); +error = dpif_rtnetlink_port_create(netdev); +if (error) { +if (error != EOPNOTSUPP) { +VLOG_DBG("Failed to create %s with rtnetlink. error = %d", + netdev_get_name(netdev), error); +} +return error; +} +error = dpif_netlink_port_add__(dpif, name, OVS_VPORT_TYPE_NETDEV, NULL, +port_nop); +if (error) { +VLOG_DBG("failed to add port, destroying: %d", error); +dpif_rtnetlink_port_destroy(name, netdev_get_type(netdev)); +} +return error; +} static int dpif_netlink_port_add(struct dpif *dpif_, struct netdev *netdev, @@ -953,7 +982,10 @@ dpif_netlink_port_add(struct dpif *dpif_, struct netdev *netdev, int error; fat_rwlock_wrlock(>upcall_lock); -error = dpif_netlink_port_add_compat(dpif, netdev, port_nop); +error = dpif_rtnetlink_port_create_and_add(dpif, netdev, port_nop); +if (error == EOPNOTSUPP) { +error = dpif_netlink_port_add_compat(dpif, netdev, port_nop); +} fat_rwlock_unlock(>upcall_lock); return error; @@ -964,19 +996,23 @@ dpif_netlink_port_del__(struct dpif_netlink *dpif, odp_port_t port_no) OVS_REQ_WRLOCK(dpif->upcall_lock) { struct dpif_netlink_vport vport; +struct dpif_port dpif_port; int error; +error = dpif_netlink_port_query__(dpif, port_no, NULL, _port); +if (error) { +return error; +} + dpif_netlink_vport_init(); vport.cmd = OVS_VPORT_CMD_DEL; vport.dp_ifindex = dpif->dp_ifindex; vport.port_no = port_no; #ifdef _WIN32 -struct dpif_port temp_dpif_port; -
[ovs-dev] [PATCH RFC v5 5/8] dpif-netlink: add GRE creation support
Creates GRE devices using rtnetlink and tunnel metadata. Co-Authored-by: Thadeu Lima de Souza CascardoSigned-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: Eric Garver --- lib/dpif-rtnetlink.c | 108 +++ 1 file changed, 108 insertions(+) diff --git a/lib/dpif-rtnetlink.c b/lib/dpif-rtnetlink.c index 8b6574d1a145..c06aa82256bc 100644 --- a/lib/dpif-rtnetlink.c +++ b/lib/dpif-rtnetlink.c @@ -39,6 +39,10 @@ #define IFLA_VXLAN_COLLECT_METADATA 25 #endif +#if IFLA_GRE_MAX < 18 +#define IFLA_GRE_COLLECT_METADATA 18 +#endif + static const struct nl_policy rtlink_policy[] = { [IFLA_LINKINFO] = { .type = NL_A_NESTED }, }; @@ -77,6 +81,12 @@ dpif_rtnetlink_vxlan_destroy(const char *name) } static int +dpif_rtnetlink_gre_destroy(const char *name) +{ +return dpif_rtnetlink_destroy(name); +} + +static int dpif_rtnetlink_vxlan_verify(struct netdev *netdev, const char *name, const char *kind) { @@ -198,6 +208,102 @@ dpif_rtnetlink_vxlan_create(struct netdev *netdev) return dpif_rtnetlink_vxlan_create_kind(netdev, "vxlan"); } +static int +dpif_rtnetlink_gre_verify(struct netdev *netdev OVS_UNUSED, const char *name, + const char *kind) +{ +int err; +struct ofpbuf request, *reply; +struct ifinfomsg *ifmsg; + +static const struct nl_policy gre_policy[] = { +[IFLA_GRE_COLLECT_METADATA] = { .type = NL_A_FLAG }, +}; + +ofpbuf_init(, 0); +nl_msg_put_nlmsghdr(, 0, RTM_GETLINK, +NLM_F_REQUEST); +ofpbuf_put_zeros(, sizeof(struct ifinfomsg)); +nl_msg_put_string(, IFLA_IFNAME, name); + +err = nl_transact(NETLINK_ROUTE, , ); +if (!err) { +struct nlattr *rtlink[ARRAY_SIZE(rtlink_policy)]; +struct nlattr *linkinfo[ARRAY_SIZE(linkinfo_policy)]; +struct nlattr *gre[ARRAY_SIZE(gre_policy)]; + +ifmsg = ofpbuf_at(reply, NLMSG_HDRLEN, sizeof *ifmsg); +if (!nl_policy_parse(reply, NLMSG_HDRLEN + sizeof *ifmsg, + rtlink_policy, rtlink, + ARRAY_SIZE(rtlink_policy)) || +!nl_parse_nested(rtlink[IFLA_LINKINFO], linkinfo_policy, + linkinfo, ARRAY_SIZE(linkinfo_policy)) || +strcmp(nl_attr_get_string(linkinfo[IFLA_INFO_KIND]), kind) || +!nl_parse_nested(linkinfo[IFLA_INFO_DATA], gre_policy, gre, + ARRAY_SIZE(gre_policy))) { +err = EINVAL; +} +if (!err) { +if (!nl_attr_get_flag(gre[IFLA_GRE_COLLECT_METADATA])) { +err = EINVAL; +} +} +ofpbuf_uninit(reply); +} +ofpbuf_uninit(); +return err; +} + +static int +dpif_rtnetlink_gre_create_kind(struct netdev *netdev, const char *kind) +{ +int err; +struct ofpbuf request, *reply; +size_t linkinfo_off, infodata_off; +char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; +const char *name = netdev_vport_get_dpif_port(netdev, + namebuf, sizeof namebuf); +struct ifinfomsg *ifinfo; +const struct netdev_tunnel_config *tnl_cfg; +tnl_cfg = netdev_get_tunnel_config(netdev); +if (!tnl_cfg) { +return EINVAL; +} + +ofpbuf_init(, 0); +nl_msg_put_nlmsghdr(, 0, RTM_NEWLINK, +NLM_F_REQUEST | NLM_F_ACK | NLM_F_CREATE); +ifinfo = ofpbuf_put_zeros(, sizeof(struct ifinfomsg)); +ifinfo->ifi_change = ifinfo->ifi_flags = IFF_UP; +nl_msg_put_string(, IFLA_IFNAME, name); +nl_msg_put_u32(, IFLA_MTU, UINT16_MAX); +linkinfo_off = nl_msg_start_nested(, IFLA_LINKINFO); +nl_msg_put_string(, IFLA_INFO_KIND, kind); +infodata_off = nl_msg_start_nested(, IFLA_INFO_DATA); +nl_msg_put_flag(, IFLA_GRE_COLLECT_METADATA); +nl_msg_end_nested(, infodata_off); +nl_msg_end_nested(, linkinfo_off); + +err = nl_transact(NETLINK_ROUTE, , ); + +if (!err) { +ofpbuf_uninit(reply); +} + +if (!err && (err = dpif_rtnetlink_gre_verify(netdev, name, kind))) { +dpif_rtnetlink_gre_destroy(name); +} + +ofpbuf_uninit(); +return err; +} + +static int +dpif_rtnetlink_gre_create(struct netdev *netdev) +{ +return dpif_rtnetlink_gre_create_kind(netdev, "gretap"); +} + int dpif_rtnetlink_port_create(struct netdev *netdev) { @@ -205,6 +311,7 @@ dpif_rtnetlink_port_create(struct netdev *netdev) case OVS_VPORT_TYPE_VXLAN: return dpif_rtnetlink_vxlan_create(netdev); case OVS_VPORT_TYPE_GRE: +return dpif_rtnetlink_gre_create(netdev); case OVS_VPORT_TYPE_GENEVE: case OVS_VPORT_TYPE_NETDEV: case OVS_VPORT_TYPE_INTERNAL: @@ -225,6 +332,7 @@ dpif_rtnetlink_port_destroy(const char *name, const char *type) case OVS_VPORT_TYPE_VXLAN: return
[ovs-dev] [PATCH RFC v5 4/8] dpif-netlink: add VXLAN creation support
Creates VXLAN devices using rtnetlink and tunnel metadata. Co-Authored-by: Thadeu Lima de Souza CascardoSigned-off-by: Thadeu Lima de Souza Cascardo Signed-off-by: Eric Garver --- lib/dpif-rtnetlink.c | 181 ++- lib/dpif-rtnetlink.h | 2 +- 2 files changed, 181 insertions(+), 2 deletions(-) diff --git a/lib/dpif-rtnetlink.c b/lib/dpif-rtnetlink.c index b309c88d187a..8b6574d1a145 100644 --- a/lib/dpif-rtnetlink.c +++ b/lib/dpif-rtnetlink.c @@ -18,14 +18,192 @@ #include "dpif-rtnetlink.h" +#include +#include +#include + #include "dpif-netlink.h" +#include "netdev-vport.h" +#include "netlink-socket.h" + +/* + * On some older systems, these enums are not defined. + */ +#ifndef IFLA_VXLAN_MAX +#define IFLA_VXLAN_MAX 0 +#define IFLA_VXLAN_PORT 15 +#endif +#if IFLA_VXLAN_MAX < 20 +#define IFLA_VXLAN_UDP_ZERO_CSUM6_RX 20 +#define IFLA_VXLAN_GBP 23 +#define IFLA_VXLAN_COLLECT_METADATA 25 +#endif + +static const struct nl_policy rtlink_policy[] = { +[IFLA_LINKINFO] = { .type = NL_A_NESTED }, +}; +static const struct nl_policy linkinfo_policy[] = { +[IFLA_INFO_KIND] = { .type = NL_A_STRING }, +[IFLA_INFO_DATA] = { .type = NL_A_NESTED }, +}; + + +static int +dpif_rtnetlink_destroy(const char *name) +{ +int err; +struct ofpbuf request, *reply; + +ofpbuf_init(, 0); +nl_msg_put_nlmsghdr(, 0, RTM_DELLINK, +NLM_F_REQUEST | NLM_F_ACK); +ofpbuf_put_zeros(, sizeof(struct ifinfomsg)); +nl_msg_put_string(, IFLA_IFNAME, name); + +err = nl_transact(NETLINK_ROUTE, , ); + +if (!err) { +ofpbuf_uninit(reply); +} + +ofpbuf_uninit(); +return err; +} + +static int +dpif_rtnetlink_vxlan_destroy(const char *name) +{ +return dpif_rtnetlink_destroy(name); +} + +static int +dpif_rtnetlink_vxlan_verify(struct netdev *netdev, const char *name, +const char *kind) +{ +int err; +struct ofpbuf request, *reply; +struct ifinfomsg *ifmsg; +const struct netdev_tunnel_config *tnl_cfg; + +static const struct nl_policy vxlan_policy[] = { +[IFLA_VXLAN_COLLECT_METADATA] = { .type = NL_A_U8 }, +[IFLA_VXLAN_LEARNING] = { .type = NL_A_U8 }, +[IFLA_VXLAN_UDP_ZERO_CSUM6_RX] = { .type = NL_A_U8 }, +[IFLA_VXLAN_PORT] = { .type = NL_A_U16 }, +}; +tnl_cfg = netdev_get_tunnel_config(netdev); +if (!tnl_cfg) { +return EINVAL; +} + +ofpbuf_init(, 0); +nl_msg_put_nlmsghdr(, 0, RTM_GETLINK, +NLM_F_REQUEST); +ofpbuf_put_zeros(, sizeof(struct ifinfomsg)); +nl_msg_put_string(, IFLA_IFNAME, name); + +err = nl_transact(NETLINK_ROUTE, , ); +if (!err) { +struct nlattr *rtlink[ARRAY_SIZE(rtlink_policy)]; +struct nlattr *linkinfo[ARRAY_SIZE(linkinfo_policy)]; +struct nlattr *vxlan[ARRAY_SIZE(vxlan_policy)]; + +ifmsg = ofpbuf_at(reply, NLMSG_HDRLEN, sizeof *ifmsg); +if (!nl_policy_parse(reply, NLMSG_HDRLEN + sizeof *ifmsg, + rtlink_policy, rtlink, + ARRAY_SIZE(rtlink_policy)) || +!nl_parse_nested(rtlink[IFLA_LINKINFO], linkinfo_policy, + linkinfo, ARRAY_SIZE(linkinfo_policy)) || +strcmp(nl_attr_get_string(linkinfo[IFLA_INFO_KIND]), kind) || +!nl_parse_nested(linkinfo[IFLA_INFO_DATA], vxlan_policy, vxlan, + ARRAY_SIZE(vxlan_policy))) { +err = EINVAL; +} +if (!err) { +if (0 != nl_attr_get_u8(vxlan[IFLA_VXLAN_LEARNING]) || +1 != nl_attr_get_u8(vxlan[IFLA_VXLAN_COLLECT_METADATA]) || +1 != nl_attr_get_u8(vxlan[IFLA_VXLAN_UDP_ZERO_CSUM6_RX]) || +tnl_cfg->dst_port != +nl_attr_get_be16(vxlan[IFLA_VXLAN_PORT])) { +err = EINVAL; +} +} +if (!err) { +if ((tnl_cfg->exts & (1 << OVS_VXLAN_EXT_GBP)) && +!(vxlan[IFLA_VXLAN_GBP] && + nl_attr_get_flag(vxlan[IFLA_VXLAN_GBP]))) { +err = EINVAL; +} +} +ofpbuf_uninit(reply); +} +ofpbuf_uninit(); +return err; +} + +static int +dpif_rtnetlink_vxlan_create_kind(struct netdev *netdev, const char *kind) +{ +int err; +struct ofpbuf request, *reply; +size_t linkinfo_off, infodata_off; +char namebuf[NETDEV_VPORT_NAME_BUFSIZE]; +const char *name = netdev_vport_get_dpif_port(netdev, + namebuf, sizeof namebuf); +struct ifinfomsg *ifinfo; +const struct netdev_tunnel_config *tnl_cfg; +tnl_cfg = netdev_get_tunnel_config(netdev); +if (!tnl_cfg) { +return EINVAL; +} + +ofpbuf_init(, 0); +nl_msg_put_nlmsghdr(, 0, RTM_NEWLINK, +NLM_F_REQUEST |
[ovs-dev] [PATCH RFC v5 0/8] create tunnel devices using rtnetlink interface
This series adds support for the creation of tunnels using the rtnetlink interface. This will open the possibility for new features and flags on those vports without the need to change vport compatibility code. Support for STT and LISP have not been added because these are not upstream yet, so we don't know how the interface will be like upstream. And there are no features in the current drivers right now we could make use of. Note: This work originally started by Thadeu Lima de Souza Cascardo. Testing: - kernel 4.9.3, in-tree datapath - rtnetlink successfully creates devices - kernel 4.2.8, in-tree datapath - rtnetlink is tried, but fails due to no COLLECT_METADATA support - genetlink successfully creates devices - kernel 4.2.8, out-of-tree datapath - rtnetlink is not tried - genetlink successfully creates devices v2: We are able to set the MTU to UINT16_MAX since it is not restricted by the driver during newlink. v3: Prefer to get type from vport before checking if device is opened. Also, disable IFLA_VXLAN_LEARNING as it's not enabled on compat vports as well. v4: - Probe for ovs_geneve on init, this indicates out-of-tree datapath - If exists, only try genetlink/compat - else, try rtnetlink and fallback to genetlink/compat - Read back and verify devices created with rtnetlink - checkpatch fixes v5: - Move rtnetlink code to a new file lib/dpif-rtnetlink.c. This is for creating/destroying linux tunnel devices. - Move probe patch to after GENEVE, so it doesn't break build. - Add NEWS item - Split patch 2 into two parts (patch 2 and 3 in this series) Eric Garver (7): dpif-netlink: break up code that creates compat ports dpif-netlink: code to create/destroy tunnel ports via rtnetlink dpif-netlink: add VXLAN creation support dpif-netlink: add GRE creation support dpif-netlink: add GENEVE creation support dpif-netlink: Probe for out-of-tree datapath. NEWS: Add item for creating tunnels via rtnetlink Thadeu Lima de Souza Cascardo (1): netdev: get device type from vport prefix if it uses one NEWS | 3 + lib/automake.mk | 3 + lib/dpif-netlink.c | 199 +--- lib/dpif-netlink.h | 2 + lib/dpif-rtnetlink.c | 515 +++ lib/dpif-rtnetlink.h | 53 ++ lib/netdev.c | 26 ++- 7 files changed, 735 insertions(+), 66 deletions(-) create mode 100644 lib/dpif-rtnetlink.c create mode 100644 lib/dpif-rtnetlink.h -- 2.10.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH RFC v5 1/8] netdev: get device type from vport prefix if it uses one
From: Thadeu Lima de Souza CascardoIf the device name uses a vport prefix, then use that vport type. Since these names are reserved, we can assume this is the right type. This is important when we are querying the datapath right after vswitch has started and using the right type will be even more important when we add support to creating tunnel ports with rtnetlink. Signed-off-by: Thadeu Lima de Souza Cascardo --- lib/netdev.c | 26 +++--- 1 file changed, 23 insertions(+), 3 deletions(-) diff --git a/lib/netdev.c b/lib/netdev.c index a8d8edad7243..26c413601550 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -288,6 +288,21 @@ netdev_enumerate_types(struct sset *types) } } +static const char * +netdev_vport_type_from_name(const char *name) +{ +struct netdev_registered_class *rc; +const char *type; +CMAP_FOR_EACH (rc, cmap_node, _classes) { +const char *dpif_port = netdev_vport_class_get_dpif_port(rc->class); +if (dpif_port && !strncmp(name, dpif_port, strlen(dpif_port))) { +type = rc->class->type; +return type; +} +} +return NULL; +} + /* Check that the network device name is not the same as any of the registered * vport providers' dpif_port name (dpif_port is NULL if the vport provider * does not define it) or the datapath internal port name (e.g. ovs-system). @@ -1811,9 +1826,14 @@ netdev_get_vports(size_t *size) const char * netdev_get_type_from_name(const char *name) { -struct netdev *dev = netdev_from_name(name); -const char *type = dev ? netdev_get_type(dev) : NULL; -netdev_close(dev); +struct netdev *dev; +const char *type; +type = netdev_vport_type_from_name(name); +if (type == NULL) { +dev = netdev_from_name(name); +type = dev ? netdev_get_type(dev) : NULL; +netdev_close(dev); +} return type; } -- 2.10.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] ovs-appctl: Print lacp_fallback_ab info in "bond/show".
On Thu, Feb 16, 2017 at 2:52 AM, nickcooper-zhangtonghaowrote: > Signed-off-by: nickcooper-zhangtonghao Applied. Thanks. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ofproto/bond: Fix bond/show when all interfaces are disabled
Without this patch, when all slaves are disabled, the 'bond/show' command still shows the mac address of last active slave in 'active slave mac' output. This patch clears them to zeros. Signed-off-by: Andy Zhou--- ofproto/bond.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/ofproto/bond.c b/ofproto/bond.c index c138593..260023e 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -488,10 +488,13 @@ bond_find_slave_by_mac(const struct bond *bond, const struct eth_addr mac) static void bond_active_slave_changed(struct bond *bond) { -struct eth_addr mac; - -netdev_get_etheraddr(bond->active_slave->netdev, ); -bond->active_slave_mac = mac; +if (bond->active_slave) { +struct eth_addr mac; +netdev_get_etheraddr(bond->active_slave->netdev, ); +bond->active_slave_mac = mac; +} else { +bond->active_slave_mac = eth_addr_zero; +} bond->active_slave_changed = true; seq_change(connectivity_seq_get()); } @@ -1866,6 +1869,7 @@ bond_choose_active_slave(struct bond *bond) bond_active_slave_changed(bond); } } else if (old_active_slave) { +bond_active_slave_changed(bond); VLOG_INFO_RL(, "bond %s: all interfaces disabled", bond->name); } } -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] ofproto/bond: Drop traffic in balance-tcp mode without lacp.
On Wed, Feb 15, 2017 at 5:32 PM, nickcooper-zhangtonghaowrote: > The balance-tcp mode requires the upstream switch to support 802.3ad > with successful LACP negotiation. When bond ports are configured to > balance-tcp mode without lacp or lacp is disabled, drop the traffic. > > Signed-off-by: nickcooper-zhangtonghao Applied. Thanks. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVN: Preserving state between logical datapaths
On Fri, Feb 17, 2017 at 12:44:09AM +0500, Valentine Sinitsyn wrote: > On 16.02.2017 21:24, Ben Pfaff wrote: > >On Thu, Feb 16, 2017 at 08:46:34PM +0500, Valentine Sinitsyn wrote: > >>Imagine you want to mark a packet in logical switch datapath then use this > >>mark in logical router datapath somehow (an artificial use-case would be > >>policy routing based on VM port, not destination IP address). > >> > >>Is there a better way than using packet mark (which also doesn't seem to > >>survive "output" action, yet it's easily fixable)? I assume OVS/OVN 2.6 on > >>Linux with in-kernel datapath, if this matters. > > > >The main issue for preserving metadata from one logical datapath to > >another is that there needs to be a way to do it when we cross from one > >hypervisor to another. This is straightforward in Geneve, by adding an > >extra option. So far, we haven't had enough motivation to add extra > >options for this purpose. > Makes sense. Let us consider an artificial use-case above, though. If we are > only concerned about preserving metadata across logical patch ports, the > traffic is local to hypervisor. What then? If that's the only case you care about, all you have to do is delete the code in ovn-controller that zeros out metadata when packets go from ingress to egress or from one logical datapath to another. (This could cause some problems because the logical flows tables assume that registers are zero at the beginning of the pipeline.) ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH branch-2.7] docs: Add dpdk stable release to DPDK install docs.
DPDK now provides a stable release branch. Modify install docs to use the DPDK v.16.11 stable branch snapshot to benefit from most recent bug fixes. Signed-off-by: Ian Stokes--- Documentation/intro/install/dpdk.rst |6 +++--- Documentation/topics/dpdk/vhost-user.rst |8 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 3018590..0cf4dab 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -64,9 +64,9 @@ Install DPDK #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``:: $ cd /usr/src/ - $ wget http://fast.dpdk.org/rel/dpdk-16.11.tar.xz - $ tar xf dpdk-16.11.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-16.11 + $ wget http://dpdk.org/browse/dpdk-stable/snapshot/dpdk-stable-16.11.tar.xz + $ tar xf dpdk-stable-16.11.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-16.11 $ cd $DPDK_DIR #. (Optional) Configure DPDK as a shared library diff --git a/Documentation/topics/dpdk/vhost-user.rst b/Documentation/topics/dpdk/vhost-user.rst index 5448bd2..a0fd582 100644 --- a/Documentation/topics/dpdk/vhost-user.rst +++ b/Documentation/topics/dpdk/vhost-user.rst @@ -278,9 +278,9 @@ To begin, instantiate a guest as described in :ref:`dpdk-vhost-user` or DPDK sources to VM and build DPDK:: $ cd /root/dpdk/ -$ wget http://fast.dpdk.org/rel/dpdk-16.11.tar.xz -$ tar xf dpdk-16.11.tar.xz -$ export DPDK_DIR=/root/dpdk/dpdk-16.11 +$ wget http://dpdk.org/browse/dpdk-stable/snapshot/dpdk-stable-16.11.tar.xz +$ tar xf dpdk-stable-16.11.tar.xz +$ export DPDK_DIR=/usr/src/dpdk-stable-16.11 $ export DPDK_TARGET=x86_64-native-linuxapp-gcc $ export DPDK_BUILD=$DPDK_DIR/$DPDK_TARGET $ cd $DPDK_DIR @@ -364,7 +364,7 @@ Sample XML - + -- 1.7.0.7 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v10] dpif-netdev: Conditional EMC insert
2017-02-16 3:01 GMT-08:00 Kevin Traynor: > On 02/16/2017 10:22 AM, Ciara Loftus wrote: >> Unconditional insertion of EMC entries results in EMC thrashing at high >> numbers of parallel flows. When this occurs, the performance of the EMC >> often falls below that of the dpcls classifier, rendering the EMC >> practically useless. >> >> Instead of unconditionally inserting entries into the EMC when a miss >> occurs, use a 1% probability of insertion. This ensures that the most >> frequent flows have the highest chance of creating an entry in the EMC, >> and the probability of thrashing the EMC is also greatly reduced. >> >> The probability of insertion is configurable, via the >> other_config:emc-insert-inv-prob option. This value sets the average >> probability of insertion to 1/emc-insert-inv-prob. >> >> For example the following command changes the insertion probability to >> (on average) 1 in every 20 packets ie. 1/20 ie. 5%. >> >> ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=20 >> >> Signed-off-by: Ciara Loftus >> Signed-off-by: Georg Schmuecking >> Co-authored-by: Georg Schmuecking >> Acked-by: Kevin Traynor >> --- >> v10: >> - Fixed typo in commit message >> - Only store insert_min when value has changed >> - Add prints to reflect changes in the DB > > Thanks for the changes, LGTM. > Kevin. Thanks a lot. I squashed the following incremental to support values that don't fit in a signed integer: diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 35d3eda5e..31aee51a2 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -2762,7 +2762,9 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) { struct dp_netdev *dp = get_dp_netdev(dpif); const char *cmask = smap_get(other_config, "pmd-cpu-mask"); -int insert_prob = smap_get_int(other_config, "emc-insert-inv-prob", -1); +unsigned long long insert_prob = +smap_get_ullong(other_config, "emc-insert-inv-prob", +DEFAULT_EM_FLOW_INSERT_INV_PROB); uint32_t insert_min, cur_min; if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) { @@ -2772,7 +2774,7 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) } atomic_read_relaxed(>emc_insert_min, _min); -if (insert_prob >= 0 && insert_prob <= UINT32_MAX) { +if (insert_prob <= UINT32_MAX) { insert_min = insert_prob == 0 ? 0 : UINT32_MAX / insert_prob; } else { insert_min = DEFAULT_EM_FLOW_INSERT_MIN; @@ -2784,7 +2786,7 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) if (insert_min == 0) { VLOG_INFO("EMC has been disabled"); } else { -VLOG_INFO("EMC insertion probability changed to 1/%i (~%.2f%%)", +VLOG_INFO("EMC insertion probability changed to 1/%llu (~%.2f%%)", insert_prob, (100 / (float)insert_prob)); } } and pushed this to master. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] docker.rst: Add documentation to open up TCP ports.
On 16 February 2017 at 10:25, Ben Pfaffwrote: > On Tue, Feb 07, 2017 at 04:54:44AM -0800, Gurucharan Shetty wrote: > > Signed-off-by: Gurucharan Shetty > > --- > > Documentation/howto/docker.rst | 6 ++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/Documentation/howto/docker.rst > b/Documentation/howto/docker.rst > > index e23ca75..7845699 100644 > > --- a/Documentation/howto/docker.rst > > +++ b/Documentation/howto/docker.rst > > @@ -90,6 +90,12 @@ The "overlay" mode > > > >$ /usr/share/openvswitch/scripts/ovn-ctl start_northd > > > > + With Open vSwitch version of 2.7 or greater, you need to run the > following > > + additional commands:: > > + > > + $ ovn-nbctl set-connection ptcp:6641 > > + $ ovn-sbctl set-connection ptcp:6642 > > Acked-by: Ben Pfaff > > We really should provide documentation on how to set up SSL. > Yeah. I do have documentation for SSL support in kubernetes here: https://github.com/openvswitch/ovn-kubernetes/blob/master/docs/INSTALL.SSL.md But it only says - do it this way and not try to explain all the other potential ways to do it. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] manpages in rst?
On Thu, Feb 16, 2017 at 12:24 PM, Ben Pfaffwrote: > Currently, we have some manpages written directly in nroff. This is an > awful format, that is difficult to read and difficult to write. Other > manpages are written in a custom XML format that, while it is easier to > read and write, isn't any standard format and so we can't expect anyone > else (person or program) to understand it. This is not ideal. It's > difficult to include either format in the readthedocs documentation, > too. > > I'm thinking about starting to write manpages in REstructured Text > (rst). This would make it much easier to include them in the > readthedocs pages, and ReST seems to convert pretty well to nroff for > installing as real manpages. For example, try fetching > http://docutils.sourceforge.net/sandbox/manpage-writer/input/test.txt, > which is a rst file, and then running "rst2man test.txt > test.man" and > viewing test.man with "man -l" or "groffer". The output looks fine. > > I think that all we'd need for this is a build dependency on > python-docutils to ensure that rst2man is available at build time. > > Does anyone have comments? > +1 for rst. That should make it easier to integrate the same content into docs.openvswitch.org. I think sphinx has support for man pages, but it has been a long time since I've used it. -- Russell Bryant ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Control y Ahorro en las cuentas por pagar
Principales errores en Cuentas por Pagar. Control y Ahorro en las cuentas por pagar Se presentan las herramientas financieras prácticas (Reglas de Oro) para mejorar las políticas de control del área de Cuentas por Pagar y generar ahorros en la compañía. Mejore el control, planeación y vigilancia de los proveedores, además de enumerar los errores más comunes en las Cuentas por Pagar, así como conocimientos legales básicos para evitar sorpresas y mantener finanzas sanas. Si desea que le adjuntemos el temario Sin compromiso, responda este correo con la palabra: Info-Finanzas, juntos con los datos solicitados. Nombre: Teléfono: Correo: y le enviaremos la información completa de este tema que está incluido en nuestro Plan Integral de Capacitación de Contabilidad y Finanzas. centro telefónico: 018002129393 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] OVN: Preserving state between logical datapaths
On Thu, Feb 16, 2017 at 08:46:34PM +0500, Valentine Sinitsyn wrote: > Imagine you want to mark a packet in logical switch datapath then use this > mark in logical router datapath somehow (an artificial use-case would be > policy routing based on VM port, not destination IP address). > > Is there a better way than using packet mark (which also doesn't seem to > survive "output" action, yet it's easily fixable)? I assume OVS/OVN 2.6 on > Linux with in-kernel datapath, if this matters. The main issue for preserving metadata from one logical datapath to another is that there needs to be a way to do it when we cross from one hypervisor to another. This is straightforward in Geneve, by adding an extra option. So far, we haven't had enough motivation to add extra options for this purpose. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] OVN: Preserving state between logical datapaths
Hi all, Imagine you want to mark a packet in logical switch datapath then use this mark in logical router datapath somehow (an artificial use-case would be policy routing based on VM port, not destination IP address). Is there a better way than using packet mark (which also doesn't seem to survive "output" action, yet it's easily fixable)? I assume OVS/OVN 2.6 on Linux with in-kernel datapath, if this matters. Many thanks, Valentine Sinitsyn ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] netdev-dpdk: Fix rx_error stat for dpdk ports.
"rx_error" stat for a DPDK interface was calculated with the assumption that dropped packets due to hardware buffer overload were counted as errors in DPDK and the rte ierror stat included rte imissed packets i.e. rx_errors = rte_stats.ierrors - rte_stats.imissed This results in negative statistic values as imissed packets are no longer counted as part of ierror since DPDK v.16.04. Fix this by setting rx_errors equal to ierrors only. Fixes: 9e3ddd45 (netdev-dpdk: Add some missing statistics.) CC: Timo Puha) Reported-by: Stepan Andrushko Signed-off-by: Ian Stokes --- lib/netdev-dpdk.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 94568a1..ee53c4c 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -2067,8 +2067,7 @@ out: stats->tx_packets = rte_stats.opackets; stats->rx_bytes = rte_stats.ibytes; stats->tx_bytes = rte_stats.obytes; -/* DPDK counts imissed as errors, but count them here as dropped instead */ -stats->rx_errors = rte_stats.ierrors - rte_stats.imissed; +stats->rx_errors = rte_stats.ierrors; stats->tx_errors = rte_stats.oerrors; rte_spinlock_lock(>stats_lock); -- 1.7.0.7 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] doc: Describe backporting process.
On Wed, 2017-02-15 at 15:05 -0800, Joe Stringer wrote: > This patch documents the backporting process, and provides a > walkthrough > for developers who would like to backport upstream Linux patches into > the Open vSwitch tree. Nothing in this documentation should be > surprising or new; it merely puts the existing process into words. > > Signed-off-by: Joe Stringer> Acked-by: Ben Pfaff Excellent guide - I had no idea OVS and the net-next tree were as closely related as they are. Couple of nits below but nothing I'd block on. Acked-by: Stephen Finucane > --- > Documentation/automake.mk | 1 + > Documentation/index.rst| 1 + > Documentation/internals/contributing/backports.rst | 232 > + > Documentation/internals/contributing/index.rst | 1 + > 4 files changed, 235 insertions(+) > create mode 100644 > Documentation/internals/contributing/backports.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index 42553f0b57ff..610d8ccc6f96 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -80,6 +80,7 @@ EXTRA_DIST += \ > Documentation/internals/release-process.rst \ > Documentation/internals/security.rst \ > Documentation/internals/contributing/index.rst \ > + Documentation/internals/contributing/backports.rst \ > Documentation/internals/contributing/coding-style.rst \ > Documentation/internals/contributing/coding-style- > windows.rst \ > Documentation/internals/contributing/documentation-style.rst > \ > diff --git a/Documentation/index.rst b/Documentation/index.rst > index 02b376fc2a08..8cfb9f3f47a8 100644 > --- a/Documentation/index.rst > +++ b/Documentation/index.rst > @@ -98,6 +98,7 @@ Learn more about the Open vSwitch project and about > how you can contribute: > :doc:`internals/security` > > - **Contributing:** :doc:`internals/contributing/submitting-patches` > | > + :doc:`internals/contributing/backports` | I guess 'backporting-patches' might be more in keeping with existing files? > :doc:`internals/contributing/coding-style` | > :doc:`internals/contributing/coding-style-windows` > > diff --git a/Documentation/internals/contributing/backports.rst > b/Documentation/internals/contributing/backports.rst > new file mode 100644 > index ..d1fa35007f01 > --- /dev/null > +++ b/Documentation/internals/contributing/backports.rst > @@ -0,0 +1,232 @@ > +.. Copyright (c) 2017 ??? > + Licensed under the Apache License, Version 2.0 (the > "License"); you may > + not use this file except in compliance with the License. You > may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, > software > + distributed under the License is distributed on an "AS IS" > BASIS, WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or > implied. See the > + License for the specific language governing permissions and > limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + === Heading 0 (reserved for the title in a document) > + --- Heading 1 > + ~~~ Heading 2 > + +++ Heading 3 > + ''' Heading 4 > + > + Avoid deeper levels because they do not render well. > + > +=== > +Backporting patches > +=== > + > +.. note:: > + > +This is an advanced topic for developers and maintainers. > Readers should > +familiarize themselves with building and running Open vSwitch, > with the git > +tool, and with the Open vSwitch patch submission process. > + > +The backporting of patches from one git tree to another takes > multiple forms > +within Open vSwitch, but is broadly applied in the following > fashion: > + > +- Contributors submit their proposed changes to the latest > development branch > +- Contributors and maintainers provide feedback on the patches > +- When the change is satisfactory, maintainers apply the patch to > the > + development branch. > +- Maintainers backport changes from a development branch to release > branches. > + > +With regards to Open vSwitch user space code and code that does not > comprise > +the Linux datapath and compat code, the development branch is > `master` in the > +Open vSwitch repository. Patches are applied first to this branch, > then to the > +most recent `branch-X.Y`, then earlier `branch-X.Z`, and so on. The > most common > +kind of patch in this category is a bugfix which affects master and > other > +branches. > + > +For Linux datapath code, the primary development branch is in the > `net-next`_ > +tree as described in the section below, and patch discussion occurs > on the > +`netdev`_ mailing list. Patches are first applied to
[ovs-dev] ID 8d6ba737-775e8bdc-f95f16f3-1b460259 - Company Complaint
This message has been generated in response to the company complaint submitted to Companies House. (CC01) Company Complaint for the above company was accepted on 16/02/2017. Please check attached documents for more information. The submission number is id: 8d6ba737-775e8bdc-f95f16f3-1b460259 Please quote this number in any communications with Companies House. Note: Attached documents are encrypted with a unique Private Key. Service Desk tel +44 (0)303 8097 432 or email enquir...@companieshouse.gov.uk Note: This email was sent from a notification-only email address which cannot accept incoming email. Please do not reply directly to this message If you’re unsure an email is from Companies House: Crown Logo Do not reply to it or click on any links Report the suspicious email to Companies House All content is available under the Open Government Licence v3.0, except where otherwise stated ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v10] dpif-netdev: Conditional EMC insert
On 02/16/2017 10:22 AM, Ciara Loftus wrote: > Unconditional insertion of EMC entries results in EMC thrashing at high > numbers of parallel flows. When this occurs, the performance of the EMC > often falls below that of the dpcls classifier, rendering the EMC > practically useless. > > Instead of unconditionally inserting entries into the EMC when a miss > occurs, use a 1% probability of insertion. This ensures that the most > frequent flows have the highest chance of creating an entry in the EMC, > and the probability of thrashing the EMC is also greatly reduced. > > The probability of insertion is configurable, via the > other_config:emc-insert-inv-prob option. This value sets the average > probability of insertion to 1/emc-insert-inv-prob. > > For example the following command changes the insertion probability to > (on average) 1 in every 20 packets ie. 1/20 ie. 5%. > > ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=20 > > Signed-off-by: Ciara Loftus> Signed-off-by: Georg Schmuecking > Co-authored-by: Georg Schmuecking > Acked-by: Kevin Traynor > --- > v10: > - Fixed typo in commit message > - Only store insert_min when value has changed > - Add prints to reflect changes in the DB Thanks for the changes, LGTM. Kevin. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] ovs-appctl: Print lacp_fallback_ab info in "bond/show".
Signed-off-by: nickcooper-zhangtonghao--- ofproto/bond.c | 3 +++ tests/lacp.at | 9 + 2 files changed, 12 insertions(+) diff --git a/ofproto/bond.c b/ofproto/bond.c index 2e018aa..de75f87 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -1325,6 +1325,9 @@ bond_print_details(struct ds *ds, const struct bond *bond) break; } +ds_put_format(ds, "lacp_fallback_ab: %s\n", + bond->lacp_fallback_ab ? "true" : "false"); + ds_put_cstr(ds, "active slave mac: "); ds_put_format(ds, ETH_ADDR_FMT, ETH_ADDR_ARGS(bond->active_slave_mac)); slave = bond_find_slave_by_mac(bond, bond->active_slave_mac); diff --git a/tests/lacp.at b/tests/lacp.at index 8f78e79..20ec09e 100644 --- a/tests/lacp.at +++ b/tests/lacp.at @@ -124,6 +124,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false active slave mac: 00:00:00:00:00:00(none) slave p1: disabled @@ -288,6 +289,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p0: enabled may_enable: true @@ -302,6 +304,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p2: enabled may_enable: true @@ -423,6 +426,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p0: disabled @@ -439,6 +443,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p2: disabled @@ -553,6 +558,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p0: disabled @@ -569,6 +575,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p2: disabled @@ -688,6 +695,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p0: enabled @@ -704,6 +712,7 @@ bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms lacp_status: negotiated +lacp_fallback_ab: false slave p2: enabled -- 1.8.3.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v10] dpif-netdev: Conditional EMC insert
Unconditional insertion of EMC entries results in EMC thrashing at high numbers of parallel flows. When this occurs, the performance of the EMC often falls below that of the dpcls classifier, rendering the EMC practically useless. Instead of unconditionally inserting entries into the EMC when a miss occurs, use a 1% probability of insertion. This ensures that the most frequent flows have the highest chance of creating an entry in the EMC, and the probability of thrashing the EMC is also greatly reduced. The probability of insertion is configurable, via the other_config:emc-insert-inv-prob option. This value sets the average probability of insertion to 1/emc-insert-inv-prob. For example the following command changes the insertion probability to (on average) 1 in every 20 packets ie. 1/20 ie. 5%. ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=20 Signed-off-by: Ciara LoftusSigned-off-by: Georg Schmuecking Co-authored-by: Georg Schmuecking Acked-by: Kevin Traynor --- v10: - Fixed typo in commit message - Only store insert_min when value has changed - Add prints to reflect changes in the DB v9: - Revert back to original 1/N formula for configuring the probability (don't use percentiles). - Rename option to reflect inverse probability & update documentation to make the wording clearer. v8: - Floating point precision percentiles - Moved NEWS entry to post-2.7 section and is no longer in the DPDK specific section. v7: - Remove further code duplication v6: - Refactor the code to remove duplication around calls to emc_probabilistic_insert() v5: - Use percentiles for emc-insert-prob (0-100%) - Update docs to reflect the option not exclusive to the DPDK datapath. v4: - Added Georg to Authors file - Set emc-insert-prob=1 for 'PMD - stats' unit test - Use read_relaxed on atomic var - Correctly allow for 0 and 100% probababilites - Cache align new element in dp_netdev struct - Revert to default probability if invalid emc-insert-prob set - Allow configurability for non-DPDK case v3: - Use new dpif other_config infrastructure to tidy up how the emc-insert-prob value is passed to dpif-netdev. v2: - Enable probability configurability via other_config:emc-insert-prob option. AUTHORS.rst | 1 + Documentation/howto/dpdk.rst | 20 NEWS | 2 ++ lib/dpif-netdev.c| 57 tests/pmd.at | 1 + vswitchd/vswitch.xml | 17 + 6 files changed, 94 insertions(+), 4 deletions(-) diff --git a/AUTHORS.rst b/AUTHORS.rst index f247df5..9a37423 100644 --- a/AUTHORS.rst +++ b/AUTHORS.rst @@ -385,6 +385,7 @@ Eric Lopez elo...@nicira.com Frido Roose fr.ro...@gmail.com Gaetano Catalli gaetano.cata...@gmail.com Gavin Remaley gavin_rema...@selinc.com +Georg Schmuecking georg.schmueck...@ericsson.com George Shuklin ama...@desunote.ru Gerald Rogers gerald.rog...@intel.com Ghanem Bahribahri.gha...@gmail.com diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst index d1e6e89..52cb3fc 100644 --- a/Documentation/howto/dpdk.rst +++ b/Documentation/howto/dpdk.rst @@ -354,6 +354,26 @@ the `DPDK documentation Note: Not all DPDK virtual PMD drivers have been tested and verified to work. +EMC Insertion Probability +- +By default 1 in every 100 flows are inserted into the Exact Match Cache (EMC). +It is possible to change this insertion probability by setting the +``emc-insert-inv-prob`` option:: + +$ ovs-vsctl --no-wait set Open_vSwitch . other_config:emc-insert-inv-prob=N + +where: + +``N`` + is a positive integer representing the inverse probability of insertion ie. + on average 1 in every N packets with a unique flow will generate an EMC + insertion. + +If ``N`` is set to 1, an insertion will be performed for every flow. If set to +0, no insertions will be performed and the EMC will effectively be disabled. + +For more information on the EMC refer to :doc:`/intro/install/dpdk` . + .. _dpdk-ovs-in-guest: OVS with DPDK Inside VMs diff --git a/NEWS b/NEWS index aebd99c..b14c76d 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,8 @@ Post-v2.7.0 - Tunnels: * Added support to set packet mark for tunnel endpoint using `egress_pkt_mark` OVSDB option. + - EMC insertion probability is reduced to 1% and is configurable via + the new 'other_config:emc-insert-inv-prob' option. v2.7.0 - xx xxx - diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 0be5db5..35d3eda 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -144,6 +144,11 @@ struct netdev_flow_key { #define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1) #define
[ovs-dev] [patch_v6 7/8] dpdk: Enable NAT tests for userspace datapath.
Signed-off-by: Darrell BallAcked-by: Flavio Leitner --- tests/system-userspace-macros.at | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/tests/system-userspace-macros.at b/tests/system-userspace-macros.at index 631f71a..6e3d468 100644 --- a/tests/system-userspace-macros.at +++ b/tests/system-userspace-macros.at @@ -99,9 +99,6 @@ m4_define([CHECK_CONNTRACK_LOCAL_STACK], # CHECK_CONNTRACK_NAT() # # Perform requirements checks for running conntrack NAT tests. The userspace -# doesn't support NATs yet, so skip the tests +# datapath supports NAT. # -m4_define([CHECK_CONNTRACK_NAT], -[ -AT_SKIP_IF([:]) -]) +m4_define([CHECK_CONNTRACK_NAT]) -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch_v6 8/8] dpdk: Update feature alert documentation
Signed-off-by: Darrell Ball--- Documentation/faq/releases.rst | 2 +- NEWS | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index 319c2d7..eb9187c 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -103,7 +103,7 @@ Q: Are all features available with all datapaths? = == == = === Feature Linux upstream Linux OVS tree Userspace Hyper-V = == == = === -NAT 4.6YESNONO +NAT 4.6YESYes NO Connection tracking 4.3YESPARTIAL PARTIAL Tunnel - LISP NO YESNONO Tunnel - STT NO YESNOYES diff --git a/NEWS b/NEWS index aebd99c..890549f 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,8 @@ Post-v2.7.0 - Tunnels: * Added support to set packet mark for tunnel endpoint using `egress_pkt_mark` OVSDB option. + - Userspace Datapath NAT: + * Added NAT support for userspace datapath. v2.7.0 - xx xxx - -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch_v6 4/8] dpdk: Userspace Datapath: Introduce NAT Support.
This patch introduces NAT support for the userspace datapath. The conntrack module changes are in this patch. The per packet scope of lookups for NAT and un_NAT is at the bucket level rather than global. One hash table is introduced to support create/delete handling. The create/delete events may be further optimized, if the need becomes clear. Some NAT options with limited utility (persistent, random) are not supported yet, but will be supported in a later patch. Signed-off-by: Darrell Ball--- lib/conntrack-private.h | 16 +- lib/conntrack.c | 782 ++-- lib/conntrack.h | 46 +++ 3 files changed, 751 insertions(+), 93 deletions(-) diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h index 493865f..a7c2ae4 100644 --- a/lib/conntrack-private.h +++ b/lib/conntrack-private.h @@ -51,14 +51,23 @@ struct conn_key { uint16_t zone; }; +struct nat_conn_key_node { +struct hmap_node node; +struct conn_key key; +struct conn_key value; +}; + struct conn { struct conn_key key; struct conn_key rev_key; long long expiration; struct ovs_list exp_node; struct hmap_node node; -uint32_t mark; ovs_u128 label; +/* XXX: consider flattening. */ +struct nat_action_info_t *nat_info; +uint32_t mark; +uint8_t conn_type; }; enum ct_update_res { @@ -67,6 +76,11 @@ enum ct_update_res { CT_UPDATE_NEW, }; +enum ct_conn_type { +CT_CONN_TYPE_DEFAULT, +CT_CONN_TYPE_UN_NAT, +}; + struct ct_l4_proto { struct conn *(*new_conn)(struct conntrack_bucket *, struct dp_packet *pkt, long long now); diff --git a/lib/conntrack.c b/lib/conntrack.c index d0e106f..49760c0 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -76,6 +76,20 @@ static void set_label(struct dp_packet *, struct conn *, const struct ovs_key_ct_labels *mask); static void *clean_thread_main(void *f_); +static struct nat_conn_key_node * +nat_conn_keys_lookup(struct hmap *nat_conn_keys, + const struct conn_key *key, + uint32_t basis); + +static void +nat_conn_keys_remove(struct hmap *nat_conn_keys, +const struct conn_key *key, +uint32_t basis); + +static bool +nat_select_range_tuple(struct conntrack *ct, const struct conn *conn, + struct conn *nat_conn); + static struct ct_l4_proto *l4_protos[] = { [IPPROTO_TCP] = _proto_tcp, [IPPROTO_UDP] = _proto_other, @@ -90,7 +104,7 @@ long long ct_timeout_val[] = { }; /* If the total number of connections goes above this value, no new connections - * are accepted */ + * are accepted; this is for CT_CONN_TYPE_DEFAULT connections. */ #define DEFAULT_N_CONN_LIMIT 300 /* Initializes the connection tracker 'ct'. The caller is responsible for @@ -101,6 +115,11 @@ conntrack_init(struct conntrack *ct) unsigned i, j; long long now = time_msec(); +ct_rwlock_init(>nat_resources_lock); +ct_rwlock_wrlock(>nat_resources_lock); +hmap_init(>nat_conn_keys); +ct_rwlock_unlock(>nat_resources_lock); + for (i = 0; i < CONNTRACK_BUCKETS; i++) { struct conntrack_bucket *ctb = >buckets[i]; @@ -139,13 +158,24 @@ conntrack_destroy(struct conntrack *ct) ovs_mutex_destroy(>cleanup_mutex); ct_lock_lock(>lock); HMAP_FOR_EACH_POP(conn, node, >connections) { -atomic_count_dec(>n_conn); +if (conn->conn_type == CT_CONN_TYPE_DEFAULT) { +atomic_count_dec(>n_conn); +} delete_conn(conn); } hmap_destroy(>connections); ct_lock_unlock(>lock); ct_lock_destroy(>lock); } +ct_rwlock_wrlock(>nat_resources_lock); +struct nat_conn_key_node *nat_conn_key_node; +HMAP_FOR_EACH_POP(nat_conn_key_node, node, >nat_conn_keys) { +free(nat_conn_key_node); +} +hmap_destroy(>nat_conn_keys); +ct_rwlock_unlock(>nat_resources_lock); +ct_rwlock_destroy(>nat_resources_lock); + } static unsigned hash_to_bucket(uint32_t hash) @@ -158,29 +188,186 @@ static unsigned hash_to_bucket(uint32_t hash) } static void -write_ct_md(struct dp_packet *pkt, uint16_t state, uint16_t zone, -uint32_t mark, ovs_u128 label) +write_ct_md(struct dp_packet *pkt, uint16_t zone, uint32_t mark, +ovs_u128 label) { -pkt->md.ct_state = state | CS_TRACKED; +pkt->md.ct_state |= CS_TRACKED; pkt->md.ct_zone = zone; pkt->md.ct_mark = mark; pkt->md.ct_label = label; } +static void +nat_packet(struct dp_packet *pkt, const struct conn *conn) +{ +if (conn->nat_info->nat_action & NAT_ACTION_SRC) { +pkt->md.ct_state |= CS_SRC_NAT; +if (conn->key.dl_type == htons(ETH_TYPE_IP)) { +struct ip_header *nh = dp_packet_l3(pkt); +packet_set_ipv4_addr(pkt, >ip_src, +
[ovs-dev] [patch_v6 6/8] dpdk: Add missing CHECK_CONNTRACK_ALG guards.
Signed-off-by: Darrell BallAcked-by: Flavio Leitner --- tests/system-traffic.at | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index a15e059..e97a45d 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -2601,6 +2601,7 @@ m4_define([CHECK_FTP_NAT], AT_SKIP_IF([test $HAVE_FTP = no]) CHECK_CONNTRACK() CHECK_CONNTRACK_NAT() +CHECK_CONNTRACK_ALG() OVS_TRAFFIC_VSWITCHD_START() @@ -2815,6 +2816,8 @@ AT_SETUP([conntrack - IPv6 FTP with NAT]) AT_SKIP_IF([test $HAVE_FTP = no]) CHECK_CONNTRACK() CHECK_CONNTRACK_NAT() +CHECK_CONNTRACK_ALG() + OVS_TRAFFIC_VSWITCHD_START() ADD_NAMESPACES(at_ns0, at_ns1) -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch_v6 5/8] dpdk: Enhance V6 NAT test.
Signed-off-by: Darrell BallAcked-by: Flavio Leitner --- tests/system-traffic.at | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/system-traffic.at b/tests/system-traffic.at index 29dd6d6..a15e059 100644 --- a/tests/system-traffic.at +++ b/tests/system-traffic.at @@ -2777,15 +2777,17 @@ ADD_VETH(p0, at_ns0, br0, "fc00::1/96") NS_CHECK_EXEC([at_ns0], [ip link set dev p0 address 80:88:88:88:88:88]) ADD_VETH(p1, at_ns1, br0, "fc00::2/96") NS_CHECK_EXEC([at_ns1], [ip -6 neigh add fc00::240 lladdr 80:88:88:88:88:88 dev p1]) +NS_CHECK_EXEC([at_ns1], [ip -6 neigh add fc00::241 lladdr 80:88:88:88:88:88 dev p1]) dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from ns1->ns0. AT_DATA([flows.txt], [dnl priority=1,action=drop priority=10,icmp6,action=normal -priority=100,in_port=1,ip6,action=ct(commit,nat(src=fc00::240)),2 +priority=100,in_port=1,ip6,action=ct(commit,nat(src=fc00::240-fc00::241)),2 priority=100,in_port=2,ct_state=-trk,ip6,action=ct(nat,table=0) priority=100,in_port=2,ct_state=+trk+est,ip6,action=1 priority=200,in_port=2,ct_state=+trk+new,icmp6,icmpv6_code=0,icmpv6_type=135,nd_target=fc00::240,action=ct(commit,nat(dst=fc00::1)),1 +priority=200,in_port=2,ct_state=+trk+new,icmp6,icmpv6_code=0,icmpv6_type=135,nd_target=fc00::241,action=ct(commit,nat(dst=fc00::1)),1 ]) AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch_v6 3/8] dpdk: Remove batch sorting in userspace conntrack.
Packet batch sorting is removed for three reasons: 1) The following patches for NAT change the locking marshalling so batching loses benefit. 2) For real mixtures of flows either in hypervisors or gateways, the batch sorting won't provide benefit and will just be a tax. 3) Code clarity. Signed-off-by: Darrell Ball--- lib/conntrack.c | 49 +++-- 1 file changed, 11 insertions(+), 38 deletions(-) diff --git a/lib/conntrack.c b/lib/conntrack.c index 0a611a2..d0e106f 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -284,16 +284,8 @@ conntrack_execute(struct conntrack *ct, struct dp_packet_batch *pkt_batch, enum { KEY_ARRAY_SIZE = NETDEV_MAX_BURST }; #endif struct conn_lookup_ctx ctxs[KEY_ARRAY_SIZE]; -int8_t bucket_list[CONNTRACK_BUCKETS]; -struct { -unsigned bucket; -unsigned long maps; -} arr[KEY_ARRAY_SIZE]; long long now = time_msec(); size_t i = 0; -uint8_t arrcnt = 0; - -BUILD_ASSERT_DECL(sizeof arr[0].maps * CHAR_BIT >= NETDEV_MAX_BURST); if (helper) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5); @@ -302,48 +294,29 @@ conntrack_execute(struct conntrack *ct, struct dp_packet_batch *pkt_batch, /* Continue without the helper */ } -memset(bucket_list, INT8_C(-1), sizeof bucket_list); for (i = 0; i < cnt; i++) { -unsigned bucket; if (!conn_key_extract(ct, pkts[i], dl_type, [i], zone)) { write_ct_md(pkts[i], CS_INVALID, zone, 0, OVS_U128_ZERO); continue; } -bucket = hash_to_bucket(ctxs[i].hash); -if (bucket_list[bucket] == INT8_C(-1)) { -bucket_list[bucket] = arrcnt; - -arr[arrcnt].maps = 0; -ULLONG_SET1(arr[arrcnt].maps, i); -arr[arrcnt++].bucket = bucket; -} else { -ULLONG_SET1(arr[bucket_list[bucket]].maps, i); -} -} - -for (i = 0; i < arrcnt; i++) { -struct conntrack_bucket *ctb = >buckets[arr[i].bucket]; -size_t j; - +unsigned bucket = hash_to_bucket(ctxs[i].hash); +struct conntrack_bucket *ctb = >buckets[bucket]; ct_lock_lock(>lock); +conn_key_lookup(ctb, [i], now); -ULLONG_FOR_EACH_1(j, arr[i].maps) { -struct conn *conn; - -conn_key_lookup(ctb, [j], now); +struct conn *conn = process_one(ct, pkts[i], [i], zone, +commit, now); -conn = process_one(ct, pkts[j], [j], zone, commit, now); - -if (conn && setmark) { -set_mark(pkts[j], conn, setmark[0], setmark[1]); -} +if (conn && setmark) { +set_mark(pkts[i], conn, setmark[0], setmark[1]); +} -if (conn && setlabel) { -set_label(pkts[j], conn, [0], [1]); -} +if (conn && setlabel) { +set_label(pkts[i], conn, [0], [1]); } + ct_lock_unlock(>lock); } -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch_v6 2/8] dpdk: Parse NAT netlink for userspace datapath.
Signed-off-by: Darrell Ball--- lib/conntrack-private.h | 9 -- lib/conntrack.c | 3 +- lib/conntrack.h | 29 - lib/dpif-netdev.c | 83 +++-- tests/test-conntrack.c | 8 +++-- 5 files changed, 115 insertions(+), 17 deletions(-) diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h index 013f19f..493865f 100644 --- a/lib/conntrack-private.h +++ b/lib/conntrack-private.h @@ -29,15 +29,6 @@ #include "packets.h" #include "unaligned.h" -struct ct_addr { -union { -ovs_16aligned_be32 ipv4; -union ovs_16aligned_in6_addr ipv6; -ovs_be32 ipv4_aligned; -struct in6_addr ipv6_aligned; -}; -}; - struct ct_endpoint { struct ct_addr addr; union { diff --git a/lib/conntrack.c b/lib/conntrack.c index 9bea3d9..0a611a2 100644 --- a/lib/conntrack.c +++ b/lib/conntrack.c @@ -273,7 +273,8 @@ conntrack_execute(struct conntrack *ct, struct dp_packet_batch *pkt_batch, ovs_be16 dl_type, bool commit, uint16_t zone, const uint32_t *setmark, const struct ovs_key_ct_labels *setlabel, - const char *helper) + const char *helper, + const struct nat_action_info_t *nat_action_info OVS_UNUSED) { struct dp_packet **pkts = pkt_batch->packets; size_t cnt = pkt_batch->count; diff --git a/lib/conntrack.h b/lib/conntrack.h index 254f61c..288808b 100644 --- a/lib/conntrack.h +++ b/lib/conntrack.h @@ -26,6 +26,8 @@ #include "openvswitch/thread.h" #include "openvswitch/types.h" #include "ovs-atomic.h" +#include "ovs-thread.h" +#include "packets.h" /* Userspace connection tracker * @@ -61,6 +63,30 @@ struct dp_packet_batch; struct conntrack; +struct ct_addr { +union { +ovs_16aligned_be32 ipv4; +union ovs_16aligned_in6_addr ipv6; +ovs_be32 ipv4_aligned; +struct in6_addr ipv6_aligned; +}; +}; + +enum nat_action_e { +NAT_ACTION_SRC = 1 << 0, +NAT_ACTION_SRC_PORT = 1 << 1, +NAT_ACTION_DST = 1 << 2, +NAT_ACTION_DST_PORT = 1 << 3, +}; + +struct nat_action_info_t { +struct ct_addr min_addr; +struct ct_addr max_addr; +uint16_t min_port; +uint16_t max_port; +uint16_t nat_action; +}; + void conntrack_init(struct conntrack *); void conntrack_destroy(struct conntrack *); @@ -68,7 +94,8 @@ int conntrack_execute(struct conntrack *, struct dp_packet_batch *, ovs_be16 dl_type, bool commit, uint16_t zone, const uint32_t *setmark, const struct ovs_key_ct_labels *setlabel, - const char *helper); + const char *helper, + const struct nat_action_info_t *nat_action_info); struct conntrack_dump { struct conntrack *ct; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 0be5db5..231e609 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -97,7 +97,8 @@ static struct shash dp_netdevs OVS_GUARDED_BY(dp_netdev_mutex) static struct vlog_rate_limit upcall_rl = VLOG_RATE_LIMIT_INIT(600, 600); #define DP_NETDEV_CS_SUPPORTED_MASK (CS_NEW | CS_ESTABLISHED | CS_RELATED \ - | CS_INVALID | CS_REPLY_DIR | CS_TRACKED) + | CS_INVALID | CS_REPLY_DIR | CS_TRACKED \ + | CS_SRC_NAT | CS_DST_NAT) #define DP_NETDEV_CS_UNSUPPORTED_MASK (~(uint32_t)DP_NETDEV_CS_SUPPORTED_MASK) static struct odp_support dp_netdev_support = { @@ -4689,6 +4690,9 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, const char *helper = NULL; const uint32_t *setmark = NULL; const struct ovs_key_ct_labels *setlabel = NULL; +struct nat_action_info_t nat_action_info; +struct nat_action_info_t *nat_action_info_ref = NULL; +bool nat_config = false; NL_ATTR_FOR_EACH_UNSAFE (b, left, nl_attr_get(a), nl_attr_get_size(a)) { @@ -4710,15 +4714,88 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_, case OVS_CT_ATTR_LABELS: setlabel = nl_attr_get(b); break; -case OVS_CT_ATTR_NAT: +case OVS_CT_ATTR_NAT: { +const struct nlattr *b_nest; +unsigned int left_nest; +bool ip_min_specified = false; +bool proto_num_min_specified = false; +bool ip_max_specified = false; +bool proto_num_max_specified = false; +memset(_action_info, 0, sizeof nat_action_info); +nat_action_info_ref = _action_info; + +NL_NESTED_FOR_EACH_UNSAFE (b_nest, left_nest, b) { +enum ovs_nat_attr sub_type_nest = nl_attr_type(b_nest); + +
[ovs-dev] [patch_v6 1/8] dpdk: Export packet_set_ipv6_addr() for DPDK.
The NAT changes in this series need both packet_set_ipv4_addr() and packet_set_ipv6_addr() exporting, however, the ipv4 api was exported with an unrelated patch. Signed-off-by: Darrell BallAcked-by: Flavio Leitner --- lib/packets.c | 2 +- lib/packets.h | 4 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/packets.c b/lib/packets.c index fa70df6..94e7d87 100644 --- a/lib/packets.c +++ b/lib/packets.c @@ -986,7 +986,7 @@ packet_update_csum128(struct dp_packet *packet, uint8_t proto, } } -static void +void packet_set_ipv6_addr(struct dp_packet *packet, uint8_t proto, ovs_16aligned_be32 addr[4], const struct in6_addr *new_addr, diff --git a/lib/packets.h b/lib/packets.h index c4d3799..850f192 100644 --- a/lib/packets.h +++ b/lib/packets.h @@ -1100,6 +1100,10 @@ void packet_set_ipv4_addr(struct dp_packet *packet, ovs_16aligned_be32 *addr, void packet_set_ipv6(struct dp_packet *, const struct in6_addr *src, const struct in6_addr *dst, uint8_t tc, ovs_be32 fl, uint8_t hlmit); +void packet_set_ipv6_addr(struct dp_packet *packet, uint8_t proto, + ovs_16aligned_be32 addr[4], + const struct in6_addr *new_addr, + bool recalculate_csum); void packet_set_tcp_port(struct dp_packet *, ovs_be16 src, ovs_be16 dst); void packet_set_udp_port(struct dp_packet *, ovs_be16 src, ovs_be16 dst); void packet_set_sctp_port(struct dp_packet *, ovs_be16 src, ovs_be16 dst); -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [patch_v6 0/8] Userspace Datapath: Introduce NAT support.
This patch series introduces NAT support for the userspace datapath. The per packet scope of lookups for NAT and un_NAT is at the bucket level rather than global. One hash table is introduced to support create/delete handling. The create/delete events may be further optimized, if the need becomes clear. The existing NAT tests are enabled for the dpdk datapath, with an added enhancement to the V6 NAT test. Some NAT options with limited utility (persistent, random) are not supported yet, but will be supported in a later patch. One V6 api is exported to facilitate selective editing the V6 header - packet_set_ipv6_addr(). alg and fragmentation support are not included here but are being worked on. I realize patch 4 is big. It may be clearer and easier to keep as a single patch, so I have done that after some discussion. v5->v6: Add releases file NAT alert, as pointed out by Flavio. Add some missing details in commit message in a couple patches as mentioned by Flavio. Flushed the bug queue - found a couple bugs in testing over the last week. a) nat_range_hash was missing the intended conn entry address and port fields :-); I guess missed since the corresponding nat info address and port fields were there. b) The netlink parsing math was off for min/max address in NAT range. v4->v5: Remove packet sorting in userspace datapath conntrack. Simplify conntrack state code. Fix sparse error. Address code review comments from Daniele. v3->v4: Fix rev_key vs key for nat_conn_keys access in a couple places; this would have affected cleanup; at same time rename some variables and change nat_conn_keys APIs to use conn key, rather than conn. Fix conntrack_flush() CT_CONN_TYPE_DEFAULT flag placement; the intention was that it be the same as in sweep_bucket(). Fix nat_ipv6_addrs_delta() max boundary checking logic. I also enhanced the conntrack - IPv6 HTTP with NAT test to give it more coverage as partial penance. Rebase v2->v3: Fix a theoretical resend for closed connection restart. Parse out a function to help and also limit conn_state_update() to one. I decided to cap V6 address range delta at 4 billion using internal adjustment (user visibility not required). Some cleanup of deprecated code path. Parse out some more changes as separate patches. v1->v2: Updates/fixes that were missed in v1 patches. Darrell Ball (8): dpdk: Export packet_set_ipv6_addr() for DPDK. dpdk: Parse NAT netlink for userspace datapath. dpdk: Remove batch sorting in userspace conntrack. dpdk: Userspace Datapath: Introduce NAT Support. dpdk: Enhance V6 NAT test. dpdk: Add missing CHECK_CONNTRACK_ALG guards. dpdk: Enable NAT tests for userspace datapath. dpdk: Update feature alert documentation Documentation/faq/releases.rst | 2 +- NEWS | 2 + lib/conntrack-private.h | 25 +- lib/conntrack.c | 810 +-- lib/conntrack.h | 75 +++- lib/dpif-netdev.c| 83 +++- lib/packets.c| 2 +- lib/packets.h| 4 + tests/system-traffic.at | 7 +- tests/system-userspace-macros.at | 7 +- tests/test-conntrack.c | 8 +- 11 files changed, 881 insertions(+), 144 deletions(-) -- 1.9.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev