Hi Martin,
Did some basic testing, and it all works fine. See some comments inline
below.
Cheers,
Eelco
On 7 Dec 2020, at 4:32, Martin Varghese wrote:
From: Martin Varghese <martin.vargh...@nokia.com>
There are various L3 encapsulation standards using UDP being discussed
to
leverage the UDP based load balancing capability of different
networks.
MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is one among them.
The Bareudp tunnel provides a generic L3 encapsulation support for
tunnelling different L3 protocols like MPLS, IP, NSH etc. inside a UDP
tunnel.
An example to create bareudp device to tunnel MPLS traffic is
given
$ ovs-vsctl add-port br_mpls udp_port -- set interface udp_port \
type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
options:payload_type=0x8847 options:dst_port=6635 \
options:packet_type="legacy_l3" \
ofport_request=$bareudp_egress_port
The bareudp device supports special handling for MPLS & IP as
they can have multiple ethertypes. MPLS procotcol can have ethertypes
ETH_P_MPLS_UC (unicast) & ETH_P_MPLS_MC (multicast). IP protocol can
have
ethertypes ETH_P_IP (v4) & ETH_P_IPV6 (v6).
The bareudp device to tunnel L3 traffic with multiple ethertypes
(MPLS & IP) can be created by passing the L3 protocol name as string
in
the field payload_type. An example to create bareudp device to tunnel
MPLS unicast & multicast traffic is given below.::
$ ovs-vsctl add-port br_mpls udp_port -- set interface
udp_port \
type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
options:payload_type=mpls options:dst_port=6635 \
options:packet_type="legacy_l3"
Signed-off-by: Martin Varghese <martin.vargh...@nokia.com>
Acked-By: Greg Rose <gvrose8...@gmail.com>
Tested-by: Greg Rose <gvrose8...@gmail.com>
---
Changes in v2:
- Removed vport-bareudp module.
Changes in v3:
- Added net-next upstream commit id and message to commit message.
Changes in v4:
- Removed kernel datapath changes.
Changes in v5:
- Fixed release notes errors.
- Fixed coding errors in dpif-nelink-rtnl.c.
Changes in v6:
- Added code to enable rx metadata collection in the kernel
device.
- Added version history.
Changes in v7
- Fixed release notes errors.
- Added Skip tests for older kernels.
- Changes bareudp ovs_vport_type to 111.
- Added Acked-by & tested by from gvrose8...@gmail.com
Changes in v8
- The code added in v6 to enable rx metadata collection in
the kernel device is removed. This flag was never added to any
of
the kernel release. The rx metadata collection is always enabled
in
kernel bareudp module.
Documentation/automake.mk | 1 +
Documentation/faq/bareudp.rst | 62
+++++++++++++++++++
Documentation/faq/index.rst | 1 +
Documentation/faq/releases.rst | 1 +
NEWS | 5 +-
.../linux/compat/include/linux/openvswitch.h | 9 +++
lib/dpif-netlink-rtnl.c | 53 ++++++++++++++++
lib/dpif-netlink.c | 5 ++
lib/netdev-vport.c | 27 +++++++-
lib/netdev.h | 1 +
ofproto/ofproto-dpif-xlate.c | 1 +
tests/system-layer3-tunnels.at | 48 ++++++++++++++
12 files changed, 211 insertions(+), 3 deletions(-)
create mode 100644 Documentation/faq/bareudp.rst
diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index f85c4320e..ea3475f35 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -88,6 +88,7 @@ DOC_SOURCE = \
Documentation/faq/terminology.rst \
Documentation/faq/vlan.rst \
Documentation/faq/vxlan.rst \
+ Documentation/faq/bareudp.rst \
Documentation/internals/index.rst \
Documentation/internals/authors.rst \
Documentation/internals/bugs.rst \
diff --git a/Documentation/faq/bareudp.rst
b/Documentation/faq/bareudp.rst
new file mode 100644
index 000000000..ef437631c
--- /dev/null
+++ b/Documentation/faq/bareudp.rst
@@ -0,0 +1,62 @@
+..
+ Licensed under the Apache License, Version 2.0 (the "License");
you may
+ not use this file except in compliance with the License. You
may obtain
+ a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
software
+ distributed under the License is distributed on an "AS IS"
BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the
+ License for the specific language governing permissions and
limitations
+ under the License.
+
+ Convention for heading levels in Open vSwitch documentation:
+
+ ======= Heading 0 (reserved for the title in a document)
+ ------- Heading 1
+ ~~~~~~~ Heading 2
+ +++++++ Heading 3
+ ''''''' Heading 4
+
+ Avoid deeper levels because they do not render well.
+
+=======
+Bareudp
+=======
+
+Q: What is Bareudp?
+
+ A: There are various L3 encapsulation standards using UDP being
discussed
+ to leverage the UDP based load balancing capability of
different
+ networks. MPLSoUDP (__ https://tools.ietf.org/html/rfc7510) is
one among
+ them.
+
+ The Bareudp tunnel provides a generic L3 encapsulation support
for
+ tunnelling different L3 protocols like MPLS, IP, NSH etc.
inside a UDP
+ tunnel.
+
+ An example to create bareudp device to tunnel MPLS traffic is
given
+ below.::
+
+ $ ovs-vsctl add-port br_mpls udp_port -- set interface
udp_port \
+ type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
+ options:payload_type=0x8847 options:dst_port=6635 \
I think it would be good to explain what the payload_type is used for as
it's not clear from this text, and I had to read the kernel code to
understand.
Maybe add an example on how to redirect traffic to this tunnel, as it
will only accept the specific ethertype.
+ options:packet_type="legacy_l3" \
Looking at the code, it seems we only support packet_type=legacy_l3 (or
ptap), so we could remove it in the examples as it will default to L3.
+ ofport_request=$bareudp_egress_port
+
Maybe also the ofport_request option can be removed, as it adds no value
here.
+ The bareudp device supports special handling for MPLS & IP as
they can
+ have multiple ethertypes.
+ MPLS procotcol can have ethertypes ETH_P_MPLS_UC (unicast) &
+ ETH_P_MPLS_MC (multicast). IP protocol can have ethertypes
ETH_P_IP (v4)
+ & ETH_P_IPV6 (v6).
+
+ The bareudp device to tunnel L3 traffic with multiple
ethertypes
+ (MPLS & IP) can be created by passing the L3 protocol name as
string in
+ the field payload_type. An example to create bareudp device to
tunnel
+ MPLS unicast & multicast traffic is given below.::
+
+ $ ovs-vsctl add-port br_mpls udp_port -- set interface
udp_port \
+ type=bareudp options:remote_ip=2.1.1.3
options:local_ip=2.1.1.2 \
+ options:payload_type=mpls options:dst_port=6635 \
+ options:packet_type="legacy_l3"
Same as above on packet_type.
Maybe also add an example for IP over UDP?
diff --git a/Documentation/faq/index.rst b/Documentation/faq/index.rst
index 334b828b2..1dd29986a 100644
--- a/Documentation/faq/index.rst
+++ b/Documentation/faq/index.rst
@@ -30,6 +30,7 @@ Open vSwitch FAQ
.. toctree::
:maxdepth: 2
+ bareudp
configuration
contributing
design
diff --git a/Documentation/faq/releases.rst
b/Documentation/faq/releases.rst
index 3623e3f40..68cbf1dbc 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -138,6 +138,7 @@ Q: Are all features available with all datapaths?
Tunnel - ERSPAN 4.18 2.10 2.10
NO
Tunnel - ERSPAN-IPv6 4.18 2.10 2.10
NO
Tunnel - GTP-U NO NO 2.14
NO
+ Tunnel - Bareudp 5.7 NO NO
NO
QoS - Policing YES 1.1 2.6
NO
QoS - Shaping YES 1.1 NO
NO
sFlow YES 1.0 1.0
NO
diff --git a/NEWS b/NEWS
index 7e291a180..e3bc34a3f 100644
--- a/NEWS
+++ b/NEWS
@@ -75,7 +75,10 @@ v2.14.0 - 17 Aug 2020
- GTP-U Tunnel Protocol
* Add two new fields: tun_gtpu_flags, tun_gtpu_msgtype.
* Only support for userspace datapath.
-
+ - Bareudp Tunnel
+ * Bareudp device support is present in linux kernel from version
5.7
+ * Kernel bareudp device is not backported to ovs tree.
+ * Userspace datapath support is not added
Any plans on adding this?
static const char *
vport_type_to_kind(enum ovs_vport_type type,
@@ -113,6 +129,8 @@ vport_type_to_kind(enum ovs_vport_type type,
}
case OVS_VPORT_TYPE_GTPU:
return NULL;
+ case OVS_VPORT_TYPE_BAREUDP:
+ return "bareudp";
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
case OVS_VPORT_TYPE_LISP:
@@ -243,6 +261,24 @@ dpif_netlink_rtnl_geneve_verify(const struct
netdev_tunnel_config *tnl_cfg,
return err;
}
+static int
+dpif_netlink_rtnl_bareudp_verify(const struct netdev_tunnel_config
*tnl_cfg,
+ const char *kind, struct ofpbuf
*reply)
+{
+ struct nlattr *bareudp[ARRAY_SIZE(bareudp_policy)];
+ int err;
+
+ err = rtnl_policy_parse(kind, reply, bareudp_policy, bareudp,
+ ARRAY_SIZE(bareudp_policy));
+ if (!err) {
+ if ((tnl_cfg->dst_port !=
nl_attr_get_be16(bareudp[IFLA_BAREUDP_PORT]))
+ || (tnl_cfg->payload_ethertype
+ !=
nl_attr_get_be16(bareudp[IFLA_BAREUDP_ETHERTYPE]))) {
+ err = EINVAL;
+ }
+ }
+ return err;
+}
static int
dpif_netlink_rtnl_verify(const struct netdev_tunnel_config *tnl_cfg,
@@ -275,6 +311,9 @@ dpif_netlink_rtnl_verify(const struct
netdev_tunnel_config *tnl_cfg,
case OVS_VPORT_TYPE_GENEVE:
err = dpif_netlink_rtnl_geneve_verify(tnl_cfg, kind, reply);
break;
+ case OVS_VPORT_TYPE_BAREUDP:
+ err = dpif_netlink_rtnl_bareudp_verify(tnl_cfg, kind, reply);
+ break;
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
case OVS_VPORT_TYPE_LISP:
@@ -357,6 +396,19 @@ dpif_netlink_rtnl_create(const struct
netdev_tunnel_config *tnl_cfg,
nl_msg_put_u8(&request, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, 1);
nl_msg_put_be16(&request, IFLA_GENEVE_PORT,
tnl_cfg->dst_port);
break;
+ case OVS_VPORT_TYPE_BAREUDP:
+ nl_msg_put_be16(&request, IFLA_BAREUDP_ETHERTYPE,
+ tnl_cfg->payload_ethertype);
+ if ((tnl_cfg->payload_ethertype == htons(ETH_TYPE_MPLS)) ||
+ (tnl_cfg->payload_ethertype ==
htons(ETH_TYPE_MPLS_MCAST))) {
+ nl_msg_put_u16(&request, IFLA_BAREUDP_SRCPORT_MIN,
+ BAREUDP_MPLS_SRCPORT_MIN);
So why do we set this for MPLS only? All other proposals have the same
min port guidance:
- https://tools.ietf.org/html/draft-xu-intarea-ip-in-udp-09
- https://tools.ietf.org/html/rfc8086
+ }
+ nl_msg_put_be16(&request, IFLA_BAREUDP_PORT,
tnl_cfg->dst_port);
+ if (tnl_cfg->exts & (1 << OVS_BAREUDP_EXT_MULTIPROTO_MODE)) {
+ nl_msg_put_flag(&request, IFLA_BAREUDP_MULTIPROTO_MODE);
+ }
+ break;
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
case OVS_VPORT_TYPE_LISP:
@@ -470,6 +522,7 @@ dpif_netlink_rtnl_port_destroy(const char *name,
const char *type)
case OVS_VPORT_TYPE_ERSPAN:
case OVS_VPORT_TYPE_IP6ERSPAN:
case OVS_VPORT_TYPE_IP6GRE:
+ case OVS_VPORT_TYPE_BAREUDP:
return dpif_netlink_rtnl_destroy(name);
case OVS_VPORT_TYPE_NETDEV:
case OVS_VPORT_TYPE_INTERNAL:
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 2f881e4fa..ceb56c685 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -749,6 +749,9 @@ get_vport_type(const struct dpif_netlink_vport
*vport)
case OVS_VPORT_TYPE_GTPU:
return "gtpu";
+ case OVS_VPORT_TYPE_BAREUDP:
+ return "bareudp";
+
case OVS_VPORT_TYPE_UNSPEC:
case __OVS_VPORT_TYPE_MAX:
break;
@@ -784,6 +787,8 @@ netdev_to_ovs_vport_type(const char *type)
return OVS_VPORT_TYPE_GRE;
} else if (!strcmp(type, "gtpu")) {
return OVS_VPORT_TYPE_GTPU;
+ } else if (!strcmp(type, "bareudp")) {
+ return OVS_VPORT_TYPE_BAREUDP;
} else {
return OVS_VPORT_TYPE_UNSPEC;
}
diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 0252b61de..c86d420d7 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -112,7 +112,7 @@ netdev_vport_needs_dst_port(const struct netdev
*dev)
return (class->get_config == get_tunnel_config &&
(!strcmp("geneve", type) || !strcmp("vxlan", type) ||
!strcmp("lisp", type) || !strcmp("stt", type) ||
- !strcmp("gtpu", type)));
+ !strcmp("gtpu", type) || !strcmp("bareudp",type)));
}
const char *
@@ -219,6 +219,8 @@ netdev_vport_construct(struct netdev *netdev_)
dev->tnl_cfg.dst_port = port ? htons(port) :
htons(STT_DST_PORT);
} else if (!strcmp(type, "gtpu")) {
dev->tnl_cfg.dst_port = port ? htons(port) :
htons(GTPU_DST_PORT);
+ } else if (!strcmp(type, "bareudp")) {
+ dev->tnl_cfg.dst_port = htons(port);
}
dev->tnl_cfg.dont_fragment = true;
@@ -438,6 +440,8 @@ tunnel_supported_layers(const char *type,
return TNL_L2 | TNL_L3;
} else if (!strcmp(type, "gtpu")) {
return TNL_L3;
+ } else if (!strcmp(type, "bareudp")) {
+ return TNL_L3;
} else {
return TNL_L2;
}
@@ -745,6 +749,16 @@ set_tunnel_config(struct netdev *dev_, const
struct smap *args, char **errp)
goto out;
}
}
+ } else if (!strcmp(node->key, "payload_type")) {
+ if (strcmp(node->key, "mpls")) {
+ tnl_cfg.payload_ethertype = htons(ETH_TYPE_MPLS);
+ tnl_cfg.exts |= (1 <<
OVS_BAREUDP_EXT_MULTIPROTO_MODE);
+ } else if ((strcmp(node->key, "ip"))) {
+ tnl_cfg.payload_ethertype = htons(ETH_TYPE_IP);
+ tnl_cfg.exts |= (1 <<
OVS_BAREUDP_EXT_MULTIPROTO_MODE);
+ } else {
+ tnl_cfg.payload_ethertype =
htons(atoi(node->value));
As the kernel only supports IPv4, IPv6, MPLS, and MPLS_MULTI, why not
return an error here if it's not one of these four?
+ }
} else {
ds_put_format(&errors, "%s: unknown %s argument '%s'\n",
name,
type, node->key);
@@ -917,7 +931,8 @@ get_tunnel_config(const struct netdev *dev, struct
smap *args)
(!strcmp("vxlan", type) && dst_port != VXLAN_DST_PORT) ||
(!strcmp("lisp", type) && dst_port != LISP_DST_PORT) ||
(!strcmp("stt", type) && dst_port != STT_DST_PORT) ||
- (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT)) {
+ (!strcmp("gtpu", type) && dst_port != GTPU_DST_PORT) ||
+ !strcmp("bareudp", type)) {
smap_add_format(args, "dst_port", "%d", dst_port);
}
}
@@ -1243,6 +1258,14 @@ netdev_vport_tunnel_register(void)
},
{{NULL, NULL, 0, 0}}
},
+ { "udp_sys",
+ {
+ TUNNEL_FUNCTIONS_COMMON,
+ .type = "bareudp",
+ .get_ifindex = NETDEV_VPORT_GET_IFINDEX,
+ },
+ {{NULL, NULL, 0, 0}}
+ },
};
static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
diff --git a/lib/netdev.h b/lib/netdev.h
index fb5073056..b705a9e56 100644
--- a/lib/netdev.h
+++ b/lib/netdev.h
@@ -107,6 +107,7 @@ struct netdev_tunnel_config {
bool out_key_flow;
ovs_be64 out_key;
+ ovs_be16 payload_ethertype;
ovs_be16 dst_port;
bool ip_src_flow;
diff --git a/ofproto/ofproto-dpif-xlate.c
b/ofproto/ofproto-dpif-xlate.c
index 11aa20754..7eeff14f6 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3573,6 +3573,7 @@ propagate_tunnel_data_to_flow(struct xlate_ctx
*ctx, struct eth_addr dmac,
case OVS_VPORT_TYPE_VXLAN:
case OVS_VPORT_TYPE_GENEVE:
case OVS_VPORT_TYPE_GTPU:
+ case OVS_VPORT_TYPE_BAREUDP:
nw_proto = IPPROTO_UDP;
break;
case OVS_VPORT_TYPE_LISP:
diff --git a/tests/system-layer3-tunnels.at
b/tests/system-layer3-tunnels.at
index 1232964bb..8423add2b 100644
--- a/tests/system-layer3-tunnels.at
+++ b/tests/system-layer3-tunnels.at
These tests also get executed for the userspace test set,
system-userspace-testsuite.at, which will fail, so it needs to be
excluded.
@@ -152,3 +152,51 @@ AT_CHECK([tail -1 stdout], [0],
OVS_VSWITCHD_STOP
AT_CLEANUP
+
+AT_SETUP([layer3 - ping over MPLS Bareudp])
+OVS_CHECK_MIN_KERNEL(5, 7)
+OVS_TRAFFIC_VSWITCHD_START([_ADD_BR([br1])])
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "36:b1:ee:7c:01:01")
+ADD_VETH(p1, at_ns1, br1, "10.1.1.2/24", "36:b1:ee:7c:01:02")
+
+ADD_OVS_TUNNEL([bareudp], [br0], [at_bareudp0], [8.1.1.3],
[8.1.1.2/24],
+ [ options:local_ip=8.1.1.2
options:packet_type="legacy_l3" options:payload_type=mpls
options:dst_port=6635])
+
+ADD_OVS_TUNNEL([bareudp], [br1], [at_bareudp1], [8.1.1.2],
[8.1.1.3/24],
+ [options:local_ip=8.1.1.3
options:packet_type="legacy_l3" options:payload_type=mpls
options:dst_port=6635])
+
+AT_DATA([flows0.txt], [dnl
+table=0,priority=100,dl_type=0x0800
actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp0
+table=0,priority=100,dl_type=0x8847 in_port=at_bareudp0
actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:01->dl_dst,set_field:36:b1:ee:7c:01:02->dl_src,output:ovs-p0
+table=0,priority=10 actions=normal
+])
Maybe it would be good to also have an IP test case?
+AT_DATA([flows1.txt], [dnl
+table=0,priority=100,dl_type=0x0800
actions=push_mpls:0x8847,set_mpls_label:3,output:at_bareudp1
+table=0,priority=100,dl_type=0x8847 in_port=at_bareudp1
actions=pop_mpls:0x0800,set_field:36:b1:ee:7c:01:02->dl_dst,set_field:36:b1:ee:7c:01:01->dl_src,output:ovs-p1
+table=0,priority=10 actions=normal
+])
+
+AT_CHECK([ip link add patch0 type veth peer name patch1])
+on_exit 'ip link del patch0'
+
+AT_CHECK([ip link set dev patch0 up])
+AT_CHECK([ip link set dev patch1 up])
+AT_CHECK([ovs-vsctl add-port br0 patch0])
+AT_CHECK([ovs-vsctl add-port br1 patch1])
+
+
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br0 flows0.txt])
+AT_CHECK([ovs-ofctl -O OpenFlow13 add-flows br1 flows1.txt])
+
+NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 10.1.1.2 |
FORMAT_PING], [0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+
+NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.1 |
FORMAT_PING], [0], [dnl
+3 packets transmitted, 3 received, 0% packet loss, time 0ms
+])
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
--
2.18.4
Can you also update the vswitchd/ovs-vswitchd.conf.db.5 man page with
the new tunnel and options?
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev